NotificationsYou must be signed in to change notification settings
Fork11
Star63

Long-read splice alignment with high accuracy

63 stars 11 forks Branches Tags Activity

You must be signed in to change notification settings

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 698 Commits
data		data
evaluation		evaluation
modules		modules
scripts		scripts
test		test
.travis.yml		.travis.yml
INSTALL.sh		INSTALL.sh
README.md		README.md
setup.py		setup.py
uLTRA		uLTRA

Repository files navigation

uLTRA

uLTRA is a tool for splice alignment of long transcriptomic reads to a genome, guided by a database of exon annotations. uLTRA is particularly accurate when aligning to small exonssee some examples.

uLTRA is distributed as a python package supported on Linux / OSX with python (versions 3.4 or above).

Here is aYouTube video that describes uLTRA.

INSTALLATION

Conda recipe

There is abioconda recipe,docker image, and asingularity container of uLTRA created bysguizard. You can use, e.g., the bioconda recipe for an easy automated installation.

Alternative ways of installations are provided below.

Using the INSTALL.sh script

You can clone this repository andrun the scriptINSTALL.sh as

git clone https://github.com/ksahlin/uLTRA.git --depth 1cd uLTRA./INSTALL.sh <install_directory>

The install script is tested in bash environment.

To run uLTRA, you need to activate the conda environment "ultra":

conda activate ultra

Without the INSTALL.sh script

You can also manually perform below steps for more control.

1. Create conda environment

Create a conda environment called ultra and activate it

conda create -n ultra python=3 pip conda activate ultra

2. Install uLTRA

pip install ultra-bioinformatics

3. Install third party tools

Installnamfinder andminimap2 andplace the generated binariesnamfinder andminimap2 in your path.

4. Verify installation

You should now have 'uLTRA' installed; try it

uLTRA --help

Upon start/login to your server/computer you need to activate the conda environment "ultra" to run uLTRA as:

conda activate ultra

You can also download and use test data available in this repositoryhere and run:

uLTRA pipeline [/your/full/path/to/test]/SIRV_genes.fasta  \               /your/full/path/to/test/SIRV_genes_C_170612a.gtf  \               [/your/full/path/to/test]/reads.fa outfolder/  [optional parameters]

Entirly from source

Make sure the below-listed dependencies are installed (installation links below). All below dependencies exceptnamfinder can be installed aspip install X or through conda.

With these dependencies installed. Run

git clone https://github.com/ksahlin/uLTRA.gitcd uLTRA./uLTRA

USAGE

uLTRA can be used with either PacBio Iso-Seq or ONT cDNA/dRNA reads.

Indexing

uLTRA index genome.fasta  /full/path/to/annotation.gtf  outfolder/  [parameters]

Important parameters:

--disable_infer can speed up the indexing considerably, but it only works if you have thegene feature andtranscript feature in your GTF file.

Aligning

For example

uLTRA align genome.fasta reads.[fa/fq] outfolder/  --ont --t 8   # ONT cDNA reads using 8 coresuLTRA align genome.fasta reads.[fa/fq] outfolder/  --isoseq --t 8 # PacBio isoseq reads

Important parameters:

--index [PATH]: You can set a custom location of where to get the index from using, otherwise, uLTRA will try to read the index from theoutfolder/ by default.
--prefix [PREFIX OF FILE]: The aligned reads will be written tooutfolder/reads.sam unless--prefix is set. For example,--prefix sample_X will output the reads inoutfolder/sample_X.sam.

Pipeline

Perform all the steps in one

uLTRA pipeline genome.fasta /full/path/to/annotation.gtf reads.fa outfolder/  [parameters]

Common errors

Not having a properly formatted GTF file. Before running uLTRA, notice that it reqires aproperly formatted GTF file. If you have a GFF file or other annotation format, it is adviced to useAGAT for file conversion to GTF as many other conversion tools do not respect GTF format. For example, you can run AGAT as:

agat_convert_sp_gff2gtf.pl --gff annot.gff3 --gtf annot.gtf

CREDITS

Please cite

Kristoffer Sahlin, Veli Mäkinen, Accurate spliced alignment of long RNA sequencing reads, Bioinformatics, Volume 37, Issue 24, 15 December 2021, Pages 4643–4651,https://doi.org/10.1093/bioinformatics/btab540

when using uLTRA.Please also citeminimap2 as uLTRA incorporates minimap2 for alignment of some genomic reads outside indexed regions. For example "We aligned reads to the genome using uLTRA [1], which incorporates minimap2 [CIT].".

LICENCE

GPL v3.0, seeLICENSE.txt.

About

Long-read splice alignment with high accuracy

Releases5

v0.1 Latest

May 17, 2023

+ 4 releases

Packages

No packages published

Contributors2

Languages

Python100.0%

Movatterモバイル変換

ksahlin/ultra

Folders and files

Latest commit

History

Repository files navigation

uLTRA

Table of Contents

INSTALLATION

Conda recipe

Using the INSTALL.sh script

Without the INSTALL.sh script

1. Create conda environment

2. Install uLTRA

3. Install third party tools

4. Verify installation

Entirly from source

USAGE

Indexing

Aligning

Pipeline

Common errors

CREDITS

LICENCE

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases5

Packages0

Uh oh!

Contributors2

Uh oh!

Languages

Packages