Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Another Gtf/Gff Analysis Toolkithttps://nbisweden.github.io/AGAT/

License

NotificationsYou must be signed in to change notification settings

NBISweden/AGAT

Repository files navigation

GitHub CICoverage StatusDocumentation Statusinstall with biocondadocker_agatsingularity_agatAnaconda-Server BadgeAnaconda-Server Badgedoi_zenodo

AGAT

AnotherGtf/GffAnalysisToolkit

Suite of tools to handle gene annotations in any GTF/GFF format.

Documentation >>here<<
Previous documentation until v1.4.0 (readthedocs)here


Table of Contents


What can AGAT do for you?

AGAT has the power to check, fix, pad missing information (features/attributes) of any kind of GTF and GFF to create complete, sorted and standardised gff3 format. Over the years it has been enriched by many many tools to perform just about any tasks that is possible related to GTF/GFF format files (sanitizing, conversions, merging, modifying, filtering, FASTA sequence extraction, adding information, etc). Comparing to other methods AGAT is robust to even the most despicable GTF/GFF files.

  • Standardize/sanitize any GTF/GFF file into a comprehensive GFF3 format (script with_sp_ prefix)

    See standardization/sanitization tool
    tasktool
    check, fix, pad missing information into sorted and standardised gff3agat_convert_sp_gxf2gxf.pl
    • add missing parent features (e.g. gene and mRNA if only CDS/exon exists).
    • add missing features (e.g. exon and UTR).
    • add missing mandatory attributes (i.e. ID, Parent).
    • fix identifiers to be uniq.
    • fix feature locations.
    • remove duplicated features.
    • group related features (if spread in different places in the file).
    • sort features (tabix optional).
    • merge overlapping loci into one single locus (only if option activated).
  • Convert many formats

    See conversion tools
    tasktool
    convert anyGTF/GFF intoBED formatagat_convert_sp_gff2bed.pl
    convert anyGTF/GFF intoGTF formatagat_convert_sp_gff2gtf.pl
    convert anyGTF/GFF intotabulated formatagat_sp_gff2tsv.pl
    convert anyBAM from minimap2 intoGFF formatagat_convert_sp_minimap2_bam2gff.pl
    convert anyGTF/GFF intoZFF formatagat_sp_gff2zff.pl
    convert anyGTF/GFF into anyGTF/GFF (bioperl) formatagat_convert_sp_gxf2gxf.pl
    convertBED format intoGFF3 formatagat_convert_bed2gff.pl
    convertEMBL format intoGFF3 formatagat_convert_embl2gff.pl
    convertgenscan format intoGFF3 formatagat_convert_genscan2gff.pl
    convertmfannot format intoGFF3 formatagat_convert_mfannot2gff.pl
  • Perform numerous tasks (Just about anything that is possible)

    See tools
    tasktool
    make featurestatisticsagat_sp_statistics.pl
    makefunction statisticsagat_sp_functional_statistics.pl
    extract any type of sequenceagat_sp_extract_sequences.pl
    extract attributesagat_sp_extract_attributes.pl
    complement annotations (non-overlapping loci)agat_sp_complement_annotations.pl
    merge annotationsagat_sp_merge_annotations.pl
    filter gene models by ORF sizeagat_sp_filter_by_ORF_size.pl
    filter to keep only longest isoformsagat_sp_keep_longest_isoform.pl
    create introns featuresagat_sp_add_introns.pl
    fix cds phasesagat_sp_fix_cds_phases.pl
    manage IDsagat_sp_manage_IDs.pl
    manage UTRsagat_sp_manage_UTRs.pl
    manage intronsagat_sp_manage_introns.pl
    manage functional annotationagat_sp_manage_functional_annotation.pl
    specificity sensitivityagat_sp_sensitivity_specificity.pl
    fusion / split analysis between two annotationsagat_sp_compare_two_annotations.pl
    analyze differences betweenBUSCO resultsagat_sp_compare_two_BUSCOs.pl
    ... and much more ...... seehere ...

About the GTF/GFF fromat
The GTF/GFF formats are 9-column text formats used to describe and represent genomic features.The formats have quite evolved since 1997, and despite well-defined specifications existing nowadays they have a great flexibility allowing holding wide variety of information.This flexibility has a drawback aspect, there is an incredible amount of flavour of the formats, that can result in problems when using downstream programs.
For a complete overview of the GTF/GFF formats have a lookhere.

Installation

Using Docker

See details

First you must haveDocker installed and running.
Secondly have look at the availabe AGAT biocontainers atquay.io.

Then:

# get the chosen AGAT container versiondocker pull quay.io/biocontainers/agat:1.4.2--pl5321hdfd78af_0# use an AGAT's tool e.g. agat_convert_sp_gxf2gxf.pldocker run quay.io/biocontainers/agat:1.4.2--pl5321hdfd78af_0 agat_convert_sp_gxf2gxf.pl --help

Using Singularity

See details

First you must haveSingularity installed and running.
Secondly have look at the availabe AGAT biocontainers atquay.io.

Then:

# get the chosen AGAT container versionsingularity pull docker://quay.io/biocontainers/agat:1.4.2--pl5321hdfd78af_0# run the containersingularity run agat_1.4.2--pl5321hdfd78af_0.sif

You are now in the container. You can use an AGAT's tool e.g. agat_convert_sp_gxf2gxf.pl doing

agat_convert_sp_gxf2gxf.pl --help

Using Bioconda

See details

Install AGAT

conda install -c bioconda agat

Update AGAT

conda update agat

Uninstall AGAT

conda uninstall agat

Old school - Manually

See detailsYou will have to install all prerequisites and AGAT manually.

Install prerequisites

  • R (optional)
    You can install it by conda (conda install r-base), throughCRAN (See here for a nice tutorial) or using your package management tool (e.g apt for Debian, Ubuntu, and related Linux distributions). R is optional and can be used to perform some plots. You will need to install the perl depency Statistics::R

  • Perl >= 5.8
    It should already be available on your computer. If you are unluckyperl.org is the place to go.

  • Perl modules
    They can be installed in different ways:

    • using cpan or cpanm
    cpanm install bioperl Clone Graph::Directed LWP::UserAgent Carp Sort::Naturally File::Share File::ShareDir::Install Moose YAML LWP::Protocol::https Term::ProgressBar
    • using conda

      • using the provided yaml file
      conda env create -f conda_environment_AGAT.ymlconda activate agat
      • manually
      conda install perl-bioperl perl-clone perl-graph perl-lwp-simple perl-carp perl-sort-naturally perl-file-share perl-file-sharedir-install perl-moose perl-yaml perl-lwp-protocol-https perl-term-progressbar
    • using your package management tool (e.g apt for Debian, Ubuntu, and related Linux distributions)

    apt install libbio-perl-perl libclone-perl libgraph-perl liblwp-useragent-determined-perl libstatistics-r-perl libcarp-clan-perl libsort-naturally-perl libfile-share-perl libfile-sharedir-perl libfile-sharedir-install-perl libyaml-perl liblwp-protocol-https-perl libterm-progressbar-perl
  • OptionalSome scripts offer the possibility to perform plots. You will need R and Statistics::R which are not included by default.

    • R
      You can install it by conda (conda install r-base), throughCRAN (See here for a nice tutorial) or using your package management tool (e.g apt for Debian, Ubuntu, and related Linux distributions).

    • Statistics::RYou can install it through conda (conda install perl-statistics-r), using cpan/cpanm (cpanm install Statistics::R), or your package management tool (apt install libstatistics-r-perl)

Install AGAT

git clone https://github.com/NBISweden/AGAT.git # Clone AGATcd AGAT                                         # move into AGAT folderperl Makefile.PL                                # Check all the dependencies*make                                            # Compilemake test                                       # Testmake install                                    # Install

*If dependencies are missing you will be warn. Please refer to theInstall prerequisites section.

Remark: On MS Windows, instead of make you'd probably have to use dmake or nmake depending the toolchain you have.

Update AGAT

From the folder where the repository is located.

git pull                                        # Update to last AGATperl Makefile.PL                                # Check all the dependencies*make                                            # Compilemake test                                       # Testmake install                                    # Install

*If dependencies are missing you will be warn. Please refer to theInstall prerequisites section.

Change to a specific version

From the folder where the repository is located.

git pull                                        # Update the codegit checkout v0.1                               # use version v0.1 (See releases tab for a list of available versions)perl Makefile.PL                                # Check all the dependencies*make                                            # Compilemake test                                       # Testmake install                                    # Install

*If dependencies are missing you will be warn. Please refer to theInstall prerequisites section.

Uninstall AGAT

perl uninstall_AGAT

Usage

script_name.pl -h

List of tools

As AGAT is a toolkit, it contains a lot of tools. The main one isagat_convert_sp_gxf2gxf.pl that allows to check, fix, pad missing information (features/attributes) of any kind of gtf and gff to create complete, sorted and standardised gff3 format.All the installed scripts have theagat_ prefix.

To have a look to the available tools you have several approaches:

  • agat --tools
  • Typingagat_ in your terminal followed by the key to activate the autocompletion will display the complete list of available tools.
  • The documentation.

More about the tools

with _sp_ prefix => Means SLURP

The gff file will be charged in memory in a specific data structure facilitating the access to desired features at any time.It has a memory cost but make life smoother. Indeed, it allows to perform complicated tasks in a more time efficient way.Moreover, it allows to fix all potential errors in the limit of the possibilities given by the format itself.See the AGAT parser section for more information about it.

with _sq_ prefix => Means SEQUENTIAL

The gff file is read and processed from its top to the end line by line without sanity check (e.g. relationship between the features). This is memory efficient.

The AGAT parser - Standardisation to create GXF files compliant to any tool

All tools withagat_sp_ prefix will parse and slurps the entire data into a specific data structure.Below you will find more information about peculiarity of the data structure,and the parsing approach used.

the data structure

See data structure details

The method create a hash structure containing all the data in memory. We can call it OMNISCIENT.The OMNISCIENT hold the GFF/GTF header information in that structure:

$omniscient{other}{header} = header information from the beginning of the file starting by #

The OMNISCIENT hold the GFF/GTF feature information in that structure:

$omniscient{level1}{tag_l1}{level1_id} = feature <= tag could be gene, match  $omniscient{level2}{tag_l2}{idY} = @featureListL2 <= tag could be mRNA,rRNA,tRNA,etc. idY is a level1_id (know as Parent attribute within the level2 feature). The @featureListL2 is a list to be able to manage isoform cases.  $omniscient{level3}{tag_l3}{idZ} =  @featureListL3 <= tag could be exon,cds,utr3,utr5,etc. idZ is the ID of a level2 feature (know as Parent attribute within the level3 feature). The @featureListL3 is a list to be able to put all the feature of a same tag together.

The OMNISCIENT hold theagat_config.yml information in that structure:

$omniscient{config}{parameter1} = value parameter1$omniscient{config}{parameter2} = value parameter2

The OMNISCIENT hold thefeature_levels.yaml information in that structure:

$omniscient{other}{level}{level1}{featureTypeX} = value featureTypeX (standalone, topfeature)$omniscient{other}{level}{level2}{featureTypeY} = value featureTypeY $omniscient{other}{level}{level2}{featureTypeZ} = value featureTypeZ

How does the AGAT parser work

The AGAT parser phylosophy:

    1. Parse by Parent/child relationship or gene_id/transcript_id relationship.
    1. ELSE Parse by a common tag (an attribute value shared by feature that must be grouped together. By default we are using locus_tag but can be set by parameter).
    1. ELSE Parse sequentially (mean group features in a bucket, and the bucket change at each level2 feature, and bucket are join in a common tag at each new L1 feature).

/!\ Cases with only level3 features (i.e rast or some prokka files),sequential parsing may not work as expected if Parent/ID gene_id/transcript_id attributes are missing. Indeed all features will be the child of only one newly created Parent. To create a parent per feature or group of features, a common tag must be used to group them correctly (by defaultgene_id andlocus_tag but you can set up the ones of your choice)

To resume by priority of way to parse:Parent/child relationship > locus_tag > sequential.
The parser may used only one or a mix of these approaches according of the peculiarity of the gtf/gff file you provide.

What can the AGAT parser do for you

  • It creates missing parental features. (e.g if a level2 or level3 feature do not have parental feature(s) we create the missing level2 and/or level1 feature(s)).
  • It creates missing mandatory attributes (ID and/or Parent).
  • It fixes identifier to be uniq.
  • It removes duplicated features (same position, same ID, same Parent).
  • It expands level3 features sharing multiple parents (e.g if one exon has list of multiple parent mRNA in its Parent attribute, one exon per parent with uniq ID will be created.
  • It fixes feature location errors (e.g an mRNA spanning over its gene location, we fix the gene location).
  • It adds UTR if possible (CDS and exon present).
  • It adds exon if possible (CDS has to be present).
  • It groups features together (if related features are spread at different places in the file).

examples

AGAT was tested on 42 different types of GTF/GFF of different flavours or/and containing errors.Below few are listed but you can find the full list of them into thet/gff_syntax directory.

example 8 - only CDS defined
See example
##gff-version 3Tob1_contig1Prodigal:2.60CDS476670.-0ID=Tob1_00001;locus_tag=Tob1_00001;product=hypothetical proteinTob1_contig1Prodigal:2.60CDS3426635222.+0ID=Tob1_00024;locus_tag=Tob1_00024;product=hypothetical proteinTob1_contig1SignalP:4.1sig_peptide3426634298.+0inference=ab initio prediction:SignalP:4.1;note=predicted cleavage at residue 33;product=putative signal peptideTob1_contig1Prodigal:2.60CDS3526737444.-0ID=Tob1_00025;locus_tag=Tob1_00025;Tob1_contig1SignalP:4.1sig_peptide3742037444.-0inference=ab initio prediction:SignalP:4.1;note=predicted cleavage at residue 25;product=putative signal peptideTob1_contig1Prodigal:2.60CDS3830439338.-0ID=Tob1_00026;locus_tag=Tob1_00026;

agat_convert_sp_gxf2gxf.pl --gff 8_test.gff

See result
##gff-version 3Tob1_contig1Prodigal:2.60gene476670.-0ID=nbis_NEW-gene-1;locus_tag=Tob1_00001;product=hypothetical proteinTob1_contig1Prodigal:2.60mRNA476670.-0ID=nbis_nol2id-cds-1;Parent=nbis_NEW-gene-1;locus_tag=Tob1_00001;product=hypothetical proteinTob1_contig1Prodigal:2.60exon476670.-.ID=nbis_NEW-exon-1;Parent=nbis_nol2id-cds-1;locus_tag=Tob1_00001;product=hypothetical proteinTob1_contig1Prodigal:2.60CDS476670.-0ID=Tob1_00001;Parent=nbis_nol2id-cds-1;locus_tag=Tob1_00001;product=hypothetical proteinTob1_contig1Prodigal:2.60gene3426635222.+0ID=nbis_NEW-gene-2;locus_tag=Tob1_00024;product=hypothetical proteinTob1_contig1Prodigal:2.60mRNA3426635222.+0ID=nbis_nol2id-cds-2;Parent=nbis_NEW-gene-2;locus_tag=Tob1_00024;product=hypothetical proteinTob1_contig1Prodigal:2.60exon3426635222.+.ID=nbis_NEW-exon-2;Parent=nbis_nol2id-cds-2;locus_tag=Tob1_00024;product=hypothetical proteinTob1_contig1Prodigal:2.60CDS3426635222.+0ID=Tob1_00024;Parent=nbis_nol2id-cds-2;locus_tag=Tob1_00024;product=hypothetical proteinTob1_contig1SignalP:4.1sig_peptide3426634298.+0ID=sig_peptide-1;Parent=nbis_nol2id-cds-2;inference=ab initio prediction:SignalP:4.1;note=predicted cleavage at residue 33;product=putative signal peptideTob1_contig1Prodigal:2.60gene3526737444.-0ID=nbis_NEW-gene-3;locus_tag=Tob1_00025Tob1_contig1Prodigal:2.60mRNA3526737444.-0ID=nbis_nol2id-cds-3;Parent=nbis_NEW-gene-3;locus_tag=Tob1_00025Tob1_contig1Prodigal:2.60exon3526737444.-.ID=nbis_NEW-exon-3;Parent=nbis_nol2id-cds-3;locus_tag=Tob1_00025Tob1_contig1Prodigal:2.60CDS3526737444.-0ID=Tob1_00025;Parent=nbis_nol2id-cds-3;locus_tag=Tob1_00025Tob1_contig1SignalP:4.1sig_peptide3742037444.-0ID=sig_peptide-2;Parent=nbis_nol2id-cds-3;inference=ab initio prediction:SignalP:4.1;note=predicted cleavage at residue 25;product=putative signal peptideTob1_contig1Prodigal:2.60gene3830439338.-0ID=nbis_NEW-gene-4;locus_tag=Tob1_00026Tob1_contig1Prodigal:2.60mRNA3830439338.-0ID=nbis_nol2id-cds-4;Parent=nbis_NEW-gene-4;locus_tag=Tob1_00026Tob1_contig1Prodigal:2.60exon3830439338.-.ID=nbis_NEW-exon-4;Parent=nbis_nol2id-cds-4;locus_tag=Tob1_00026Tob1_contig1Prodigal:2.60CDS3830439338.-0ID=Tob1_00026;Parent=nbis_nol2id-cds-4;locus_tag=Tob1_00026
example 9 - level2 feature missing (mRNA) and level3 features missing (UTRs)
See example
##gff-version 3#!gff-spec-version 1.14#!source-version NCBI C++ formatter 0.2##Type DNA NC_003070.9NC_003070.9RefSeqsource130427671.+.organism=Arabidopsis thaliana;mol_type=genomic DNA;db_xref=taxon:3702;chromosome=1;ecotype=ColumbiaNC_003070.9RefSeqgene36315899.+.ID=NC_003070.9:NAC001;locus_tag=AT1G01010;NC_003070.9RefSeqexon36313913.+.ID=NM_099983.2;Parent=NC_003070.9:NAC001;gbkey=mRNA;locus_tag=AT1G01010;NC_003070.9RefSeqexon39964276.+.ID=NM_099983.2;Parent=NC_003070.9:NAC001;gbkey=mRNA;locus_tag=AT1G01010;NC_003070.9RefSeqexon44864605.+.ID=NM_099983.2;Parent=NC_003070.9:NAC001;gbkey=mRNA;locus_tag=AT1G01010;NC_003070.9RefSeqexon47065095.+.ID=NM_099983.2;Parent=NC_003070.9:NAC001;gbkey=mRNA;locus_tag=AT1G01010;NC_003070.9RefSeqexon51745326.+.ID=NM_099983.2;Parent=NC_003070.9:NAC001;gbkey=mRNA;locus_tag=AT1G01010;NC_003070.9RefSeqexon54395899.+.ID=NM_099983.2;Parent=NC_003070.9:NAC001;gbkey=mRNA;locus_tag=AT1G01010;NC_003070.9RefSeqCDS37603913.+0ID=NM_099983.2;Parent=NC_003070.9:NAC001;locus_tag=AT1G01010;NC_003070.9RefSeqCDS39964276.+2ID=NM_099983.2;Parent=NC_003070.9:NAC001;locus_tag=AT1G01010;NC_003070.9RefSeqCDS44864605.+0ID=NM_099983.2;Parent=NC_003070.9:NAC001;locus_tag=AT1G01010;NC_003070.9RefSeqCDS47065095.+0ID=NM_099983.2;Parent=NC_003070.9:NAC001;locus_tag=AT1G01010;NC_003070.9RefSeqCDS51745326.+0ID=NM_099983.2;Parent=NC_003070.9:NAC001;locus_tag=AT1G01010;NC_003070.9RefSeqCDS54395627.+0ID=NM_099983.2;Parent=NC_003070.9:NAC001;locus_tag=AT1G01010;NC_003070.9RefSeqstart_codon37603762.+0ID=NM_099983.2;Parent=NC_003070.9:NAC001;locus_tag=AT1G01010;NC_003070.9RefSeqstop_codon56285630.+0ID=NM_099983.2;Parent=NC_003070.9:NAC001;locus_tag=AT1G01010;

agat_convert_sp_gxf2gxf.pl --gff 8_test.gff

See result
##gff-version 3#!gff-spec-version 1.14#!source-version NCBI C++ formatter 0.2##Type DNA NC_003070.9NC_003070.9RefSeqsource130427671.+.ID=source-1;chromosome=1;db_xref=taxon:3702;ecotype=Columbia;mol_type=genomic DNA;organism=Arabidopsis thalianaNC_003070.9RefSeqgene36315899.+.ID=nbis_NEW-gene-1;locus_tag=AT1G01010NC_003070.9RefSeqmRNA36315899.+.ID=NC_003070.9:NAC001;Parent=nbis_NEW-gene-1;locus_tag=AT1G01010NC_003070.9RefSeqexon36313913.+.ID=NM_099983.2;Parent=NC_003070.9:NAC001;gbkey=mRNA;locus_tag=AT1G01010NC_003070.9RefSeqexon39964276.+.ID=nbis_NEW-exon-1;Parent=NC_003070.9:NAC001;gbkey=mRNA;locus_tag=AT1G01010NC_003070.9RefSeqexon44864605.+.ID=nbis_NEW-exon-2;Parent=NC_003070.9:NAC001;gbkey=mRNA;locus_tag=AT1G01010NC_003070.9RefSeqexon47065095.+.ID=nbis_NEW-exon-3;Parent=NC_003070.9:NAC001;gbkey=mRNA;locus_tag=AT1G01010NC_003070.9RefSeqexon51745326.+.ID=nbis_NEW-exon-4;Parent=NC_003070.9:NAC001;gbkey=mRNA;locus_tag=AT1G01010NC_003070.9RefSeqexon54395899.+.ID=nbis_NEW-exon-5;Parent=NC_003070.9:NAC001;gbkey=mRNA;locus_tag=AT1G01010NC_003070.9RefSeqCDS37603913.+0ID=nbis_NEW-cds-1;Parent=NC_003070.9:NAC001;locus_tag=AT1G01010NC_003070.9RefSeqCDS39964276.+2ID=nbis_NEW-cds-1;Parent=NC_003070.9:NAC001;locus_tag=AT1G01010NC_003070.9RefSeqCDS44864605.+0ID=nbis_NEW-cds-1;Parent=NC_003070.9:NAC001;locus_tag=AT1G01010NC_003070.9RefSeqCDS47065095.+0ID=nbis_NEW-cds-1;Parent=NC_003070.9:NAC001;locus_tag=AT1G01010NC_003070.9RefSeqCDS51745326.+0ID=nbis_NEW-cds-1;Parent=NC_003070.9:NAC001;locus_tag=AT1G01010NC_003070.9RefSeqCDS54395627.+0ID=nbis_NEW-cds-1;Parent=NC_003070.9:NAC001;locus_tag=AT1G01010NC_003070.9RefSeqfive_prime_UTR36313759.+.ID=nbis_NEW-five_prime_utr-1;Parent=NC_003070.9:NAC001;gbkey=mRNA;locus_tag=AT1G01010NC_003070.9RefSeqstart_codon37603762.+0ID=nbis_NEW-start_codon-1;Parent=NC_003070.9:NAC001;locus_tag=AT1G01010NC_003070.9RefSeqstop_codon56285630.+0ID=nbis_NEW-stop_codon-1;Parent=NC_003070.9:NAC001;locus_tag=AT1G01010NC_003070.9RefSeqthree_prime_UTR56285899.+.ID=nbis_NEW-three_prime_utr-1;Parent=NC_003070.9:NAC001;gbkey=mRNA;locus_tag=AT1G01010
example 18 - related features spread within the file
See example
##gff-version 3scaffold625makergene337818343277.+.ID=CLUHARG00000005458;Name=TUBB3_2scaffold625makermRNA337818343277.+.ID=CLUHART00000008717;Parent=CLUHARG00000005458scaffold625makerexon337818337971.+.ID=CLUHART00000008717:exon:1404;Parent=CLUHART00000008717scaffold625makerexon340733340841.+.ID=CLUHART00000008717:exon:1405;Parent=CLUHART00000008717scaffold789makerthree_prime_UTR564589564780.+.ID=CLUHART00000006146:three_prime_utr;Parent=CLUHART00000006146scaffold789makermRNA558184564780.+.ID=CLUHART00000006147;Parent=CLUHARG00000003852scaffold625makerCDS337915337971.+0ID=CLUHART00000008717:cds;Parent=CLUHART00000008717scaffold625makerCDS340733340841.+0ID=CLUHART00000008717:cds;Parent=CLUHART00000008717scaffold625makerCDS341518341628.+2ID=CLUHART00000008717:cds;Parent=CLUHART00000008717scaffold625makerCDS341964343033.+2ID=CLUHART00000008717:cds;Parent=CLUHART00000008717scaffold625makerfive_prime_UTR337818337914.+.ID=CLUHART00000008717:five_prime_utr;Parent=CLUHART00000008717scaffold625makerthree_prime_UTR343034343277.+.ID=CLUHART00000008717:three_prime_utr;Parent=CLUHART00000008717scaffold789makergene558184564780.+.ID=CLUHARG00000003852;Name=PF11_0240scaffold789makermRNA558184564780.+.ID=CLUHART00000006146;Parent=CLUHARG00000003852scaffold789makerexon558184560123.+.ID=CLUHART00000006146:exon:995;Parent=CLUHART00000006146scaffold789makerexon561401561519.+.ID=CLUHART00000006146:exon:996;Parent=CLUHART00000006146scaffold789makerexon564171564235.+.ID=CLUHART00000006146:exon:997;Parent=CLUHART00000006146scaffold789makerexon564372564780.+.ID=CLUHART00000006146:exon:998;Parent=CLUHART00000006146scaffold789makerCDS558191560123.+0ID=CLUHART00000006146:cds;Parent=CLUHART00000006146scaffold789makerCDS561401561519.+2ID=CLUHART00000006146:cds;Parent=CLUHART00000006146scaffold625makerexon341518341628.+.ID=CLUHART00000008717:exon:1406;Parent=CLUHART00000008717scaffold625makerexon341964343277.+.ID=CLUHART00000008717:exon:1407;Parent=CLUHART00000008717scaffold789makerCDS564171564235.+0ID=CLUHART00000006146:cds;Parent=CLUHART00000006146scaffold789makerCDS564372564588.+1ID=CLUHART00000006146:cds;Parent=CLUHART00000006146scaffold789makerfive_prime_UTR558184558190.+.ID=CLUHART00000006146:five_prime_utr;Parent=CLUHART00000006146scaffold789makerexon558184560123.+.ID=CLUHART00000006147:exon:997;Parent=CLUHART00000006147scaffold789makerexon561401561519.+.ID=CLUHART00000006147:exon:998;Parent=CLUHART00000006147scaffold789makerexon562057562121.+.ID=CLUHART00000006147:exon:999;Parent=CLUHART00000006147scaffold789makerexon564372564780.+.ID=CLUHART00000006147:exon:1000;Parent=CLUHART00000006147scaffold789makerCDS558191560123.+0ID=CLUHART00000006147:cds;Parent=CLUHART00000006147scaffold789makerCDS561401561519.+2ID=CLUHART00000006147:cds;Parent=CLUHART00000006147scaffold789makerCDS562057562121.+0ID=CLUHART00000006147:cds;Parent=CLUHART00000006147scaffold789makerCDS564372564588.+1ID=CLUHART00000006147:cds;Parent=CLUHART00000006147scaffold789makerfive_prime_UTR558184558190.+.ID=CLUHART00000006147:five_prime_utr;Parent=CLUHART00000006147scaffold789makerthree_prime_UTR564589564780.+.ID=CLUHART00000006147:three_prime_utr;Parent=CLUHART00000006147

agat_convert_sp_gxf2gxf.pl --gff 18_test.gff

See result
##gff-version 3scaffold625makergene337818343277.+.ID=CLUHARG00000005458;Name=TUBB3_2scaffold625makermRNA337818343277.+.ID=CLUHART00000008717;Parent=CLUHARG00000005458scaffold625makerexon337818337971.+.ID=CLUHART00000008717:exon:1404;Parent=CLUHART00000008717scaffold625makerexon340733340841.+.ID=CLUHART00000008717:exon:1405;Parent=CLUHART00000008717scaffold625makerexon341518341628.+.ID=CLUHART00000008717:exon:1406;Parent=CLUHART00000008717scaffold625makerexon341964343277.+.ID=CLUHART00000008717:exon:1407;Parent=CLUHART00000008717scaffold625makerCDS337915337971.+0ID=CLUHART00000008717:cds;Parent=CLUHART00000008717scaffold625makerCDS340733340841.+0ID=CLUHART00000008717:cds;Parent=CLUHART00000008717scaffold625makerCDS341518341628.+2ID=CLUHART00000008717:cds;Parent=CLUHART00000008717scaffold625makerCDS341964343033.+2ID=CLUHART00000008717:cds;Parent=CLUHART00000008717scaffold625makerfive_prime_UTR337818337914.+.ID=CLUHART00000008717:five_prime_utr;Parent=CLUHART00000008717scaffold625makerthree_prime_UTR343034343277.+.ID=CLUHART00000008717:three_prime_utr;Parent=CLUHART00000008717scaffold789makergene558184564780.+.ID=CLUHARG00000003852;Name=PF11_0240scaffold789makermRNA558184564780.+.ID=CLUHART00000006146;Parent=CLUHARG00000003852scaffold789makerexon558184560123.+.ID=CLUHART00000006146:exon:995;Parent=CLUHART00000006146scaffold789makerexon561401561519.+.ID=CLUHART00000006146:exon:996;Parent=CLUHART00000006146scaffold789makerexon564171564235.+.ID=CLUHART00000006146:exon:997;Parent=CLUHART00000006146scaffold789makerexon564372564780.+.ID=CLUHART00000006146:exon:998;Parent=CLUHART00000006146scaffold789makerCDS558191560123.+0ID=CLUHART00000006146:cds;Parent=CLUHART00000006146scaffold789makerCDS561401561519.+2ID=CLUHART00000006146:cds;Parent=CLUHART00000006146scaffold789makerCDS564171564235.+0ID=CLUHART00000006146:cds;Parent=CLUHART00000006146scaffold789makerCDS564372564588.+1ID=CLUHART00000006146:cds;Parent=CLUHART00000006146scaffold789makerfive_prime_UTR558184558190.+.ID=CLUHART00000006146:five_prime_utr;Parent=CLUHART00000006146scaffold789makerthree_prime_UTR564589564780.+.ID=CLUHART00000006146:three_prime_utr;Parent=CLUHART00000006146scaffold789makermRNA558184564780.+.ID=CLUHART00000006147;Parent=CLUHARG00000003852scaffold789makerexon558184560123.+.ID=CLUHART00000006147:exon:997;Parent=CLUHART00000006147scaffold789makerexon561401561519.+.ID=CLUHART00000006147:exon:998;Parent=CLUHART00000006147scaffold789makerexon562057562121.+.ID=CLUHART00000006147:exon:999;Parent=CLUHART00000006147scaffold789makerexon564372564780.+.ID=CLUHART00000006147:exon:1000;Parent=CLUHART00000006147scaffold789makerCDS558191560123.+0ID=CLUHART00000006147:cds;Parent=CLUHART00000006147scaffold789makerCDS561401561519.+2ID=CLUHART00000006147:cds;Parent=CLUHART00000006147scaffold789makerCDS562057562121.+0ID=CLUHART00000006147:cds;Parent=CLUHART00000006147scaffold789makerCDS564372564588.+1ID=CLUHART00000006147:cds;Parent=CLUHART00000006147scaffold789makerfive_prime_UTR558184558190.+.ID=CLUHART00000006147:five_prime_utr;Parent=CLUHART00000006147scaffold789makerthree_prime_UTR564589564780.+.ID=CLUHART00000006147:three_prime_utr;Parent=CLUHART00000006147

How to cite?

This work has not been published (I will think about it) but you can cite it as follow:

Dainat J. 2022. Another Gtf/Gff Analysis Toolkit (AGAT): Resolve interoperability issues and accomplish more with your annotations. Plant and Animal Genome XXIX Conference. https://github.com/NBISweden/AGAT.

or/and (Adapt the AGAT version to the one you used):

Dainat J. AGAT: Another Gff Analysis Toolkit to handle annotations in any GTF/GFF format.  (Version v1.4.1). Zenodo. https://www.doi.org/10.5281/zenodo.3552717

Publication using AGAT

Seehere for examples of publications using AGAT.

Troubleshooting

See Troubleshooting section form the dochere.


[8]ページ先頭

©2009-2025 Movatter.jp