Introns are removed and exons joined in the process of RNA splicing. RNAs could bemRNA ornon-coding RNA.
Anexon is any part of agene that will form a part of the final matureRNA produced by that gene afterintrons have been removed byRNA splicing. The termexon refers to both the DNA sequence within a gene and to the corresponding sequence in RNA transcripts. In RNA splicing, introns are removed and exons are covalently joined to one another as part of generating the matureRNA. Just as the entire set of genes for aspecies constitutes thegenome, the entire set of exons constitutes theexome.
The termexon is a shortening of the phraseexpressed region and was coined by AmericanbiochemistWalter Gilbert in 1978:[1]
The notion of thecistron... must be replaced by that of a transcription unit containing regions which will be lost from the mature messenger – which I suggest we call introns (for intragenic regions) – alternating with regions which will be expressed – exons.
This definition was originally made for protein-coding transcripts that are spliced before being translated. The term later came to include sequences removed fromrRNA[2] andtRNA,[3] and otherncRNA[4] and it also was used later for RNA molecules originating from different parts of the genome that are thenligated by trans-splicing.[5]
Across all eukaryotic genes in GenBank, there were (in 2002), on average, 5.48 exons per protein coding gene. The average exon encoded 30-36amino acids.[7] While the longest exon in the human genome is 11555bp long, several exons have been found to be only 2 bp long.[8] A single-nucleotide exon has been reported from theArabidopsis genome.[9] In humans, like protein codingmRNA, mostnon-coding RNA also contain multiple exons[10]
Exons in a messenger RNA precursor (pre-mRNA). Exons can include both sequences that code for amino acids (red) and untranslated sequences (grey). Introns — those parts of the pre-mRNA that are not in the mRNA — (blue) are removed, and the exons are joined (spliced) to form the final functional mRNA. The 5′ and 3′ ends of the mRNA are marked to differentiate the two untranslated regions (grey).
In protein-coding genes, the exons include both the protein-coding sequence and the 5′- and 3′-untranslated regions (UTR). Often the first exon includes both the 5′-UTR and the first part of the coding sequence, but exons containing only regions of 5′-UTR or (more rarely) 3′-UTR occur in some genes, i.e. the UTRs may contain introns.[11] Somenon-coding RNA transcripts also have exons and introns.
Mature mRNAs originating from the same gene need not include the same exons, since different introns in the pre-mRNA can be removed by the process ofalternative splicing.
Exonization is the creation of a new exon, as a result of mutations inintrons.[12]
Exon trapping or 'gene trapping' is amolecular biology technique that exploits the existence of the intron-exonsplicing to find new genes.[13] The first exon of a 'trapped' gene splices into the exon that is contained in theinsertional DNA. This new exon contains theOpen Reading Frame for areporter gene that can now be expressed using theenhancers that control the target gene. A scientist knows that a new gene has been trapped when the reporter gene is expressed.
Splicing can be experimentally modified so that targeted exons are excluded from mature mRNA transcripts by blocking the access of splice-directingsmall nuclear ribonucleoprotein particles (snRNPs) to pre-mRNA usingMorpholino antisense oligos.[14] This has become a standard technique indevelopmental biology. Morpholino oligos can also be targeted to prevent molecules that regulate splicing (e.g. splice enhancers, splice suppressors) from binding to pre-mRNA, altering patterns of splicing.
Common incorrect uses of the termexon are that 'exons code for protein', or 'exons code for amino-acids' or 'exons are translated'[15]. However, these sorts of definitions only coverprotein-coding genes, and omit those exons that become part of anon-coding RNA[16] or theuntranslated region of anmRNA.[17][18] Such incorrect definitions still occur in overall reputable secondary sources.[19][20]
^Liu AY, Van der Ploeg LH, Rijsewijk FA, Borst P (June 1983). "The transposition unit of variant surface glycoprotein gene 118 of Trypanosoma brucei. Presence of repeated elements at its border and absence of promoter-associated sequences".Journal of Molecular Biology.167 (1):57–75.doi:10.1016/S0022-2836(83)80034-5.PMID6306255.
^Sakharkar M.K.; Chow VT; Kangueane P. (2004). "Distributions of exons and introns in the human genome".In Silico Biol.4 (4):387–93.doi:10.3233/ISB-00142.PMID15217358.