Translation is the process inbiological cells in whichproteins are produced usingRNA molecules as templates. The generated protein is a sequence ofamino acids determined by the sequence ofnucleotides in the RNA. The nucleotides are considered three at a time. Each such triple results in the addition of one specific amino acid to the protein being generated. The matching from nucleotide triple to amino acid is called thegenetic code. The translation is performed by a large complex of functional RNA and proteins calledribosomes. The entire process is calledgene expression.
In translation,messenger RNA (mRNA) is decoded in a ribosome, outside the nucleus, to produce a specificamino acid chain, orpolypeptide. The polypeptide laterfolds into anactive protein and performs its functions in the cell. The polypeptide can also start folding during protein synthesis.[1] The ribosome facilitates decoding by inducing the binding ofcomplementarytransfer RNA (tRNA)anticodon sequences to mRNAcodons. The tRNAs carry specific amino acids that are chained together into a polypeptide as the mRNA passes through and is "read" by the ribosome. The three stages of translation are initiation, elongation, and termination.
A ribosome translating a protein that is secreted into theendoplasmic reticulum (tRNAs colored dark blue).Protein domain dynamics can now be seen byneutron spin echo spectroscopyTertiary structure of tRNA (CCA tail in yellow,Acceptor stem in purple,Variable loop in orange,D arm in red,Anticodon arm in blue withAnticodon in black,T arm in green)
The basic process ofprotein production is the addition of one amino acid at a time to the end of a forming polypeptide chain. This operation is performed by aribosome.[2] A ribosome is made up of two subunits, in the eukaryote asmall (40S) subunit, and alarge (60S) subunit. These subunits come together before the translation of mRNA into a protein to provide a location for translation to be carried out and a polypeptide to be produced.[3] The choice of amino acid type to add is determined by amessenger RNA (mRNA) molecule. Each amino acid added is matched to a three-nucleotide subsequence of the mRNA. For each such triplet possible, the corresponding amino acid is accepted. The successive amino acids added to the chain are matched to successive nucleotide triplets in the mRNA. In this way, the sequence of nucleotides in the template mRNA chain determines the sequence of amino acids in the generated amino acid chain.[4]The addition of an amino acid occurs at theC-terminus of the peptide; thus, translation is said to be amine-to-carboxyl directed.[5]
The mRNA carriesgenetic information encoded as a ribonucleotide sequence from the chromosomes to the ribosomes. The ribonucleotides are "read" by translational machinery in a sequence ofnucleotide triplets calledcodons. Each of those triplets codes for a specificamino acid.[citation needed]
The ribosome molecules translate this code to a specific sequence of amino acids. The ribosome is a multisubunit structure containingribosomal RNA (rRNA) and proteins. It is the "factory" where amino acids are assembled into proteins.
Transfer RNAs (tRNAs) are small noncoding RNA chains (74–93 nucleotides) that transport amino acids to the ribosome. The repertoire of tRNA genes varies widely between species, with some bacteria having between 20 and 30 genes while complex eukaryotes could have thousands.[6] tRNAs have a site for amino acid attachment, and a site called an anticodon. The anticodon is an RNA triplet complementary to the mRNA triplet that codes for their cargoamino acid.
Aminoacyl tRNA synthetases (enzymes) catalyze the bonding between specific tRNAs and theamino acids that their anticodon sequences call for. The product of this reaction is anaminoacyl-tRNA. The amino acid is joined by its carboxyl group to the 3' OH of the tRNA by anester bond. When the tRNA has an amino acid linked to it, the tRNA is termed "charged". Aminoacyl-tRNA synthetases that mispair tRNAs with the wrong amino acids can produce mischarged aminoacyl-tRNAs, which can result in inappropriate amino acids at the respective position in the protein. This "mistranslation"[7] of the genetic code naturally occurs at low levels in most organisms, but certain cellular environments cause an increase in permissive mRNA decoding, sometimes to the benefit of the cell.
The ribosome has two binding sites for tRNA. They are the aminoacyl site (abbreviated A), and the peptidyl site/ exit site (abbreviated P/E). Concerning the mRNA, the three sites are oriented5' to3' E-P-A, because ribosomes move toward the 3' end of mRNA. TheA-site binds the incoming tRNA with the complementary codon on the mRNA. TheP/E-site holds the tRNA with the growing polypeptide chain. When an aminoacyl-tRNA initially binds to its corresponding codon on the mRNA, it is in the A site. Then, a peptide bond forms between the amino acid of the tRNA in the A site and the amino acid of the charged tRNA in the P/E site. The growing polypeptide chain is transferred to the tRNA in the A site. Translocation occurs, moving the tRNA to the P/E site, now without an amino acid; the tRNA that was in the A site, now charged with the polypeptide chain, is moved to the P/E site and theuncharged tRNA leaves, and another aminoacyl-tRNA enters the A site to repeat the process.[8]
After the new amino acid is added to the chain, and after the tRNA is released out of the ribosome and into the cytosol, the energy provided by the hydrolysis of a GTP bound to thetranslocaseEEF2 moves the ribosome down one codon towards the3' end. The energy required for translation of proteins is significant. For a protein containingn amino acids, the number of high-energy phosphate bonds required to translate it is 4n-1.[9] The rate of translation varies; it is significantly higher in prokaryotic cells (up to 17–21 amino acid residues per second) than in eukaryotic cells (up to 6–9 amino acid residues per second).[10]
Initiation involves the small subunit of the ribosome binding to the 5' end of mRNA with the help ofinitiation factors.[12] The ribosome and its associated factors assemble and bind to an mRNA. The first tRNA is attached at thestart codon. This process is defined as either cap-dependent, in which the ribosome binds initially at the 5' cap and then travels to the stop codon, or as cap-independent, where the ribosome does not initially bind the 5' cap. The 5' cap is added when the nascent pre-mRNA is about 20 nucleotides long.[13]
The process of initiation of translation in eukaryotes.
some of the protein complexes involved in initiation
Initiation of translation usually involves the interaction of certain key proteins, theinitiation factors, with a special tag bound to the 5'-end of an mRNA molecule, the5' cap, as well as with the5' UTR. These proteins bind the small (40S)ribosomal subunit and hold the mRNA in place.[14]
eIF3 is associated with the 40S ribosomal subunit and plays a role in keeping the large (60S) ribosomal subunit from prematurely binding. eIF3 also interacts with theeIF4F complex, which consists of three other initiation factors:eIF4A,eIF4E, andeIF4G.eIF4G is a scaffolding protein that directly associates with both eIF3 and the other two components.eIF4E is the cap-binding protein. Binding of the cap by eIF4E is often considered the rate-limiting step of cap-dependent initiation, and the concentration of eIF4E is a regulatory nexus of translational control. Certain viruses cleave a portion of eIF4G that binds eIF4E, thus preventing cap-dependent translation to hijack the host machinery in favor of the viral (cap-independent) messages.eIF4A is an ATP-dependent RNA helicase that aids the ribosome by resolving certain secondary structures formed along the mRNA transcript. Recent structural biology results also indicated that a second eIF4A protein can simultaneously associate with the initiation complex, specifically interacting with eIF3.[15][16] Thepoly(A)-binding protein (PABP) also associates with theeIF4F complex via eIF4G, and binds thepoly-A tail of most eukaryotic mRNA molecules. This protein has been implicated in playing a role in circularization of the mRNA during translation.[17]
This43S preinitiation complex (43S PIC) accompanied by the protein factors moves along the mRNA chain toward its 3'-end, in a process known as 'scanning', to reach thestart codon (typically AUG). Ineukaryotes andarchaea, theamino acid encoded by the start codon ismethionine. The Met-charged initiator tRNA (Met-tRNAiMet) is brought to the P-site of the small ribosomal subunit byeukaryotic initiation factor 2 (eIF2). It hydrolyzes GTP, and signals for the dissociation of several factors from the small ribosomal subunit, eventually leading to the association of the large subunit (or the60S subunit). The complete ribosome (80S) then commences translation elongation.
Regulation of protein synthesis is partly influenced by phosphorylation ofeIF2 (via the α subunit), which is a part of the eIF2-GTP-Met-tRNAiMet ternary complex (eIF2-TC). When large numbers of eIF2 are phosphorylated, protein synthesis is inhibited. This occurs under amino acid starvation or after viral infection. However, a small fraction of this initiation factor is naturally phosphorylated. Another regulator is4EBP, which binds to the initiation factoreIF4E and inhibits its interactions witheIF4G, thus preventing cap-dependent initiation. To oppose the effects of 4EBP, growth factors phosphorylate 4EBP, reducing its affinity for eIF4E and permitting protein synthesis.[citation needed]
While protein synthesis is globally regulated by modulating the expression of key initiation factors as well as the number of ribosomes, individual mRNAs can have different translation rates due to the presence of regulatory sequence elements. This has been shown to be important in a variety of settings including yeast meiosis and ethylene response in plants. In addition, recent work in yeast and humans suggest that evolutionary divergence in cis-regulatory sequences can impact translation regulation.[18] Additionally, RNAhelicases such asDHX29 andDed1/DDX3 participate in the process of translation initiation, especially for mRNAs with structured 5'UTRs.[19][20]
The best-studied example of cap-independent translation initiation in eukaryotes uses theinternal ribosome entry site (IRES). Unlike cap-dependent translation, cap-independent translation does not require a 5' cap to initiate scanning from the 5' end of the mRNA until the start codon. The ribosome can localize to the start site by direct binding, initiation factors, and/or ITAFs (IRES trans-acting factors) bypassing the need to scan the entire5' UTR. This method of translation is important in conditions that require the translation of specific mRNAs during cellular stress, when overall translation is reduced. Examples include factors responding to apoptosis and stress-induced responses.[21]
The elongation and membrane targeting stages of eukaryotic translation. The ribosome is green and yellow, the tRNAs are dark-blue, and the other proteins involved are light-blue
Elongation depends onelongation factors. At the end of the initiation step, the mRNA is positioned so that the next codon can be translated during the elongation stage of protein synthesis. The initiator tRNA occupies the P site in the ribosome, and the A site is ready to receive an aminoacyl-tRNA. During chain elongation, each additional amino acid is added to the nascent polypeptide chain in a three-step microcycle. The steps in this microcycle are (1) positioning the correct aminoacyl-tRNA in the A site of the ribosome, which is brought into that site by eEF1, (2) forming the peptide bond, and (3) shifting the mRNA by one codon relative to the ribosome with the help of eEF2.Unlike bacteria, in which translation initiation occurs as soon as the 5' end of an mRNA is synthesized, in eukaryotes, such tight coupling between transcription and translation is not possible because transcription and translation are carried out in separate compartments of the cell (thenucleus andcytoplasm). Eukaryotic mRNA precursors must be processed in the nucleus (e.g., capping,polyadenylation, splicing) in ribosomes before they are exported to thecytoplasm for translation.Translation can also be affected byribosomal pausing, which can trigger endonucleolytic attack of the tRNA, a process termed mRNA no-go decay. Ribosomal pausing also aids co-translational folding of the nascent polypeptide on the ribosome, and delays protein translation while it is encoding tRNA. This can trigger ribosomal frameshifting.[22]
The last tRNA validated by the small ribosomal subunit (accommodation) transfers the amino acid. It carries to thelarge ribosomal subunit which binds it to one of the preceding admitted tRNA (transpeptidation). The ribosome then moves to the next mRNA codon to continue the process (translocation), creating an amino acid chain.
Inbacterial translation, andarchaeal translation, translation occurs in the cytosol, where the ribosome binds to the mRNA. Ineukaryotes, translation can occur in thecytoplasm and also across the membrane of theendoplasmic reticulum through a process calledco-translational translocation. In co-translational translocation, the entire ribosome–mRNA complex binds to the outer membrane of therough endoplasmic reticulum (ER), and the new protein is synthesized and released into the ER; the newly created polypeptide can be immediatelysecreted or stored inside the ER for futurevesicle transport and secretion outside the cell.
Many types oftranscribed RNA, such as tRNA, ribosomal RNA, and small nuclear RNA, do not undergo a translation into proteins.
Termination of elongation depends on therelease factoreRF1 that recognizes all three stop codons. When a stop codon is reached, termination of the polypeptide occurs the ribosome is disassembled and the completed polypeptide is released.eRF3 is a ribosome-dependent GTPase that helps eRF1 release the completed polypeptide. The human genome encodes a few genes whose mRNA stop codons are surprisingly leaky: In these genes, termination of translation is inefficient due to special RNA bases in the vicinity of the stop codon. Leaky termination in these genes leads totranslational readthrough of up to 10% of the stop codons of these genes. Some of these genes encode functionalprotein domains in their readthrough extension so that new proteinisoforms can arise. This process has been termed 'functional translational readthrough'.[23]
When the A site of the ribosome is occupied by a stop codon (UAA, UAG, or UGA) on the mRNA, creating the primary structure of a protein. tRNA usually cannot recognize or bind to stop codons. Instead, the stop codon induces the binding of a release factor protein[24] (RF1 & RF2) that prompts the disassembly of the entire ribosome/mRNA complex by the hydrolysis of the polypeptide chain from the peptidyl transferase center[2] of the ribosome.[25] Drugs or special sequence motifs on the mRNA can change the ribosomal structure so that near-cognate tRNAs are bound to the stop codon instead of the release factors. In such cases of 'translational readthrough', translation continues until the ribosome encounters the next stop codon.[23]
Even though the ribosomes are usually considered accurate and processive machines, the translation process is subject to errors that can lead either to the synthesis of erroneous proteins or to the premature abandonment of translation, either because a tRNA couples to a wrong codon or because a tRNA is coupled to the wrong amino acid.[26] The rate of error in synthesizing proteins has been estimated to be between 1 in 105 and 1 in 103 misincorporated amino acids, depending on the experimental conditions.[27] The rate of premature translation abandonment, instead, has been estimated to be of the order of magnitude of 10−4 events per translated codon.[28][29]
Translation is one of the key energy consumers in cells, hence it is strictly regulated. Numerous mechanisms have evolved that control and regulate translation ineukaryotes as well asprokaryotes. Regulation of translation can impact the global rate of protein synthesis which is closely coupled to the metabolic and proliferative state of a cell.
To study this process, scientists have used a wide variety of methods such as structural biology, analytical chemistry (mass-spectrometry based), imaging of reporter mRNA translation (in which the translation of a mRNA is linked to an output, such as luminescence or fluorescence), detecting it via radioactive amino acid incorporation, and next-generation sequencing based methods.[30] Other methods such astoeprinting assay can also be used to determine the location of ribosomes of a particular mRNA in vitro, and footprints of other proteins regulating translation. To delve deeper into this intricate process, scientists typically use a technique known as ribosome profiling.[31] This method enables researchers to take a snapshot of the translatome, showing which parts of the mRNA are being translated into proteins by ribosomes at a given time. Ribosome profiling provides valuable insights into translation dynamics, revealing the complex interplay between gene sequence, mRNA structure, and translation regulation.[18] Expanding on this concept, single-cell ribosome profiling, is a technique that allows the study of the translation process at the resolution of individual cells.[32] Single-cell ribosome profiling has revealed that genetic differences and their subsequent expression as mRNAs can also impact translation rate in an RNA-specific manner. Single-cell ribosome profiling has the potential to shed light on the heterogeneous nature of cells, leading to a more nuanced understanding of how translation regulation can impact cell behavior, metabolic state, and responsiveness to various stimuli or conditions.
In some cells certainamino acids can be depleted and thus affect translation efficiency. For instance, activatedT cells secreteinterferon-γ which triggers intracellulartryptophan shortage by upregulating theindoleamine 2,3-dioxygenase 1 (IDO1) enzyme. Despitetryptophan depletion, in-frame protein synthesis continues across tryptophancodons. This is achieved by incorporation ofphenylalanine instead of tryptophan. The resulting peptides are called W>F "substitutants". Such W>F substitutants are abundant in certaincancer types and have been associated with increased IDO1 expression. Functionally, W>F substitutants can impairprotein activity.[33]
Translational control is critical for the development and survival ofcancer. Cancer cells must frequently regulate the translation phase of gene expression, though it is not fully understood why translation is targeted over steps liketranscription. While cancer cells often have genetically altered translation factors, it is much more common for cancer cells to modify the levels of existing translation factors.[34] Several major oncogenic signaling pathways, including theRAS–MAPK,PI3K/AKT/mTOR, MYC, andWNT–β-catenin pathways, ultimately reprogram thegenome via translation.[35] Cancer cells also control translation to adapt to cellular stress. During stress, the cell translates mRNAs that can mitigate the stress and promote survival. An example of this is the expression ofAMPK in various cancers; its activation triggers a cascade that can ultimately allow the cancer to escapeapoptosis (programmed cell death) triggered by nutrition deprivation. Future cancer therapies may involve disrupting the translation machinery of the cell to counter the downstream effects of cancer.[34]
Figure M0. Basic and the simplest modelM0 of protein synthesis. Here, * M – amount of mRNA with translation initiation site not occupied by assembling ribosome, * F – amount of mRNA with translation initiation site occupied by assembling ribosome, * R – amount of ribosomes sitting on mRNA synthesizing proteins, * P – amount of synthesized proteins.[36]Figure M1'. The extended model of protein synthesisM1 with explicit presentation of 40S, 60S and initiation factors (IF) binding.[36]
The transcription-translation process description, mentioning only the most basic "elementary" processes, consists of:
production of mRNA molecules (including splicing),
initiation of these molecules with help of initiation factors (e.g., the initiation can include the circularization step though it is not universally required),
initiation of translation, recruiting the small ribosomal subunit,
assembly of full ribosomes,
elongation, (i.e. movement of ribosomes along mRNA with production of protein),
termination of translation,
degradation of mRNA molecules,
degradation of proteins.
The process of amino acid building to create protein in translation is a subject of various physic models for a long time starting from the first detailed kinetic models such as[37] or others taking into account stochastic aspects of translation and using computer simulations. Many chemical kinetics-based models of protein synthesis have been developed and analyzed in the last four decades.[38][39] Beyond chemical kinetics, various modeling formalisms such asTotally Asymmetric Simple Exclusion Process,[39]Probabilistic Boolean Networks,Petri Nets andmax-plus algebra have been applied to model the detailed kinetics of protein synthesis or some of its stages. A basic model of protein synthesis that takes into account all eight 'elementary' processes has been developed,[36] following theparadigm that "usefulmodels are simple and extendable".[40] The simplest modelM0 is represented by the reaction kinetic mechanism (Figure M0). It was generalised to include 40S, 60S andinitiation factors (IF) binding (Figure M1'). It was extended further to include effect ofmicroRNA on protein synthesis.[41] Most of models in this hierarchy can be solved analytically. These solutions were used to extract 'kinetic signatures' of different specific mechanisms of synthesis regulation.
It is also possible to translate either by hand (for short sequences) or by computer (after first programming one appropriately, see section below); this allows biologists and chemists to draw out the primary amino acid sequence of the encoded protein on paper.
First, convert each template DNA base to its RNA complement (note that the complement of A is now U), as shown below. Note that the template strand of the DNA is the one the RNA is polymerized against; the other DNA strand would be the same as the RNA, but with thymine instead of uracil.
DNA -> RNA A -> U T -> A C -> G G -> C A=T-> A=U
Then split the RNA into triplets (groups of three bases). Note that there are 3 translation "windows", orreading frames, depending on where you start reading the code.Finally, use thetable atGenetic code to translate the above into astructural formula as used in chemistry.
Whereas other aspects such as the 3D structure, calledtertiary structure, of protein can only be predicted usingsophisticated algorithms, the amino acid sequence, called primary structure, can be determined solely from the nucleic acid sequence with the aid of atranslation table.
This approach may not give the correct amino acid composition of the protein, in particular if unconventionalamino acids such asselenocysteine are incorporated into the protein, which is coded for by a conventional stop codon in combination with a downstream hairpin (SElenoCysteine Insertion Sequence, or SECIS).
There are many computer programs capable of translating a DNA/RNA sequence into a protein sequence. Normally this is performed using the Standard Genetic Code, however, few programs can handle all the "special" cases, such as the use of the alternative initiation codons which are biologically significant. For instance, the rare alternative start codon CTG codes forMethionine when used as a start codon, and forLeucine in all other positions.
Example: Condensed translation table for the Standard Genetic Code (from the NCBI Taxonomy webpage).[42]
The "Starts" row indicate three start codons, UUG, CUG, and the very common AUG. It also indicates the first amino acid residue when interpreted as a start: in this case it is all methionine.
Even when working with ordinary eukaryotic sequences such as theYeast genome, it is often desired to be able to use alternative translation tables—namely for translation of the mitochondrial genes. Currently the following translation tables are defined by theNCBI Taxonomy Group for the translation of the sequences inGenBank:[42]
^Brooker RJ, Widmaier EP, Graham LE, Stiling PD (2014).Biology (Third international student ed.). New York, NY: McGraw Hill Education. p. 249.ISBN978-981-4581-85-1.
^Neill C (1996).Biology (Fourth ed.). The Benjamin/Cummings Publishing Company. pp. 309–310.ISBN0-8053-1940-9.
^Nakamoto T (February 2011). "Mechanisms of the initiation of protein synthesis: in reading frame binding of ribosomes to mRNA".Molecular Biology Reports.38 (2):847–55.doi:10.1007/s11033-010-0176-1.PMID20467902.S2CID22038744.
^Heinrich R, Rapoport TA (September 1980). "Mathematical modelling of translation of mRNA in eucaryotes; steady state, time-dependent processes and application to reticulocytes".Journal of Theoretical Biology.86 (2):279–313.Bibcode:1980JThBi..86..279H.doi:10.1016/0022-5193(80)90008-9.PMID7442295.
^abSkjøndal-Bar N, Morris DR (January 2007). "Dynamic model of the process of protein synthesis in eukaryotic cells".Bulletin of Mathematical Biology.69 (1):361–93.doi:10.1007/s11538-006-9128-2.PMID17031456.S2CID83701439.