RNA polymerase (purple) unwinding the DNA double helix. It uses one strand (darker orange) as a template to create the single-stranded messenger RNA (green).
Inmolecular biology,RNA polymerase (abbreviatedRNAP orRNApol), or more specificallyDNA-directed/dependent RNA polymerase (DdRP), is anenzyme that catalyzes the chemical reactions that synthesizeRNA from aDNA template.
Using the enzymehelicase, RNAP locally opens the double-stranded DNA so that one strand of the exposednucleotides can be used as a template for the synthesis of RNA, a process calledtranscription. Atranscription factor and its associated transcriptionmediator complex must be attached to aDNA binding site called apromoter region before RNAP can initiate the DNA unwinding at that position. RNAP not only initiates RNA transcription, it also guides the nucleotides into position, facilitates attachment andelongation, has intrinsic proofreading and replacement capabilities, and termination recognition capability. Ineukaryotes, RNAP can build chains as long as 2.4 million nucleotides.
RNAP produces RNA that, functionally, is either for proteincoding, i.e.messenger RNA (mRNA); ornon-coding (so-called "RNA genes"). Examples of four functional types of RNA genes are:
Functions as an enzymatically active RNA molecule.
RNA polymerase is essential to life, and is found in all livingorganisms and manyviruses. Depending on the organism, a RNA polymerase can be aprotein complex (multi-subunit RNAP) or only consist of one subunit (single-subunit RNAP, ssRNAP), each representing an independent lineage. The former is found inbacteria,archaea, andeukaryotes alike, sharing a similar core structure and mechanism.[1] The latter is found inphages as well as eukaryoticchloroplasts andmitochondria, and is related to modernDNA polymerases.[2] Eukaryotic and archaeal RNAPs have more subunits than bacterial ones do, and are controlled differently.
Bacteria and archaea only have one RNA polymerase. Eukaryotes have multiple types of nuclear RNAP, each responsible for synthesis of a distinct subset of RNA:
RNA polymerase I synthesizes a pre-rRNA 45S (35S inyeast), which matures and will form the major RNA sections of the ribosome.
RNA polymerase IV andV found in plants are less understood; they makesiRNA. In addition to the ssRNAPs, chloroplasts also encode and use a bacteria-like RNAP.
In mostprokaryotes, a single RNA polymerase species transcribes all types of RNA. RNA polymerase "core" fromE. coli consists of five subunits: two alpha (α) subunits of 36 kDa, a beta (β) subunit of 150 kDa, a beta prime subunit (β′) of 155 kDa, and a small omega (ω) subunit. A sigma (σ) factor binds to the core, forming the holoenzyme. After transcription starts, the factor can unbind and let the core enzyme proceed with its work.[5][6] The core RNA polymerase complex forms a "crab claw" or "clamp-jaw" structure with an internal channel running along the full length.[7] Eukaryotic and archaeal RNA polymerases have a similar core structure and work in a similar manner, although they have many extra subunits.[8]
All RNAPs contain metalcofactors, in particularzinc andmagnesium cations which aid in the transcription process.[9][10]
Anelectron-micrograph ofDNA strands decorated by hundreds of RNAP molecules too small to be resolved. Each RNAP is transcribing anRNA strand, which can be seen branching off from the DNA. "Begin" indicates the3′ end of the DNA, where RNAP initiates transcription; "End" indicates the5′ end, where the longer RNA molecules are completely transcribed.
Control of the process ofgene transcription affects patterns ofgene expression and, thereby, allows acell to adapt to a changing environment, perform specialized roles within an organism, and maintain basic metabolic processes necessary for survival. Therefore, it is hardly surprising that the activity of RNAP is long, complex, and highly regulated. InEscherichia coli bacteria, more than 100transcription factors have been identified, which modify the activity of RNAP.[11]
RNAP can initiate transcription at specific DNA sequences known aspromoters. It then produces an RNA chain, which iscomplementary to the template DNA strand. The process of addingnucleotides to the RNA strand is known as elongation; in eukaryotes, RNAP can build chains as long as 2.4 millionnucleotides (the full length of thedystrophin gene). RNAP will preferentially release its RNA transcript at specific DNA sequences encoded at the end of genes, which are known asterminators.
Non-coding RNA or "RNA genes"—a broad class of genes that encode RNA that is not translated into protein. The most prominent examples of RNA genes aretransfer RNA (tRNA) andribosomal RNA (rRNA), both of which are involved in the process oftranslation. However, since the late 1990s, many new RNA genes have been found, and thus RNA genes may play a much more significant role than previously thought.
Catalytic RNA (Ribozyme)—enzymatically active RNA molecules
RNAP accomplishesde novo synthesis. It is able to do this because specific interactions with the initiating nucleotide hold RNAP rigidly in place, facilitating chemical attack on the incoming nucleotide. Such specific interactions explain why RNAP prefers to start transcripts with ATP (followed by GTP, UTP, and then CTP). In contrast toDNA polymerase, RNAP includeshelicase activity, therefore no separate enzyme is needed to unwind DNA.
RNA polymerase binding in bacteria involves thesigma factor recognizing the core promoter region containing the −35 and −10 elements (locatedbefore the beginning of sequence to be transcribed) and also, at some promoters, the α subunit C-terminal domain recognizing promoter upstream elements.[12] There are multiple interchangeable sigma factors, each of which recognizes a distinct set of promoters. For example, inE. coli, σ70 is expressed under normal conditions and recognizes promoters for genes required under normal conditions ("housekeeping genes"), while σ32 recognizes promoters for genes required at high temperatures ("heat-shock genes"). In archaea and eukaryotes, the functions of the bacterial general transcription factor sigma are performed by multiplegeneral transcription factors that work together. The RNA polymerase-promoter closed complex is usually referred to as the "transcription preinitiation complex."[13][14]
After binding to the DNA, the RNA polymerase switches from a closed complex to an open complex. This change involves the separation of the DNA strands to form an unwound section of DNA of approximately 13 bp, referred to as the "transcription bubble".Supercoiling plays an important part in polymerase activity because of the unwinding and rewinding of DNA. Because regions of DNA in front of RNAP are unwound, there are compensatory positive supercoils. Regions behind RNAP are rewound and negative supercoils are present.[14]
RNA polymerase then starts to synthesize the initial DNA-RNA heteroduplex, with ribonucleotides base-paired to the template DNA strand according to Watson-Crick base-pairing interactions. As noted above, RNA polymerase makes contacts with the promoter region. However these stabilizing contacts inhibit the enzyme's ability to access DNA further downstream and thus the synthesis of the full-length product. In order to continue RNA synthesis, RNA polymerase must escape the promoter. It must maintain promoter contacts while unwinding more downstream DNA for synthesis,"scrunching" more downstream DNA into the initiation complex.[15] During the promoter escape transition, RNA polymerase is considered a "stressed intermediate." Thermodynamically the stress accumulates from the DNA-unwinding and DNA-compaction activities. Once the DNA-RNA heteroduplex is long enough (~10 bp), RNA polymerase releases its upstream contacts and effectively achieves the promoter escape transition into the elongation phase. The heteroduplex at the active center stabilizes the elongation complex.
However, promoter escape is not the only outcome. RNA polymerase can also relieve the stress by releasing its downstream contacts, arresting transcription. The paused transcribing complex has two options: (1) release the nascent transcript and begin anew at the promoter or (2) reestablish a new 3′-OH on the nascent transcript at the active site via RNA polymerase's catalytic activity and recommence DNA scrunching to achieve promoter escape.Abortive initiation, the unproductive cycling of RNA polymerase before the promoter escape transition, results in short RNA fragments of around 9 bp in a process known as abortive transcription. The extent of abortive initiation depends on the presence of transcription factors and the strength of the promoter contacts.[16]
RNA Polymerase II Transcription: the process of transcript elongation facilitated by disassembly of nucleosomes.RNAP fromT. aquaticus pictured during elongation. Portions of the enzyme were made transparent so as to make the path of RNA and DNA more clear. Themagnesium ion (yellow) is located at the enzyme active site.
The 17-bp transcriptional complex has an 8-bp DNA-RNA hybrid, that is, 8 base-pairs involve the RNA transcript bound to the DNA template strand.[17] As transcription progresses, ribonucleotides are added to the 3′ end of the RNA transcript and the RNAP complex moves along the DNA. The characteristic elongation rates in prokaryotes and eukaryotes are about 10–100 nts/sec.[18]
Aspartyl (asp) residues in the RNAP will hold on to Mg2+ ions, which will, in turn, coordinate the phosphates of the ribonucleotides. The first Mg2+ will hold on to the α-phosphate of the NTP to be added. This allows the nucleophilic attack of the 3′-OH from the RNA transcript, adding another NTP to the chain. The second Mg2+ will hold on to the pyrophosphate of the NTP.[19] The overall reaction equation is:
Unlike the proofreading mechanisms ofDNA polymerase those of RNAP have only recently been investigated. Proofreading begins with separation of the mis-incorporated nucleotide from the DNA template. This pauses transcription. The polymerase then backtracks by one position and cleaves the dinucleotide that contains the mismatched nucleotide. In the RNA polymerase this occurs at the same active site used for polymerization and is therefore markedly different from the DNA polymerase where proofreading occurs at a distinct nuclease active site.[20]
The overall error rate is around 10−4 to 10−6.[21]
In bacteria, termination of RNA transcription can be rho-dependent or rho-independent. The former relies on therho factor, which destabilizes the DNA-RNA heteroduplex and causes RNA release.[22] The latter, also known asintrinsic termination, relies on a palindromic region of DNA. Transcribing the region causes the formation of a "hairpin" structure from the RNA transcription looping and binding upon itself. This hairpin structure is often rich in G-C base-pairs, making it more stable than the DNA-RNA hybrid itself. As a result, the 8 bp DNA-RNA hybrid in the transcription complex shifts to a 4 bp hybrid. These last 4 base pairs are weak A-U base pairs, and the entire RNA transcript will fall off the DNA.[23]
Transcription termination in eukaryotes is less well understood than in bacteria, but involves cleavage of the new transcript followed by template-independent addition of adenines at its new 3′ end, in a process calledpolyadenylation.[24]
Given that DNA and RNA polymerases both carry out template-dependent nucleotide polymerization, it might be expected that the two types of enzymes would be structurally related. However,x-ray crystallographic studies of both types of enzymes reveal that, other than containing a critical Mg2+ ion at the catalytic site, they are virtually unrelated to each other; indeed template-dependent nucleotide polymerizing enzymes seem to have arisen independently twice during the early evolution of cells. One lineage led to the modern DNA polymerases and reverse transcriptases, as well as to a few single-subunit RNA polymerases (ssRNAP) from phages and organelles.[2] The other multi-subunit RNAP lineage formed all of the modern cellular RNA polymerases.[25][1]
RNAP is a large molecule. The core enzyme has five subunits (~ 400 kDa):[26]
β′
The β′ subunit is the largest subunit, and is encoded by the rpoC gene.[27] The β′ subunit contains part of the active center responsible for RNA synthesis and contains some of the determinants for non-sequence-specific interactions with DNA and nascent RNA. It is split into two subunits in Cyanobacteria and chloroplasts.[28]
β
The β subunit is the second-largest subunit, and is encoded by therpoB gene. The β subunit contains the rest of the active center responsible for RNA synthesis and contains the rest of the determinants for non-sequence-specific interactions with DNA and nascent RNA.
α (αI and αII)
Two copies of the α subunit, being the third-largest subunit, are present in a molecule of RNAP: αI and αII (one and two). Each α subunit contains two domains: αNTD (N-terminal domain) and αCTD (C-terminal domain). αNTD contains determinants for assembly of RNAP. αCTD (C-terminal domain) contains determinants for interaction with promoter DNA, making non-sequence-non-specific interactions at most promoters and sequence-specific interactions at upstream-element-containing promoters, and contains determinants for interactions with regulatory factors.
ω
The ω subunit is the smallest subunit. The ω subunit facilitates assembly of RNAP and stabilizes assembled RNAP.[29]
In order to bind promoters, RNAP core associates with the transcription initiation factorsigma (σ) to form RNA polymerase holoenzyme. Sigma reduces the affinity of RNAP for nonspecific DNA while increasing specificity for promoters, allowing transcription to initiate at correct sites. The complete holoenzyme therefore has 6 subunits: β′βαI and αIIωσ (~450 kDa).
Structure of eukaryotic RNA polymerase II (light blue) in complex withα-amanitin (red), a strong poison found indeath cap mushrooms that targets this vital enzyme
Eukaryotes have multiple types of nuclear RNAP, each responsible for synthesis of a distinct subset of RNA. All are structurally and mechanistically related to each other and to bacterial RNAP:
RNA polymerase I synthesizes a pre-rRNA 45S (35S in yeast), which matures into 28S, 18S and 5.8S rRNAs, which will form the major RNA sections of theribosome.[30]
RNA polymerase II synthesizes precursors ofmRNAs and mostsnRNA andmicroRNAs.[31] This is the most studied type, and, due to the high level of control required over transcription, a range oftranscription factors are required for its binding to promoters.
Eukaryoticchloroplasts contain a multi-subunit RNAP ("PEP, plastid-encoded polymerase"). Due to its bacterial origin, the organization of PEP resembles that of current bacterial RNA polymerases: It is encoded by the RPOA, RPOB, RPOC1 and RPOC2 genes on the plastome, which as proteins form the core subunits of PEP, respectively named α, β, β′ and β″.[35] Similar to the RNA polymerase inE. coli, PEP requires the presence ofsigma (σ) factors for the recognition of its promoters, containing the -10 and -35 motifs.[36] Despite the many commonalities between plant organellar and bacterial RNA polymerases and their structure, PEP additionally requires the association of a number of nuclear encoded proteins, termed PAPs (PEP-associated proteins), which form essential components that are closely associated with the PEP complex in plants. Initially, a group consisting of 10 PAPs was identified through biochemical methods, which was later extended to 12 PAPs.[37][38]
Chloroplast also contain a second, structurally and mechanistically unrelated, single-subunit RNAP ("nucleus-encoded polymerase, NEP"). Eukaryoticmitochondria usePOLRMT (human), a nucleus-encoded single-subunit RNAP.[2] Such phage-like polymerases are referred to as RpoT in plants.[39]
Archaea have a single type of RNAP, responsible for the synthesis of all RNA. Archaeal RNAP is structurally and mechanistically similar to bacterial RNAP and eukaryotic nuclear RNAP I-V, and is especially closely structurally and mechanistically related to eukaryotic nuclear RNAP II.[8][40]The history of the discovery of the archaeal RNA polymerase is quite recent. The first analysis of the RNAP of an archaeon was performed in 1971, when the RNAP from the extremehalophileHalobacterium cutirubrum was isolated and purified.[41] Crystal structures of RNAPs fromSulfolobus solfataricus andSulfolobus shibatae set the total number of identified archaeal subunits at thirteen.[8][42]
Archaea has the subunit corresponding to Eukaryotic Rpb1 split into two. There is no homolog to eukaryotic Rpb9 (POLR2I) in theS. shibatae complex, although TFS (TFIIS homolog) has been proposed as one based on similarity. There is an additional subunit dubbed Rpo13; together with Rpo5 it occupies a space filled by an insertion found in bacterial β′ subunits (1,377–1,420 inTaq).[8] An earlier, lower-resolution study onS. solfataricus structure did not find Rpo13 and only assigned the space to Rpo5/Rpb5. Rpo3 is notable in that it's aniron–sulfur protein. RNAP I/III subunit AC40 found in some eukaryotes share similar sequences,[42] but does not bind iron.[43] This domain, in either case, serves a structural function.[44]
Archaeal RNAP subunit previously used an "RpoX" nomenclature where each subunit is assigned a letter in a way unrelated to any other systems.[1] In 2009, a new nomenclature based on Eukaryotic Pol II subunit "Rpb" numbering was proposed.[8]
T7 RNA polymerase producing a mRNA (green) from a DNA template. The protein is shown as a purple ribbon (PDB:1MSW)
Orthopoxviruses and some othernucleocytoplasmic large DNA viruses synthesize RNA using a virally encoded multi-subunit RNAP. They are most similar to eukaryotic RNAPs, with some subunits minified or removed.[45] Exactly which RNAP they are most similar to is a topic of debate.[46] Most other viruses that synthesize RNA use unrelated mechanics.
Many viruses use a single-subunit DNA-dependent RNAP (ssRNAP) that is structurally and mechanistically related to the single-subunit RNAP of eukaryotic chloroplasts (RpoT) and mitochondria (POLRMT) and, more distantly, toDNA polymerases andreverse transcriptases. Perhaps the most widely studied such single-subunit RNAP isbacteriophageT7 RNA polymerase. ssRNAPs cannot proofread.[2]
B. subtilisprophage SPβ uses YonO, a homolog of the β+β′ subunits of msRNAPs to form a monomeric (both barrels on the same chain) RNAP distinct from the usual "right hand" ssRNAP. It probably diverged very long ago from the canonical five-unit msRNAP, before the time of thelast universal common ancestor.[47][48]
^abcdWerner F, Grohmann D (February 2011). "Evolution of multisubunit RNA polymerases in the three domains of life".Nature Reviews. Microbiology.9 (2):85–98.doi:10.1038/nrmicro2507.PMID21233849.S2CID30004345. See also Cramer 2002:Cramer P (February 2002). "Multisubunit RNA polymerases".Current Opinion in Structural Biology.12 (1):89–97.doi:10.1016/s0959-440x(02)00294-4.PMID11839495.
^Alberts B (2014-11-18).Molecular Biology of the Cell (Sixth ed.). New York, NY: Garland Science, Taylor and Francis Group.ISBN9780815344322.OCLC887605755.
^Roeder RG (November 1991). "The complexities of eukaryotic transcription initiation: regulation of preinitiation complex assembly".Trends in Biochemical Sciences.16 (11):402–408.doi:10.1016/0968-0004(91)90164-Q.PMID1776168.
^abWatson JD, Baker TA, Bell SP, Gann AA, Levine M, Losick RM (2013).Molecular Biology of the Gene (7th ed.). Pearson.
^Richardson JP (September 2002). "Rho-dependent termination and ATPases in transcript termination".Biochimica et Biophysica Acta (BBA) - Gene Structure and Expression.1577 (2):251–260.doi:10.1016/S0167-4781(02)00456-6.PMID12213656.
^Porrua O, Boudvillain M, Libri D (August 2016). "Transcription Termination: Variations on Common Themes".Trends in Genetics.32 (8):508–522.doi:10.1016/j.tig.2016.05.007.PMID27371117.
^Lykke-Andersen S, Jensen TH (October 2007). "Overlapping pathways dictate termination of RNA polymerase II transcription".Biochimie.89 (10):1177–1182.doi:10.1016/j.biochi.2007.05.007.PMID17629387.
^Mathew R, Chatterji D (October 2006). "The evolving story of the omega subunit of bacterial RNA polymerase".Trends in Microbiology.14 (10):450–455.doi:10.1016/j.tim.2006.08.002.PMID16908155.
^Pfannschmidt T, Ogrzewalla K, Baginsky S, Sickmann A, Meyer HE, Link G (January 2000). "The multisubunit chloroplast RNA polymerase A from mustard (Sinapis alba L.). Integration of a prokaryotic core into a larger complex with organelle-specific functions".European Journal of Biochemistry.267 (1):253–261.doi:10.1046/j.1432-1327.2000.00991.x.PMID10601874.
^Chi W, He B, Mao J, Jiang J, Zhang L (September 2015). "Plastid sigma factors: Their individual functions and regulation in transcription".Biochimica et Biophysica Acta (BBA) - Bioenergetics. SI: Chloroplast Biogenesis.1847 (9):770–778.doi:10.1016/j.bbabio.2015.01.001.PMID25596450.
^Schweer J, Türkeri H, Kolpack A, Link G (December 2010). "Role and regulation of plastid sigma factors and their functional interactors during chloroplast transcription - recent lessons from Arabidopsis thaliana".European Journal of Cell Biology.89 (12):940–946.doi:10.1016/j.ejcb.2010.06.016.PMID20701995.
^Hager DA, Jin DJ, Burgess RR (August 1990). "Use of Mono Q high-resolution ion-exchange chromatography to obtain highly pure and active Escherichia coli RNA polymerase".Biochemistry.29 (34):7890–7894.doi:10.1021/bi00486a016.PMID2261443.