
Protein production is thebiotechnological process of generating a specificprotein. It is typically achieved by the manipulation ofgene expression in an organism such that itexpresses large amounts of arecombinant gene. This includes thetranscription of therecombinant DNA to messengerRNA (mRNA), thetranslation of mRNA intopolypeptide chains, which are ultimately folded into functionalproteins and may betargeted to specific subcellular or extracellular locations.[1]
Protein production systems (also known asexpression systems) are used in thelife sciences,biotechnology, andmedicine.Molecular biology research uses numerous proteins and enzymes, many of which are from expression systems; particularlyDNA polymerase forPCR,reverse transcriptase for RNA analysis,restriction endonucleases for cloning, and to make proteins that are screened indrug discovery asbiological targets or as potential drugs themselves. There are also significant applications for expression systems inindustrial fermentation, notably the production ofbiopharmaceuticals such as humaninsulin to treatdiabetes, and to manufactureenzymes.
Commonly used protein production systems include those derived frombacteria,[2][3]yeast,[4][5]baculovirus/insect,[6]mammalian cells,[7][8] and more recently filamentous fungi such asMyceliophthora thermophila.[9] When biopharmaceuticals are produced with one of these systems, process-related impurities termedhost cell proteins also arrive in the final product in trace amounts.[10]
This article includes a list ofgeneral references, butit lacks sufficient correspondinginline citations. Please help toimprove this article byintroducing more precise citations.(January 2024) (Learn how and when to remove this message) |
The oldest and most widely used expression systems are cell-based and may be defined as the "combination of anexpression vector, its cloned DNA, and the host for the vector that provide a context to allow foreign gene function in a host cell, that is, produce proteins at a high level".[11][12] Overexpression is an abnormally and excessively high level ofgene expression which produces a pronounced gene-relatedphenotype.[13][14][clarification needed]
There are many ways to introduce foreignDNA to a cell for expression, and many different host cells may be used for expression — each expression system has distinct advantages and liabilities. Expression systems are normally referred to by thehost and the DNA source or the delivery mechanism for the genetic material. For example, common hosts arebacteria (such asE. coli,B. subtilis),yeast (such asS. cerevisiae[5]) or eukaryoticcell lines. Common DNA sources and delivery mechanisms areviruses (such asbaculovirus,retrovirus,adenovirus),plasmids,artificial chromosomes andbacteriophage (such aslambda). The best expression system depends on thegene involved, for example theSaccharomyces cerevisiae is often preferred for proteins that require significantposttranslational modification.Insect ormammal cell lines are used when human-like splicing of mRNA is required. Nonetheless, bacterial expression has the advantage of easily producing large amounts of protein, which is required forX-ray crystallography ornuclear magnetic resonance experiments for structure determination.
Because bacteria areprokaryotes, they are not equipped with the full enzymatic machinery to accomplish the required post-translational modifications or molecular folding. Hence, multi-domain eukaryotic proteins expressed in bacteria often are non-functional. Also, many proteins become insoluble as inclusion bodies that are difficult to recover without harsh denaturants and subsequent cumbersome protein-refolding.
To address these concerns, expressions systems using multiple eukaryotic cells were developed for applications requiring the proteins be conformed as in, or closer to eukaryotic organisms: cells of plants (i.e. tobacco), of insects or mammalians (i.e. bovines) are transfected with genes and cultured in suspension and even as tissues or whole organisms, to produce fully folded proteins. Mammalianin vivo expression systems have however low yield and other limitations (time-consuming, toxicity to host cells,..). To combine the high yield/productivity and scalable protein features of bacteria and yeast, and advanced epigenetic features of plants, insects and mammalians systems, other protein production systems are developed using unicellular eukaryotes (i.e. non-pathogenic 'Leishmania' cells).

E. coli is one of the most widely used expression hosts, and DNA is normally introduced in aplasmid expression vector. The techniques for overexpression inE. coli are well developed and work by increasing the number of copies of the gene or increasing the binding strength of the promoter region so assisting transcription.[3]
For example, a DNA sequence for a protein of interest could becloned orsubcloned into a high copy-number plasmid containing thelac (oftenLacUV5) promoter, which is thentransformed into the bacteriumE. coli. Addition ofIPTG (alactose analog) activates the lac promoter and causes the bacteria to express the protein of interest.[2]
E. coli strain BL21 and BL21(DE3) are two strains commonly used for protein production. As members of the B lineage, they lacklon andOmpT proteases, protecting the produced proteins from degradation. The DE3 prophage found in BL21(DE3) providesT7 RNA polymerase (driven by the LacUV5 promoter), allowing for vectors with the T7 promoter to be used instead.[15]
Non-pathogenic species of the gram-positiveCorynebacterium are used for the commercial production of various amino acids. TheC. glutamicum species is widely used for producingglutamate andlysine,[16] components of human food, animal feed and pharmaceutical products.
Expression of functionally active humanepidermal growth factor has been done inC. glutamicum,[17] thus demonstrating a potential for industrial-scale production of human proteins. Expressed proteins can be targeted for secretion through either the general,secretory pathway (Sec) or thetwin-arginine translocation pathway (Tat).[18]
Unlikegram-negative bacteria, the gram-positiveCorynebacterium lacklipopolysaccharides that function as antigenicendotoxins in humans.[citation needed]
The non-pathogenic and gram-negative bacteria,Pseudomonas fluorescens, is used for high level production of recombinant proteins; commonly for the development bio-therapeutics and vaccines.P. fluorescens is a metabolically versatile organism, allowing for high throughput screening and rapid development of complex proteins.P. fluorescens is most well known for its ability to rapid and successfully produce high titers of active, soluble protein.[19]
Expression systems using eitherS. cerevisiae orPichia pastoris allow stable and lasting production of proteins that are processed similarly to mammalian cells, at high yield, in chemically defined media of proteins.[4][5]
Filamentous fungi, especiallyAspergillus andTrichoderma, have long been used to produce diverseindustrial enzymes from their own genomes ("native", "homologous") and from recombinant DNA ("heterologous").[9]
More recently,Myceliophthora thermophila C1 has been developed into an expression platform for screening and production of native and heterologous proteins.The expression system C1 shows a low viscosity morphology in submerged culture, enabling the use of complex growth and production media. C1 also does not "hyperglycosylate" heterologous proteins, asAspergillus andTrichoderma tend to do.[9]
Baculovirus-infected insect cells[20] (Sf9,Sf21,High Five strains) or mammalian cells[21] (HeLa,HEK 293) allow production of glycosylated or membrane proteins that cannot be produced using fungal or bacterial systems.[20][6] It is useful for production of proteins in high quantity. Genes are not expressed continuously because infected host cells eventually lyse and die during each infection cycle.[22]
Non-lytic insect cell expression is an alternative to the lytic baculovirus expression system. In non-lytic expression, vectors are transiently or stablytransfected into the chromosomal DNA of insect cells for subsequent gene expression.[23][24] This is followed by selection and screening of recombinant clones.[25] The non-lytic system has been used to give higher protein yield and quicker expression of recombinant genes compared to baculovirus-infected cell expression.[24] Cell lines used for this system include:Sf9,Sf21 fromSpodoptera frugiperda cells,Hi-5 fromTrichoplusia ni cells, andSchneider 2 cells and Schneider 3 cells fromDrosophila melanogaster cells.[23][25] With this system, cells do not lyse and several cultivation modes can be used.[23] Additionally, protein production runs are reproducible.[23][24] This system gives a homogeneous product.[24] A drawback of this system is the requirement of an additional screening step for selecting viableclones.[25]
Leishmania tarentolae (cannot infect mammals) expression systems allow stable and lasting production of proteins at high yield, in chemically defined media. Produced proteins exhibit fully eukaryotic post-translational modifications, includingglycosylation and disulfide bond formation.[citation needed]
The most common mammalian expression systems areChinese Hamsterovary (CHO) and Human embryonic kidney (HEK) cells.[26][27][28]
Cell-free production of proteins is performedin vitro using purified RNA polymerase, ribosomes, tRNA and ribonucleotides. These reagents may be produced by extraction from cells or from a cell-based expression system. Due to the low expression levels and high cost of cell-free systems, cell-based systems are more widely used.[29]
Aspergillus and Trichoderma are currently the main fungal genera used to produce industrial enzymes.
The production of abnormally large amounts of a substance which is coded for by a particular gene or group of genes; the appearance in the phenotype to an abnormally high degree of a character or effect attributed to a particular gene.
overexpress
In biology, to make too many copies of a protein or other substance. Overexpression of certain proteins or other substances may play a role in cancer development.