In biology,reprogramming refers to erasure and remodeling ofepigenetic marks, such asDNA methylation, during mammalian development or in cell culture.[1] Such control is also often associated with alternative covalent modifications ofhistones.
Reprogrammings that are both large scale (10% to 100% of epigenetic marks) and rapid (hours to a few days) occur at three life stages of mammals. Almost 100% of epigenetic marks are reprogrammed in two short periods early in development afterfertilization of anovum by asperm. In addition, almost 10% ofDNA methylations inneurons of the hippocampus can be rapidly altered during formation of a strong fear memory.
After fertilization in mammals,DNA methylation patterns are largely erased and then re-established during early embryonic development. Almost all of the methylations from the parents are erased, first during earlyembryogenesis, and again ingametogenesis, with demethylation and remethylation occurring each time. Demethylation during early embryogenesis occurs in the preimplantation period. After a sperm fertilizes anovum to form azygote, rapidDNA demethylation of the paternal DNA and slower demethylation of the maternal DNA occurs until formation of amorula, which has almost no methylation. After theblastocyst is formed, methylation can begin, and with formation of theepiblast a wave of methylation then takes place until theimplantation stage of the embryo. Another period of rapid and almost complete demethylation occurs during gametogenesis within the primordialgerm cells (PGCs). Other than the PGCs, in the post-implantation stage, methylation patterns in somatic cells are stage- andtissue-specific with changes that presumably define each individualcell type and last stably over a long time.[2]
The mousespermgenome is 80–90%methylated at itsCpG sites in DNA, amounting to about 20 million methylated sites.[citation needed] Afterfertilization, the paternal chromosome is almost completelydemethylated in six hours by an active process, before DNA replication (blue line in Figure). In the matureoocyte, about 40% of its CpG sites are methylated. Demethylation of the maternal chromosome largely takes place by blockage of the methylatingenzymes from acting on maternal-origin DNA and by dilution of the methylated maternal DNA during replication (red line in Figure). Themorula (at the 16 cell stage), has only a small amount ofDNA methylation (black line in Figure). Methylation begins to increase at 3.5 days after fertilization in theblastocyst, and a large wave of methylation then occurs on days 4.5 to 5.5 in theepiblast, going from 12% to 62% methylation, and reaching maximum level after implantation in the uterus.[3] By day seven after fertilization, the newly formedprimordial germ cells (PGC) in the implantedembryo segregate from the remainingsomatic cells. At this point the PGCs have about the same level of methylation as the somatic cells.
The newly formed primordial germ cells (PGC) in the implanted embryo devolve from the somatic cells. At this point the PGCs have high levels of methylation. These cells migrate from the epiblast toward thegonadal ridge. Now the cells are rapidly proliferating and beginning demethylation in two waves. In the first wave, demethylation is by replicative dilution, but in the second wave demethylation is by an active process. The second wave leads to demethylation of specificloci. At this point the PGC genomes display the lowest levels of DNA methylation of any cells in the entirelife cycle [at embryonic day 13.5 (E13.5), see the second figure in this section].[4]
After fertilization some cells of the newly formed embryo migrate to the germinal ridge and will eventually become thegerm cells (sperm and oocytes) of the next generation. Due to the phenomenon ofgenomic imprinting, maternal and paternal genomes are differentially marked and must be properly reprogrammed every time they pass through the germline. Therefore, during the process ofgametogenesis the primordial germ cells must have their original biparentalDNA methylation patterns erased and re-established based on the sex of the transmitting parent.
After fertilization, the paternal and maternal genomes are demethylated in order to erase their epigenetic signatures and acquiretotipotency. There is asymmetry at this point: the male pronucleus undergoes a quick and active demethylation. Meanwhile the female pronucleus is demethylated passively during consecutive cell divisions. The process ofDNA demethylation involvesbase excision repair and likely other DNA-repair-based mechanisms.[5] Despite the global nature of this process, there are certain sequences that avoid it, such asdifferentially methylated regions (DMRS) associated with imprinted genes,retrotransposons andcentromericheterochromatin. Remethylation is needed again to differentiate the embryo into a complete organism.[6]
In vitro manipulation of pre-implantation embryos has been shown to disrupt methylation patterns at imprinted loci[7] and plays a crucial role in cloned animals.[8]
Learning and memory have levels of permanence, differing from other mental processes such as thought, language, and consciousness, which are temporary in nature. Learning and memory can be either accumulated slowly (multiplication tables) or rapidly (touching a hot stove), but once attained, can be recalled into conscious use for a long time. Rats subjected to one instance ofcontextual fear conditioning create an especially strong long-term memory. At 24 h after training, 9.17% of the genes in the rat genomes ofhippocampus neurons were found to bedifferentially methylated. This included more than 2,000 differentially methylated genes at 24 hours after training, with over 500 genes being demethylated.[9] The hippocampus region of the brain is where contextual fear memories are first stored (see figure of the brain, this section), but this storage is transient and does not remain in the hippocampus. In rats contextual fear conditioning is abolished when the hippocampus is subjected to hippocampectomy just 1 day after conditioning, but rats retain a considerable amount of contextual fear when a long delay (28 days) is imposed between the time of conditioning and the time of hippocampectomy.[10]
Three molecular stages are required for reprogramming theDNA methylome. Stage 1: Recruitment. The enzymes needed for reprogramming are recruited to genome sites that require demethylation or methylation. Stage 2: Implementation. The initial enzymatic reactions take place. In the case of methylation, this is a short step that results in the methylation ofcytosine to5-methylcytosine. Stage 3:Base excision DNA repair. The intermediate products of demethylation are catalysed by specific enzymes of the base excision DNA repair pathway that finally restore cystosine in the DNA sequence.
The Figure in this section indicates the central roles of ten-eleven translocationmethylcytosine dioxygenases (TETs) in the demethylation of 5-methylcytosine to form cytosine.[12] As reviewed in 2018,[12] 5mC is very often initially oxidized by TET dioxygenases to generate5-hydroxymethylcytosine (5hmC). In successive steps (see Figure) TET enzymes further hydroxylate 5hmC to generate5-formylcytosine (5fC) and 5-carboxylcytosine (5caC).Thymine-DNA glycosylase (TDG) recognizes the intermediate bases 5fC and 5caC and excises theglycosidic bond resulting in anapyrimidinic site (AP site). In an alternative oxidative deamination pathway, 5hmC can be oxidatively deaminated byAPOBEC (AID/APOBEC) deaminases to form 5-hydroxymethyluracil (5hmU) or 5mC can be converted tothymine (Thy). 5hmU can be cleaved by TDG,SMUG1,NEIL1, orMBD4. AP sites and T:G mismatches are then repaired by base excision repair (BER) enzymes to yield cytosine (Cyt).
The isoforms of theTET enzymes include at least two isoforms of TET1, one ofTET2 and three isoforms ofTET3.[13][14] The full-length canonical TET1 isoform appears virtually restricted to early embryos, embryonic stem cells and primordial germ cells (PGCs). The dominant TET1 isoform in most somatic tissues, at least in the mouse, arises fromalternative promoter usage which gives rise to a short transcript and a truncated protein designated TET1s. The isoforms of TET3 are the full length form TET3FL, a short form splice variant TET3s, and a form that occurs in oocytes and neurons designated TET3o. TET3o is created by alternative promoter use and contains an additional firstN-terminalexon coding for 11amino acids. TET3o only occurs in oocytes and neurons and was not expressed in embryonic stem cells or in any other cell type or adult mouse tissue tested. Whereas TET1 expression can barely be detected in oocytes and zygotes, and TET2 is only moderately expressed, the TET3 variant TET3o shows extremely high levels of expression in oocytes and zygotes, but is nearly absent at the 2-cell stage. It is possible that TET3o, high in neurons, oocytes and zygotes at the one cell stage, is the major TET enzyme utilized when very large scale rapid demethylations occur in these cells.
TheTET enzymes do not specifically bind to5-methylcytosine except when recruited. Without recruitment or targeting, TET1 predominantly binds to high CG promoters andCpG islands (CGIs) genome-wide by its CXXC domain that can recognizeun-methylated CGIs.[15] TET2 does not have an affinity for 5-methylcytosine in DNA.[16] The CXXC domain of the full-length TET3, which is the predominant form expressed in neurons, binds most strongly to CpGs where the C was converted to 5-carboxycytosine (5caC). However, it also binds toun-methylated CpGs.[14]
For aTET enzyme to initiate demethylation it must first be recruited to a methylatedCpG site in DNA. Two of the proteins shown to recruit a TET enzyme to a methylated cytosine in DNA areOGG1 (see figure Initiation of DNA demthylation)[17] andEGR1.[18]
Oxoguanine glycosylase (OGG1) catalyses the first step in base excision repair of the oxidatively damaged base8-OHdG. OGG1 finds 8-OHdG by sliding along the linear DNA at 1,000 base pairs of DNA in 0.1 seconds.[19] OGG1 very rapidly finds 8-OHdG. OGG1 proteins bind to oxidatively damaged DNA with a half maximum time of about 6 seconds.[20] When OGG1 finds 8-OHdG it changes conformation and complexes with 8-OHdG in the binding pocket of OGG1.[21] OGG1 does not immediately act to remove the 8-OHdG. Half maximum removal of 8-OHdG takes about 30 minutes inHeLa cellsin vitro,[22] or about 11 minutes in the livers ofirradiated mice.[23] DNA oxidation byreactive oxygen species preferentially occurs at aguanine in a methylated CpG site, because of a loweredionization potential of guanine bases adjacent to 5-methylcytosine.[24] TET1 binds (is recruited to) the OGG1 bound to 8-OHdG (see figure).[17] This likely allows TET1 to demethylate an adjacent methylated cytosine. When humanmammaryepithelial cells (MCF-10A) were treated withH2O2, 8-OHdG increased in DNA by 3.5-fold and this caused large scale demethylation of 5-methylcytosine to about 20% of its initial level in DNA.[17]
The geneearly growth response protein 1 (EGR1) is animmediate early gene (IEG). The defining characteristic of IEGs is the rapid and transient up-regulation—within minutes—of their mRNA levels independent of protein synthesis.[25] EGR1 can rapidly be induced by neuronal activity.[26] In adulthood, EGR1 is expressed widely throughout the brain, maintaining baseline expression levels in several key areas of the brain including themedial prefrontal cortex,striatum, hippocampus andamygdala.[25] This expression is linked to control of cognition, emotional response, social behavior and sensitivity to reward.[25] EGR1 binds to DNA at sites with themotifs 5′-GCGTGGGCG-3′ and 5'-GCGGGGGCGG-3′ and these motifs occur primarily in promoter regions of genes.[26] The short isoform TET1s is expressed in the brain. EGR1 and TET1s form a complex mediated by theC-terminal regions of both proteins, independently of association with DNA.[26] EGR1 recruits TET1s to genomic regions flanking EGR1 binding sites.[26] In the presence of EGR1, TET1s is capable of locus-specific demethylation and activation of the expression of downstream genes regulated by EGR1.[26]
The first person to successfully demonstrate reprogramming wasJohn Gurdon, who in 1962 demonstrated that differentiated somatic cells could be reprogrammed back into an embryonic state when he managed to obtain swimming tadpoles following the transfer of differentiated intestinal epithelial cells into enucleated frog eggs.[27] For this achievement he received the 2012Nobel Prize in Medicine alongsideShinya Yamanaka.[28] Yamanaka was the first to demonstrate (in 2006) that this somatic cell nuclear transfer or oocyte-based reprogramming process (see below), that Gurdon discovered, could be recapitulated (in mice) by defined factors (Oct4,Sox2,Klf4, andc-Myc) to generateinduced pluripotent stem cells (iPSCs).[29] Other combinations of genes have also been used, including LIN25[30] andHomeobox protein NANOG.[30][31]
With the discovery that cell fate could be altered, the question of what progression of events occurs signifies a cell undergoing reprogramming. As the final product of iPSC reprogramming was similar inmorphology, proliferation,gene expression,pluripotency, andtelomerase activity, genetic and morphological markers were used as a way to determine what phase of reprogramming was occurring.[32] Reprogramming is defined into three phase: initiation, maturation, and stabilization.[33]
The initiation phase is associated with the downregulation of cell type specific genes and the upregulation of pluripotent genes.[33] As the cells move towards pluripotency, thetelomerase activity is reactivated to extendtelomeres. The cell morphology can directly affect the reprogramming process as the cell is modifying itself to prepare for the gene expression of pluripotency.[34] The main indicator that the initiation phase has completed is that the first genes associated with pluripotency are expressed. This includes the expression ofOct-4 orHomeobox protein NANOG, while undergoing amesenchymal–epithelial transition (MET), and the loss ofapoptosis andsenescence.[35]
If the cell is directly reprogrammed from onesomatic cell to another, the genes associated with each cell type begin to be upregulated and downregulated accordingly.[33] This can either occur through direct cell reprogramming or creating an intermediate, such as a iPSC, and differentiating into the desired cell type.[35]
The initiation phase is completed through one of three pathways:nuclear transfer,cell fusion, or defined factors (microRNA,transcription factor, epigenetic markers, and other small molecules).[30][35]
Anoocyte can reprogram an adult nucleus into an embryonic state aftersomatic cell nuclear transfer, so that a new organism can be developed from such cell.[36]
Reprogramming is distinct from development of asomatic epitype,[37] as somatic epitypes can potentially be altered after an organism has left the developmental stage of life.[38] During somatic cell nuclear transfer, the oocyte turns off tissue specific genes in the somatic cell nucleus and turns back on embryonic specific genes. This process has been shown through cloning, as seen throughJohn Gurdon with the tadpoles[27] andDolly the Sheep.[39] Notably, these events have shown that cell fate is a reversible process.
[35]Cell fusion is used to create a multi nucleated cell called aheterokaryon.[35] The fused cells allow for otherwise silenced genes to become reactivated and expressive. As the genes are reactivated, the cells can re-differentiate. There are instances where transcriptional factors, such as the Yamanaka factors, are still needed to aid inheterokaryon cell reprogramming.[40]
Unlike nuclear transfer and cell fusion, defined factors do not require a full genome, only reprogramming factors. These reprogramming factors includemicroRNA,transcription factor, epigenetic markers, and other small molecules.[35] The original transcription factors, that lead to iPSC development, discovered by Yamanaka includeOct4,Sox2,Klf4, andc-Myc (OSKM factors).[29][32] Although the OSKM factors have been shown to induce and aid in pluripotency, other transcription factors such asHomeobox protein NANOG,[41] LIN25,[30] TRA-1-60,[41] and C/EBPα[42] aid in the efficiency of reprogramming. The use ofmicroRNA and other small molecule-driven processes has been utilized as a means of increasing the efficiency of the differentiation from somatic cells to pluripotency.[35]
The maturation phase begins at the end of the initiation phase, when the first pluripotent genes are expressed.[33] The cell is preparing itself to be independent from the defined factors, that started the reprogramming process. The first genes to be detected in iPSCs areOct4,Homeobox protein NANOG, and Esrrb, followed later bySox2.[35] In the later stages of maturation,transgene silencing marks the start of the cell becoming independent from the inducedtranscription factor. Once the cell is independent, the maturation phase ends and the stabilization phase begins.
As reprogramming efficiency has proven to be a variable and low efficiency process, not all the cells complete the maturation phase and achievepluripotency.[42] Some cells that undergo reprogramming still remain underapoptosis at the beginning of the maturation stage fromoxidative stress brought on by the stresses of gene expression change. The use ofmicroRNA, proteins, and different combinations of the OSKM factors have started to lead towards a higher efficiency rate of reprogramming.
The stabilization phase refers to the processes in the cell that occur after the cell reachespluripotency. One genetic marker is the expression ofSox2 andX chromosomereactivation, while epigenetic changes include thetelomerase extending thetelomeres[30] and loss of the cell’s epigenetic memory.[33] The epigenetic memory of a cell is reset by the changes in DNA methylation,[43] usingactivation-induced cytidine deaminase (AID),TET enzymes (TET), andDNA methyltransferase (DMNTs), starting in the maturation phase and into the stabilization stage.[33] Once the epigenetic memory of the cell is lost, the possibility of differentiation into the three germ layers is achieved.[32] This is considered a fully reprogrammed cell as it can be passaged without reverting to its original somatic cell type.[35]
Reprogramming can also be induced artificially through the introduction of exogenous factors, usuallytranscription factors. In this context, it often refers to the creation ofinduced pluripotent stem cells from mature cells such as adultfibroblasts. This allows the production ofstem cells forbiomedical research, such as research intostem cell therapies, without the use of embryos. It is carried out by thetransfection of stem-cell associated genes into mature cells usingviral vectors such asretroviruses.
One of the first transacting factors discovered to change a cell was found in a myoblast when thecomplementary DNA (cDNA) coding forMyoD was expressed and converted afibroblast to a myoblast. Another transacting factor that directly transformed alymphoid cell into amyeloid cell was C/EBPα. MyoD and C/EBPα are examples of a small number of single factors that can transform cells. More often, a combination of transcription factors work in conjunction to reprogram a cell.
The OSKM factors (Oct4,Sox2,Klf4, andc-Myc) were initially discovered by Yamanaka in 2006, by the induction of a mouse fibroblast into aninduced pluripotent stem cell (iPSCs).[29] Within the following year, these factors were used to induce human fibroblasts into iPSCs.[32]
Oct4 is part of the core regulatory genes needed for pluripotency, as it is seen in bothembryonic stem cells and tumors.[44] The use of Oct4 even in small increases allows for the start differentiation into pluripotency. Oct4 works in conjecture with Sox2 for the expression ofFGF4 which could aid in differentiation.
Sox2 is a gene used in maintaining pluripotency in stem cells. Oct4 and Sox2 work together to regulate hundreds of genes utilized in pluripotency.[44] However, Sox2 is not the only possible Sox family member to participate in gene regulation with Oct4 – Sox4,Sox11, andSox15 also participate, as the Sox protein is redundant throughout the stem cellgenome.
Klf4 is a transcription factor used inproliferation,differentiation,apoptosis, andsomatic cell reprogramming. When being utilized in cellular reprogramming, Klf4 prevents cell division of damaged cells using its apoptotic ability, and aids inhistone acetyltransferase activity.[32]
c-Myc is also known as anoncogene, and in certain conditions can become cancer causing.[45] In cellular reprogramming, c-Myc is used forcell cycle progression,apoptosis, and cellular transformation for further differentiation.
Homeobox protein NANOG (NANOG) is a transcription factor used to aid in the efficiency of generating iPSCs by maintainingpluripotency[46] and suppressingcell determination factors.[47] NANOG works by promotingchromatin accessibility through repression ofhistone markers, such asH3K27me3. NANOG aids recruitment ofOct4,Sox2, and Esrrb used intranscription, while also recruitingBrahma-related gene-1 (BRG1) forchromatin accessibility.
CEBPA is a commonly used factor when reprogramming cells into not only iPSCs, but also other cells. C/EBPα has shown itself to be a single transacting factor during direct reprogramming of a lymphoid cell into a myeloid cell.[42] C/EBPα is considered a 'path breaker' to aid in preparing the cell for intake of the OSKM factors and specific transcription events.[41] C/EBPα has also been shown to increase the efficiency of the reprogramming events.[33]
The properties of cells obtained after reprogramming can vary significantly, in particular among iPSCs.[48] Factors leading to variation in the performance of reprogramming and functional features of end products include genetic background, tissue source, reprogramming factor stoichiometry and stressors related to cell culture.[48]