CCCTC-Binding factor or CTCF was initially discovered as a negative regulator of the chickenc-myc gene. This protein was found to be binding to three regularly spaced repeats of the core sequence CCCTC and thus was named CCCTC binding factor.[9]
Although CTCF was initially discovered as a transcription factor,[9] and most of its binding is at cis-regulatory regions in combination with other transcription factors,[10] later research has shifted to its role in regulating the 3D structure of chromatin.[8] CTCF bookmarks distant regions on the DNA which can be connected by chromatin loops, and such loops can account for creating chromatin compartments, domains, nanodomains, territories, and specific structures liketopologically associating domain (TAD) orlamina-associated domain (LAD).[11] It also defines the boundaries between active and heterochromatic DNA.
Since the 3D structure of DNA influences the regulation of genes, CTCF's activity influences the expression of genes. CTCF is thought to be a primary part of the activity ofinsulators, sequences that block the interaction between enhancers and promoters. CTCF binding has also been both shown to promote and repress gene expression. It is unknown whether CTCF affects gene expression solely through its looping activity, or if it has some other, unknown, activity.[8] In a recent study, it has been shown that, in addition to demarcatingTADs, CTCF mediates promoter–enhancer loops, often located in promoter-proximal regions, to facilitate the promoter–enhancer interactions within one TAD.[12] This is in line with the concept that a subpopulation of CTCF associates with theRNA polymerase II (Pol II) protein complex to activate transcription. It is likely that CTCF helps to bridge the transcription factor-bound enhancers to transcription start site-proximal regulatory elements and to initiate transcription by interacting with Pol II, thus supporting a role of CTCF in facilitating contacts between transcription regulatory sequences. This model has been demonstrated by the previous work on thebeta-globin locus.
The binding of CTCF has been shown to have many effects, which are enumerated below. In each case, it is unknown if CTCF directly evokes the outcome or if it does so indirectly (in particular through its looping role).
The protein CTCF plays a heavy role in repressing theinsulin-like growth factor 2 gene, by binding to theH-19 imprinting control region (ICR) along with differentially-methylated region-1 (DMR1) andMAR3.[13][14]
Binding of targeting sequence elements by CTCF can block the interaction between enhancers and promoters, therefore limiting the activity of enhancers to certain functional domains. Besides acting as enhancer blocking, CTCF can also act as a chromatin barrier[15] by preventing the spread ofheterochromatin structures.
CTCF physically binds to itself to form homodimers,[16]which causes the bound DNA to form loops.[17] CTCF also occurs frequently at the boundaries of sections of DNA bound to thenuclear lamina.[11] Usingchromatin immuno-precipitation (ChIP) followed byChIP-seq, it was found that CTCF localizes withcohesin genome-wide and affects gene regulatory mechanisms and the higher-order chromatin structure.[18][19] It is currently believed that the DNA loops are formed by theloop extrusion mechanism, whereby the cohesin ring is actively being translocated along the DNA until it meets CTCF. CTCF has to be in a proper orientation to stop cohesin.[20][21]
CTCF binds to theconsensus sequence CCGCGNGGNGGCAG (inIUPAC notation).[23][24] This sequence is defined by 11zinc finger motifs in its structure. CTCF's binding is disrupted byCpG methylation of the DNA it binds to.[25] On the other hand, CTCF binding may set boundaries for the spreading of DNA methylation.[26] In recent studies, CTCF binding loss is reported to increase localized CpG methylation, which reflected another epigenetic remodeling role of CTCF in human genome.[27][28][29]
CTCF binds to an average of about 55,000 DNA sites in 19 diverse cell types (12 normal and 7 immortal) and in total 77,811 distinct binding sites across all 19 cell types.[30]CTCF's ability to bind to multiple sequences through the usage of various combinations of itszinc fingers earned it the status of a "multivalent protein".[5] More than 30,000 CTCF binding sites have been characterized.[31] The human genome contains anywhere between 15,000 and 40,000 CTCF binding sites depending on cell type, suggesting a widespread role for CTCF in gene regulation.[15][23][32] In addition CTCF binding sites act asnucleosome positioning anchors so that, when used to align various genomic signals, multiple flanking nucleosomes can be readily identified.[15][33] On the other hand, high-resolution nucleosome mapping studies have demonstrated that the differences of CTCF binding between cell types may be attributed to the differences in nucleosome locations.[34] Methylation loss at CTCF-binding site of some genes has been found to be related to human diseases, including male infertility.[24]
CTCF binds to itself to formhomodimers.[16] CTCF has also been shown tointeract withY box binding protein 1.[35] CTCF also co-localizes withcohesin, which extrudes chromatin loops by actively translocating one or two DNA strands through its ring-shaped structure, until it meets CTCF in a proper orientation.[36] CTCF is also known to interact with chromatin remodellers such asChd4 and Snf2h (SMARCA5).[10]
^abLobanenkov VV, Nicolas RH, Adler VV, Paterson H, Klenova EM, Polotskaja AV, Goodwin GH (December 1990). "A novel sequence-specific DNA binding protein which interacts with three regularly spaced direct repeats of the CCCTC-motif in the 5'-flanking sequence of the chicken c-myc gene".Oncogene.5 (12):1743–53.PMID2284094.
^abGuelen L, Pagie L, Brasset E, Meuleman W, Faza MB, Talhout W, Eussen BH, de Klein A, Wessels L, de Laat W, van Steensel B (June 2008). "Domain organization of human chromosomes revealed by mapping of nuclear lamina interactions".Nature.453 (7197):948–51.Bibcode:2008Natur.453..948G.doi:10.1038/nature06947.PMID18463634.S2CID4429401.
^Ohlsson R, Renkawitz R, Lobanenkov V (2001). "CTCF is a uniquely versatile transcription regulator linked to epigenetics and disease".Trends Genet.17 (9):520–7.doi:10.1016/S0168-9525(01)02366-6.PMID11525835.
Ohlsson R, Renkawitz R, Lobanenkov V (2001). "CTCF is a uniquely versatile transcription regulator linked to epigenetics and disease".Trends Genet.17 (9):520–7.doi:10.1016/S0168-9525(01)02366-6.PMID11525835.
Klenova EM, Morse HC, Ohlsson R, Lobanenkov VV (2003). "The novel BORIS + CTCF gene family is uniquely involved in the epigenetics of normal biology and cancer".Semin. Cancer Biol.12 (5):399–414.doi:10.1016/S1044-579X(02)00060-3.PMID12191639.
Filippova GN, Lindblom A, Meincke LJ, Klenova EM, Neiman PE, Collins SJ, Doggett NA, Lobanenkov VV (1998). "A widely expressed transcription factor with multiple DNA sequence specificity, CTCF, is localized at chromosome segment 16q22.1 within one of the smallest regions of overlap for common deletions in breast and prostate cancers".Genes Chromosomes Cancer.22 (1):26–36.doi:10.1002/(SICI)1098-2264(199805)22:1<26::AID-GCC4>3.0.CO;2-9.PMID9591631.S2CID34221526.