Movatterモバイル変換


[0]ホーム

URL:


Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
Thehttps:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

NIH NLM Logo
Log inShow account info
Access keysNCBI HomepageMyNCBI HomepageMain ContentMain Navigation
pubmed logo
Advanced Clipboard
User Guide

Full text links

HighWire full text link HighWire Free PMC article
Full text links

Actions

Share

.2013 Jan;23(1):121-8.
doi: 10.1101/gr.141705.112. Epub 2012 Oct 11.

Sequencing the unsequenceable: expanded CGG-repeat alleles of the fragile X gene

Affiliations

Sequencing the unsequenceable: expanded CGG-repeat alleles of the fragile X gene

Erick W Loomis et al. Genome Res.2013 Jan.

Abstract

The human fragile X mental retardation 1 (FMR1) gene contains a (CGG)(n) trinucleotide repeat in its 5' untranslated region (5'UTR). Expansions of this repeat result in a number of clinical disorders with distinct molecular pathologies, including fragile X syndrome (FXS; full mutation range, greater than 200 CGG repeats) and fragile X-associated tremor/ataxia syndrome (FXTAS; premutation range, 55-200 repeats). Study of these diseases has been limited by an inability to sequence expanded CGG repeats, particularly in the full mutation range, with existing DNA sequencing technologies. Single-molecule, real-time (SMRT) sequencing provides an approach to sequencing that is fundamentally different from other "next-generation" sequencing platforms, and is well suited for long, repetitive DNA sequences. We report the first sequence data for expanded CGG-repeat FMR1 alleles in the full mutation range that reveal the confounding effects of CGG-repeat tracts on both cloning and PCR. A unique feature of SMRT sequencing is its ability to yield real-time information on the rates of nucleoside addition by the tethered DNA polymerase; for the CGG-repeat alleles, we find a strand-specific effect of CGG-repeat DNA on the interpulse distance. This kinetic signature reveals a novel aspect of the repeat element; namely, that the particular G bias within the CGG/CCG-repeat element influences polymerase activity in a manner that extends beyond simple nearest-neighbor effects. These observations provide a baseline for future kinetic studies of repeat elements, as well as for studies of epigenetic and other chemical modifications thereof.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Schematic representation of SMRT sequencing. DNA polymerase synthesizes a nascent strand complementary to a closed-circular SMRTbell DNA template. Fluorescent phospholinked nucleotides produce real-time fluorescent pulse data for both basecalls and incorporation kinetics. Inter-pulse distance (IPD) is defined as the time gap between two consecutive pulses and includes the time taken to move the DNA polymerase between base positions on the DNA template. Pulse width (PW) is the duration of a given pulse and represents the residence time of the nucleotide at the active site of DNA polymerase, up to the point of incorporation and cleavage of the fluorescent tag. Closed SMRTbell sequencing templates yield multiple, overlapping subreads of both forward and reverse strands of the insert, which are then assembled in silico with the primary consensus core algorithm into circular consensus sequence (CCS), eliminating randomly distributed pulse-read sequencing errors (Supplemental Table S1). Note also that although a single enzymatic misincorporation is indicated for completeness, such errors occur during polymerization with a frequency that is several orders of magnitude less than photonic miscall errors and therefore do not contribute to the error profile.
Figure 2.
Figure 2.
SMRT sequencing of short CGG repeats. (A) Sequence alignment of representative reads from a library of plasmid-derivedFMR1 sequence with nominally 36 CGG repeats. Three major CGG-repeat size species are observed. Flanking and CGG-repeat regions are delineated by vertical tick marks. (B) Frequency of sequence lengths in the top 1000 reads (by predicted quality) plotted by region as indicated. Three major peaks observed in the repeats (red) correspond to 34, 35, and 36 repeats as seen inA. Both the left (green broken line) and right (blue) flanking sequence regions are uniform. (C) Accuracy by alignment to reference of each region of the insert increases with each successive pass of consensus coverage, saturating after four subreads for the flanking regions. Accuracy of the reads within the CGG-repeat region has improved through the use of reference sequences corresponding to the individual lengths within the distribution (see Supplemental Fig. S2).
Figure 3.
Figure 3.
SMRT sequencing of a mid-premutation CGG-repeat expansion (approximately 95 CGG repeats). (A) Sequence alignment of representative CCS reads from a library of plasmid-generatedFMR1 sequence; note that the original construct was generated from PCR-amplified genomic DNA followed by bacterial clonal selection. Sporadic single-base additions and deletions result from comparatively lower CCS coverage than the smaller repeat library;upper andlower sets represent sample CCS reads from the main peak at ∼280 nucleotides and from the smaller, broad distribution, respectively; horizontal lines indicate the CGG-repeat regions. (B) Expanded view of the transition from flanking sequence into the CGG repeats. An AGG repeat (boxed) is unambiguously recognized in all reads, demonstrating the utility in genotyping polymorphic CGG-repeat interruptions. (C) Frequency distribution of sequence lengths in the top 1000 reads plotted by region. A major peak is observed in the repeats (red), with minor peaks generally corresponding to units of single repeats, and a spread of shorter fragments produced by bacterial deletion of the CGG repeats. Both left (green broken line) and right (blue) flanking sequence regions are uniform.
Figure 4.
Figure 4.
SMRT sequencing of a full mutation allele of (nominally) 750 CGG repeats. (A) Representative CCS sequences from a library of PCR-amplifiedFMR1 genomic DNA. These reads are the result of reading through >2 kb of CGG repeats at least three times. (B) Size-corrected distribution of sequence lengths in the sequenced library plotted by region. PCR amplification creates a broad distribution of repeat sizes with a mode at 720 repeats.
Figure 5.
Figure 5.
Analysis of the SD in read length for CGG-repeat-containingFMR1 alleles. (A) Mean SD comparison between inter- and intramolecular subreads for the two plasmid- (underlined tick labels) and PCR-generated libraries. Higher intermolecular (versus intramolecular) SDs are apparent only for the repeat region, consistent with the presence of complex populations of repeat sizes. (B,C) Comparisons of CGG-repeat size distributions for plasmid- and PCR-generated CGG-repeat–containing DNA for (B) normal (∼30 CGG repeats) and (C) premutation (∼100 CGG repeats). Broader distributions for PCR-generated fragments reflect errors associated with PCR amplification of CGG-repeat elements.
Figure 6.
Figure 6.
Time-domain analysis with mean IPD values faceted by base, colored according to the strand being synthesized (CGG, red; GCC, blue), and synchronized by aligned template position. (Error bars) SD from 200 reads. Vertical black lines demarcate the start and end of the repeat region per sample. The IPD, as illustrated in Figure 1, is the time interval from the end of the previous incorporation pulse to the start of the current incorporation pulse. (A,C) 36-mer sample shows an increased G IPD inside the repeat region only for the GCC strand. (B,D) 95-mer sample with the same increased G IPD only for the GCC strand, with a dip localized to an AGG interruption (arrow). (C,D) Expanded view of the start of the repeat region for both 36-mer (C) and 95-mer (D) reveals that the IPD increase begins at the fourth CGG repeat.
See this image and copyright information in PMC

Similar articles

See all similar articles

Cited by

See all "Cited by" articles

References

    1. Braida C, Stefanatos RK, Adam B, Mahajan N, Smeets HJ, Niel F, Goizet C, Arveiler B, Koenig M, Lagier-Tourenne C, et al. 2010. Variant CCG and GGC repeats within the CTG expansion dramatically modify mutational dynamics and likely contribute toward unusual symptoms in some myotonic dystrophy type 1 patients. Hum Mol Genet 19: 1399–1412 - PubMed
    1. Chen LS, Tassone F, Sahota P, Hagerman PJ 2003. The (CGG)n repeat element within the 5′ untranslated region of the FMR1 message provides both positive and negative cis effects on in vivo translation of a downstream reporter. Hum Mol Genet 12: 3067–3074 - PubMed
    1. Chen L, Hadd A, Sah S, Filipovic-Sadic S, Krosting J, Sekinger E, Pan R, Hagerman PJ, Stenzel TT, Tassone F, et al. 2010. An information-rich CGG repeat primed PCR that detects the full range of fragile X expanded alleles and minimizes the need for southern blot analysis. J Mol Diagn 12: 589–600 - PMC - PubMed
    1. Chen L, Hadd AG, Sah S, Houghton JF, Filipovic-Sadic S, Zhang W, Hagerman PJ, Tassone F, Latham GJ 2011. High-resolution methylation polymerase chain reaction for fragile X analysis: Evidence for novel FMR1 methylation patterns undetected in Southern blot analyses. Genet Med 13: 528–538 - PMC - PubMed
    1. Chonchaiya W, Au J, Schneider A, Hessl D, Harris SW, Laird M, Mu Y, Tassone F, Nguyen DV, Hagerman RJ 2012. Increased prevalence of seizures in boys who were probands with the FMR1 premutation and co-morbid autism spectrum disorder. Hum Genet 131: 581–589 - PMC - PubMed

Publication types

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full text links
HighWire full text link HighWire Free PMC article
Cite
Send To

NCBI Literature Resources

MeSHPMCBookshelfDisclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.


[8]ページ先頭

©2009-2025 Movatter.jp