Movatterモバイル変換


[0]ホーム

URL:


WO2025054389A1 - Identification of methylated cytosine using landmarks - Google Patents

Identification of methylated cytosine using landmarks
Download PDF

Info

Publication number
WO2025054389A1
WO2025054389A1PCT/US2024/045481US2024045481WWO2025054389A1WO 2025054389 A1WO2025054389 A1WO 2025054389A1US 2024045481 WUS2024045481 WUS 2024045481WWO 2025054389 A1WO2025054389 A1WO 2025054389A1
Authority
WO
WIPO (PCT)
Prior art keywords
dna
polymerase
methylcytosine
composition
lesion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2024/045481
Other languages
French (fr)
Inventor
Jeffrey Fisher
Boyan Boyanov
Egor DOLZHENKO
Seth MCDONALD
Ali ASADI
Eric Brustad
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Illumina Inc
Original Assignee
Illumina Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Illumina IncfiledCriticalIllumina Inc
Publication of WO2025054389A1publicationCriticalpatent/WO2025054389A1/en
Pendinglegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Definitions

Landscapes

Abstract

A method for detecting 5 '-methylcytosine and/or 5'-hydroxymethylcytosine in a target DNA is disclosed. Such methods comprise converting 5 '-methylcytosine and/or 5 '-hydroxymethylcytosine to 5 '-formylcytosine and/or 5 '-carboxylcytosine to produce converted DNA. Treating the converted DNA with a coupling reagent to attach a functional group at the 5 '-formyl and/or 5 '-carboxyl group to produce a coupled DNA. Amplifying the coupled DNA so as to introduce a discrepant region at the site of the functional group to produce discrepant DNA. Sequencing the discrepant DNA to identify the discrepant region that deviates from a reference sequence, thereby detecting 5 '-methylcytosine and/or 5 '-hydroxymethylcytosine in a target DNA. In particular embodiments, the DNA is amplified with at least two distinct polymerases. At least one of the polymerases is a processive polymerase. At least one of these polymerases is a lesion polymerase. The lesion polymerase introduces the discrepant region. The attached functional group causes the processive polymerase to release from the DNA.

Description

Identification of Methylated Cytosine using Landmarks
TECHNICAL FIELD
[0001] This disclosure relates to nucleotide base identification, specifically to the identification of methylated cytosine.
BACKGROUND
[0002] The primary DNA sequence of the four-letter alphabet G, C, A, and T forms the genetic information of life on earth. Chemical modifications of DNA bases do not change the underlying sequence, but instead carry an extra layer of information. The first discovered 5 -methylcytosine (5mC) is the most studied modified base, and it plays crucial roles in a broad range of biological processes from gene regulation to normal development.
[0003] It is well established that hypo and hypermethylation of promoter regions can be a reliable marker for early cancer onset. Both of these methods rely on converting either a C (bisulfite/EM-seq) or 5mC (TAPS) into a different sequencable base (typically U or T analogues, respectively) and then using deep sequencing and comparison to a non-converted sample to map the presence of 5mC and 5hmC bases.
[0004] Both methods are subject to similar limitations: significant sample loss during conversion, and the need to convert to another sequencable base which often requires harsh reagents and/or significantly limits the available chemistry space. In addition, methods that convert C also reduce genome complexity significantly (often denoted as the 3-base genome), leading to an increase in down-stream computational burdens and enforcing a need for high concentrations of co-sequenced control DNA such as PhiX.
SUMMARY
[0005] Disclosed herein are methods for detecting 5 ’-methylcytosine and/or 5’- hydroxymethylcytosine in a target DNA. Such methods comprise converting 5 ’-methylcytosine and/or 5 ’ -hydroxymethylcytosine to 5 ’ -formylcytosine and/or 5 ’ -carboxylcytosine to produce converted DNA. Treating the converted DNA with a coupling reagent to attach a functional group at the 5 ’-formyl and/or 5 ’-carboxyl group to produce a coupled DNA. Amplifying the coupled DNA so as to introduce a discrepant region at the site of the functional group to produce discrepant DNA. Sequencing the discrepant DNA to identify the discrepant region that deviates from a reference sequence, thereby identifies the location of 5 ’-methylcytosine and/or 5 ’-hydroxymethylcytosine in a target DNA. [0006] In particular embodiments, amplifying the coupled DNA comprises amplifying the DNA with at least two distinct polymerases. In certain of these embodiments, at least one of the polymerases is a processive polymerase. In certain of these embodiments, at least one of these polymerases is a lesion polymerase.
[0007] In particular embodiments, the lesion polymerase introduces the discrepant region.
[0008] In particular embodiments, the attached functional group in the coupled DNA causes the processive polymerase to release from the DNA. In certain of these embodiments, the functional group is a potassium oxoruthenate. In certain of these embodiments, the lesion polymerase attaches to the DNA where the processive polymerase released from the DNA. The lesion polymerase introduces replication errors over a window of bases which begins at a given 5 -methylcytosine site.
[0009] In particular embodiments, if the replication errors within a window associated with cytosine is at or above a base call error rate threshold then the cytosine is identified as methylated. In certain of these embodiments, the replication error rate threshold is above 0.35.
[00010] In particular embodiments, the 5 -methylcytosine is converted to 5 -methylformylcytosine before being converted to 5 -methylcarboxylcytosine. In some of these embodiments, the 5- methylcytosine is converted to 5 -methylformylcytosine and then to 5 -methylcarboxylcytosine by an enzyme from the class defined by the EC number 1.14.11.-.
[00011] In particular embodiments, the functional group is attached via a Benzotriazol- 1-yloxy- tripyrrolidinophosphonium-hexafluorophosphate. In other embodiments, the functional group is attached via any of the reagents from the group consisting of benzotriazole- 1- yloxytris(dimethylamino)phosphonium hexafluorophosphate (BOP), benzotriazole- 1- yloxytripyrrolidinophosphonium hexafluorophosphate (PyBOP), (7-Azabenzotriazol-l- yloxy)tripyrrolidinoposphonium hexafluorophosphate (PyAOP), Bromo-tris-pyrrolidino-phosphonium hexafluorophosphate (PyBrOP), Bis(2-oxo-3-oxazolidinyl)phosphinic chloride (BOP-CI).
[00012] In particular embodiments, the functional group is a fluorescent probe.
[00013] In particular embodiments, the sequencing is SBS sequencing.
[00014] In particular embodiments, the reference DNA is the original, unconverted, target DNA. In certain of these embodiments, the DNA is a genome.
[00015] In a second aspect, the disclosure provides a first in vitro or ex vivo composition comprising: target DNA comprising functional groups at the site of the 5 ’-methylcytosine and/or 5’- hydroxymethylcytosine that have been converted; a processive polymerase; and a lesion polymerase.
[00016] In another aspect the disclosure provides a kit for detecting 5 ’-methylcytosine and/or 5’- hydroxymethylcytosine in a target DNA, the kit comprising: an enzyme for converting 5’- methylcytosine and/or 5 ’-hydroxymethylcytosine to 5 ’-formylcytosine and/or 5 ’-carboxylcytosine to produce converted DNA; a reaction agent for treating the converted DNA with a coupling reagent to attach a functional group at the 5 ’-formyl group of the 5 ’-formylcytosine and/or the 5 ’-carboxyl group of the 5 ’-carboxylcytosine to produce a coupled DNA.
[00017] In particular embodiments, the enzyme for converting 5 ’-methylcytosine and/or 5’- hydroxymethylcytosine to 5 ’-formylcytosine and/or 5 ’-carboxylcytosine is from the family having enzymatic activity defined by EC class number 1.14.11.-.
[00018] In other particular embodiments, the enzyme is the TET protein.
[00019] In particular embodiments, the enzyme for converting 5 ’-methylcytosine and/or 5’- hydroxymethylcytosine to 5 ’-formylcytosine and/or 5 ’-carboxylcytosine is a potassium oxoruthenate.
BRIEF DESCRIPTION OF THE DRAWINGS
[00020] The following drawings are provided to illustrate certain embodiments described herein. The drawings are merely illustrative and are not intended to limit the scope of claimed inventions and are not intended to show every potential feature or embodiment of the claimed inventions. The drawings are not necessarily drawn to scale; in some instances, certain elements of the drawing may be enlarged with respect to other elements of the drawing for purposes of illustration.
[00021] Figure 1 is the conversion pathway of 5 -methylcytosine to 5 -carboxylcytosine.
[00022] Figure 2 is a depiction of the PyBOP coupling reaction for attaching a functional group to the carboxyl group of the 5-carboxylcytosine.
[00023] Figure 3 is a depiction of the reaction pathway for K2RuO4-treated oligo was then labeled with Oxyamine or hydrazines to block natural polymerase.
[00024] Figure 4 is a depiction of one functional group that can be attached.
[00025] Figure 5 is the workflow for identifying 5mC sites.
[00026] Figure 6 is a graph showing True positive rate corresponding to different values of perror , and tmeth parameters assuming that the size of the window is set to 10.
[00027] Figure 7 is a nucleotide sequence of the STK11 gene.
[00028] Figure 8 is a depiction of a possible sequence of the STK11 gene following conversion of the 5 -methylcytosine and the addition of a landmark functional group. DETAILED DESCRIPTION
[00029] The following description recites various aspects and embodiments of the inventions disclosed herein. No particular embodiment is intended to define the scope of the invention. Rather, the embodiments provide non-limiting examples of various compositions, and methods that are included within the scope of the claimed inventions. The description is to be read from the perspective of one of ordinary skill in the art. Therefore, information that is well known to the ordinarily skilled artisan is not necessarily included.
Definitions
[00030] The following terms and phrases have the meanings indicated below, unless otherwise provided herein. This disclosure may employ other terms and phrases not expressly defined herein. Such other terms and phrases shall have the meanings that they would possess within the context of this disclosure to those of ordinary skill in the art. In some instances, a term or phrase may be defined in the singular or plural. In such instances, it is understood that any term in the singular may include its plural counterpart and vice versa, unless expressly indicated to the contrary.
[00031] As used herein, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. For example, reference to “a substituent” encompasses a single substituent as well as two or more substituents, and the like.
[00032] As used herein, “for example,” “for instance,” “such as,” or “including” are meant to introduce examples that further clarify more general subject matter. Unless otherwise expressly indicated, such examples are provided only as an aid for understanding embodiments illustrated in the present disclosure and are not meant to be limiting in any fashion. Nor do these phrases indicate any kind of preference for the disclosed embodiment.
[00033] As used herein “converted DNA” is meant to refer to DNA where enzymatic or chemical reactions have altered any 5 ’-methylcytosine and/or 5 ’-hydroxymethylcytosine to 5 ’-formylcytosine and/or 5 ’-carboxylcytosine. One method for conversion is through the action of an enzyme from the EC class number 1.14.11.- such enzymes are oxidoreductases, that act on paired donors with incorporation or reduction of molecular oxygen. Any oxygen thus incorporated may be derived from other sources than O2. The enzyme further utilizes 2-oxoglutarate as one donor, and incorporates one atom of oxygen into both donors. On example of such an enzyme is methlycytosine dioxygenase. More specifically, a specific family of ten-eleven translocation (TET) methylcytosine dioxygenases are used.
[00034] As used herein “coupled DNA” is meant to refer to DNA on which the methylcytosine has been replaced with formylcytosine or carboxylcytosine. The formyl and/or carboxyl groups have undergone reactions to couple a functional group to those formyl and/or carboxyl groups. These functional groups become landmarks, that can cause a polymerase or replisome to fall off the DNA. [00035] As used herein, “processive polymerase” is meant to refer to a polymerase | A 1 [that has a high average number of nucleotide additions per association event. The processivity of a polymerase is the ability to add nucleotides without coming off the DNA, or the ability of DNA polymerase to carry out continuous DNA synthesis on a template DNA without frequent dissociation. It can be measured by the average number of nucleotides incorporated by a DNA polymerase on a single association/disassociation event. The higher the number of nucleotide additions per association event, the higher the processivity of the polymerase. Some polymerases such as Pol £ (epsilon) have high processivity without the aid of a clamp region to attach and remain connected to the DNA. Other polymerases, such as Pol a, Pol 5, and Pol with high processivity are part of a replisome that utilizes a clamp region to remain attached to the DNA. It is generally understood that, accurate and efficient synthesis of genomic DNA in all three kingdoms of life is carried out by the multiple -protein complex called DNA replisome. DNA polymerase constitutes the core of the replisome and has the ability of synthesizing complementary DNA in a 5'-to-3' direction on both leading and lagging strands. The replicative DNA polymerases possess remarkable fidelity that guarantees the faithful replication of genomic DNA. Most DNA polymerases are intrinsically low-processivity enzymes and produce short DNA product strands per binding event. The low processivity of most DNA polymerase alone is insufficient for the timely replication of a large DNA genome. Polymerases interact with the phosphate backbone and the minor groove of the DNA, so their interactions do not depend on the specific nucleotide sequence. The binding is largely mediated by electrostatic interactions between the DNA and the "thumb" and "palm" domains of the metaphorically hand-shaped DNA polymerase molecule. When the polymerase advances along the DNA sequence after adding a nucleotide, the interactions with the minor groove dissociate but those with the phosphate backbone remain more stable, allowing rapid re-binding to the minor groove at the next nucleotide.
[00036] Interactions with the DNA are also facilitated by DNA clamp proteins, which are multimeric proteins that completely encircle the DNA, with which they associate at replication forks. Their central pore is sufficiently large to admit the DNA strands and some surrounding water molecules, which allows the clamp to slide along the DNA without dissociating from it and without loosening the proteinprotein interactions that maintain the toroid shape. When associated with a DNA clamp, DNA polymerase is dramatically more processive; without the clamp most polymerases have a processivity of only about 100 nucleotides. The interactions between the polymerase and the clamp are more persistent than those between the polymerase and the DNA.
[00037] As used herein, “lesion polymerase” is meant to refer to template-dependent DNA polymerases, with low fidelity, low processivity, and no proof reading. Certain lesion polymerases can cope with completely foreign material in DNA, and some can overcome a chain of 12 methylene residues. (Lesion bypass DNA polymerases replicate across non-DNA segment, Ayelet Maor-Shoshani, Vered Ben-Ari, and Zvi, December 1, 2003 100 (25) 14760-14765, https://doi.org/10.1073/pnas.2433503100). Some examples of lesion polymerases include Family Y DNA polymerases such as Pol T| (eta), Pol i (iota), and Pol K (kappa).
[00038] As used herein “discrepant region” is meant to refer to regions of the DNA that have been replicated incorrectly, that is they are discrepant from the original DNA template. The introduction.
[00039] As used herein “discrepant DNA” is meant to refer to DNA that contains discrepant regions.
[00040] DNA cytosine modifications are important epigenetic mechanisms that play crucial roles in a broad range of biological processes from gene regulation to normal development. 5 -methylcytosine (5mC) and 5 -hydroxymethylcytosine (5hmC) are by far the two most common epigenetic marks found in the mammalian genome. 5hmC is generated from 5mC by the ten-eleven translocation (TET) family of dioxygenases. TET can further oxidize 5hmC to 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC), which exist in much lower abundances in the mammalian genome compared with 5mC and 5hmC (10-fold to 100-fold lower than 5hmC). Aberrant DNA methylation and hydroxymethylation have been associated with various diseases and are well accepted hallmarks of cancer. Therefore, determination of the genomic distribution of 5mC and 5hmC is not only important for our understanding of development and homeostasis but is also invaluable for clinical applications.
[00041] One option currently used for base-level resolution and quantitative DNA methylation and hydroxymethylation analysis is bisulfite sequencing and its derived methods, including TET-assisted bisulfite sequencing (TAB-Seq) and oxidative bisulfite sequencing (oxBS). All these methods employ bisulfite treatment to convert unmethylated cytosine to uracil while leaving 5mC and/or 5hmC intact. Since PCR amplification of the bisulfite-treated DNA reads uracil as thymine, the modification of each cytosine can be inferred at single-base resolution, where C-to-T transitions provide the locations of the unmethylated cytosines. There are, however, two main drawbacks to bisulfite sequencing. First, bisulfite treatment is a harsh chemical reaction, which degrades most of the DNA, up to 99% of the DNA. Some of the degradation is due to depyrimidination under the required acidic and thermal conditions. The degradation severely limits the utility of this method, especially if sample DNA quantities are low. Second, bisulfite sequencing relies on the complete conversion of unmodified cytosine to thymine. Unmodified cytosine accounts for approximately 95% of the total cytosine in the human genome. Converting all these positions to thymine severely reduces sequence complexity, leading to poor sequencing quality, low mapping rates, uneven genome coverage and increased sequencing cost. Consequently, bisulfite sequencing suffers from pronounced sequencing biases due to selective and context-specific DNA degradation.
[00042] The method described in this disclosure converts 5mC and 5hmC and related sites into reactive moieties, then attaches side groups to the DNA backbone that serve as “landmarks” that affect DNA processivity. In addition to the common methylation on the 5-position of the cytosine (5mC), other types of modifications at the same position, such as 5-hydroxymethyl (5hmC), 5-formyl (5fC), and 5- carboxyl (5caC), are also significant. Lately, 5 -hydroxymethyl (5hmC), a product of 5mC demethylation by the TET (Ten-Eleven Translocation) family enzymes, was shown to control many cellular and evolving processes, including the pluripotency of embryonic stem cells, neuron development, and tumorigenesis in mammals. (Acc. Chem. Res. 2019, 52, 4, 1016-1024 )
[00043] The TET enzymes readily oxidize 5mC and 5hmC to the final oxidation product 5caC in vitro. The TET enzymes convert 5mC to 5CaC as seen in Figure 1.
[00044] Through use of a combination of processive and lesion bypass polymerases we introduce landmarks at the location of the methylated sites, which can then be detected with sequencing and comparison to unprocessed reference sequence. The advantages over bisulfite and methylation are: No use of harsh reagents and no reduction in the size of the DNA alphabet or the diversity of the sample.
[00045] The described method relies on the conversion of 5 -methylcytosine to 5 -formylcytosine and 5- carboxylcytosine via established enzymatic pathways from the enzymes in EC class number 1.14.11.-. One such enzyme from this EC class is the TET protein. The enzymatic pathway of the TET protein, is depicted in Figure 1. For example, the TET enzymes first oxidize the methyl group of 5mC resulting in 5HmC. Further reacting the 5HmC with TET produces 5FoC. Further reacting the 5FoC with TET converts the 5FoC to 5CaC.
[00046] This is also the starting point for the TAPS and EM-seq methods. However, in TAPS and related approaches, the 5-CaC base is treated with pyridine and borane to convert 5-CaC to dihydrouracil (DHU), which is then replicated via PCR and sequenced. For EM-seq, subsequent to TET treatment, DNA is further processed by APOBEC, a cytosine deaminase, that converts non-modified Cs to T, leaving MeC adducts untouched.
[00047] As mentioned previously the disadvantages of such approaches are; use of harsh and hazardous reagents, significant DNA damage and sample loss, and reduction of the size of the DNA alphabet to a 3 -base genome
[00048] Once the TET enzymes have converted the 5mC to 5CaC another reaction is performed to convert the carboxyl group into a larger functional group, referred to as a landmark, that will cause a processive polymerase to release from the DNA. Instead of attempting to convert 5-CaC into another sequencable base, once the TET enzymes have converted the 5mC to 5CaC another reaction is introduced so as to convert the carboxyl group into a larger functional group, referred to as a landmark, that will cause a processive polymerase to release from the DNA. The PyBOP (Benzotriazol- 1- yloxy)tripyrrolidinophosphonium hexafluorophosphate (or similar e.g. BOP, PyBOP, PyAOP, PyBrOP, BOP-CI), coupling reaction is used to create these landmarks by coupling an additional moiety to the carboxyl group of 5-CaC in a one-step reaction with high yield] A2|. [00049] In the Figure 2. The PyBOP reaction couples a functional group to the carboxyl of the 5CaC. The “R’ & R” ” can be a range of functional groups of varying size and polarity introduced as a landmark on the DNA backbone (e.g. 1 -Pyrenemethylamine hydrochloride as a fluorescent probe and or steric hinderance designed for targeted analytical and assay performance). The landmark interrupts transcription of the DNA by causing the processive polymerase to release from the DNA strand. The landmark also interferes with the processive polymerase reattaching to the DNA strand.
[00050] An alternative to the TET mediated conversion is the use of potassium oxoruthenates to introduce the landmark, functional group. The versatile potassium oxoruthenates (K2RuO4) (J. Am. Chem. Soc. 2018, 140, 41, 13190-13194) targets the 5-hmC to have a comprehensive toolkit for direct and quantitative sequencing of the two major epigenetic modifications and obtain both exact 5- methylcytosine (5mC) and 5- hydroxymethylcytosine (5hmC) content.
[00051] The original DNA is treated with the TET enzymes to convert 5mC to 5CaC and then treated with a landmark attaching enzyme such as PyBOP. Following attachment of the landmark, the DNA is run in a single PCR cycle with a processive polymerase and a lesion polymerase. The processive polymerase is a high-fidelity polymerase] A31 that will replicate highly conserve the DNA as it is replicated. When the processive polymerase runs into the landmarks attached to the DNA, the processive polymerase will be knocked off the DNA. The landmark will also interfere with the processive polymerase immediately reattaching to the DNA.
[00052] Once the processive polymerase has been knocked off the DNA by the landmark, the lesion polymerase will attach to the DNA. The lesion polymerases will be chosen for its size, to get in as close to the landmark as possible. Lesion polymerases do not adhere to the DNA as well as processive polymerases. There may be one or several lesion polymerases that extend the sequence of the replicated DNA after each landmark. Additionally, lesion polymerases are not as accurate in their replication of the DNA and base call errors will be introduced to the replicated DNA. These base call errors will be helpful in identifying the location of 5mC.
[00053] A variety of lesion polymerases have properties compatible with the intended application. Lesion bypass DNA polymerases are known to replicate across non-DNA segments Other lesion polymerases with limited processivity and high error rate include archaeal and prokaryotic members of the Y-family such as Dpo4 and DinB, respectively, as well as human enzymes such as pol T|, pol i, pol K, and Revl. These polymerases bypass lesions stemming from many carcinogenic reactions with environmental pollutants. These reactions typically result in base modifications with pyrene moieties, such as benzo[a]pyrene diol epoxide, aminopyrine, and acetylaminofluorene adducts. For this proposed application, we can exploit these polymerase properties and EDC coupling to attach related chemical moieties to 5-carboxylcytosine. [00054] It is also possible to engineer the lesion bypass polymerase to accept unnatural nucleotides as a means of increasing the mutagenesis rate in the bypass zone, an approach similar to that of the Infinity assay for long read phasing.
[00055] Figure 5 is a diagram depicting the process of this method.
[00056] Step 5a: The 5mC sites of cfDNA are converted to 5-CaC via the reaction path from Fig.l.
[00057] Step 5b: A subsequent coupling reaction (Figs.2a and 2b) attaches an “appropriate” landmark to both strands in a CpG island
[00058] Step 5c: PCR adapters are attached to the DNA fragments
[00059] Step 5d: a copy cycle is performed with a mix of processive and lesion polymerases. The processive polymerase copies with high fidelity till the landmark but stalls there and eventually falls off. A lesion polymerase extends for a few bases past the landmark and then falls off. Since lesion polymerases tend to be very error prone they instroduce a “discrepant” region in the copied DNA indicated with the red bars on fig. 6d. The size and nature of this region will be discussed below.
[00060] Step 5e: subsequent PCR cycles replicate the discrepant region into subsequent copies.
[00061] Step 5f: sequencing adapters are attached and normal SBS sequencing is performed.
[00062] Subsequent analysis of the data then seeks out short segments with a high Q-factor and high concordance that deviate from the reference genome or an unprocessed sample. The onset of the region (in the 3 ’->5’ direction) marks the location of the original 5mC site.
[00063] A simple model where lesion polymerase introduces a discrepant region characterized by base call errors in an N base window overlapping a given 5-mC site. If the rate of base call errors is perror , then the total number of incorrect base calls in the window has a binomial distribution. Binomial(N, Perror)-
[00064] A very simple methylation calling algorithm is used: If the number of base calling errors within a window associated with the C is at or above some threshold tmeth then C is labeled methylated. Otherwise it is labeled unmethylated. No call is made if a given read does not cover the entire window.
[00065] Figure 6 shows the true positive (TP) rate corresponding to different values of perror , and tmeth assuming that the size of the window, N, is 10.
[00066] This Figure shows that it can be possible to call 5-mC with high accuracy when the base call error rate (perror) is above 0.35.
[00067] The rate of false positives that the above methylation calling algorithm entails was verified through application it to a whole-genome DNA sequencing sample (where any mC called by the algorithm is a false positive). False positive error rates of 0.254, 0.084, 0.045, and 0.026 were observed, corresponding to tmeth parameter values of 1, 2, 3, and 4, respectively. These results indicate that two or more mismatches per lObp is a good indicator of a true 5-mC.
[00068] Note that in the above analysis all mismatches consistently present in 4 or more reads were removed since these likely correspond to real SNPs.
[00069] While this method looks to convert a CaC-EDC adduct into a sequencable system for detection of MeC in DNA, this method is also amenable for CaC detection by other readouts. In particular, CaC modification with fluorescent amines while create an “imageable” label that can be readout to provide structural information for CaC-localization. Typical structural genomics effort make use of sequencespecific restriction enzymes to label DNA with flourophores and identify the location of select DNA sequences. In this case, structural imaging can be used to detect the location of meC and MeC hot-spots.
[00070] One challenge associated with this method is the need for the complete conversion of MeC to CaC. TET enzymes proceed through three oxidative intermediates (hydroxy, formyl and CaC) and in many cases the conversion does not go to completion. To overcome this burden, probes and roadblocks select for formyl cytosine (e.g. well established alkoxyamines and hydrazides probes) can be used in conjunction with PyBOP and Oxime coupling to modify both MeC and HMeC adducts.
[00071] All patents and published patent applications referred to herein are incorporated herein by reference. The invention has been described with reference to various specific and preferred embodiments and techniques. Nevertheless, it is understood that many variations and modifications may be made while remaining within the spirit and scope of the invention.
[00072] To see the function of the landmarks in identifying regions of methylation. A potassium oxoruthenate may be used to convert 5 -hydromethylcytosine to 5 -formylcytosine, and then an oxy amine is coupled to the formyl as a functional group, as depicted in Figure 3. The coupled DNA is then replicated. For replication a processive polymerase, such as Pol e, and a lesion polymerase, such as Pol T|, are used. The Pol Eprocessive polymerase will attach with higher fidelity and processivity, therefore the Pol £ processive polymerase will attach to the DNA and begin replication. When the leading edge of the Pol £ processive polymerase runs into the coupled oxyamine, the Pol £ processive polymerase will be knocked off the DNA. The Pol £ processive polymerase will attempt to reattach to the DNA but will be blocked from doing so by the oxyamine. The lesion polymerase Pol T| will then attach to the DNA. The Pol plesion polymerase is smaller than the Pol £ processive polymerase, which allows Pol T| to attach to the DNA when the oxyamine is attached to the DNA. The Pol T| lesion polymerase does not remain attached. The Pol T| lesion polymerase is a low processivity polymerase so it will only attach a few nucleotides before dropping off the DNA. When the Pol T| lesion polymerase drops off the DNA the Pol Eprocessive polymerase will attach and continue replication. This process will occur at each location where the oxyamine has been coupled to the DNA. Once the DNA has been replicated it is sequenced and analyzed. Each location where an oxyamine was added to the DNA becomes a part of a discrepant region.
[00073] The methods described herein can be used in conjunction with a variety of nucleic acid sequencing techniques.
[00074] In some embodiments, the process to determine the nucleotide sequence of a target nucleic acid can be an automated process. An exemplary embodiment includes sequencing-by-synthesis ("SBS") techniques. Where sequencing by synthesis is used in combination with a high-fidelity non- natural/unnatural base pair, a polymerase that is able to incorporate the a high-fidelity non- natural/unnatural bases is used. Exemplary polymerases have greater than 50%, 60%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% and/or 99% fidelity during incorporation of the non-natural/unnatural bases during amplification of the repaired target polynucleotide.
[00075] Sequencing techniques can utilize nucleotide monomers that have one or more label moiety (ies) or those that lack a label moiety. Accordingly, incorporation events can be detected based on a characteristic of the label, such as fluorescence of the label; a characteristic of the nucleotide monomer such as molecular weight or charge; a byproduct of incorporation of the nucleotide, such as release of pyrophosphate; or the like. In embodiments, where two or more different nucleotides are present in a sequencing reagent, the different nucleotides can be distinguishable from each other, or alternatively, the two or more different labels can be the indistinguishable under the detection techniques being used. For example, the different nucleotides present in a sequencing reagent can have different labels and they can be distinguished using appropriate optics as exemplified by the sequencing methods developed by Solexa (now Illumina, Inc.).
[00076] Other exemplary embodiments include pyrosequencing techniques. Pyrosequencing detects the release of inorganic pyrophosphate (PPi) as particular nucleotides are incorporated into the nascent strand (Ronaghi, M., Karamohamed, S., Pettersson, B., Uhlen, M. and Nyren, P. (1996) "Real-time DNA sequencing using detection of pyrophosphate release." Analytical Biochemistry 242(1), 84-9; Ronaghi, M. (2001) "Pyrosequencing sheds light on DNA sequencing." Genome Res. 11(1), 3-11; Ronaghi, M., Uhlen, M. and Nyren, P. (1998) "A sequencing method based on real-time pyrophosphate." Science 281(5375), 363; U.S. Pat. No. 6,210,891; U.S. Pat. No. 6,258,568 and U.S. Pat. No. 6,274,320, the disclosures of which are incorporated herein by reference in their entireties). In pyrosequencing, released PPi can be detected by being immediately converted to adenosine triphosphate (ATP) by ATP sulfurylase, and the level of ATP generated is detected via luciferase- produced photons. The nucleic acids to be sequenced can be attached to features in an array and the array can be imaged to capture the chemiluminescent signals that are produced due to incorporation of a nucleotides at the features of the array. An image can be obtained after the array is treated with a particular nucleotide type (e.g. A, T, C, G or a non-natural/unnatural base (X)). Images obtained after addition of each nucleotide type will differ with regard to which features in the array are detected. These differences in the image reflect the different sequence content of the features on the array. However, the relative locations of each feature will remain unchanged in the images. The images can be stored, processed and analyzed.
[00077] In another exemplary type of cycle sequencing is accomplished by stepwise addition of reversible terminator nucleotides containing, for example, a cleavable or photo bleachable dye label as described, for example, in WO 04/018497 and U.S. Pat. No. 7,057,026, the disclosures of which are incorporated herein by reference. This approach is being commercialized by Solexa (now Illumina Inc.), and is also described in WO 91/06678 and WO 07/123,744, each of which is incorporated herein by reference. The availability of fluorescently-labeled terminators in which both the termination can be reversed and the fluorescent label cleaved facilitates efficient cyclic reversible termination (CRT) sequencing. Polymerases can also be co-engineered to efficiently incorporate and extend from these modified nucleotides.
[00078] Preferably in reversible terminator-based sequencing embodiments, the labels do not substantially inhibit extension under SBS reaction conditions. However, the detection labels can be removable, for example, by cleavage or degradation. Images can be captured following incorporation of labels into arrayed nucleic acid features. In particular embodiments, each cycle involves simultaneous delivery of four different nucleotide types to the array and each nucleotide type has a spectrally distinct label. Four images can then be obtained, each using a detection channel that is selective for one of the four different labels. Alternatively, different nucleotide types can be added sequentially and an image of the array can be obtained between each addition step. In such embodiments each image will show nucleic acid features that have incorporated nucleotides of a particular type. Different features will be present or absent in the different images due the different sequence content of each feature. However, the relative position of the features will remain unchanged in the images. Images obtained from such reversible terminator-SBS methods can be stored, processed and analyzed as known in the art. Following the image capture step, labels can be removed and reversible terminator moieties can be removed for subsequent cycles of nucleotide addition and detection. Removal of the labels after they have been detected in a particular cycle and prior to a subsequent cycle can provide the advantage of reducing background signal and crosstalk between cycles.
[00079] In particular embodiments some or all of the nucleotide monomers can include reversible terminators. In such embodiments, reversible terminators/cleavable fluorophore can include fluorophore linked to the ribose moiety via a 3' ester linkage (Metzker, Genome Res. 15:1767-1776 (2005), which is incorporated herein by reference). Other approaches have separated the terminator chemistry from the cleavage of the fluorescence label (Ruparel et al., Proc Natl Acad Sci USA 102: 5932-7 (2005), which is incorporated herein by reference in its entirety). Ruparel et al described the development of reversible terminators that used a small 3' allyl group to block extension, but could easily be deblocked by a short treatment with a palladium catalyst. The fluorophore was attached to the base via a photocleavable linker that could easily be cleaved by a 30 second exposure to long wavelength UV light. Thus, either disulfide reduction or photocleavage can be used as a cleavable linker. Another approach to reversible termination is the use of natural termination that ensues after placement of a bulky dye on a dNTP. The presence of a charged bulky dye on the dNTP can act as an effective terminator through steric and/or electrostatic hindrance. The presence of one incorporation event prevents further incorporations unless the dye is removed. Cleavage of the dye removes the fluor and effectively reverses the termination. Examples of modified nucleotides are also described in U.S. Pat. No. 7,427,673, and U.S. Pat. No. 7,057,026, the disclosures of which are incorporated herein by reference in their entireties.
[00080] Additional exemplary SBS systems and methods which can be utilized with the methods and systems described herein are described in U.S. Patent Application Publication No. 2007/0166705, U.S. Patent Application Publication No. 2006/0188901, U.S. Pat. No. 7,057,026, U.S. Patent Application Publication No. 2006/0240439, U.S. Patent Application Publication No. 2006/0281109, PCT Publication No. WO 05/065814, U.S. Patent Application Publication No. 2005/0100900, PCT Publication No. WO 06/064199, PCT Publication No. WO 07/010,251, U.S. Patent Application Publication No. 2012/0270305 and U.S. Patent Application Publication No. 2013/0260372, the disclosures of which are incorporated herein by reference in their entireties.
[00081] Some embodiments can utilize detection of four different nucleotides using fewer than four different labels. For example, SBS can be performed utilizing methods and systems described in the incorporated materials of U.S. Patent Application Publication No. 2013/0079232. As a first example, a pair of nucleotide types can be detected at the same wavelength, but distinguished based on a difference in intensity for one member of the pair compared to the other, or based on a change to one member of the pair (e.g. via chemical modification, photochemical modification or physical modification) that causes apparent signal to appear or disappear compared to the signal detected for the other member of the pair. As a second example, three of four different nucleotide types can be detected under particular conditions while a fourth nucleotide type lacks a label that is detectable under those conditions, or is minimally detected under those conditions (e.g., minimal detection due to background fluorescence, etc.). Incorporation of the first three nucleotide types into a nucleic acid can be determined based on presence of their respective signals and incorporation of the fourth nucleotide type into the nucleic acid can be determined based on absence or minimal detection of any signal. As a third example, one nucleotide type can include label(s) that are detected in two different channels, whereas other nucleotide types are detected in no more than one of the channels. The aforementioned three exemplary configurations are not considered mutually exclusive and can be used in various combinations. An exemplary embodiment that combines all three examples, is a fluorescent-based SBS method that uses a first nucleotide type that is detected in a first channel (e.g. dATP having a label that is detected in the first channel when excited by a first excitation wavelength), a second nucleotide type that is detected in a second channel (e.g. dCTP having a label that is detected in the second channel when excited by a second excitation wavelength), a third nucleotide type that is detected in both the first and the second channel (e.g. dTTP having at least one label that is detected in both channels when excited by the first and/or second excitation wavelength) and a fourth nucleotide type that lacks a label that is not, or minimally, detected in either channel (e.g. dGTP having no label).
[00082] Another exemplary embodiment, is a fluorescent-based SBS method that uses a first nucleotide type that is detected in a first channel (e.g. dATP having a label that is detected in the first channel when excited by a first excitation wavelength), a second nucleotide type that is detected in a second channel (e.g. dCTP having a label that is detected in the second channel when excited by a second excitation wavelength), a third nucleotide type that is detected in both the first and the second channel (e.g. dTTP having at least one label that is detected in both channels when excited by the first and/or second excitation wavelength), a fourth nucleotide type that lacks a label that is not, or minimally, detected in either channel (e.g. dGTP having no label) and a fifth nucleotide type that is detected in the second channel when excited by a first excitation wavelength (e.g. dPaTP having a label that is excited by the first excitation wavelength, but that emits in the second channel).
[00083] Another exemplary embodiment, is a fluorescent-based method that uses four channels, wherein a first nucleotide type emits in channel 1 (e.g. dATP), a second nucleotide type emits in channel 2 (e.g. dTTP), a third nucleotide type emits in channel 3 (e.g. dCTP), a fourth nucleotide type emits in channel 4 (e.g. dGTP) and a fifth nucleotide does not emit in channels 1 through 4 (e.g. dPaTP), it may contain no flour or it may contain a flour that emits in a fifth channel. For example, the non- natural/unnatural base may be detected using a dye set with an orthogonal excitation/emission characteristic, such as, but not limited to, a FRET dye.
[00084] Any combination of detection methods may be used to identify the four natural bases and the fifth non-natural/unnatural base.
[00085] Further, as described in the incorporated materials of U.S. Patent Application Publication No. 2013/0079232, sequencing data can be obtained using a single channel. In such so-called one-dye sequencing approaches, the first nucleotide type is labeled but the label is removed after the first image is generated, and the second nucleotide type is labeled only after a first image is generated. The third nucleotide type retains its label in both the first and second images, and the fourth nucleotide type remains unlabeled in both images.
[00086] Some embodiments can utilize sequencing by ligation techniques. Such techniques utilize DNA ligase to incorporate oligonucleotides and identify the incorporation of such oligonucleotides. The oligonucleotides typically have different labels that are correlated with the identity of a particular nucleotide in a sequence to which the oligonucleotides hybridize. As with other SBS methods, images can be obtained following treatment of an array of nucleic acid features with the labeled sequencing reagents. Each image will show nucleic acid features that have incorporated labels of a particular type. Different features will be present or absent in the different images due the different sequence content of each feature, but the relative position of the features will remain unchanged in the images. Images obtained from ligation-based sequencing methods can be stored, processed and analyzed. Exemplary SBS systems and methods which can be utilized with the methods and systems described herein are described in U.S. Pat. No. 6,969,488, U.S. Pat. No. 6,172,218, and U.S. Pat. No. 6,306,597, the disclosures of which are incorporated herein by reference in their entireties.
[00087] Some embodiments can utilize nanopore sequencing (Deamer, D. W. & Akeson, M. "Nanopores and nucleic acids: prospects for ultrarapid sequencing." Trends Biotechnol. 18, 147-151 (2000); Deamer, D. and D. Branton, "Characterization of nucleic acids by nanopore analysis". Acc. Chem. Res. 35:817-825 (2002); Li, J., M. Gershow, D. Stein, E. Brandin, and J. A. Golovchenko, "DNA molecules and configurations in a solid-state nanopore microscope" Nat. Mater. 2:611-615 (2003), the disclosures of which are incorporated herein by reference in their entireties). In such embodiments, the target nucleic acid passes through a nanopore. The nanopore can be a synthetic pore or biological membrane protein, such as a-hemolysin. As the target nucleic acid passes through the nanopore, each base-pair can be identified by measuring fluctuations in the electrical conductance of the pore. (U.S. Pat. No. 7,001,792; Soni, G. V. & Meller, "A. Progress toward ultrafast DNA sequencing using solid- state nanopores." Clin. Chem. 53, 1996-2001 (2007); Healy, K. "Nanopore-based single-molecule DNA analysis." Nanomed. 2, 459-481 (2007); Cockroft, S. L., Chu, J., Amorin, M. & Ghadiri, M. R. "A single-molecule nanopore device detects DNA polymerase activity with single-nucleotide resolution." J. Am. Chem. Soc. 130, 818-820 (2008), the disclosures of which are incorporated herein by reference in their entireties). Data obtained from nanopore sequencing can be stored, processed and analyzed. In particular, the data can be treated as an image in accordance with the exemplary treatment of optical images and other images.
[00088] Some embodiments can utilize methods involving the real-time monitoring of DNA polymerase activity. Nucleotide incorporations can be detected through fluorescence resonance energy transfer (FRET) interactions between a fluorophore-bearing polymerase and y-phosphate-labeled nucleotides as described, for example, in U.S. Pat. No. 7,329,492 and U.S. Pat. No. 7,211,414 (each of which is incorporated herein by reference) or nucleotide incorporations can be detected with zero-mode waveguides as described, for example, in U.S. Pat. No. 7,315,019 (which is incorporated herein by reference) and using fluorescent nucleotide analogs and engineered polymerases as described, for example, in U.S. Pat. No. 7,405,281 and U.S. Patent Application Publication No. 2008/0108082 (each of which is incorporated herein by reference). The illumination can be restricted to a zeptoliter-scale volume around a surface-tethered polymerase such that incorporation of fluorescently labeled nucleotides can be observed with low background (Levene, M. J. et al. "Zero-mode waveguides for single-molecule analysis at high concentrations." Science 299, 682-686 (2003); Lundquist, P. M. et al. "Parallel confocal detection of single molecules in real time." Opt. Lett. 33, 1026-1028 (2008); Korlach, J. et al. "Selective aluminum passivation for targeted immobilization of single DNA polymerase molecules in zero-mode waveguide nano structures." Proc. Natl. Acad. Sci. USA 105, 1176-1181 (2008), the disclosures of which are incorporated herein by reference in their entireties). Images obtained from such methods can be stored, processed and analyzed.
[00089] Some SBS embodiments include detection of a proton released upon incorporation of a nucleotide into an extension product. For example, sequencing based on detection of released protons can use an electrical detector and associated techniques that are commercially available from Ion Torrent (Guilford, CT, a Life Technologies subsidiary) or sequencing methods and systems described in US 2009/0026082 Al; US 2009/0127589 Al; US 2010/0137143 Al; or US 2010/0282617 Al, each of which is incorporated herein by reference. Methods set forth herein for amplifying target nucleic acids using kinetic exclusion can be readily applied to substrates used for detecting protons. More specifically, methods set forth herein can be used to produce clonal populations of amplicons that are used to detect protons.
[00090] The above SBS methods can be advantageously carried out in multiplex formats such that multiple different target nucleic acids are manipulated simultaneously. In particular embodiments, different target nucleic acids can be treated in a common reaction vessel or on a surface of a particular substrate. This allows convenient delivery of sequencing reagents, removal of unreacted reagents and detection of incorporation events in a multiplex manner. In embodiments using surface-bound target nucleic acids, the target nucleic acids can be in an array format. In an array format, the target nucleic acids can be typically bound to a surface in a spatially distinguishable manner. The target nucleic acids can be bound by direct covalent attachment, attachment to a bead or other particle or binding to a polymerase or other molecule that is attached to the surface. The array can include a single copy of a target nucleic acid at each site (also referred to as a feature) or multiple copies having the same sequence can be present at each site or feature. Multiple copies can be produced by amplification methods such as, bridge amplification or emulsion PCR as described in further detail below.
[00091] The methods set forth herein can use arrays having features at any of a variety of densities including, for example, at least about 10 features/cm2, 100 features/cm2, 500 features/cm2, 1,000 features/cm2, 5,000 features/cm2, 10,000 features/cm2, 50,000 features/cm2, 100,000 features/cm2, 1,000,000 features/cm2, 5,000,000 features/cm2, or higher. In an embodiment, a low fidelity non- natural/unnatural base may be identified in combination with the four natural bases. As depicted in Fig. 5, the amplification of a strand containing a low fidelity non-natural/unnatural base will lead to the incorporation of one the natural bases in the daughter strand. However, when the complementary strand is amplified, it will always incorporate a C at that location. After sequencing is complete, the sites with variants a particular location are identified as being methylated cytosine. In addition, to aid in the alignment of locations with variants for a particular location, the double stranded nucleic acid may be fragmented and labeled 3’ and or 5’ with Unique Molecular Identifiers (UMIs) as is well known the art prior the treatment of the double stranded nucleic acid with the glycosylase. Unique molecular indices or unique molecular identifiers (UMIs) are sequences of nucleotides applied to or identified in DNA molecules that may be used to distinguish individual DNA molecules from one another. Since UMIs are used to identify DNA molecules, they are also referred to as unique molecular identifiers. See, e.g., Kivioja, Nature Methods 9, 72-74 (2012). UMIs may be sequenced along with the DNA molecules with which they are associated to determine whether the read sequences are those of one source DNA molecule or another. The term “UMI” is used herein to refer to both the sequence information of a polynucleotide and the physical polynucleotide per se.
[00092] Commonly, multiple instances of a single source molecule are sequenced. In the case of sequencing by synthesis using Illumina's sequencing technology, the source molecule may be PCR amplified before delivery to a flow cell.
[00093] UMIs are similar to bar codes, which are commonly used to distinguish reads of one sample from reads of other samples, but UMIs are instead used to distinguish one source DNA molecule from another when many DNA molecules are sequenced together. Because there may be many more DNA molecules in a sample than samples in a sequencing run, there are typically many more distinct UMIs than distinct barcodes in a sequencing run.
[00094] As mentioned, UMIs may be applied to or identified in individual DNA molecules. In some implementations, the UMIs may be applied to the DNA molecules by methods that physically link or bond the UMIs to the DNA molecules, e.g., by ligation or transposition through polymerase, endonuclease, transposases, etc. These “applied” UMIs are therefore also referred to as physical UMIs. In some contexts, they may also be referred to as exogenous UMIs. The UMIs identified within source DNA molecules are referred to as virtual UMIs. In some context, virtual UMIs may also be referred to as endogenous UMI.
[00095] Physical UMIs may be defined in many ways. For example, they may be random, pseudorandom or partially random, or nonrandom nucleotide sequences that are inserted in adapters or otherwise incorporated in source DNA molecules to be sequenced. In some implementations, the physical UMIs may be so unique that each of them is expected to uniquely identify any given source DNA molecule present in a sample. The collection of adapters is generated, each having a physical UMI, and those adapters are attached to fragments or other source DNA molecules to be sequenced, and the individual sequenced molecules each has a UMI that helps distinguish it from all other fragments. In such implementations, a very large number of different physical UMIs (e.g., many thousands to millions) may be used to uniquely identify DNA fragments in a sample. [00096] Of course, the physical UMI must have a sufficient length to ensure this uniqueness for each and every source DNA molecule. In some implementations, a less unique molecular identifier can be used in conjunction with other identification techniques to ensure that each source DNA molecule is uniquely identified during the sequencing process. In such implementations, multiple fragments or adapters may have the same physical UMI. Other information such as alignment location or virtual UMIs may be combined with the physical UMI to uniquely identify reads as being derived from a single source DNA molecule/fragment. In some implementations, adaptors include physical UMIs limited to a relatively small number of nonrandom sequences, e.g., 120 nonrandom sequences. Such physical UMIs are also referred to as nonrandom UMIs. In some implementations, the nonrandom UMIs may be combined with sequence position information, sequence position, and/or virtual UMIs to identify reads attributable to a same source DNA molecule. The identified reads may be combined to obtain a consensus sequence that reflects the sequence of the source DNA molecule as described herein. Using physical UMIs, virtual UMIs, and/or alignment locations, one can identify reads having the same or related UMIs or locations, which identified reads can then be combined to obtain one or more consensus sequences. The process for combining reads to obtain a consensus sequence is also referred to as “collapsing” reads.
[00097] In an exemplary embodiment, the non-natural/unnatural base read out may be marked as a cytosine for the purpose of mapping to a reference genome. The non-natural/unnatural base read out may be marked as a 5-meC or 5-hmeC, before or after mapping to a reference genome, and then analyzed to identify and/or visualize the methylome.
Examples
[00098] An option for seeing the functionality of the method described above follows. A sample of an STK11 gene is obtained. Figure 7 is a DNA nucleotide sequence of the STK11 gene. The STK11 gene encodes a serine/threonine kinase involved in regulating cell polarity and energy metabolism. Additionally, this protein may function as a tumor suppressor. Mutations in this gene have been associated with skin, pancreatic, and testicular cancers. Methylation/hydroxymethylation of cytosine is one indicator of possible cancers. A sample of STK11 can be used with the method. A potassium oxoruthenate may be used to convert 5 -hydromethylcytosine to 5 -formylcytosine. For example, in the STK11 gene there are numerous cytosine nucleotides any number of which could be methylated/hydroxymethylated. In the example described herein, several cytosine nucleotides have been chosen to represent the identification of the methylated/hydroxymethylated cytosine. In this example, the cytosines located at postions 56, 256, 754, and 1095 have been chosen to represent the methylated/hydroxymethylated cytosines. The reaction continues and an oxyamine is coupled to the formyl as a functional group, as depicted in Figure 3. This will take place at the nucleotide sites 56, 256, 754, and 1095. The coupled DNA is then replicated. For replication a processive polymerase, such as Pol e and a lesion polymerase are used. The Pol £ will attach with higher fidelity and processivity, therefore the Pol £ will attach to the DNA and begin replication. The Pol £ will add base pairs to the replicating strand. As the Pol £ progresses along the DNA strand of the STK11 gene until the Pol £ reaches the oxyamine. The oxyamine interferes with the binding of the Pol £ to the DNA causing the Pol £ to release from the DNA. Once the Pol £ releases from the DNA it will attempt to reattach to the DNA. The oxyamine will interfere with the Pol £ reattaching to the DNA. As the Pol £ is interfered with, the lesion polymerase, Pol T], will attach to the DNA. Pol T| has a lower processivity than Pol £, so the Pol T| will add a few bases and then drop off the DNA. After the Pol T| drops off the DNA the Pol £ will attach. In relation to the cytosines located at positions 56, 256, 754, and 1095 the DNA will be replicated with high fidelity, up to the point where the Pol £ drops off the DNA. At the point the Pol £ drops off and the Pol T| begins replication the DNA will be replicated with low fidelity, creating a discrepant region, so the nucleotides will not match up with the original sample. The discrepant region will be from two to several nucleotides long. The cytosine at position 56 will likely become a different nucleotide in the replicated DNA, additionally the nucleotides in positions 57 and 58 will each likely be a different nucleotide. The discrepant nucleotides will continue as far as the Pol T| is the polymerase doing the replication. Once the Pol £ takes over replication, the nucleotides will return to being correctly replicated. This is true for each cytosine that was methylated/hydroxymethylated. Therefore, the cytosines at positions 256, 754, and 1095 would each also likely become a different nucleotide on the replicated strand and a number of nucleotides following those methylated/hydroxymethylated cytosines would also likely be different. Following replication the discrepant DNA is amplified and sequenced. The sequencing leads to analysis of the amplified DNA. Analysis of the sequenced DNA and comparison to the original sample DNA will highlight the discrepant regions. Those discrepant regions will then identify methylated/hydroxymethylated cystines and their positions.
[00099] Figure 8 is a nucleotide sequence showing one possible sequence configuration of the STK11 gene after addition of a landmark and DNA replication. The cytosines located at postions 56, 256, 754, and 1095 have been chosen to represent the methylated/hydroxymethylated cytosines. In relation to the cytosines located at positions 56, 256, 754, and 1095 the DNA will be replicated with high fidelity, up to the point where the Pol £ drops off the DNA. At the point the Pol £ drops off and the Pol T| begins replication the DNA will be replicated with low fidelity, creating a discrepant region, so the nucleotides will not match up with the original sample. The discrepant region will be from two to several nucleotides long. The cytosine at position 56 will become a different nucleotide in the replicated DNA represented by an X in Figure 8, additionally the nucleotides in positions 57, 58, 59, and 60 will each be a different nucleotide, also designated as X in Figure 8. The cytosine at position 256 will become a different nucleotide in the replicated DNA represented by an X in Figure 8, additionally the nucleotides in positions 257, 258, 259, and 260 will each be a different nucleotide, also designated as X in Figure 8. The cytosine at position 754 will become a different nucleotide in the replicated DNA represented by an X in Figure 8, additionally the nucleotides in positions 755, 756, 757, and 758 will each be a different nucleotide, also designated as X in Figure 8. The cytosine at position 1095 will become a different nucleotide in the replicated DNA represented by an X in Figure 8, additionally the nucleotides in positions 1096, 1097, 1098, and 1099 will each be a different nucleotide, also designated as X in Figure 8.
[000100] An alternative option for seeing the functionality of the method described above follows. A sample of an STK11 gene is obtained. Figure 7 is a DNA nucleotide sequence of the STK11 gene. The STK11 gene encodes a serine/threonine kinase involved in regulating cell polarity and energy metabolism. Additionally, this protein may function as a tumor suppressor. Mutations in this gene have been associated with skin, pancreatic, and testicular cancers. Methylation/hydroxymethylation of cytosine is one indicator of possible cancers. A sample of STK11 can be used with the method. A potassium oxoruthenate may be used to convert 5 -hydromethylcytosine to 5 -formylcytosine. For example, in the STK11 gene there are numerous cytosine nucleotides any number of which could be methylated/hydroxymethylated. In the example described herein, several cytosine nucleotides have been chosen to represent the identification of the methylated/hydroxymethylated cytosine. In this example, the cytosines located at postions 56, 256, 754, and 1095 have been chosen to represent the methylated/hydroxymethylated cytosines. The reaction continues and an hydrazine is coupled to the formyl as a functional group, as depicted in Figure 3. This will take place at the nucleotide sites 56, 256, 754, and 1095. The coupled DNA is then replicated. For replication a processive polymerase, such as Pol 5 and a lesion polymerase are used. The Pol 5 will attach with higher fidelity and processivity, therefore the Pol 5 will attach to the DNA and begin replication. The Pol 5 will add base pairs to the replicating strand. As the Pol 5 progresses along the DNA strand of the STK11 gene until the Pol 5 reaches the hydrazine. The hydrazine interferes with the binding of the Pol 5 to the DNA causing the Pol 5 to release from the DNA. Once the Pol 5 releases from the DNA it will attempt to reattach to the DNA. The hydrazine will interfere with the Pol 5 reattaching to the DNA. As the Pol 5 is interfered with, the lesion polymerase, Pol T|, will attach to the DNA. Pol K has a lower processivity than Pol 5, so the Pol K will add a few bases and then drop off the DNA. After the Pol T| drops off the DNA the Pol 5 will attach. In relation to the cytosines located at positions 56, 256, 754, and 1095 the DNA will be replicated with high fidelity, up to the point where the Pol 5 drops off the DNA. At the point the Pol 5 drops off and the Pol K begins replication the DNA will be replicated with low fidelity, creating a discrepant region, so the nucleotides will not match up with the original sample. The discrepant region will be from two to several nucleotides long. The cytosine at position 56 will likely become a different nucleotide in the replicated DNA, additionally the nucleotides in positions 57 and 58 will each likely be a different nucleotide. The discrepant nucleotides will continue as far as the Pol K is the polymerase doing the replication. Once the Pol 5 takes over replication, the nucleotides will return to being correctly replicated. This is true for each cytosine that was methylated/hydroxymethylated. Therefore, the cytosines at positions 256, 754, and 1095 would each also likely become a different nucleotide on the replicated strand and a number of nucleotides following those methylated/hydroxymethylated cytosines would also likely be different. Following replication the discrepant DNA is amplified and sequenced. The sequencing leads to analysis of the amplified DNA. Analysis of the sequenced DNA and comparison to the original sample DNA will highlight the discrepant regions. Those discrepant regions will then identify methylated/hydroxymethylated cystines and their positions.
[000101] An alternative option for seeing the functionality of the method described above follows. A sample of an STK11 gene is obtained. Figure 7 is a DNA nucleotide sequence of the STK11 gene. The STK11 gene encodes a serine/threonine kinase involved in regulating cell polarity and energy metabolism. Additionally, this protein may function as a tumor suppressor. Mutations in this gene have been associated with skin, pancreatic, and testicular cancers. Methylation/hydroxymethylation of cytosine is one indicator of possible cancers. A sample of STK11 can be used with the method. A potassium oxoruthenate may be used to convert 5 -hydromethylcytosine to 5 -formylcytosine. For example, in the STK11 gene there are numerous cytosine nucleotides any number of which could be methylated/hydroxymethylated. In the example described herein, several cytosine nucleotides have been chosen to represent the identification of the methylated/hydroxymethylated cytosine. In this example, the cytosines located at postions 56, 256, 754, and 1095 have been chosen to represent the methylated/hydroxymethylated cytosines. The reaction continues and a pyrene hydrazine is coupled to the formyl as a functional group. The pyrene hydrazine is a fluorophore, and may be used in fluorescent imaging to identify the methylated/hydroxymethylated cites. This will take place at the nucleotide sites 56, 256, 754, and 1095 as well.
[000102] In another exemplary embodiment, the present invention provides a kit for detecting methylated cytosine in a target DNA. The kit may include one or more of the following: an enzyme for converting 5mC or 5HmC to 5FoC and/or 5CaC. A reaction agent for coupling a functional group to the 5FoC and/or 5CaC so as to create a landmark on the target DNA.

Claims

WHAT IS CLAIMED IS:
1. A method for detecting 5 ’-methylcytosine and/or 5 ’-hydroxymethylcytosine in a target DNA, the method comprising: converting 5 ’-methylcytosine and/or 5 ’-hydroxymethylcytosine to 5 ’-formylcytosine and/or 5 ’-carboxylcytosine to produce converted DNA; treating the converted DNA with a coupling reagent to attach a functional group at the 5’- formyl group of the 5 ’-formylcytosine and/or the 5’ -carboxyl group of the 5 ’-carboxylcytosine to produce a coupled DNA; amplifying the coupled DNA so as to introduce a discrepant region at the site of the functional group to produce discrepant DNA; sequencing the discrepant DNA to identify the discrepant region that deviates from a reference sequence, thereby detecting 5 ’-methylcytosine and/or 5’-hydroxymethylcytosine in a target DNA.
2. The method of claim 1 , wherein amplifying the coupled DNA comprises amplifying the DNA with at least two distinct polymerases.
3. The method of claim 5, wherein at least one of the polymerases is a process! ve polymerase.
4. The method of claim 3, herein the process! ve polymerase is selected from ....
5. The method of claim 2, wherein at least one of the polymerases is a lesion polymerase.
6. The method of claim 5, herein the lesion polymerase is selected from the group comprising: Dpo4, DinB, pol T], pol i, pol K, and Revl.
7. The method according to claim 5, wherein the lesion polymerase introduces the discrepant region.
8. The method of claim 3, wherein the attached functional group causes the process! ve polymerase to release from the DNA.
9. The method of claim 5, wherein the lesion polymerase attaches where the processive polymerase released.
10. The method of claim 6, wherein the lesion polymerase introduces base call errors over a window of bases which overlap a given 5-methylcytosine site.
11. The method of claim 7, wherein if the number of base pair calling errors within a window associated with cytosine is at or above a base call error rate threshold then the cytosine is identified as methylated.
12. The method of claim 8, wherein the base pair error rate threshold is above 0.35.
13. The method of claim 1, wherein the 5 -methylcytosine is converted to 5 -methylformylcytosine before being converted to 5 -methylcarboxylcytosine.
14. The method of claim 7, wherein the 5 -methylcytosine is converted to 5 -methylformylcytosine and then to 5 -methylcarboxylcytosine by the TET protein.
15. The method of any of claims 1-7, and 11, wherein the functional group is attached via a B enzotriazol- 1 -yloxy-tripyrrolidinophosphonium-hexafluorophosphate .
16. The method of any of claims 1-7 and 11-12, wherein the functional group is attached via any of the reagents from the group consisting of BOB, PyAOP, PyBrOP, BOP-CI.
17. The method of any of claims 1-7 and 11-13, wherein the functional group is a fluorescent probe.
18. The method of any of claims 1-15, wherein the sequencing is SBS sequencing.
19. The method of any of claims 1-16, wherein the reference DNA is the original target DNA.
20. The method of any of claims 1-16, wherein the reference DNA is a genome.
21. A first in vitro or ex vivo composition comprising: target DNA comprising functional groups at the site of the 5 ’-methylcytosine and/or 5 ’-hydroxymethylcytosine that have been converted; a processive polymerase; and a lesion polymerase.
22. The composition of claim 22, wherein the processive polymerase is selected from ...
23. The composition of claim 22, herein the lesion polymerase is selected from the group comprising: Dpo4, DinB, pol T], pol i, pol K, and Revl.
24. The composition of claim 24, wherein the lesion polymerase introduces a discrepant region.
25. The composition of claim 22, wherein the attached functional group causes the processive polymerase to release from the DNA.
26. The composition of claim 26, wherein the lesion polymerase attaches where the processive polymerase released.
27. The composition of claim 27, wherein the lesion polymerase introduces base call errors over a window of bases which overlap a given 5-methylcytosine site.
28. The composition of claim 22, wherein the 5-methylcytosine is converted to 5- methylformylcytosine before being converted to 5 -methylcarboxylcytosine.
29. The composition of claim 22, wherein the 5-methylcytosine is converted to 5- methylformylcytosine and then to 5 -methylcarboxylcytosine by the TET protein.
30. The composition of any of claims 23-29, wherein the functional group is attached via a B enzotriazol- 1 -yloxy-tripyrrolidinophosphonium-hexafluorophosphate .
31. The composition of any of claims 22-31 , wherein the functional group is attached via any of the reagents from the group consisting of BOB, PyAOP, PyBrOP, BOP-CI.
32. The composition of any of claims 22-33, wherein the functional group is a fluorescent probe.
33. A second in vitro or ex vivo composition comprising: the discrepant DNA produced by the method of claim 5.
34. The composition of claim 35, wherein base pairs introduced by the lesion polymerase are random insertions resulting in base pair calling errors.
35. The composition of claim 36, wherein a statistically significant number of the second composition is analyzed, and if the number of base pair calling errors within a window of base pairs associated with cytosine is at or above a base call error rate threshold then the cytosine is identified as methylated.
36. The composition of claim 37, wherein the base pair error rate threshold is above 0.35.
37. A kit for detecting 5 ’-methylcytosine and/or 5 ’-hydroxymethylcytosine in a target DNA, the kit comprising: an enzyme for converting 5 ’-methylcytosine and/or 5 ’-hydroxymethylcytosine to 5’- formylcytosine and/or 5 ’-carboxylcytosine to produce converted DNA; a reaction agent for treating the converted DNA with a coupling reagent to attach a functional group at the 5 ’-formyl group of the 5 ’-formylcytosine and/or the 5 ’-carboxyl group of the 5 ’-carboxylcytosine to produce a coupled DNA.
38. The kit of claim 37, wherein the enzyme for converting 5 ’-methylcytosine and/or 5’- hydroxymethylcytosine to 5 ’-formylcytosine and/or 5 ’-carboxylcytosine is from the family having enzymatic activity defined by EC class number 1.14.11.-.
39. The kit of claim 38, wherein the enzyme is the TET protein.
40. The kit of claim 37, wherein the enzyme for converting 5 ’-methylcytosine and/or 5’- hydroxymethylcytosine to 5 ’-formylcytosine and/or 5 ’-carboxylcytosine is a potassium oxoruthenate.
PCT/US2024/0454812023-09-072024-09-06Identification of methylated cytosine using landmarksPendingWO2025054389A1 (en)

Applications Claiming Priority (2)

Application NumberPriority DateFiling DateTitle
US202363581162P2023-09-072023-09-07
US63/581,1622023-09-07

Publications (1)

Publication NumberPublication Date
WO2025054389A1true WO2025054389A1 (en)2025-03-13

Family

ID=92895702

Family Applications (1)

Application NumberTitlePriority DateFiling Date
PCT/US2024/045481PendingWO2025054389A1 (en)2023-09-072024-09-06Identification of methylated cytosine using landmarks

Country Status (1)

CountryLink
WO (1)WO2025054389A1 (en)

Citations (32)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
WO1991006678A1 (en)1989-10-261991-05-16Sri InternationalDna sequencing
US6172218B1 (en)1994-10-132001-01-09Lynx Therapeutics, Inc.Oligonucleotide tags for sorting and identification
US6210891B1 (en)1996-09-272001-04-03Pyrosequencing AbMethod of sequencing DNA
US6258568B1 (en)1996-12-232001-07-10Pyrosequencing AbMethod of sequencing DNA based on the detection of the release of pyrophosphate and enzymatic nucleotide degradation
US6274320B1 (en)1999-09-162001-08-14Curagen CorporationMethod of sequencing a nucleic acid
US6306597B1 (en)1995-04-172001-10-23Lynx Therapeutics, Inc.DNA sequencing by parallel oligonucleotide extensions
WO2004018497A2 (en)2002-08-232004-03-04Solexa LimitedModified nucleotides for polynucleotide sequencing
US20050100900A1 (en)1997-04-012005-05-12Manteia SaMethod of nucleic acid amplification
WO2005065814A1 (en)2004-01-072005-07-21Solexa LimitedModified molecular arrays
US6969488B2 (en)1998-05-222005-11-29Solexa, Inc.System and apparatus for sequential processing of analytes
US7001792B2 (en)2000-04-242006-02-21Eagle Research & Development, LlcUltra-fast nucleic acid sequencing device and a method for making and using the same
US7057026B2 (en)2001-12-042006-06-06Solexa LimitedLabelled nucleotides
WO2006064199A1 (en)2004-12-132006-06-22Solexa LimitedImproved method of nucleotide detection
US20060240439A1 (en)2003-09-112006-10-26Smith Geoffrey PModified polymerases for improved incorporation of nucleotide analogues
US20060281109A1 (en)2005-05-102006-12-14Barr Ost Tobias WPolymerases
WO2007010251A2 (en)2005-07-202007-01-25Solexa LimitedPreparation of templates for nucleic acid sequencing
US7211414B2 (en)2000-12-012007-05-01Visigen Biotechnologies, Inc.Enzymatic nucleic acid synthesis: compositions and methods for altering monomer incorporation fidelity
WO2007123744A2 (en)2006-03-312007-11-01Solexa, Inc.Systems and devices for sequence by synthesis analysis
US7315019B2 (en)2004-09-172008-01-01Pacific Biosciences Of California, Inc.Arrays of optical confinements and uses thereof
US7329492B2 (en)2000-07-072008-02-12Visigen Biotechnologies, Inc.Methods for real-time single molecule sequence determination
US20080108082A1 (en)2006-10-232008-05-08Pacific Biosciences Of California, Inc.Polymerase enzymes and reagents for enhanced nucleic acid sequencing
US7405281B2 (en)2005-09-292008-07-29Pacific Biosciences Of California, Inc.Fluorescent nucleotide analogs and uses therefor
US20090026082A1 (en)2006-12-142009-01-29Ion Torrent Systems IncorporatedMethods and apparatus for measuring analytes using large scale FET arrays
WO2009049916A2 (en)*2007-10-192009-04-23Ludwig-Maximilians-Universität MünchenMethod for determining methylation at cytosine residues
US20090127589A1 (en)2006-12-142009-05-21Ion Torrent Systems IncorporatedMethods and apparatus for measuring analytes using large scale FET arrays
US20100137143A1 (en)2008-10-222010-06-03Ion Torrent Systems IncorporatedMethods and apparatus for measuring analytes
US20100282617A1 (en)2006-12-142010-11-11Ion Torrent Systems IncorporatedMethods and apparatus for detecting molecular interactions using fet arrays
US20120270305A1 (en)2011-01-102012-10-25Illumina Inc.Systems, methods, and apparatuses to image a sample for biological or chemical analysis
US20130079232A1 (en)2011-09-232013-03-28Illumina, Inc.Methods and compositions for nucleic acid sequencing
US20130260372A1 (en)2012-04-032013-10-03Illumina, Inc.Integrated optoelectronic read head and fluidic cartridge useful for nucleic acid sequencing
WO2019136413A1 (en)*2018-01-082019-07-11Ludwig Institute For Cancer Research LtdBisulfite-free, base-resolution identification of cytosine modifications
WO2021252603A1 (en)*2020-06-102021-12-16Rhodx, Inc.Methods for identifying modified bases in a polynucleotide

Patent Citations (35)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
WO1991006678A1 (en)1989-10-261991-05-16Sri InternationalDna sequencing
US6172218B1 (en)1994-10-132001-01-09Lynx Therapeutics, Inc.Oligonucleotide tags for sorting and identification
US6306597B1 (en)1995-04-172001-10-23Lynx Therapeutics, Inc.DNA sequencing by parallel oligonucleotide extensions
US6210891B1 (en)1996-09-272001-04-03Pyrosequencing AbMethod of sequencing DNA
US6258568B1 (en)1996-12-232001-07-10Pyrosequencing AbMethod of sequencing DNA based on the detection of the release of pyrophosphate and enzymatic nucleotide degradation
US20050100900A1 (en)1997-04-012005-05-12Manteia SaMethod of nucleic acid amplification
US6969488B2 (en)1998-05-222005-11-29Solexa, Inc.System and apparatus for sequential processing of analytes
US6274320B1 (en)1999-09-162001-08-14Curagen CorporationMethod of sequencing a nucleic acid
US7001792B2 (en)2000-04-242006-02-21Eagle Research & Development, LlcUltra-fast nucleic acid sequencing device and a method for making and using the same
US7329492B2 (en)2000-07-072008-02-12Visigen Biotechnologies, Inc.Methods for real-time single molecule sequence determination
US7211414B2 (en)2000-12-012007-05-01Visigen Biotechnologies, Inc.Enzymatic nucleic acid synthesis: compositions and methods for altering monomer incorporation fidelity
US7057026B2 (en)2001-12-042006-06-06Solexa LimitedLabelled nucleotides
US7427673B2 (en)2001-12-042008-09-23Illumina Cambridge LimitedLabelled nucleotides
US20060188901A1 (en)2001-12-042006-08-24Solexa LimitedLabelled nucleotides
WO2004018497A2 (en)2002-08-232004-03-04Solexa LimitedModified nucleotides for polynucleotide sequencing
US20070166705A1 (en)2002-08-232007-07-19John MiltonModified nucleotides
US20060240439A1 (en)2003-09-112006-10-26Smith Geoffrey PModified polymerases for improved incorporation of nucleotide analogues
WO2005065814A1 (en)2004-01-072005-07-21Solexa LimitedModified molecular arrays
US7315019B2 (en)2004-09-172008-01-01Pacific Biosciences Of California, Inc.Arrays of optical confinements and uses thereof
WO2006064199A1 (en)2004-12-132006-06-22Solexa LimitedImproved method of nucleotide detection
US20060281109A1 (en)2005-05-102006-12-14Barr Ost Tobias WPolymerases
WO2007010251A2 (en)2005-07-202007-01-25Solexa LimitedPreparation of templates for nucleic acid sequencing
US7405281B2 (en)2005-09-292008-07-29Pacific Biosciences Of California, Inc.Fluorescent nucleotide analogs and uses therefor
WO2007123744A2 (en)2006-03-312007-11-01Solexa, Inc.Systems and devices for sequence by synthesis analysis
US20080108082A1 (en)2006-10-232008-05-08Pacific Biosciences Of California, Inc.Polymerase enzymes and reagents for enhanced nucleic acid sequencing
US20100282617A1 (en)2006-12-142010-11-11Ion Torrent Systems IncorporatedMethods and apparatus for detecting molecular interactions using fet arrays
US20090026082A1 (en)2006-12-142009-01-29Ion Torrent Systems IncorporatedMethods and apparatus for measuring analytes using large scale FET arrays
US20090127589A1 (en)2006-12-142009-05-21Ion Torrent Systems IncorporatedMethods and apparatus for measuring analytes using large scale FET arrays
WO2009049916A2 (en)*2007-10-192009-04-23Ludwig-Maximilians-Universität MünchenMethod for determining methylation at cytosine residues
US20100137143A1 (en)2008-10-222010-06-03Ion Torrent Systems IncorporatedMethods and apparatus for measuring analytes
US20120270305A1 (en)2011-01-102012-10-25Illumina Inc.Systems, methods, and apparatuses to image a sample for biological or chemical analysis
US20130079232A1 (en)2011-09-232013-03-28Illumina, Inc.Methods and compositions for nucleic acid sequencing
US20130260372A1 (en)2012-04-032013-10-03Illumina, Inc.Integrated optoelectronic read head and fluidic cartridge useful for nucleic acid sequencing
WO2019136413A1 (en)*2018-01-082019-07-11Ludwig Institute For Cancer Research LtdBisulfite-free, base-resolution identification of cytosine modifications
WO2021252603A1 (en)*2020-06-102021-12-16Rhodx, Inc.Methods for identifying modified bases in a polynucleotide

Non-Patent Citations (22)

* Cited by examiner, † Cited by third party
Title
ACC. CHEM. RES, vol. 52, no. 4, 2019, pages 1016 - 1024
AYELET MAOR-SHOSHANI, VERED BEN-ARI, AND ZVI, vol. 100, no. 25, 1 December 2003 (2003-12-01), pages 14760 - 14765, Retrieved from the Internet <URL:https://doi.org/10.1073/pnas.2433503100>
BERNEY MARK ET AL: "Methods for detection of cytosine and thymine modifications in DNA", NATURE REVIEWS CHEMISTRY, NATURE PUBLISHING GROUP UK, LONDON, vol. 2, no. 11, 12 October 2018 (2018-10-12), pages 332 - 348, XP036632179, DOI: 10.1038/S41570-018-0044-4*
COCKROFT, S. L.CHU, J.AMORIN, M.GHADIRI, M. R: "A single-molecule nanopore device detects DNA polymerase activity with single-nucleotide resolution", J. AM. CHEM. SOC., vol. 130, 2008, pages 818 - 820, XP055097434, DOI: 10.1021/ja077082c
DEAMER, D. W.AKESON, M.: "Nanopores and nucleic acids: prospects for ultrarapid sequencing", TRENDS BIOTECHNOL., vol. 18, 2000, pages 147 - 151, XP004194002, DOI: 10.1016/S0167-7799(00)01426-8
DEAMER, D.D. BRANTON: "Characterization of nucleic acids by nanopore analysis", ACC. CHEM. RES., vol. 35, 2002, pages 817 - 825, XP002226144, DOI: 10.1021/ar000138m
HEALY, K: "Nanopore-based single-molecule DNA analysis.", NANOMED, vol. 2, 2007, pages 459 - 481, XP009111262, DOI: 10.2217/17435889.2.4.459
J. AM. CHEM. SOC., vol. 140, no. 41, 2018, pages 13190 - 13194
KIVIOJA, NATURE METHODS, vol. 9, 2012, pages 72 - 74
KORLACH, J. ET AL.: "Selective aluminum passivation for targeted immobilization of single DNA polymerase molecules in zero-mode waveguide nano structures", PROC. NATL. ACAD. SCI. USA, vol. 105, 2008, pages 1176 - 1181
LEVENE, M. J. ET AL.: "Zero-mode waveguides for single-molecule analysis at high concentrations", SCIENCE, vol. 299, 2003, pages 682 - 686, XP002341055, DOI: 10.1126/science.1079700
LI, J.M. GERSHOWD. STEINE. BRANDINJ. A. GOLOVCHENKO: "DNA molecules and configurations in a solid-state nanopore microscope", NAT. MATER., vol. 2, 2003, pages 611 - 615, XP009039572, DOI: 10.1038/nmat965
LUNDQUIST, P. M. ET AL.: "Parallel confocal detection of single molecules in real time.", OPT. LETT., vol. 33, 2008, pages 1026 - 1028, XP001522593, DOI: 10.1364/OL.33.001026
METZKER, GENOME RES., vol. 15, 2005, pages 1767 - 1776
RONAGHI, M.: "Pyrosequencing sheds light on DNA sequencing.", GENOME RES., vol. 11, no. 1, 2001, pages 3 - 11, XP000980886, DOI: 10.1101/gr.11.1.3
RONAGHI, MKARAMOHAMED, S.PETTERSSON, B.UHLEN, M.NYREN, P.: "Real-time DNA sequencing using detection of pyrophosphate release", ANALYTICAL BIOCHEMISTRY, vol. 242, no. 1, 1996, pages 84 - 9, XP002388725, DOI: 10.1006/abio.1996.0432
RONAGHI, MUHLEN, M.NYREN, P: "A sequencing method based on real-time pyrophosphate.", SCIENCE, vol. 281, no. 5375, 1998, pages 363, XP002135869, DOI: 10.1126/science.281.5375.363
RUPAREL ET AL., PROC NATL ACAD SCI USA, vol. 102, 2005, pages 5932 - 7
SONI, G. V.MELLER: "A. Progress toward ultrafast DNA sequencing using solid-state nanopores.", CLIN. CHEM., vol. 53, 2007, pages 1996 - 2001, XP055076185, DOI: 10.1373/clinchem.2007.091231
WANG YAFEN ET AL: "Bisulfite-free, single base-resolution analysis of 5-hydroxymethylcytosine in genomic DNA by chemical-mediated mismatch", CHEMICAL SCIENCE, vol. 10, no. 2, 2 January 2019 (2019-01-02), United Kingdom, pages 447 - 452, XP055972257, ISSN: 2041-6520, Retrieved from the Internet <URL:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6335847/pdf/SC-010-C8SC04272A.pdf> DOI: 10.1039/C8SC04272A*
WANG YAFEN ET AL: "Gene specific-loci quantitative and single-base resolution analysis of 5-formylcytosine by compound-mediated polymerase chain reaction", CHEMICAL SCIENCE, vol. 9, no. 15, 19 March 2018 (2018-03-19), United Kingdom, pages 3723 - 3728, XP093227740, ISSN: 2041-6520, Retrieved from the Internet <URL:https://pubs.rsc.org/en/content/articlepdf/2018/sc/c8sc00493e> DOI: 10.1039/C8SC00493E*
ZENG HU ET AL: "Bisulfite-Free, Nanoscale Analysis of 5-Hydroxymethylcytosine at Single Base Resolution", JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, vol. 140, no. 41, 17 October 2018 (2018-10-17), pages 13190 - 13194, XP093065224, ISSN: 0002-7863, DOI: 10.1021/jacs.8b08297*

Similar Documents

PublicationPublication DateTitle
Cheng et al.Methods to improve the accuracy of next-generation sequencing
Kircher et al.High‐throughput DNA sequencing–concepts and limitations
JP6685324B2 (en) Suppression of errors in sequencing DNA fragments using redundant reads with a specific molecular index (UMI)
CN106574287B (en)Sample preparation for nucleic acid amplification
US20140228223A1 (en)High throughput paired-end sequencing of large-insert clone libraries
CN118638898A (en) Method for enrichment of targeted nucleic acid sequences and application in error-corrected nucleic acid sequencing
TW201018731A (en)Methods for accurate sequence data and modified base position determination
JP6789935B2 (en) Sequencing from multiple primers to increase the speed and density of the data
US20250059589A1 (en)Sample preparation for nucleic acid amplification
EP2668294B1 (en)Paired end bead amplification and high throughput sequencing
Masoudi-Nejad et al.Next generation sequencing and sequence assembly: methodologies and algorithms
CN103725773A (en)Technology for identifying HBV (hepatitis B virus) gene integration sites and recurrently targeted genes in host genome
US20220136037A1 (en)Methods of predicting age, and identifying and treating conditions associated with aging
Chang et al.Somatic diseases (cancer): Amplification-based next-generation sequencing
US20230366009A1 (en)Simultaneous amplification of dna and rna from single cells
WO2025054389A1 (en)Identification of methylated cytosine using landmarks
EP3215616B1 (en)Reducing dna damage during sample preparation and sequencing using siderophore chelators
Kang et al.History of nucleotide sequencing technologies: advances in exploring nucleotide sequences from Mendel to the 21st century
WO2024006783A2 (en)Methylation detection with a non-natural/unnatural base
US20220362771A1 (en)Use of droplet single cell epigenome profiling for patient stratification
Masoudi-Nejad et al.Emergence of Next-Generation Sequencing
US20250230497A1 (en)Methods for polynucleotide sequencing
US20220157469A1 (en)Methods of predicting age, and identifying and treating conditions associated with aging using spectral clustering and discrete cosine transform
KR101967879B1 (en)Method for measuring integrity of unique identifier in sequencing
WO2024084439A2 (en)Nucleic acid analysis

Legal Events

DateCodeTitleDescription
121Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number:24776714

Country of ref document:EP

Kind code of ref document:A1


[8]ページ先頭

©2009-2025 Movatter.jp