Movatterモバイル変換


[0]ホーム

URL:


WO2025125413A1 - Dna polymerases for the detection of epigenetic dna marks - Google Patents

Dna polymerases for the detection of epigenetic dna marks
Download PDF

Info

Publication number
WO2025125413A1
WO2025125413A1PCT/EP2024/085898EP2024085898WWO2025125413A1WO 2025125413 A1WO2025125413 A1WO 2025125413A1EP 2024085898 WEP2024085898 WEP 2024085898WWO 2025125413 A1WO2025125413 A1WO 2025125413A1
Authority
WO
WIPO (PCT)
Prior art keywords
dna polymerase
seq
dna
mutations
wild
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/EP2024/085898
Other languages
French (fr)
Inventor
Andreas Marx
Melanie HENKEL
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Universitaet Konstanz
Original Assignee
Universitaet Konstanz
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Universitaet KonstanzfiledCriticalUniversitaet Konstanz
Publication of WO2025125413A1publicationCriticalpatent/WO2025125413A1/en
Pendinglegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Definitions

Landscapes

Abstract

The present invention relates to a mutated DNA polymerase derived from wild-typeThermus aquaticus (Taq) DNA polymerase, a method for the detection of 5-methylcytosine nucleotides (5mC) in a DNA molecule of interest using said DNA polymerase, and a kit comprising said DNA polymerase. Further, the present invention concerns related DNA polymerases of Sequence Family A carrying the corresponding mutations.

Description

DNA POLYMERASES FOR THE DETECTION OF EPIGENETIC DNA MARKS FIELD OF THE INVENTION The present invention relates to a mutated DNA polymerase derived from wild-type Thermus aquaticus (Taq) DNA polymerase, a method for the detection of 5-methylcytosine nucleotides (5mC) in a DNA molecule of interest using said DNA polymerase, and a kit comprising said DNA polymerase. Further, the present invention concerns related DNA polymerases of Sequence Family A carrying the corresponding mutations. BACKGROUND OF THE INVENTION 5-methylcytosine (5mC) is the most common epigenetic modification within the mammalian genome. DNA methylation is predominately found in the CpG dinucleotide and plays a crucial role in fundamental cellular processes by acting mainly on the silencing of gene expression. Dysregulation of methylation is associated with human diseases, especially cancer, and evaluation of aberrant methylation patterns is promising for clinical diagnostics. However, DNA polymerases do not adequately discriminate between processing 5mC and unmodified cytosine and the methylation information is silenced during PCR amplification or sequencing. The most abundant nucleic acid modification in the human genome is 5-methylcytosine (5mC), representing 4% of all cytosines. The post-replicative addition of a methyl group to the 5-carbon atom of the nucleobase cytosine (C) is described as the process of DNA methylation, performed by DNA methyltransferase enzymes. In vertebrate genomes, this covalent modification occurs predominantly in a symmetrical pattern at CpG dinucleotides. DNA methylation is involved in both silencing and activation of gene expression, according to the level and location of 5mC within the gene context. Dependent on cell type and developmental stage, passive (replication- based) and active (enzymatic) demethylation allow dynamic changes in methylation patterns. In fact, consistent as well as variable DNA methylation has a crucial impact on processes like X- inactivation, genomic imprinting, cellular development and differentiation. As a consequence, dysfunction and alterations in methylation patterns can be related to the onset of diverse human diseases, especially malignancy. On the one hand, hypermethylated promotor regions of tumor suppressor genes lead to repressed gene expression and on the other hand, global hypomethylation is associated with highly diverse transcriptional outcomes. Both are connected to tumorigenesis. Hence, hyper- and hypomethylation are considered to be a promising biomarker and analysis of aberrant methylation events has great potential for cancer prognosis and diagnosis. Sequencing based methods are commonly applied for methylation profiling at single base resolution. However, the methyl moiety does not directly affect the Watson-Crick base pairing and direct discrimination between sequenced 5mC and C is aggravated. Some detection methods rely therefore on a chemical and/or enzymatic pre-treatment of the DNA sample. Here, 5mC sites can only be distinguished from unmodified cytosines after methylation-dependent conversion of the template DNA. Indeed, the most common strategy for 5mC detection is based on bisulfite treatment leading to the conversion of C to uracil (U), whereas 5mC remains mostly unchanged. The conversion is followed by amplification and sequencing which eventually results in the exclusive reading of 5mC as C, while unmethylated cytosines are further converted to thymine (T) bases. Although this method is well established and facilitates an accurate mapping and quantification of 5mC sites, bisulfite sequencing has substantial drawbacks: The conversion of unmodified C to U leads to reduced complexity of DNA sequences and impairs downstream processes and sequencing data analysis. Furthermore, harsh reaction conditions lead to fragmentation and degradation of up to 99% of sample DNA, which decreases validity and requires large amounts of starting material. Enzymatic bisulfite-free conversion techniques feature considerably milder reaction conditions, and the sample DNA stays largely intact after treatment. Still, conversion-based methods in general suffer from a number of disadvantages such as multiple process steps, adenine (A)T-rich sequence products and incomplete or incorrect conversion. Considering this, long-read and conversion-free sequencing methods that feature direct 5mC sensing represent a promising alternative. Here, especially single-molecule real-time (SMRT) sequencing is of great interest for direct 5mC profiling. SMRT sequencing captures the real-time kinetics of a DNA polymerase that incorporates fluorescently labelled nucleotides opposite an untreated DNA template. This enables the detection of methylation sites as the monitored elongation speed changes slightly when modified bases are processed. However, large amounts of unamplified template DNA are required to achieve sufficient read depth for 5mC detection. Moreover, although base modifications can be detected directly during sequencing, the DNA polymerase dynamics are only marginally changed, and the detection sensitivity is limited. In addition, sophisticated data analysis and algorithm training are needed for valid methylation profiling in both established and new sequence contexts. Nevertheless, SMRT sequencing highlights the great potential of utilising DNA polymerases for direct 5mC detection. DNA polymerases are essential enzymes for the reliable replication and maintenance of the genetic information encoded in DNA. Polymerase specificity is described as the vital ability to discriminate between incorporating the complementary and a mismatch base during primer elongation opposite a DNA template strand. The mechanism of correct nucleotide incorporation results in the specific fidelity of DNA polymerases. Here, it is assumed that fidelity depends on several mechanistic checkpoints. In addition to correct hydrogen bonding between the template and the incoming 2'-deoxynucleoside-5'-triphosphate (dNTP) according to the Watson-Crick base pairing, fidelity is also defined by minor groove hydrogen bonding, base stacking, solvation, and steric effects. As well as by the mismatch extension and proofreading efficiency of the enzyme. Because 5mC is oriented to the major groove of the DNA double helix and does not interfere with the base hydrogen bonding at the Watson-Crick face, DNA polymerases do not discriminate between C or 5mC and retain their fidelity at methylated sites. SUMMARY OF THE INVENTION Known 5mC detection methods require multi-step DNA conversion treatments and/or extensive analysis of sequencing data to decode single 5mC bases. To overcome the drawbacks of current detection strategies the present invention provides a DNA polymerase-mediated 5mC sensing approach. In particular, the engineering of a thermostable Thermus aquaticus-derived DNA polymerase variant with altered fidelity opposite 5mC is described. Using a screening-based evolution approach, a DNA polymerase that shows enhanced misincorporation opposite 5mC during DNA synthesis was identified. The DNA polymerase generates mutation signatures at methylated CpG sites, which allows direct and facile 5mC detection by next-generation sequencing without prior treatment of the DNA template. Thus, an engineered thermostable enzyme based on the N-terminally truncated form of the Thermus aquaticus (Taq) DNA polymerase (KlenTaq DNA polymerase, henceforth referred to as KTq) that discriminates methylated from unmethylated DNA by increased mismatch incorporation opposite 5mC during DNA synthesis is presented herein. By generating site- specific mutations as a 5mC-dependent signature in the PCR product, the DNA polymerase variant is able to identify methylated CpG sites, e.g. in human genomic DNA. This enzymatic tool enables the direct 5mC detection by reading elevated error rates at 5mC positions after next generation sequencing (NGS) and data processing. Taken together, this strategy proposes a new approach for 5mC profiling at single base resolution without additional treatment prior to detection. Accordingly, the technical problem underlying the present invention is the provision of fast, easy, and sensitive means for the detection of 5mC in a DNA molecule of interest that do not require any pre-treatment of said DNA molecule. The solution to the above technical problem is achieved by the embodiments characterized in the claims. In a first aspect, the present invention relates to a DNA polymerase derived from wild-type Thermus aquaticus (Taq) DNA polymerase and comprises the mutations N483K, E507K/A/R, S515K/N, K540N, A570K, V586G, and I614M/K with regard to the amino acid sequence of wild-type Taq DNA polymerase (SEQ ID NO: 1). In some embodiments, the DNA polymerase comprises an amino acid sequence that is at least 90% identical to SEQ ID NO: 1 and includes said mutations. Alternatively, the DNA polymerase of the present invention can be derived from a fragment of the wild-type Thermus aquaticus (Taq) DNA polymerase and comprises the mutations N483K, E507K/A/R, S515K/N, K540N, A570K, V586G, and I614M/K with regard to the amino acid sequence of wild-type Taq DNA polymerase (SEQ ID NO: 1). In some embodiments, the DNA polymerase of the present invention can comprise or consist of an amino acid sequence being at least 90% identical to or comprising or consisting of the amino acid sequence corresponding to amino acids 293 to 832 of SEQ ID NO: 1 (i.e., the fragment being known as KlenTaq DNA polymerase) including said mutations or being at least 90% identical to or comprising or consisting of the corresponding amino acid sequence as shown in SEQ ID NO: 2 including said mutations. In other embodiments, the DNA polymerase of the present invention can comprise or consist of the amino acid sequence being at least 90% identical to or comprising or consisting of an amino acid sequence corresponding to (i) amino acids 4 to 832 of SEQ ID NO: 1 (i.e., the fragment being known as AmpliTaq DNA polymerase), (ii) amino acids 279 to 832 of SEQ ID NO: 1 (i.e., the fragment being known as Klentaq1 DNA polymerase), or (iii) amino acids 290 to 832 of SEQ ID NO: 1 (i.e., the fragment being known as Stoffel fragment), including said mutations. In certain embodiments, the fragment may comprise or consist of the amino acid sequence corresponding to amino acids 293 to 832 of SEQ ID NO: 1 (known as KlenTaq DNA polymerase) including said mutations, or the corresponding amino acid sequence as shown in SEQ ID NO: 2 including said mutations. Further, the fragment may comprise or consist of the amino acid sequence corresponding to (i) amino acids 4 to 832 of SEQ ID NO: 1 (known as AmpliTaq DNA polymerase), (ii) amino acids 279 to 832 of SEQ ID NO: 1 (known as Klentaq1 DNA polymerase), or (iii) amino acids 290 to 832 of SEQ ID NO: 1 (known as Stoffel fragment), including said mutations. The expression “derived from wild-type Taq DNA polymerase” as used herein relates to the fact that the DNA polymerase of the present invention is substantially identical to wild-type Taq DNA polymerase, provided the above mutations are present. However, said expression also includes DNA polymerases whose amino acid sequence has one or more further amino acid substitutions, deletions or additions as compared to the amino acid sequence of wild-type Taq DNA polymerase, provided the above mutations are present and provided the DNA polymerase retains its DNA polymerase activity. In particular, the DNA polymerase of the present invention can comprise an amino acid sequence that has more than 70%, more than 80%, more than 85%, more than 90%, more than 92%, more than 94%, more than 96%, more than 97%, more than 98%, or more than 99% identity to SEQ ID NO: 1 or SEQ ID NO: 2, provided the above mutations are present. As indicated above, the amino acid sequence as shown in SEQ ID NO: 2, known as KlenTaq DNA polymerase, corresponds to amino acids 293 to 832 of SEQ ID NO: 1. More specifically, amino acids 1 to 540 of SEQ ID NO: 2 correspond to amino acids 293 to 832 of SEQ ID NO: 1, i.e., KlenTaq DNA polymerase is a C-terminal fragment of Taq DNA polymerase. Hence, in one embodiment, the DNA polymerase of the present invention comprises an amino acid sequence that is at least 90% identical to SEQ ID NO: 1 or to SEQ ID NO: 2 and includes said mutations. In another embodiment, the DNA polymerase comprises an amino acid sequence that is at least 94% identical to SEQ ID NO: 1 or to SEQ ID NO: 2 and includes said mutations. In another embodiment, the DNA polymerase comprises an amino acid sequence that is at least 98% identical to SEQ ID NO: 1 or to SEQ ID NO: 2 and includes said mutations. In certain embodiments, the DNA polymerase comprises or consists of the amino acid sequence as shown in SEQ ID NO: 1 or SEQ ID NO: 2 except that it includes said mutations. The expression “including said mutations” as used herein refers to the fact that the above mutations, i.e. the mutations N483K, E507K/A/R, S515K/N, K540N, A570K, V586G, and I614M/K with regard to the amino acid sequence of wild-type Taq DNA polymerase (SEQ ID NO: 1) are present in any case. As an example, the expression “comprising or consisting of the amino acid sequence as shown in SEQ ID NO: 1 including said mutations” as used herein refers to a DNA polymerase comprising or consisting of the amino acid sequence as shown in SEQ ID NO 1, with the exception that in comparison to the amino acid sequence shown in SEQ ID NO: 1, the amino acid sequence of the DNA polymerase of the present invention comprises (i) a lysine (K) in position 483 instead of an asparagine (N) (N483K), (ii) a lysine (K), an alanine (A), or an arginine (R) in position 507 instead of glutamic acid (E) (E507K/A/R), (iii) a lysine (K) or an asparagine (N) in position 515 instead of a serine (S) (S515K/N), (iv) an asparagine (N) in position 540 instead of a lysine (K) (K540N), (v) a lysine (K) in position 570 instead of an alanine (A) (A570K), (vi) a glycine (G) in position 586 instead of a valine (V) (V586G), and (vii) a methionine (M) or a lysine (K) in position 614 instead of an isoleucine (I) (I614M/K). Thus, the DNA polymerase of the present invention comprises an amino acid sequence, wherein seven amino acid residues present in the corresponding wild-type DNA polymerase are exchanged to a specific other amino acid residue. These seven amino acid residues are specified in correspondence to their position in SEQ ID NO: 1 and are the asparagine (N) residue at a position corresponding to amino acid residue 483 of SEQ ID NO: 1 (N483), the glutamic acid (E) residue at a position corresponding to amino acid residue 507 of SEQ ID NO: 1 (E507), the serine (S) residue at a position corresponding to amino acid residue 515 of SEQ ID NO: 1 (S515), the lysine (K) residue at a position corresponding to amino acid residue 540 of SEQ ID NO: 1 (K540), the alanine (A) residue at a position corresponding to amino acid residue 570 of SEQ ID NO: 1 (A570), the valine (V) residue at a position corresponding to amino acid residue 586 of SEQ ID NO: 1 (V586), and the isoleucine (I) residue at a position corresponding to amino acid residue 614 of SEQ ID NO: 1 (I614). Herein, the asparagine (N) residue at a position corresponding to amino acid residue 483 of SEQ ID NO: 1 is exchanged to a lysine (K) residue (N483K), the glutamic acid (E) residue at a position corresponding to amino acid residue 507 of SEQ ID NO: 1 is exchanged to either one of a lysine (K), an alanine (A), or an arginine (R) residue (E507K/A/R), the serine (S) residue at a position corresponding to amino acid residue 515 of SEQ ID NO: 1 is exchanged to either one of a lysine (K) or an asparagine (N) residue (S515K/N), the lysine (K) residue at a position corresponding to amino acid residue 540 of SEQ ID NO: 1 is exchanged to an asparagine (N) residue (K540N), the alanine (A) residue at a position corresponding to amino acid residue 570 of SEQ ID NO: 1 is exchanged to a lysine (K) residue (A570K), the valine (V) residue at a position corresponding to amino acid residue 586 of SEQ ID NO: 1 is exchanged to a glycine (G) residue (V586G), and the isoleucine (I) residue at a position corresponding to amino acid residue 614 of SEQ ID NO: 1 is exchanged to either one of a methionine (M) or a lysine (K) residue (I614M/K). Herein, either one of the alternative amino acid exchanges of the glutamic acid (E) residue at a position corresponding to amino acid residue 507 of SEQ ID NO: 1 to either one of a lysine (K), an alanine (A), or an arginine (R) residue may be combined with either one of the alternative amino acid exchanges of the serine (S) residue at a position corresponding to amino acid residue 515 of SEQ ID NO: 1 to either one of a lysine (K) or an asparagine (N) residue and either one of the alternative amino acid exchanges of the isoleucine (I) residue at a position corresponding to amino acid residue 614 of SEQ ID NO: 1 to either one of a methionine (M) or a lysine (K) residue (I614M/K). The notation of mutations as used herein is a standard notation known in the art. As an example, mutation N483K is a mutation at a position corresponding to amino acid 483 of SEQ ID NO: 1, where an asparagine (N) has been exchanged for a lysine (K). As a further example mutation E507K/A/R is a mutation at a position corresponding to amino acid 507 of SEQ ID NO: 1, where glutamic acid (E) has been exchanged for either one of a lysine (K), an alanine (A), or an arginine (R). The expression “with regard to SEQ ID NO: 1” or “corresponding to amino acid X of SEQ ID NO: 1” as used herein refers to the fact that all respective mutations are to be seen in relation to the wild-type sequence of Taq DNA polymerase provided in SEQ ID NO: 1. As an example, a DNA polymerase according to the present invention can have the amino acid sequence shown in SEQ ID NO: 2 including the mutations N483K, E507K/A/R, S515K/N, K540N, A570K, V586G, and I614M/K with regard to SEQ ID NO: 1. These mutations are actually in positions 191, 215, 223, 248, 278, 294, and 322 of the actual amino acid sequence of the DNA polymerase (i.e. SEQ ID NO:2). However, said mutations are nevertheless labeled N483K, E507K/A/R, S515K/N, K540N, A570K, V586G, and I614M/K, since all mutations are to be seen with regard to their position in SEQ ID NO: 1 (i.e., these mutations are present at a position corresponding to their position amino acid in SEQ ID NO: 1). In some embodiments, the DNA polymerase of the present invention further comprises at least one additional mutation. In some embodiments, the at least one additional mutation is F697S with regard to SEQ ID NO: 1 (i.e., the phenylalanine (F) residue at a position corresponding to amino acid residue 697 of SEQ ID NO: 1 is exchanged to a serine (S) residue). In some embodiments, the DNA polymerase of the invention is derived from wild-type Thermus aquaticus (Taq) DNA polymerase comprising the mutations N483K, E507A, S515K, K540N, A570K, V586G, I614M, and F697S with regard to the amino acid sequence of wild-type Taq DNA polymerase (SEQ ID NO: 1). In some embodiments, the DNA polymerase of the invention comprises an amino acid sequence at least 90% identical to SEQ ID NO: 1 or SEQ ID NO:2 and further comprises the mutations N483K, E507A, S515K, K540N, A570K, V586G, I614M, and F697S with regard to the amino acid sequence of wild-type Taq DNA polymerase (SEQ ID NO: 1). In some embodiments, the DNA polymerase of the invention comprises an amino acid sequence corresponding to or being at least 90% identical to (i) amino acids 293 to 832 of SEQ ID NO: 1; amino acids 4 to 832 of SEQ ID NO: 1; amino acids 279 to 832 of SEQ ID NO: 1 or amino acids 290 to 832 of SEQ ID NO: 1 and further comprises the mutations N483K, E507A, S515K, K540N, A570K, V586G, I614M, and F697S with regard to the amino acid sequence of wild-type Taq DNA polymerase (SEQ ID NO: 1). In specific embodiments, the DNA polymerase of the present invention comprises SEQ ID NO:2 and further comprises the mutations N483K, E507A, S515K, K540N, A570K, V586G, I614M, and F697S with regard to SEQ ID NO: 1. This DNA polymerase is sometimes referred to as "DNA polymerase RIV A8” or “RIV A8” hereinafter. In another specific embodiment, the DNA polymerase of the invention comprises or consists of the amino acid sequence as shown in SEQ ID NOs: 3. In some embodiments, the DNA polymerase of the invention is derived from wild-type Thermus aquaticus (Taq) DNA polymerase comprising the mutations N483K, E507R, S515K, K540N, A570K, V586G, and I614K with regard to the amino acid sequence of wild-type Taq DNA polymerase (SEQ ID NO: 1). In some embodiments, the DNA polymerase of the invention comprises an amino acid sequence at least 90% identical to SEQ ID NO: 1 or SEQ ID NO:2 and further comprises the mutations N483K, E507R, S515K, K540N, A570K, V586G, and I614K with regard to the amino acid sequence of wild-type Taq DNA polymerase (SEQ ID NO: 1). In some embodiments, the DNA polymerase of the invention comprises an amino acid sequence corresponding to or being at least 90% identical to (i) amino acids 293 to 832 of SEQ ID NO: 1; amino acids 4 to 832 of SEQ ID NO: 1; amino acids 279 to 832 of SEQ ID NO: 1 or amino acids 290 to 832 of SEQ ID NO: 1 and further comprises the mutations N483K, E507R, S515K, K540N, A570K, V586G, and I614K with regard to the amino acid sequence of wild-type Taq DNA polymerase (SEQ ID NO: 1). In specific embodiments, the DNA polymerase of the present invention comprises SEQ ID NO:2 and further comprises the N483K, E507R, S515K, K540N, A570K, V586G, and I614K with regard to SEQ ID NO: 1. This DNA polymerase is sometimes referred to as "DNA polymerase RIII H20” or “RIII H20” hereinafter. In another specific embodiment, the DNA polymerase of the invention comprises or consists of the amino acid sequence as shown in SEQ ID NOs: 4. In some embodiments, the DNA polymerase of the invention is derived from wild-type Thermus aquaticus (Taq) DNA polymerase comprising the mutations N483K, E507K, S515N, K540N, A570K, V586G, and I614K with regard to the amino acid sequence of wild-type Taq DNA polymerase (SEQ ID NO: 1). In some embodiments, the DNA polymerase of the invention comprises an amino acid sequence at least 90% identical to SEQ ID NO: 1 or SEQ ID NO:2 and further comprises the mutations N483K, E507K, S515N, K540N, A570K, V586G, and I614K with regard to the amino acid sequence of wild-type Taq DNA polymerase (SEQ ID NO: 1). In some embodiments, the DNA polymerase of the invention comprises an amino acid sequence corresponding to or being at least 90% identical to (i) amino acids 293 to 832 of SEQ ID NO: 1; amino acids 4 to 832 of SEQ ID NO: 1; amino acids 279 to 832 of SEQ ID NO: 1 or amino acids 290 to 832 of SEQ ID NO: 1 and further comprises the mutations N483K, E507K, S515N, K540N, A570K, V586G, and I614K with regard to the amino acid sequence of wild-type Taq DNA polymerase (SEQ ID NO: 1). In specific embodiments, the DNA polymerase of the present invention comprises SEQ ID NO:2 and further comprises the N483K, E507K, S515N, K540N, A570K, V586G, and I614K with regard to SEQ ID NO: 1. This DNA polymerase is sometimes referred to as "DNA polymerase RIV D15” or “RIV D15” hereinafter. In another specific embodiment, the DNA polymerase of the invention comprises or consists of the amino acid sequence as shown in SEQ ID NOs: 5. In respective embodiments, the DNA polymerase of the present invention comprises or consists of the amino acid sequence as shown in SEQ ID NO: 3 (RIV A8), SEQ ID NO: 4 (RIII H20), or SEQ ID NO: 5 (RIV D15). In some embodiments, the DNA polymerases according to the first aspect are thermostable and have altered fidelity opposite 5mC leading to increased nucleotide misincorporation (e.g., dAMP misincorporation) during a polymerase chain reaction (PCR). In a second aspect, the present invention relates to related DNA polymerases bearing the corresponding mutations as defined for the DNA polymerase according to the first aspect of the present invention. In particular, based on sequence homology to other DNA polymerases, Taq DNA polymerase belongs to a family of DNA polymerases known as sequence family A (Family A DNA polymerases, PolAs). This family includes prokaryotic and eukaryotic DNA polymerases having replicative functions. Specific examples of family A DNA polymerases, besides Taq DNA polymerase, are Thermus thermophilus DNA polymerase (Tth), Escherichia coli DNA polymerase I, E. coli phage T7 DNA polymerase (T7), Bacillus stearothermophilus DNA polymerase (Bst), Bacillus subtilis DNA polymerase (Bsu), and Bacillus phage SP01 DNA polymerase (SP01). Of note, the DNA polymerases of sequence family A share common sequence motifs, which are detailed in Table 1 below. In this table, the respective sequence motifs, their position within the respective wild-type amino acid sequence of the DNA polymerase, and the respective mutation sites are indicated. In this context, the wild-type amino acid sequences of Tth DNA polymerase, E. coli DNA polymerase I, T7 DNA polymerase, Bst DNA polymerase, Bsu DNA polymerase, and SP01 DNA polymerase, are shown as SEQ ID NOs: 42, 43, 44, 45, 46, and 47, respectively. 02 4 4 2 6 8 445 9 0 8 8 5 se 5 5 33 4 5 5 5 ti K K 6 K K K K K H s n 3 oit 85 489 47 5 5 37 32 1 2 5 7 05 6 5 35 05 1 7 7 1 33 5 1 1 6 3 5 6 5 at N N N N N N N S 5 S 63 5 S D D S 5 S N u M 709 5 03 5 075 86 15 55 62 E 6 3 5 5 5 Q P 3 T V K K D DI: QO84849405 15 25 35 45 67 8 9 9 0 0 1 N 5 5 5 5 5 5 5 6 6 6 E S no 49690964832431 04245349 40 28 2 6 6 8 itis4-24-45-83-45-5-5-5-5-6-3-4-5- 8 5- 8 5- 8 5- 5 5- o P 8 601 5 7 1 5 6 9 9 3 3 4 48475 3325 35 05 05 05 0653 63 45 45 55 55 25 K Q K I D M G K K K L E K K K K H T T A Y A G G G G Nse L L L E S L L I I As E E G K Q Q Q Q Q Ka R R R I G R R R R Rre Y H Y L I Y Y Y Y Fme Q Q E D R H H Q Q E L L Lyl c I K L L L L Mone I I I A Q I I I I L K K V A I H H Y Y Vp uq E E K Q M E E D D KA e V V P K L V V V V V I I L Ns E Y I I I I F d P P P P E E E D D DD ic H H Y D K H H H H SA a A A D D I H H K K Qy o E E L V L P P D D S R R A R D A A A A Glini L L L V I L L L L M A A E G A K K K K m m Faf A E E E E A E E E E K L L L L Q L L L L Lf V V V V K V V V V So L L L L L L L A A E E E D D D D Ds V V I K V I L A A E D P A A A A De R R T K T V E S S S D D S S S S Nti E E Q Q G G G T T T V D T T T T Ts L L L I L L L S S S V V S S S S Sn Q Q Q H Q Q Q R R P P R Y - Y Y Po D D K D K K K K K A A V G Y G - Ait R R T R P P T G G G Gl) G - G - G Gt S S S S S S S T Tl)l)a G Ku N N S S N N G K K P D a L L L P I I F E Q - Tt E) — s Lt lu Vs T a lt T) — a Ks Kt lu -s T a lt T) - a Ks Kt lu -s K la Dm N N N N N N N T T T Y E T T T Fc(b(c(b(c( Tb( Td F F F F F I K K K - D K K K K Fnasfesit ailo Ar 1il Neqahtoc 7tsus 0qahtoc 7tsus 10m m T T. T B B P T T. T B B Pe Dcylo E S E Sne p uqe S: 1 e1fit f l i bel otoaib ra M Ma TVsut lln ie ca mn B gse :ilatitss B ecn 0 oi 7275; 662 21 61 7868881 382 5469971 9792 6047ene s uqt 55646655586442 3061 1 0745639995442ar esat A A A V Q Q V V V V V6I6I V6I6I7I L6I6I 66675777 L F F F L F L Fe m:uylts oal Mp b A:a Nge D m 7O Tla ets :g O 666 6 6 668686960717172 334566788901 2 3aul 2 2 34567 hpCy N 7 7 7 77 7 7 7 7 77 8 8 8 8il b ot c .ne E mnno 2 4794 902 572 69792 2 042 1 36: g 7ilit 8is58732 82 999843301 1 1 866400906448413a -5-6 4 6 6 5 5 5 6 4 6 6 6 6 6 7 4 6 6 6 7 7 7 5 7 7 7 T ;Ieco 567 - 60 - 61 - 2 7- - 011 2 - 83- - - - - - - - - - - - - - - - - - - -n 858870452 92 2 792 2 042 570482 5e eu P 55 6 4 6 65 5 5 6 4 66060606077456563696969755374727sare qes m y: lla ot ps ul Ac N;e Dsa ilre oc m y al i o hp cir A e N hcsD 1 E 0ec : P inl Seoc e gu. ahqe E ps ;e sds uiallcreicaa Q m Bo Q Q Qyl :1n A Q Q Qi o L L L L L L Ap 0Pm N N N N N N N AS A P P P P P P P N;e D D D F E D E Dsa S S T A V T Nsre S S S H S S S ul m S S S T S S S R R R R R R Rihyl L L L A L L L L L L L L L L po o R R R R R R R E E E E E E E mp re A G G G G G G G L L E Y E E T I I I L I I L T T T T T T T P P E P E E P Q Q Q G Q Q Q P P P P P Q Pht N D A A A V Q Q V T T N S L L T S S S S S S S F F Y T F F Fs s T T T A T T T R R R R R R N Y Y Y A Y Y Y S S R N S S R uili D D D D Q Q E E A E K mt A A V G L L G V V V V I I V D D Dr b T T A N A A H P P P G P P K L L A I A A F F F F L F L Fehus Q Q Q P Q Q I I I I P I I R A A S G A A Q Y Y Y F Y Y Y Ts N N H N N N N N N N I N N P V V V A F F V R R L K R R K: u htlli Tc ; a e B ess aa : u i i i i re s reql ahto 7tsus 1l 0q hto 7tsus 1l 0q hto 7tsus 1l 0q hto 7tsus 10mB y ;e m yl T Tc. T B E B Pa S T Tc. T B Pa E B S T Tc. T B Pa E B S T Tc. T B Plosa E B Spre o m p A Nyl D o sp uc A ” N 2ita Dfi 2ft fit u Aa qsu alihoit ofrt s p oM-it o x u o M MtsE mr mr T o M“ ehe h S P Tt A :or L qaaeB Tts y b Thus, according to the second aspect, the present invention relates to a DNA polymerase, selected from the following DNA polymerases (i) to (vi): (i) a DNA polymerase derived from wild-type Thermus thermophilus (Tth) DNA polymerase, comprising the mutations N485K, Q509K/A/R, S517K/N, K542N, A572K, V588G, and I616M/K with regard to the amino acid sequence of wild-type Tth DNA polymerase (SEQ ID NO: 42); (ii) a DNA polymerase derived from wild-type E. coli DNA polymerase I, comprising the mutations N579K, P603K/A/R, S610K/N, K635N, A665K, V681G, and I709M/K with regard to the amino acid sequence of wild-type E. coli DNA polymerase I (SEQ ID NO: 43); (iii) a DNA polymerase derived from wild-type E. coli phage T7 DNA polymerase, comprising the mutations N335K, one of the mutations T357K/A/R and V368K/A/R, one of the mutations D365K/N and D376K/N, one of the mutations K394N and K404N, V426K, V443G, and L479M/K with regard to the amino acid sequence of wild-type E. coli phage T7 DNA polymerase I (SEQ ID NO: 44); (iv) a DNA polymerase derived from wild-type Bacillus stearothermophilus (Bst) DNA polymerase, comprising the mutations N527K, S557K/N, K582N, Q612K, I628G, and I657M/K, and optionally the mutation K551A/R, with regard to the amino acid sequence of wild-type Bacillus stearothermophilus DNA polymerase (SEQ ID NO: 45); (v) a DNA polymerase derived from wild-type Bacillus subtilis (Bsu) DNA polymerase, comprising the mutations N531K, S561K/N, K586N, Q616K, I632G, and I661M/K, and optionally the mutation K555A/R, with regard to the amino acid sequence of wild-type Bacillus subtilis DNA polymerase (SEQ ID NO: 46); (vi) a DNA polymerase derived from wild-type Bacillus phage SP01 DNA polymerase, comprising the mutations N502K, D526K/A/R, H558N, V587K, V605G, and L639M/K, and optionally the mutation N533K, with regard to the amino acid sequence of wild-type Bacillus phage SP01 DNA polymerase (SEQ ID NO: 47). In some embodiments, the DNA polymerase comprises an amino acid sequence at least 90% identical to Thermus thermophilus (Tth) DNA polymerase of SEQ ID NO: 42 and further comprises the mutations N485K, Q509K/A/R, S517K/N, K542N, A572K, V588G, and I616M/K with regard to the amino acid sequence of wild-type Tth DNA polymerase (SEQ ID NO: 42). In some embodiments, the DNA polymerase comprises an amino acid sequence at least 90% identical to a DNA polymerase derived from wild-type E. coli DNA polymerase I of SEQ ID NO: 43 and further comprises the mutations N579K, P603K/A/R, S610K/N, K635N, A665K, V681G, and I709M/K with regard to the amino acid sequence of wild-type E. coli DNA polymerase I (SEQ ID NO: 43). In some embodiments, the DNA polymerase comprises an amino acid sequence at least 90% identical to a DNA polymerase derived from wild-type E. coli phage T7 DNA polymerase of SEQ ID NO: 44 and further comprises the mutations N335K, one of the mutations T357K/A/R and V368K/A/R, one of the mutations D365K/N and D376K/N, one of the mutations K394N and K404N, V426K, V443G, and L479M/K with regard to the amino acid sequence of wild-type E. coli phage T7 DNA polymerase I (SEQ ID NO: 44). In some embodiments, the DNA polymerase comprises an amino acid sequence at least 90% identical to a DNA polymerase derived from wild-type Bacillus stearothermophilus (Bst) DNA polymerase of SEQ ID NO: 45 and further comprises the mutations N527K, S557K/N, K582N, Q612K, I628G, and I657M/K, and optionally the mutation K551A/R, with regard to the amino acid sequence of wild-type Bacillus stearothermophilus DNA polymerase (SEQ ID NO: 45). In some embodiments, the DNA polymerase comprises an amino acid sequence at least 90% identical to a DNA polymerase derived from wild-type Bacillus subtilis (Bsu) DNA polymerase of SEQ ID NO: 46 and further comprises the mutations N531K, S561K/N, K586N, Q616K, I632G, and I661M/K, and optionally the mutation K555A/R, with regard to the amino acid sequence of wild-type Bacillus subtilis DNA polymerase (SEQ ID NO: 46). In some embodiments, the DNA polymerase comprises an amino acid sequence at least 90% identical to a DNA polymerase derived from wild-type Bacillus phage SP01 DNA polymerase of SEQ ID NO: 46 and further comprises the mutations N502K, D526K/A/R, H558N, V587K, V605G, and L639M/K, and optionally the mutation N533K, with regard to the amino acid sequence of wild-type Bacillus phage SP01 DNA polymerase (SEQ ID NO: 46). In particular embodiments, the DNA polymerases of the present invention according to the second aspect can comprise or consist of the amino acid sequence as shown in SEQ ID NOs: 42, 43, 44, 45, 46, and 47 and further include said mutations recited above, respectively. Of note, the DNA polymerases of the present invention according to the second aspect encompass fragments of E. coli DNA polymerase I, Bacillus stearothermophilus (Bst) DNA polymerase, and Bacillus stearothermophilus (Bst) DNA polymerase, said fragments lacking the respective 5’->3’ exonuclease domain, and said fragments including the above mutations. In some embodiments, the DNA polymerases of the present invention according to the second aspect further comprise (i) the mutation F699S with regard to SEQ ID NO: 42; (ii) the mutation F792S with regard to SEQ ID NO: 43; (iii) the mutation L556S with regard to SEQ ID NO: 44; (iv) the mutation F740S with regard to SEQ ID NO: 45; (v) the mutation L744S with regard to SEQ ID NO: 46; and (vi) the mutation F727S with regard to SEQ ID NO: 47. In some embodiments, the DNA polymerases according to the second aspect are thermostable and have altered fidelity opposite 5mC leading to increased nucleotide misincorporation (e.g., dAMP misincorporation) during a polymerase chain reaction (PCR). According to the second aspect of the present invention the definitions of the terms “derived from wild-type DNA polymerase”, “including said mutations”, “comprising or consisting of the amino acid sequence as shown in SEQ ID NO: XY including said mutations”, and “DNA polymerase”, as well as the definition of preferred sequence identities, and the notation of the mutations, as given for the first aspect of the present invention, apply in an analogous manner. In a further aspect, the present invention relates to a nucleic acid or recombinant nucleic acid comprising a nucleotide sequence coding for a DNA polymerase according to the present invention. Preferably, said nucleic acid comprises a nucleotide sequence coding for a DNA polymerase according to the first aspect of the present invention. However, in other embodiments, said nucleic acid comprises a nucleotide sequence coding for a DNA polymerase according to the second aspect of the present invention. In specific embodiments, the nucleic acid comprises or consists of the nucleotide sequence as shown in SEQ ID NO: 6 (coding for KlenTaq RIV A8), SEQ ID NO: 7 (coding for KlenTaq RIII H20), or SEQ ID NO: 8 (coding for KlenTaq RIV D15). In another aspect, the present invention relates to a vector comprising a nucleic acid or a recombinant according to the present invention. Herein, the vector may be an expression vector. The term “vector” as used herein relates to any vehicle for the transportation of a nucleic acid into a cell. In particular, said term includes plasmid vectors, viral vectors, cosmid vectors, and artificial chromosomes, wherein plasmid vectors are particularly preferred. Preferably, plasmid vectors are suitable for expression of the DNA polymerases of the present invention in a prokaryotic or eukaryotic cell. Respective plasmid vectors are known in the art. In a further aspect, the present invention relates to a host cell comprising the vector and/or the nucleic acid of the present invention. Host cells comprising such expression vectors are useful in methods for producing the DNA polymerase of the invention by culturing the host cells under conditions suitable for expression of the recombinant nucleic acid. Suitable host cells that can be used for the recombinant expression of the DNA polymerases of the present invention are not particularly limited and are known in the art. They include for example suitable bacterial cells, yeast cells, plant cells, insect cells and mammalian cells. In a further aspect, the present invention relates to a method for the detection of 5- methylcytosine nucleotides (5mC) in a DNA molecule of interest, comprising the steps of: (a) amplifying a first aliquot of the DNA molecule of interest in a polymerase chain reaction (PCR), said PCR using a thermostable DNA polymerase having altered fidelity towards a 5mC modification in the DNA molecule of interest leading to increased nucleotide misincorporation (e.g. dAMP misincorporation) opposite the 5mC nucleotide during PCR, (b) sequencing the amplified PCR product obtained in step (a) to generate a test sequence, (c) comparing the test sequence obtained in step (b) to a reference sequence, wherein said reference sequence is obtained by way of (i) amplifying a second aliquot of the DNA molecule of interest in a PCR, said PCR using a High-Fidelity DNA polymerase, thereby generating an unmodified reference template, (ii) amplifying the reference template obtained in step (c)(i) in a PCR, said PCR using a DNA polymerase of the present invention, and (iii) sequencing the amplified PCR product obtained in step (c)(ii) to generate the reference sequence, and (d) identifying mismatches in the test sequence as compared to the reference sequence at positions in which the reference sequence shows a C, and the test sequence shows a T at the same positions, wherein a mismatch identified in step (d) indicates the presence of a 5-methylcytosine at the corresponding positions in the DNA molecule of interest. DNA molecules of interest that can be subject to the methods of the present invention are not particularly limited and include any DNA molecules containing, or being suspected of containing, 5mC. These include e.g. mammalian DNA, human DNA, such as human genomic DNA. In some embodiments, the thermostable DNA polymerase having altered fidelity opposite 5mC leading to increased nucleotide misincorporation (e.g. dAMP misincorporation) during PCR is a DNA polymerase according to the invention as described above. In certain embodiments, the DNA polymerase used in step (a) of the above method is a DNA polymerase according to the first aspect of the present invention. In other embodiments, the DNA polymerase used in said step is a DNA polymerase according to the second aspect of the present invention. In some embodiments, sequencing in step (b) and/or step (c) comprises Next Generation Sequencing (NGS). In some embodiments, the identification of mismatches in step (e) comprises determining a relative error rate in the test sequence as compared to the reference sequence at positions in which the reference sequence shows a C. In some embodiments, the DNA molecule of interest does not require any chemical and/or enzymatic pre-treatment prior to step (a). Means of performing PCR using a thermostable DNA polymerase having altered fidelity opposite 5mC leading to increased nucleotide misincorporation, such as, e.g., the DNA polymerases of the present invention are not particularly limited and are known in the art. Further, means of performing PCR using a High-Fidelity DNA polymerase, as well as suitable High-Fidelity DNA polymerases, are not particularly limited and are known in the art. Herein, fidelity of a DNA polymerase refers to its ability to insert the correct base during PCR. Therefore, High-Fidelity DNA polymerases exhibit a low error rate resulting in a high degree of accuracy in the replication of the DNA of interest. Furthermore, means of sequencing DNA are not particularly limited and are known in the art. These include, in particular, Next Generation Sequencing (NGS) techniques known in the art. Further, means for comparing DNA sequences and identifying mismatches between DNA sequences are not particularly limited and are known in the art. Respective means include bioinformatic and/or statistical methods, including automated methods, known in the art. In a specific embodiment, the identification of mismatches in step (e) of the method of the present invention comprises determining a relative error rate in the test sequence as compared to the reference sequence at positions in which the reference sequence shows a C. Of note, in the method of the present invention, the DNA molecule of interest does not require any chemical and/or enzymatic pre-treatment prior to step (a). Accordingly, no additional steps other than the provision of the DNA molecule of interest, e.g. by way of isolating the same by methods known in the art, are required. In another aspect, the present invention relates to a kit comprising a DNA polymerase of the present invention. In particular, a kit is provided comprising at least one container providing the DNA polymerases of the invention described under the first and second aspect above. In certain embodiments, said kit comprises a DNA polymerase according to the first aspect of the present invention. However, in other embodiments, the kit comprises a DNA polymerase according to the second aspect of the present invention. In some embodiments, the kit of the present invention further comprises suitable buffers and/or suitable disposables and/or suitable enzymes. Herein, the kit may further comprise one or more additional containers selected from the group consisting of (a) a container providing at least one primer hybridizable, under primer extension conditions, to a predetermined polynucleotide template; (b) a container providing dNTPs; and (c) a container providing a buffer suitable for primer extension. In some embodiments, the kit can also include a blood collection tube, container, or unit that comprises heparin or a salt thereof, or releases heparin into solution. The blood collection unit can be a heparinized tube. Such additional containers can include any reagents or other elements recognized by the skilled artisan for use in primer extension procedures in accordance with the methods described above, including reagents for use in, e.g., nucleic acid amplification procedures (e.g., PCR, RT-PCR), DNA sequencing procedures, or DNA labeling procedures. For example, in certain embodiments, the kit further includes a container providing a 5’ sense primer hybridizable, under primer extension conditions, to a predetermined polynucleotide template, or a primer pair comprising the 5' sense primer and a corresponding 3' antisense primer. In other, non-mutually exclusive variations, the kit includes one or more containers providing nucleoside triphosphates (conventional and/or unconventional dNTPs). Definitions Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention pertains. Although essentially any methods and materials similar to those described herein can be used in the practice or testing of the present invention, only exemplary methods and materials are described. For purposes of the present invention, the following terms are defined below. The terms “a,” “an,” and “the” include plural referents, unless the context clearly indicates otherwise. As used herein, the term “comprising”/”comprises” expressly includes the terms “consisting essentially of”/”consists essentially of” and “consisting of”/”consists of”, i.e., all of said terms are interchangeable with each other herein. Further, as used herein, the term “about” refers to a modifier of the specified value of ± 10%, preferably ± 9%, ± 8%, ± 7%, ± 6%, ± 5%, ± 4%, ± 3%, ± 2%, ± 1%, ± 0.5%, or ± 0.1%. Thus, by way of example the term “about 100” as used herein can refer to ranges of 90 to 110, 91 to 109, 92 to 108, 93 to 107, 94 to 106, 95 to 105, 96 to 104, 97 to 103, 98 to 102, 99 to 101, 99.5 to 100.5, or 99.9 to 100.1. An “amino acid” refers to any monomer unit that can be incorporated into a peptide, polypeptide, or protein. As used herein, the term “amino acid” includes the following twenty natural or genetically encoded alpha-amino acids: alanine (Ala or A), arginine (Arg or R), asparagine (Asn or N), aspartic acid (Asp or D), cysteine (Cys or C), glutamine (Gln or Q), glutamic acid (Glu or E), glycine (Gly or G), histidine (His or H), isoleucine (Ile or I), leucine (Leu or L), lysine (Lys or K), methionine (Met or M), phenylalanine (Phe or F), proline (Pro or P), serine (Ser or S), threonine (Thr or T), tryptophan (Trp or W), tyrosine (Tyr or Y), and valine (Val or V). In cases where “X” residues are undefined, these should be defined as “any amino acid.” The structures of these twenty natural amino acids are shown in, e.g., Stryer et al., Biochemistry, 5th ed., Freeman and Company (2002). Additional amino acids, such as selenocysteine and pyrrolysine, can also be genetically coded for (Stadtman (1996) “Selenocysteine,” Annu Rev Biochem 65:83-100 and Ibba et al. (2002) “Genetic code introducing pyrrolysine,” Curr Biol. 12(13):R464-R466). The term “amino acid” also includes unnatural amino acids, modified amino acids (e.g., having modified side chains and/or backbones), and amino acid analogs. See, e.g., Zhang et al. (2004) “Selective incorporation of 5-hydroxytryptophan into proteins in mammalian cells,” Proc. Natl. Acad. Sci. U.S.A. 101(24):8882-8887, Anderson et al. (2004) “An expanded genetic code with a functional quadruplet codon” Proc. Natl. Acad. Sci. U.S.A. 101(20):7566-7571, Ikeda et al. (2003) “Synthesis of a novel histidine analogue and its efficient incorporation into a protein in vivo,” Protein Eng. Des. Sel.16(9):699-706, Chin et al. (2003) “An Expanded Eukaryotic Genetic Code,” Science 301(5635):964-967, James et al. (2001) “Kinetic characterization of ribonuclease S mutants containing photoisomerizable phenylazophenylalanine residues,” Protein Eng. Des. Sel.14(12):983-991, Kohrer et al. (2001) “Import of amber and ochre suppressor tRNAs into mammalian cells: A general approach to site-specific insertion of amino acid analogues into proteins,” Proc. Natl. Acad. Sci. U.S.A. 98(25):14310-14315, Bacher et al. (2001) “Selection and Characterization of Escherichia coli Variants Capable of Growth on an Otherwise Toxic Tryptophan Analogue,” J. Bacteriol. 183(18):5414-5425, Hamano-Takaku et al. (2000) “A Mutant Escherichia coli Tyrosyl-tRNA Synthetase Utilizes the Unnatural Amino Acid Azatyrosine More Efficiently than Tyrosine,” J. Biol. Chem. 275(51):40324-40328, and Budisa et al. (2001) “Proteins with {beta}- (thienopyrrolyl)alanines as alternative chromophores and pharmaceutically active amino acids,” Protein Sci.10(7):1281-1292. The DNA molecule of interest analyzed using the DNA polymerases of the instant invention and in the methods disclosed herein may be obtained from a biological sample. Herein, the term “biological sample” encompasses a variety of sample types obtained from an organism and can be used in a diagnostic or monitoring assay. The term encompasses urine, urine sediment, blood, saliva, and other liquid samples of biological origin, solid tissue samples, such as a biopsy specimen or tissue cultures or cells derived therefrom and the progeny thereof. The term encompasses samples that have been manipulated in any way after their procurement, such as by treatment with reagents, solubilization, sedimentation, or enrichment for certain components. The term encompasses a clinical sample, and also includes cells in cell culture, cell supernatants, cell lysates, serum, plasma, biological fluids, and tissue samples. The term “mutant,” in the context of DNA polymerases of the present invention, means a polypeptide, typically recombinant, that comprises one or more amino acid substitutions relative to a corresponding, unmodified or wild-type DNA polymerase. The term “unmodified form” in the context of a mutant polymerase, is a term used herein for purposes of defining a mutant DNA polymerase of the present invention: the term “unmodified form” or “parental form” or “wild-type” refers to a functional DNA polymerase that has the amino acid sequence of the mutant polymerase except at one or more amino acid position(s) specified as characterizing the mutant polymerase. Thus, reference to a mutant DNA polymerase in terms of (a) its unmodified form and (b) one or more specified amino acid substitutions means that, with the exception of the specified amino acid substitution(s), the mutant polymerase otherwise has an amino acid sequence identical to the unmodified form in the specified motif. The “unmodified polymerase” may contain additional mutations to provide desired functionality, e.g., improved transcriptase efficiency, mismatch tolerance, extension rate; improved tolerance of RT and polymerase inhibitors; and/or improved incorporation of dideoxyribonucleotides, ribonucleotides, ribonucleotide analogs, dye-labeled nucleotides, modulating 5’-nuclease activity, modulating 3’-nuclease (or proofreading) activity, or the like. Accordingly, in carrying out the present invention as described herein, the unmodified form of a DNA polymerase is predetermined. The unmodified form of a DNA polymerase can be, for example, a wild-type and/or a naturally occurring DNA polymerase, or a DNA polymerase that has already been intentionally modified. An unmodified form of the polymerase is preferably a thermostable DNA polymerase, such as DNA polymerases from various thermophilic bacteria, as well as functional variants thereof having substantial sequence identity to a wild-type or naturally occurring thermostable polymerase. In the context of mutant DNA polymerases, “correspondence” to another sequence (e.g., regions, fragments, nucleotide or amino acid positions, or the like) is based on the convention of numbering according to nucleotide or amino acid position number and then aligning the sequences in a manner that maximizes the percentage of sequence identity. An amino acid “corresponding to position [X] of [specific sequence]” refers to an amino acid in a polypeptide of interest that aligns with the equivalent amino acid of a specified sequence. Generally, as described herein, the amino acid corresponding to a position of a polymerase can be determined using an alignment algorithm such as BLAST as described below. Because not all positions within a given “corresponding region” need be identical, non-matching positions within a corresponding region may be regarded as “corresponding positions.” Accordingly, as used herein, referral to an “amino acid position corresponding to amino acid position [X]” of a specified DNA polymerase refers to equivalent positions, based on alignment, in other DNA polymerases and structural homologues and families. In some embodiments of the present invention, “correspondence” of amino acid positions are determined with respect to a region of the polymerase comprising one or more motifs. “Recombinant,” as used herein, refers to an amino acid sequence or a nucleotide sequence that has been intentionally modified by recombinant methods. By the term “recombinant nucleic acid” herein is meant a nucleic acid, originally formed in vitro, in general, by the manipulation of a nucleic acid by restriction endonucleases, in a form not normally found in nature. Thus an isolated, mutant DNA polymerase nucleic acid, in a linear form, or an expression vector formed in vitro by ligating DNA molecules that are not normally joined, are both considered recombinant for the purposes of this invention. It is understood that once a recombinant nucleic acid is made and reintroduced into a host cell, it will replicate non-recombinantly, i.e., using the in vivo cellular machinery of the host cell rather than in vitro manipulations; however, such nucleic acids, once produced recombinantly, although subsequently replicated non- recombinantly, are still considered recombinant for the purposes of the invention. A “recombinant protein” is a protein made using recombinant techniques, i.e., through the expression of a recombinant nucleic acid as depicted above. The term “vector” refers to a piece of DNA, typically double-stranded, which may have inserted into it a piece of foreign DNA. The vector may be, for example, of plasmid origin. Vectors contain "replicon" polynucleotide sequences that facilitate the autonomous replication of the vector in a host cell. Foreign DNA is defined as heterologous DNA, which is DNA not naturally found in the host cell, which, for example, replicates the vector molecule, encodes a selectable or screenable marker, or encodes a transgene. The vector is used to transport the foreign or heterologous DNA into a suitable host cell. Once in the host cell, the vector can replicate independently of or coincidental with the host chromosomal DNA, and several copies of the vector and its inserted DNA can be generated. In addition, the vector can also contain the necessary elements that permit transcription of the inserted DNA into an mRNA molecule or otherwise cause replication of the inserted DNA into multiple copies of RNA. Some expression vectors additionally contain sequence elements adjacent to the inserted DNA that increase the half-life of the expressed mRNA and/or allow translation of the mRNA into a protein molecule. Many molecules of mRNA and polypeptide encoded by the inserted DNA can thus be rapidly synthesized. The term “nucleotide,” in addition to referring to the naturally occurring ribonucleotide or deoxyribonucleotide monomers, shall herein be understood to refer to related structural variants thereof, including derivatives and analogs, that are functionally equivalent with respect to the particular context in which the nucleotide is being used (e.g., hybridization to a complementary base), unless the context clearly indicates otherwise. The term “deoxyribonucleoside triphosphate” or “dNTP” is a generic term referring to the deoxyribonucleotides, dATP, dCTP, dGTP, dTTP, dITP, and/or dUTP. The nucleoside triphosphates containing deoxyribose are called dNTPs, and take the prefix deoxy- in their names and small d- in their abbreviations: deoxyadenosine triphosphate (dATP), deoxyguanosine triphosphate (dGTP), deoxycytidine triphosphate (dCTP), deoxythymidine triphosphate (dTTP), deoxyinosine triphosphate (dITP), and deoxyuridine triphosphate (dUTP). The dNTPs are the building blocks for DNA replication (they lose two of the phosphate groups in the process of incorporation). Each dNTP is made up of a phosphate group, a deoxyribose sugar, and a nitrogenous base. The double helix structure of DNA is made up of the dNTPs, much like monomer units in a polymer. The term “nucleic acid” or “polynucleotide” refers to a polymer that can be corresponded to a ribose nucleic acid (RNA) or deoxyribose nucleic acid (DNA) polymer, or an analog thereof. This includes polymers of nucleotides such as RNA and DNA, as well as synthetic forms, modified (e.g., chemically or biochemically modified) forms thereof, and mixed polymers (e.g., including both RNA and DNA subunits). Also included are synthetic molecules that mimic polynucleotides in their ability to bind to a designated sequence via hydrogen bonding and other chemical interactions. Typically, the nucleotide monomers are linked via phosphodiester bonds, although synthetic forms of nucleic acids can comprise other linkages (e.g., peptide nucleic acids as described in Nielsen et al. (Science 254:1497-1500, 1991). A nucleic acid can be or can include, e.g., a chromosome or chromosomal segment, a vector (e.g., an expression vector), an expression cassette, a naked DNA or RNA polymer, the product of a polymerase chain reaction (PCR), an oligonucleotide, a probe, and a primer. A nucleic acid can be, e.g., single- stranded, double-stranded, or triple-stranded and is not limited to any particular length. Unless otherwise indicated, a particular nucleic acid sequence optionally comprises or encodes complementary sequences, in addition to any sequence explicitly indicated. The term “oligonucleotide” refers to a nucleic acid that includes at least two nucleic acid monomer units (e.g., nucleotides). An oligonucleotide typically includes from about six to about 175 nucleic acid monomer units, more typically from about eight to about 100 nucleic acid monomer units, and still more typically from about 10 to about 50 nucleic acid monomer units (e.g., about 15, about 20, about 25, about 30, about 35, or more nucleic acid monomer units). The exact size of an oligonucleotide will depend on many factors, including the ultimate function or use of the oligonucleotide. Oligonucleotides are optionally prepared by any suitable method, including, but not limited to, isolation of an existing or natural sequence, DNA replication or amplification, reverse transcription, cloning and restriction digestion of appropriate sequences, or direct chemical synthesis by a method such as the phosphotriester method of Narang et al. (Meth. Enzymol.68:90-99, 1979); the phosphodiester method of Brown et al. (Meth. Enzymol.68:109-151, 1979); the diethylphosphoramidite method of Beaucage et al. (Tetrahedron Lett. 22:1859-1862, 1981); the triester method of Matteucci et al. (J. Am. Chem. Soc.103:3185-3191, 1981); automated synthesis methods; or the solid support method of U.S. Pat. No.4,458,066, or other methods known to those skilled in the art. The term “primer” as used herein refers to a polynucleotide capable of acting as a point of initiation of template-directed nucleic acid synthesis when placed under conditions in which polynucleotide extension is initiated (e.g., under conditions comprising the presence of requisite nucleoside triphosphates (as dictated by the template that is copied) and a polymerase in an appropriate buffer and at a suitable temperature or cycle(s) of temperatures (e.g., as in a polymerase chain reaction)). To further illustrate, primers can also be used in a variety of other oligonucleotide-mediated synthesis processes, including as initiators of de novo RNA synthesis and in vitro transcription-related processes (e.g., nucleic acid sequence-based amplification (NASBA), transcription mediated amplification (TMA), etc.). A primer is typically a single- stranded oligonucleotide (e.g., oligodeoxyribonucleotide). The appropriate length of a primer depends on the intended use of the primer but typically ranges from 6 to 40 nucleotides, more typically from 15 to 35 nucleotides. Short primer molecules generally require cooler temperatures to form sufficiently stable hybrid complexes with the template. A primer need not reflect the exact sequence of the template but must be sufficiently complementary to hybridize with a template for primer elongation to occur. In certain embodiments, the term “primer pair” means a set of primers including a 5’ sense primer (sometimes called “forward”) that hybridizes with the complement of the 5’ end of the nucleic acid sequence to be amplified and a 3’ antisense primer (sometimes called “reverse”) that hybridizes with the 3’ end of the sequence to be amplified (e.g., if the target sequence is expressed as RNA or is an RNA). A primer can be labeled, if desired, by incorporating a label detectable by spectroscopic, photochemical, biochemical, immunochemical, or chemical means. For example, useful labels include32P, fluorescent dyes, electron-dense reagents, enzymes (as commonly used in ELISA assays), biotin, or haptens and proteins for which antisera or monoclonal antibodies are available. As used herein, “percentage of sequence identity” is determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the sequence in the comparison window can comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity. The terms “identical” or percent “identity,” in the context of two or more nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that are the same. Sequences are "substantially identical" to each other if they have a specified percentage of nucleotides or amino acid residues that are the same (e.g., at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or at least 95% identity over a specified region)), when compared and aligned for maximum correspondence over a comparison window, or designated region as measured using one of the following sequence comparison algorithms or by manual alignment and visual inspection. These definitions also refer to the complement of a test sequence. Optionally, the identity exists over a region that is at least about 50 nucleotides in length, or more typically over a region that is 100 to 500 or 1,000 or more nucleotides in length. The terms “similarity” or “percent similarity,” in the context of two or more polypeptide sequences, refer to two or more sequences or subsequences that have a specified percentage of amino acid residues that are either the same or similar as defined by a conservative amino acid substitutions (e.g., 60% similarity, optionally 65%, 70%, 75%, 80%, 85%, 90%, or 95% similar over a specified region), when compared and aligned for maximum correspondence over a comparison window, or designated region as measured using one of the following sequence comparison algorithms or by manual alignment and visual inspection. Sequences are also "substantially similar" to each other if they are at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, or at least 55% similar to each other. Optionally, this similarly exists over a region that is at least about 50 amino acids in length, or more typically over a region that is at least about 100 to 500 or 1,000 or more amino acids in length. For sequence comparison, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Default program parameters are commonly used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities or similarities for the test sequences relative to the reference sequence, based on the program parameters. A “comparison window,” as used herein, includes reference to a segment of any one of the number of contiguous positions selected from the group consisting of from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. Methods of alignment of sequences for comparison are well known in the art. Optimal alignment of sequences for comparison can be conducted, for example, by the local homology algorithm of Smith and Waterman (Adv. Appl. Math. 2:482, 1970), by the homology alignment algorithm of Needleman and Wunsch (J. Mol. Biol.48:443, 1970), by the search for similarity method of Pearson and Lipman (Proc. Natl. Acad. Sci. USA 85:2444, 1988), by computerized implementations of these algorithms (e.g., GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by manual alignment and visual inspection (see, e.g., Ausubel et al., Current Protocols in Molecular Biology (1995 supplement)). Examples of an algorithm that is suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al. (Nuc. Acids Res.25:3389-402, 1977), and Altschul et al. (J. Mol. Biol.215:403-10, 1990), respectively. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al., supra). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative- scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) of 10, M=5, N=-4 and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff and Henikoff, Proc. Natl. Acad. Sci. USA 89:10915, 1989) alignments (B) of 50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands. The BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin and Altschul, Proc. Natl. Acad. Sci. USA 90:5873-87, 1993). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, typically less than about 0.01, and more typically less than about 0.001. The term “thermostable polymerase,” refers to an enzyme that is stable to heat, is heat resistant, and retains sufficient activity to effect subsequent polynucleotide extension reactions and does not become irreversibly denatured (inactivated) when subjected to the elevated temperatures for the time necessary to effect denaturation of double-stranded nucleic acids. The heating conditions necessary for nucleic acid denaturation are well known in the art and are exemplified in, e.g., U.S. Patent Nos.4,683,202, 4,683,195, and 4,965,188. As used herein, a thermostable polymerase is suitable for use in a temperature cycling reaction such as the polymerase chain reaction ("PCR"). Irreversible denaturation for purposes herein refers to permanent and complete loss of enzymatic activity. For a thermostable polymerase, enzymatic activity refers to the catalysis of the combination of the nucleotides in the proper manner to form polynucleotide extension products that are complementary to a template nucleic acid strand. Thermostable DNA polymerases from thermophilic bacteria include, e.g., DNA polymerases from Thermus aquaticus, Thermus thermophilus, E. coli DNA polymerase I, E. coli phage T7 DNA polymerase Bacillus subtilis, Bacillus stearothermophilus, and Bacillus phage SP01. Additional thermostable DNA polymerases are from Thermus spp. including Thermus filiformis (Tfi) DNA polymerase, Thermus flavus (Tfl) DNA polymerase, N-terminally truncated Tfl DNA polymerase (encompassing residues 280 to 932 of the wild-type Tfl polymerase), Thermus brockianus (Tbr) DNA polymerase, and Thermus caldophilus (Tca) DNA polymerase. The term “DNA polymerase” as used herein includes DNA polymerases that have been modified by e.g. natural process such as posttranslational processing, or non-natural process such as chemical modification. Such modifications can occur on the peptide backbone, amino acid side chains, or the N- or C-terminus. Modifications include e.g. acetylations, acylations, ADP-ribosylations, amidations, covalent attachment of flavines, haem-groups, nucleotides or nucleotide derivatives, lipids or lipid derivatives, cyclizations, disulfide bridges, methylations and demethylations, cystine linkages, formylations, ^-carboxylations, glycosylations, hydroxylations, phosphorylations and the tRNA-mediated addition of amino acids. Of note, based on sequence homology to other DNA polymerases, Taq DNA polymerase belongs to a family of DNA polymerases known as sequence family A (Family A DNA polymerases, PolAs). This family includes prokaryotic and eukaryotic DNA polymerases having replicative functions. Specific examples of family A DNA polymerases, besides Taq DNA polymerase, are Thermus thermophilus DNA polymerase (Tth), Escherichia coli DNA polymerase I, E. coli phage T7 DNA polymerase (T7), Bacillus stearothermophilus DNA polymerase (Bst), Bacillus subtilis DNA polymerase (Bsu), and Bacillus phage SP01 DNA polymerase (SP01). Of note, further DNA polymerases of this family derived from Thermus spp. include Thermus filiformis (Tfi) DNA polymerase, Thermus flavus (Tfl) DNA polymerase, N-terminally truncated Tfl DNA polymerase (encompassing residues 280 to 932 of the wild- type Tfl polymerase), Thermus brockianus (Tbr) DNA polymerase, and Thermus caldophilus (Tca) DNA polymerase. These DNA polymerases, including the mutations corresponding to the above mutations defined for Taq DNA polymerase, are encompassed herein. BRIEF DESCRIPTION OF THE DRAWINGS The figures show: Figure 1: Primer extension with KTq wild-type for determination of optimal lysate dilution for screening reactions. KTq wild-type (wt) lysate was diluted in different ratios and used for single-nucleotide incorporation of dGMP and dAMP opposite C template (100 nM). 10 nM 5'-FAM labelled primers of different length were used in primer extension reactions in presence of 70 nM dGTP and 10 nM 5'-HEX labelled primers of different length in presence of 35 µM dATP. Reactions were stopped after 5, 10 and 15 min and fluorescently labelled primers were analysed by CE. Optimal lysate dilution of 1:20 (indicated by a black arrow) was selected based on the ideal reaction window to study discrimination and misincorporation characteristics. At optimal lysate concentration the KTq wild-type shows complete elongation of the primer by dGMP incorporation after 10 min and starts with dAMP incorporation just after 15 min (indicated by grey arrows). Primer extension experiments were carried out once for each expressed KTq library 96-well lysate plate and displayed electropherograms represent one exemplary experiment. Figure 2: Sequencing results after bisulfite conversion of template DNA. Base content per sequence position of the bisulfite converted 5'-3' strand of (A) unmodified 803 bp C template DNA generated by PCR, (B) modified 803 bp 5mC template DNA generated by PCR, (C) native gDNA and (D) CpG methylated (mCpG) gDNA. Base calls for each position of the 95 bp amplicon product with converted sequence (top) and original template sequence (bottom). 5mC at methylated CpG sites remains unchanged during bisulfite treatment and is still read as C. Unmethylated cytosines are converted to U and after PCR amplification read as thymines. Methylation status at CpG sites C24, C32 and C72, analysed by reading C base calls of the 5'-3' strand of (E) unmodified 803 bp C template DNA generated by PCR, (F) modified 803 by 5mC template DNA generated by PCR, (G) native gDNA and (H) CpG methylated (mCpG) gDNA after bisulfite conversion. Note, C base calls at CpG sites for unmethylated C template derived from unconverted cytosines. Each NGS library was sequenced once using the Illumina MiSegTM (C and 5mC template) or NextSeq 2000 (native gDNA and mCpG gDNA) system. Figure 3: Screening hits RIII H20 and RIV D15 discriminate 5mC by increased misincorporation. (A) Error rates of linear PCR products generated by the RIII H20 variant in presence of 2 µM dGTP and 200 µM d(A/T/C)TP (each) by amplifying either the unmodified C or modified 5mC template. (B) Error rate difference (left) and mutation signature (right) at C and 5mC positions of RIII H20. (C) Error rates of linear PCR products generated by the RIV D15 variant in presence of 2 µM dGTP and 200 µM d(A/T/C)TP (each) by amplifying either the unmodified C or modified 5mC template. (B) Error rate difference (left) and mutation signature (right) at C and 5mC positions of RIV D15. All CpG sites are indicated by black arrows. Each NGS library was sequenced once using the Illumina NextSeq 2000 system. Figure 4: Standardised error profile of KTq variant RIV A8 for 5mC detection in human genomic DNA. Z-score standardised error rates at C positions of linear PCR products generated by the RIV A8 variant in presence of an unbalanced dNTP pool with 10 µM dGTP (match) and 200 µM d(A/T/C)TP (each) by amplifying the unmodified C template, modified 5mC template, natively methylated gDNA (gDNA native) and CpG methylated gDNA (gDNA mCpG). Z-scores per sample were calculated by dividing the difference between error rate and the mean error rate of exclusively C base positions by the standard deviation of the latter. CpG sites are indicated by black arrows. Error rates derived from NGS libraries which were sequenced once using the Illumina NextSeq 2000 system. Figure 5: Screening for KTq variants with increased misincorporation opposite 5mC. (A) DNA polymerase library expression lysates were used for single nucleotide incorporation of dGMP or dAMP opposite C and 5mC. Utilisation of primers with different length and 5'- fluorescent labelling, FAM for reactions with C and HEX for reactions with 5mC, enabled the multiplexed analysis of 12 primer extension reactions in one capillary by capillary electrophoresis (CE). Primer size correlates with oligonucleotide migration time and differences for FAM- and HEX-labelled primers of same length derive from differential electrophoretic mobility of the fluorophores. Fluorescent signal shift to the right in the electropherograms corresponds to nucleotide incorporation (R = purine base) and the comparison of extension peak intensity to the intensity of non-elongated primer allows a qualitative evaluation of the primer extension reaction efficiencies. (B) Screened characteristics of promising KTq variants: high 5mC discrimination for dGMP incorporation and low to moderate efficiency for dAMP misincorporation opposite C and 5mC. Displayed electropherograms were obtained during screening experiments and are exemplarily shown for anticipated results. Figure 6: Misincorporation experiment with KTq wild-type DNA polymerase. (A) Chemical structure of C (left) and 5mC (right). (B) 50 nM KTq wild-type was tested for mismatch formation during extension of 150 nM radioactively labelled primer annealed to either C or 5mC template (200 nM) by applying 50 µM dGTP, dATP, dTTP and dCTP as substrates. Reactions were stopped after indicated time points. Reaction mixtures were separated by 12% denaturing PAGE end primers were visualised by phosphor imaging. KTq wild-type misincorporated dAMP with higher efficiency in comparison to dTMP and dCMP, therefore dATP was selected as mismatch substrate in screening experiments. Experiments were repeated three times with similar results. Figure 7: Screening for KTq variants with high 5mC discrimination. KTq wild-type (wt), empty vector (ev) and variant lysates were used for single-nucleotide incorporation with 70 nM dGTP and 100 nM template in reaction. 10 nm 5’-FAM labelled primers were used in primer extension reactions opposite C and 10 nM 5'-HEX labelled primers opposite 5mC. Reactions were stopped after 10 min. Experiments were performed for >3000 KTq variants and promising hits (indicated by name) were selected after five rounds of screening considering the combination of discrimination and misincorporation characteristics. Displayed electropherograms were obtained in the last screening round, in which primer extension reactions were carried out three times (one experiment shown). Figure 8: Screening for KTq variants with misincorporation opposite C and 5mC. KTq wild-type (wt), empty vector (ev) and variant lysates were used for single-nucleotide incorporation with 35 µM dATP and 100 nM template in reaction. 10 nM 5'-FAM labelled primers were used in primer extension reactions opposite C and 10 nM 5'-HEX labelled primers opposite 5mC. Reactions were stopped after 10 min. Experiments were performed for >3000 KTq variants and promising hits (indicated by name) were selected after 5 rounds of screening considering the combination of discrimination and misincorporation characteristics. Displayed electropherograms were obtained in the last screening round, in which primer extension reactions were carried out three times (one experiment shown). Figure 9: Characterisation of most interesting screening hits. 2.5 nM purified KTq wild-type (wt) and variants were used in primer extension experiments with 10 nM 5'-FAM labelled primers for nucleotide, incorporation opposite C template (100 nM) and 10 nM 5'-HEX labelled primers opposite 5mC template (100 nM). Reactions were stopped after 10 min. (A) Single-nucleotide incorporation of indicated dGTP (G) concentrations to validate 5mC discrimination and of indicated dATP (A) concentrations to validate misincorporation characteristics of variants in comparison to KTq wild-type. (B) 100 nM dGTP and 100 nM dCTP (C) were used as substrates for multiple-nucleotide incorporation opposite templating bases indicated in bold letters (top). Shift of the fluorescence signal to the right corresponds to increased primer size, indicating the elongation efficiency of the respective DNA polymerase. (C) 70 µM dATP and 10 µM dCTP were used as substrates for multiple- nucleotide incorporation opposite templating bases indicated in bold letters (top) to study mismatch elongation efficiency of the KTq variants. Shift of the fluorescence signal to the right corresponds to primer size increase according to the mismatch elongation efficiency of the respective DNA polymerase after dAMP misincorporation. Shown KTq variants feature efficient mismatch elongation opposite 5mC, indicated by the rightmost fluorescence signal shift of HEX-labelled primers. Note, KTq variants show elongation only after dAMP misincorporation opposite A in reaction with C template. Most promising hits (RIII H20, RIV A8 and RIV D15) were selected considering both the discrimination and misincorporation characteristics in combination with high mismatch elongation efficiency. Primer extension reactions were carried out twice with comparable results and displayed electropherograms were obtained in one experiment. Figure 10: Characterisation of promising screening hits by single-nucleotide incorporation. 2.5 nM of purified KTq variants were used in primer extension experiments with 10 nM 5-FAM labelled primers for nucleotide incorporation opposite C template (100 nM) and 10 nM 5'-HEX labelled primers opposite 5mC template (100 nM). Single-nucleotide incorporation experiments were performed with indicated dGTP (G) match substrate concentrations to validate 5mC discrimination and with indicated dATP (A) mismatch substrate concentrations to validate misincorporation characteristics. Reactions were stopped after 10 min. Primer extension experiments were carried out once. Figure 11: Characterisation of promising screening hits by multiple-nucleotide incorporation. 2.5 nM of purified KTq variants were used in primer extension experiments with 10 nM 5'- FAM labelled primers for nucleotide incorporation opposite C template (100 nM) and 10 nM 5'-HEX labelled primers opposite 5mC template (100 nM). (A) 100 nM dGTP and 100 nM dCTP (C) were used as substrates for multiple-nucleotide incorporation opposite templating bases indicated in bold letters (top). Shift of the fluorescence signal to the right corresponds to primer size increased by extension according to the elongation efficiency of the respective DNA polymerase. (B) 70 µM dATP and 10 µM dCTP were used as substrates for multiple-nucleotide incorporation opposite templating bases indicated in bold letters (top) to study mismatch elongation efficiency of the KTq variants. Shift of the fluorescence signal to the right corresponds to primer size increased according to the mismatch elongation efficiency of the respective DNA polymerase after dAMP misincorporation. Reactions were stopped after 10 min. Primer extension experiments were carried out once. Figure 12: PCR activity of promising KTq variants. 250 nM of purified KTq wild-type (wt) and variants were used in qPCR with 400 nM forward and reverse primer and of 50 pM unmodified 803 bp C template to amplify a 109 bp PCR product. PCR was performed using 20 µM dGTP and 200 µM d(A/T/C)TP (each) to determine KTq variants showing high PCR activity with reduced concentration of match nucleotide. (A) qPCR amplification curves of KTq wild-type and variants. KTq wild-type was applied as positive control reaction. Reaction without enzyme (no variant) was applied as negative control for PCR. (B) Specificity and correct PCR product formation of KTq variants were verified by comparing melting curves of PCR products with the melting peak obtained by the KTq wild- type reaction. (C) Cq value of each KTq variant as indicated at fluorescence intensity threshold of 0.2. Most promising KTq variants (RIII H20, RIV A8 and RIV D15) featured highest qPCR activity with Cq 6.05-6.46. qPCR was carried out once under displayed condition with one reaction per DNA polymerase. Figure 13: Strategy for the detection of 5mC by reading an increased error rate. (A) KTq variants were used for primer elongation opposite C and 5mC in a linear PCR. The increased error rate opposite 5mC (indicated by dots) derives from a selective mismatch formation during DNA synthesis by dAMP misincorporation. Methylation information is represented as a mutation, that is a mispaired A base, in the product DNA. The PCR product functions as template for the labelling with unique molecular identifier (UMI) sequences (sequencing primer binding site in dark grey and one colour per UMI). Errors are retained during amplification in the amplicon PCR (P5 and P7 adapter in black, indexes in blue or green) and libraries were analysed by NGS. (B) KNIME data analysis workflow used for error calculation. Errors based on the misincorporation by the KTq variant can be distinguished from sequencing errors by using the depicted UMI strategy. Reads were sorted by identical UMI sequences into UMI family groups (one UMI family derived from one linear PCR product) and UMI families with a minimum of three reads were further processed. Reads were aligned to the reference sequence and the error rate at each position was calculated within each family. Next, only errors which are present in 90% of all reads within one family (true error cut-off: 0.9), were considered for calculating the mean error rate over all UMI families. For this, the true UMI family errors were set to 1 and the calculated mean error rate represents the error derived from the KTq variant. 5mC readout is facilitated by reading an increased error rate opposite 5mC positions. Figure 14: KTq variant RIV A8 discriminates 5mC by increased misincorporation. (A) Error rates of linear PCR products generated by the RIV A8 variant in presence of an unbalanced dNTP pool with 2 µM dGTP (match) and 200 µM d(A/T/C)TP (each) by amplifying either the unmodified C or modified 5mC template. (B) Error rate difference (left) and mutation signature (right) at C and 5mC positions of RIV A8. Error rate differences were calculated by subtracting error rates obtained by amplifying the C template from error rates obtained by amplifying the 5mC template. All CpG sites are indicated by black arrows. Each NGS library was sequenced once using the Illumina NextSeq 2000 system. Figure 15: Misincorporation profile of KTq wild-type. (A) Error rates of linear PCR products generated by the KTq wild-type in presence of an unbalanced dNTP pool with 2 µM dGTP (match) and 200 µM d(A/T/C)TP (each) by amplifying either the unmodified C or modified 5mC template. (B) Error rate difference (left) and mutation signature (right) at C and 5mC positions of KTq wild-type. Error rate differences were calculated by subtracting error rates obtained by amplifying the C template from error rates obtained by amplifying the 5mC template. All CpG sites are indicated by black arrows. KTq wild-type shows low misincorporation efficiency opposite C positions and only minor 5mC discrimination at CpG sites C24 and C32, but not C72. Each NGS library was sequenced once using the Illumina NextSeq 2000 system. Figure 16: KTq variant RIV A8 verifies misincorporation profile and 5mC discrimination. (A) Error rates of linear PCR products generated by the RIV A8 variant in presence of an unbalanced dNTP pool with 2 µM dGTP (match) and 200 µM d(A/T/C)TP (each) by amplifying either the unmodified C or modified 5mC template. (B) Error rate difference (left) and mutation signature (right) at C and 5mC positions of RIV A8. Error rate differences were calculated by subtracting error rates obtained by amplifying the C template from error rates obtained by amplifying the 5mC template. All CpG sites are indicated by black arrows. KTq variant RIV A8 reproduces high mismatch formation efficiency opposite C positions and 5mC discrimination by increased dAMP misincorporation. Each NGS library was sequenced once using the Illumina MiSeqTM system. Figure 17: Comparison of the DNA polymerases by statistical analysis of C and 5mC error differences from NGS libraries prepared by using template DNA generated by PCR. Black dots represent error rate differences (Δ error rate) at C and 5mC positions of linear PCR products received from KTq variants in presence of an unbalanced dNTP pool with 2 µM dGTP (match) and 200 µM d(A/T/C)TP (each). Error differences were calculated by subtracting error rates obtained by amplifying the unmodified C template from error rates obtained by amplifying the modified 5mC template. Grey lines represent the mean ± standard deviation of each individual group. (A) Dot plot of the KTq wild-type based Δ error rates with combined data from one experiment (n (C) = 10, n (5mC) = 3). (B) Dot plot of the RIII H20 variant based Δ error rates with combined data from one experiment (n (C) = 10, n (5mC) = 3). (C) Dot plot of the RIV A8 variant based Δ error rates with combined data from two experiments (n (C) = 20, n (5mC) = 6). (D) Dot plot of the RIV D15 variant based terror rates with combined data from one experiment (n (C) = 10, n (5mC) = 3). KTq variants RIII H20, RIV A8 and RIV 015 significantly detect 5mC at methylated CpG sites in modified 5mC template. The two-tailed P- values were calculated according to the Wilcoxon-Mann-Whitney (WMW) test comparing C positions with 5mC positions using GraphPad Prism version 6.00 for Windows, GraphPad Software, La Jolla California USA. Figure 18: Detection of 5mC in human genomic DNA by the KTq variant RIV A8. (A) Z-score difference at C positions calculated by subtracting z-scores of standardised error rates of the C template from z-scores of standardised error rates of the 5mC template. Z-scores were calculated by dividing the difference between error rate and the mean error rate of exclusively C base positions by the standard deviation of the latter. Error rates are from linear PCR products received from RIV A8 variant in presence of an unbalanced dNTP pool with 10 µM dGTP (match) and 200 µM d(A/T/C)TP (each). RIV A8 verifies the 5mC discrimination by increased misincorporation opposite CpG sites C24, C32 and C72 in the 5mC template. (B) Z-score difference at C positions calculated by subtracting z-scores of standardised error rates of the C template from z-scores of standardised error rates of the natively methylated gDNA (gDNA native). RIV A8 detects CpG methylation at C24 and C32 by increased misincorporation opposite 5mC in native gDNA. At CpG site C72 (only minor methylation) no increased misincorporation can be detected. (C) Z-score difference at C positions calculated by subtracting z-scores of standardised error rates of the C template from z-scores of standardised error rates of the CpG methylated gDNA (mCpG). RIV A8 detects methylation at CpG sites C24, C32 and C72 by increased misincorporation opposite 5mC in mCpG gDNA. All CpG sites are indicated by black arrows. Error rates derived from NGS libraries which were sequenced once using the Illumina NextSeq 2000 system. Figure 19: Misincorporation profile of KTq variant RIV A8 for 5mC detection in human genomic DNA. (A) Error rates of linear PCR products generated by the RIV A8 variant in presence of an unbalanced dNTP pool with 10 µM dGTP (match) and 200 µM d(A/T/C)TP (each) by amplifying the unmodified C template, modified 5mC template, natively methylated gDNA (gDNA native) and CpG methylated gDNA (gDNA mCpG). (B) Error rate difference at C positions calculated by subtracting error rates obtained by amplifying the C template from error rates obtained by amplifying the 5mC template. (C) Error rate difference at C positions calculated by subtracting error rates obtained by amplifying the C template from error rates obtained by amplifying native gDNA. (D) Error rate difference at C positions calculated by subtracting error rates obtained by amplifying the C template from error rates obtained by amplifying mCpG gDNA. All CpG sites are indicated by black arrows. Absolute error rate data show 5mC discrimination of RIV A8 by increased misincorporation for processing of native gDNA (CpG sites C24 and C32) and mCpG gDNA (CpG sites C24, C32 and C72), but not for processing of the 5mC template in comparison with the unmodified C template. Each NGS library was sequenced once using the Illumina NextSeq 2000 system. Figure 20: Statistical analysis of C and 5mC z-score differences from NGS libraries prepared by using human genomic DNA. Black dots represent z-score differences (Δz-score) at C and 5mC positions of linear PCR products received from KTq variant RIV A8 in presence of an unbalanced dNTP pool with 10 µM dGTP (match) and 200 µM d(A/T/C)TP (each). Z-score differences were calculated by subtracting z-scores based on error rates obtained by amplifying the unmodified C template from z-scores based on error rates obtained by amplifying the respective modified templates. Grey lines represent the mean ± standard deviation of each individual group. (A) Dot plot of 5mC template Δz-scores with combined data from one experiment (n (C) = 10, n (5mC) = 3). (B) Dot plot of natively methylated gDNA (gDNA native) Δz-scores with combined data from one experiment (n (C) = 10, n (5mC) = 3). (C) Dot plot of CpG methylated gDNA (mCpG gDNA) Δz-scores with combined data from one experiment (n (C) = 10, n (5mC) = 3). RIV A8 significantly detects 5mC at highly methylated CpG sites in native and mCpG gDNA templates in addition to modified 5mC template control. The two-tailed P-values were calculated according to the Wilcoxon-Mann-Whitney (WMW) test comparing C positions with 5mC positions using GraphPad Prism version 6.00 for Windows, GraphPad Software, La Jolla California USA. Figure 21: Crystal structure of the KTq DNA polymerase with highlighted mutation sites. (A) Crystal structure of the KTq DNA polymerase in a closed ternary complex with primer (grey) and template (black) DNA. Incoming dNTP is depicted in orange and amino acids which are identified as mutation sites in most promising KTq variants RIII H20, RIV A8 and RIV D15 are highlighted in blue and amino acid site which is only identified in RIV A8 in purple. (B) Ribbon model (right) with mutation sites labelled and displayed as sticks in KTq variants RIII H20, RIV A8 and RIV D15 (blue) and only in RIV A8 (purple). Highlighted mutations contributing to improved 5mC discrimination and increased misincorporation of KTq variants RIII H20, RIV A8 and RIV D15 (left). Crystal structure is adapted from PDB ID: 3RTV using PyMOL (Schrödinger, LLC; New York, NY). SEQUENCE OVERVIEW The present invention relates to the following amino acid and nucleotide sequences. Additional nucleotide sequences are indicated in Tables 1, 2, and 3. SEQ ID NO: 1 (amino acid sequence of wild-type Thermus aquaticus DNA polymerase) MRGMLPLFEPKGRVLLVDGHHLAYRTFHALKGLTTSRGEPVQAVYGFAKSLLKALKEDGD AVIVVFDAKAPSFRHEAYGGYKAGRAPTPEDFPRQLALIKELVDLLGLARLEVPGYEADD VLASLAKKAEKEGYEVRILTADKDLYQLLSDRIHVLHPEGYLITPAWLWEKYGLRPDQWA DYRALTGDESDNLPGVKGIGEKTARKLLEEWGSLEALLKNLDRLKPAIREKILAHMDDLK LSWDLAKVRTDLPLEVDFAKRREPDRERLRAFLERLEFGSLLHEFGLLESPKALEEAPWP PPEGAFVGFVLSRKEPMWADLLALAAARGGRVHRAPEPYKALRDLKEARGLLAKDLSVLA LREGLGLPPGDDPMLLAYLLDPSNTTPEGVARRYGGEWTEEAGERAALSERLFANLWGRL EGEERLLWLYREVERPLSAVLAHMEATGVRLDVAYLRALSLEVAEEIARLEAEVFRLAGH PFNLNSRDQLERVLFDELGLPAIGKTEKTGKRSTSAAVLEALREAHPIVEKILQYRELTK LKSTYIDPLPDLIHPRTGRLHTRFNQTATATGRLSSSDPNLQNIPVRTPLGQRIRRAFIA EEGWLLVALDYSQIELRVLAHLSGDENLIRVFQEGRDIHTETASWMFGVPREAVDPLMRR AAKTINFGVLYGMSAHRLSQELAIPYEEAQAFIERYFQSFPKVRAWIEKTLEEGRRRGYV ETLFGRRRYVPDLEARVKSVREAAERMAFNMPVQGTAADLMKLAMVKLFPRLEEMGARML LQVHDELVLEAPKERAEAVARLAKEVMEGVYPLAVPLEVEVGIGEDWLSAKE SEQ ID NO: 2 (amino acid sequence of a C-terminal fragment of wild-type Thermus aquaticus DNA polymerase; KlenTaq DNA polymerase): ALEEAPWPPPEGAFVGFVLSRKEPMWADLLALAAARGGRVHRAPEPYKALRDLKEARGLL AKDLSVLALREGLGLPPGDDPMLLAYLLDPSNTTPEGVARRYGGEWTEEAGERAALSERL FANLWGRLEGEERLLWLYREVERPLSAVLAHMEATGVRLDVAYLRALSLEVAEEIARLEA EVFRLAGHPFNLNSRDQLERVLFDELGLPAIGKTEKTGKRSTSAAVLEALREAHPIVEKI LQYRELTKLKSTYIDPLPDLIHPRTGRLHTRFNQTATATGRLSSSDPNLQNIPVRTPLGQ RIRRAFIAEEGWLLVALDYSQIELRVLAHLSGDENLIRVFQEGRDIHTETASWMFGVPRE AVDPLMRRAAKTINFGVLYGMSAHRLSQELAIPYEEAQAFIERYFQSFPKVRAWIEKTLE EGRRRGYVETLFGRRRYVPDLEARVKSVREAAERMAFNMPVQGTAADLMKLAMVKLFPRL EEMGARMLLQVHDELVLEAPKERAEAVARLAKEVMEGVYPLAVPLEVEVGIGEDWLSAKE SEQ ID NO: 3 (amino acid sequence of DNA polymerase KlenTaq RIV A8) MRGSHHHHHHTDPHAALEEAPWPPPEGAFVGFVLSRKEPMWADLLALAAARGGRVHRAPE PYKALRDLKEARGLLAKDLSVLALREGLGLPPGDDPMLLAYLLDPSNTTPEGVARRYGGE WTEEAGERAALSERLFANLWGRLEGEERLLWLYREVERPLSAVLAHMEATGVRLDVAYLR ALSLEVAEEIARLEAEVFRLAGHPFKLNSRDQLERVLFDELGLPAIGKTAKTGKRSTKAA VLEALREAHPIVEKILQYRELTNLKSTYIDPLPDLIHPRTGRLHTRFNQTATKTGRLSSS DPNLQNIPGRTPLGQRIRRAFIAEEGWLLVALDYSQMELRVLAHLSGDENLIRVFQEGRD IHTETASWMFGVPREAVDPLMRRAAKTINFGVLYGMSAHRLSQELAIPYEEAQAFIERYS QSFPKVRAWIEKTLEEGRRRGYVETLFGRRRYVPDLEARVKSVREAAERMAFNMPVQGTA ADLMKLAMVKLFPRLEEMGARMLLQVHDELVLEAPKERAEAVARLAKEVMEGVYPLAVPL EVEVGIGEDWLSAKE SEQ ID NO: 4 (amino acid sequence of DNA polymerase KlenTaq RIII H20) MRGSHHHHHHTDPHAALEEAPWPPPEGAFVGFVLSRKEPMWADLLALAAARGGRVHRAPE PYKALRDLKEARGLLAKDLSVLALREGLGLPPGDDPMLLAYLLDPSNTTPEGVARRYGGE WTEEAGERAALSERLFANLWGRLEGEERLLWLYREVERPLSAVLAHMEATGVRLDVAYLR ALSLEVAEEIARLEAEVFRLAGHPFKLNSRDQLERVLFDELGLPAIGKTRKTGKRSTKAA VLEALREAHPIVEKILQYRELTNLKSTYIDPLPDLIHPRTGRLHTRFNQTATKTGRLSSS DPNLQNIPGRTPLGQRIRRAFIAEEGWLLVALDYSQKELRVLAHLSGDENLIRVFQEGRD IHTETASWMFGVPREAVDPLMRRAAKTINFGVLYGMSAHRLSQELAIPYEEAQAFIERYF QSFPKVRAWIEKTLEEGRRRGYVETLFGRRRYVPDLEARVKSVREAAERMAFNMPVQGTA ADLMKLAMVKLFPRLEEMGARMLLQVHDELVLEAPKERAEAVARLAKEVMEGVYPLAVPL EVEVGIGEDWLSAKE SEQ ID NO: 5 (amino acid sequence of DNA polymerase KlenTaq RIV D15) MRGSHHHHHHTDPHAALEEAPWPPPEGAFVGFVLSRKEPMWADLLALAAARGGRVHRAPE PYKALRDLKEARGLLAKDLSVLALREGLGLPPGDDPMLLAYLLDPSNTTPEGVARRYGGE WTEEAGERAALSERLFANLWGRLEGEERLLWLYREVERPLSAVLAHMEATGVRLDVAYLR ALSLEVAEEIARLEAEVFRLAGHPFKLNSRDQLERVLFDELGLPAIGKTKKTGKRSTNAA VLEALREAHPIVEKILQYRELTNLKSTYIDPLPDLIHPRTGRLHTRFNQTATKTGRLSSS DPNLQNIPGRTPLGQRIRRAFIAEEGWLLVALDYSQKELRVLAHLSGDENLIRVFQEGRD IHTETASWMFGVPREAVDPLMRRAAKTINFGVLYGMSAHRLSQELAIPYEEAQAFIERYF QSFPKVRAWIEKTLEEGRRRGYVETLFGRRRYVPDLEARVKSVREAAERMAFNMPVQGTA ADLMKLAMVKLFPRLEEMGARMLLQVHDELVLEAPKERAEAVARLAKEVMEGVYPLAVPL EVEVGIGEDWLSAKE SEQ ID NO: 6 (nucleotide sequence coding for KlenTaq RIV A8) ATGAGAGGATCTCACCATCACCATCACCATACGGATCCGCATGCAGCACTGGAAGAAGCA CCTTGGCCTCCGCCTGAAGGTGCATTTGTTGGTTTTGTTCTGAGCCGTAAAGAACCGATG TGGGCAGATCTGCTGGCACTGGCAGCAGCACGTGGTGGTCGTGTTCATCGTGCACCGGAA CCGTATAAAGCTCTGCGCGATCTGAAAGAAGCACGCGGTCTGCTGGCAAAAGATCTGAGC GTTCTGGCACTGCGTGAAGGTCTGGGACTGCCTCCGGGTGATGATCCGATGCTGCTGGCA TATCTGCTGGATCCGAGCAATACCACACCGGAAGGTGTTGCACGTCGTTATGGTGGTGAA TGGACCGAAGAAGCAGGCGAACGCGCAGCACTGAGCGAACGTCTGTTTGCAAATCTGTGG GGTCGTCTGGAAGGTGAAGAACGTCTGCTGTGGCTGTATCGTGAAGTTGAACGTCCGCTG TCTGCAGTTCTGGCACACATGGAAGCAACCGGTGTTCGTCTGGATGTTGCATATCTGCGT GCACTGAGCCTGGAAGTTGCAGAAGAAATTGCACGTCTGGAAGCAGAAGTTTTTCGTCTG GCCGGCCATCCGTTTAAACTGAATAGCCGTGATCAGCTGGAACGTGTTCTGTTTGATGAA
Figure imgf000037_0001
GTTCTGGAAGCCCTGCGTGAAGCACATCCGATTGTGGAAAAAATTCTGCAGTATCGCGAA CTGACCAACCTGAAAAGCACCTATATCGATCCGCTGCCGGATCTGATTCATCCGCGTACC GGTCGTCTGCATACCCGTTTTAATCAGACCGCAACCAAAACCGGTCGCCTGAGCAGCAGC GATCCGAATCTGCAGAATATTCCGGGTCGTACACCGCTGGGTCAGCGTATTCGTCGTGCA TTTATTGCAGAAGAAGGTTGGCTGCTGGTTGCACTGGATTATAGCCAGATGGAACTGCGT GTTCTGGCCCATCTGAGCGGTGATGAAAATCTGATTCGCGTGTTTCAGGAAGGTCGCGAT ATTCATACCGAAACCGCAAGCTGGATGTTTGGTGTTCCGCGTGAAGCAGTTGATCCGCTG ATGCGTCGTGCAGCAAAAACCATTAATTTTGGGGTGCTGTATGGTATGAGCGCACATCGT CTGAGCCAGGAACTGGCAATTCCGTACGAAGAAGCCCAGGCATTTATCGAACGTTATTCT CAGAGCTTTCCGAAAGTTCGTGCCTGGATTGAAAAAACCCTGGAAGAAGGTCGTCGTCGC GGTTATGTTGAAACCCTGTTTGGTCGTCGTCGTTATGTTCCGGATCTGGAAGCACGTGTT AAAAGCGTTCGTGAAGCAGCAGAACGTATGGCCTTTAATATGCCGGTTCAGGGCACCGCA GCAGATCTGATGAAACTGGCCATGGTTAAACTGTTTCCGCGTCTGGAAGAAATGGGTGCA CGTATGCTGCTGCAGGTTCATGATGAACTGGTGCTGGAAGCACCGAAAGAACGTGCAGAA GCAGTTGCCCGTCTGGCAAAAGAAGTTATGGAAGGCGTTTATCCGCTGGCAGTTCCGCTG GAAGTTGAAGTTGGTATTGGTGAAGATTGGCTGTCTGCAAAAGAA bold: mutated codon Dark grey: KlenTaq wild-type base Light grey: mutated base SEQ ID NO: 7 (nucleotide sequence coding for KlenTaq RIII H20) ATGAGAGGATCTCACCATCACCATCACCATACGGATCCGCATGCAGCACTGGAAGAAGCA CCTTGGCCTCCGCCTGAAGGTGCATTTGTTGGTTTTGTTCTGAGCCGTAAAGAACCGATG TGGGCAGATCTGCTGGCACTGGCAGCAGCACGTGGTGGTCGTGTTCATCGTGCACCGGAA CCGTATAAAGCTCTGCGCGATCTGAAAGAAGCACGCGGTCTGCTGGCAAAAGATCTGAGC GTTCTGGCACTGCGTGAAGGTCTGGGACTGCCTCCGGGTGATGATCCGATGCTGCTGGCA TATCTGCTGGATCCGAGCAATACCACACCGGAAGGTGTTGCACGTCGTTATGGTGGTGAA TGGACCGAAGAAGCAGGCGAACGCGCAGCACTGAGCGAACGTCTGTTTGCAAATCTGTGG GGTCGTCTGGAAGGTGAAGAACGTCTGCTGTGGCTGTATCGTGAAGTTGAACGTCCGCTG TCTGCAGTTCTGGCACACATGGAAGCAACCGGTGTTCGTCTGGATGTTGCATATCTGCGT GCACTGAGCCTGGAAGTTGCAGAAGAAATTGCACGTCTGGAAGCAGAAGTTTTTCGTCTG GCCGGCCATCCGTTTAAACTGAATAGCCGTGATCAGCTGGAACGTGTTCTGTTTGATGAA CTGGGTCTGCCAGCAATTGGTAAAACCCGTAAAACCGGTAAACGTAGCACCAAAGCAGCA GTTCTGGAAGCCCTGCGTGAAGCACATCCGATTGTGGAAAAAATTCTGCAGTATCGCGAA CTGACCAACCTGAAAAGCACCTATATCGATCCGCTGCCGGATCTGATTCATCCGCGTACC GGTCGTCTGCATACCCGTTTTAATCAGACCGCAACCAAAACCGGTCGCCTGAGCAGCAGC GATCCGAATCTGCAGAATATTCCGGGTCGTACACCGCTGGGTCAGCGTATTCGTCGTGCA TTTATTGCAGAAGAAGGTTGGCTGCTGGTTGCACTGGATTATAGCCAGAAAGAACTGCGT GTTCTGGCCCATCTGAGCGGTGATGAAAATCTGATTCGCGTGTTTCAGGAAGGTCGCGAT ATTCATACCGAAACCGCAAGCTGGATGTTTGGTGTTCCGCGTGAAGCAGTTGATCCGCTG ATGCGTCGTGCAGCAAAAACCATTAATTTTGGGGTGCTGTATGGTATGAGCGCACATCGT CTGAGCCAGGAACTGGCAATTCCGTACGAAGAAGCCCAGGCATTTATCGAACGTTATTTT CAGAGCTTTCCGAAAGTTCGTGCCTGGATTGAAAAAACCCTGGAAGAAGGTCGTCGTCGC GGTTATGTTGAAACCCTGTTTGGTCGTCGTCGTTATGTTCCGGATCTGGAAGCACGTGTT AAAAGCGTTCGTGAAGCAGCAGAACGTATGGCCTTTAATATGCCGGTTCAGGGCACCGCA GCAGATCTGATGAAACTGGCCATGGTTAAACTGTTTCCGCGTCTGGAAGAAATGGGTGCA CGTATGCTGCTGCAGGTTCATGATGAACTGGTGCTGGAAGCACCGAAAGAACGTGCAGAA GCAGTTGCCCGTCTGGCAAAAGAAGTTATGGAAGGCGTTTATCCGCTGGCAGTTCCGCTG GAAGTTGAAGTTGGTATTGGTGAAGATTGGCTGTCTGCAAAAGAA bold: mutated codon Dark grey: KlenTaq wild-type base Light grey: mutated base SEQ ID NO: 8 (nucleotide sequence coding for KlenTaq RIV D15) ATGAGAGGATCTCACCATCACCATCACCATACGGATCCGCATGCAGCACTGGAAGAAGCA CCTTGGCCTCCGCCTGAAGGTGCATTTGTTGGTTTTGTTCTGAGCCGTAAAGAACCGATG TGGGCAGATCTGCTGGCACTGGCAGCAGCACGTGGTGGTCGTGTTCATCGTGCACCGGAA CCGTATAAAGCTCTGCGCGATCTGAAAGAAGCACGCGGTCTGCTGGCAAAAGATCTGAGC GTTCTGGCACTGCGTGAAGGTCTGGGACTGCCTCCGGGTGATGATCCGATGCTGCTGGCA TATCTGCTGGATCCGAGCAATACCACACCGGAAGGTGTTGCACGTCGTTATGGTGGTGAA TGGACCGAAGAAGCAGGCGAACGCGCAGCACTGAGCGAACGTCTGTTTGCAAATCTGTGG GGTCGTCTGGAAGGTGAAGAACGTCTGCTGTGGCTGTATCGTGAAGTTGAACGTCCGCTG TCTGCAGTTCTGGCACACATGGAAGCAACCGGTGTTCGTCTGGATGTTGCATATCTGCGT GCACTGAGCCTGGAAGTTGCAGAAGAAATTGCACGTCTGGAAGCAGAAGTTTTTCGTCTG GCCGGCCATCCGTTTAAACTGAATAGCCGTGATCAGCTGGAACGTGTTCTGTTTGATGAA CTGGGTCTGCCAGCAATTGGTAAAACCAAAAAAACCGGTAAACGTAGCACCAACGCAGCA GTTCTGGAAGCCCTGCGTGAAGCACATCCGATTGTGGAAAAAATTCTGCAGTATCGCGAA CTGACCAACCTGAAAAGCACCTATATCGATCCGCTGCCGGATCTGATTCATCCGCGTACC GGTCGTCTGCATACCCGTTTTAATCAGACCGCAACCAAAACCGGTCGCCTGAGCAGCAGC GATCCGAATCTGCAGAATATTCCGGGTCGTACACCGCTGGGTCAGCGTATTCGTCGTGCA TTTATTGCAGAAGAAGGTTGGCTGCTGGTTGCACTGGATTATAGCCAGAAAGAACTGCGT GTTCTGGCCCATCTGAGCGGTGATGAAAATCTGATTCGCGTGTTTCAGGAAGGTCGCGAT ATTCATACCGAAACCGCAAGCTGGATGTTTGGTGTTCCGCGTGAAGCAGTTGATCCGCTG ATGCGTCGTGCAGCAAAAACCATTAATTTTGGGGTGCTGTATGGTATGAGCGCACATCGT CTGAGCCAGGAACTGGCAATTCCGTACGAAGAAGCCCAGGCATTTATCGAACGTTATTTT CAGAGCTTTCCGAAAGTTCGTGCCTGGATTGAAAAAACCCTGGAAGAAGGTCGTCGTCGC GGTTATGTTGAAACCCTGTTTGGTCGTCGTCGTTATGTTCCGGATCTGGAAGCACGTGTT AAAAGCGTTCGTGAAGCAGCAGAACGTATGGCCTTTAATATGCCGGTTCAGGGCACCGCA GCAGATCTGATGAAACTGGCCATGGTTAAACTGTTTCCGCGTCTGGAAGAAATGGGTGCA CGTATGCTGCTGCAGGTTCATGATGAACTGGTGCTGGAAGCACCGAAAGAACGTGCAGAA GCAGTTGCCCGTCTGGCAAAAGAAGTTATGGAAGGCGTTTATCCGCTGGCAGTTCCGCTG GAAGTTGAAGTTGGTATTGGTGAAGATTGGCTGTCTGCAAAAGAA bold: mutated codon Dark grey: KlenTaq wild-type base Light grey: mutated base SEQ ID NO: 9 (fragment of SEQ ID NO: 38 (Table 2), as shown in Fig.2) CTGCTGCTTGAAAATGGATTGTGCGTAAAGACGGAGGGTAATTATAGATATACCACCTAG TCTTCTTGATCCGAGGCCTACAGCTTTTGATCCCT SEQ ID NOs: 10 to 36 are shown in Table 2 SEQ ID NOs: 37 to 41 are shown in Table 3 SEQ ID NO: 42 (amino acid sequence of wild-type Thermus thermophilus DNA polymerase) MEAMLPLFEPKGRVLLVDGHHLAYRTFFALKGLTTSRGEPVQAVYGFAKSLLKALKEDGY KAVFVVFDAKAPSFRHEAYEAYKAGRAPTPEDFPRQLALIKELVDLLGFTRLEVPGYEAD DVLATLAKKAEKEGYEVRILTADRDLYQLVSDRVAVLHPEGHLITPEWLWEKYGLRPEQW VDFRALVGDPSDNLPGVKGIGEKTALKLLKEWGSLENLLKNLDRVKPENVREKIKAHLED LRLSLELSRVRTDLPLEVDLAQGREPDREGLRAFLERLEFGSLLHEFGLLEAPAPLEEAP WPPPEGAFVGFVLSRPEPMWAELKALAACRDGRVHRAADPLAGLKDLKEVRGLLAKDLAV LASREGLDLVPGDDPMLLAYLLDPSNTTPEGVARRYGGEWTEDAAHRALLSERLHRNLLK RLEGEEKLLWLYHEVEKPLSRVLAHMEATGVRLDVAYLQALSLELAEEIRRLEEEVFRLA GHPFNLNSRDQLERVLFDELRLPALGKTQKTGKRSTSAAVLEALREAHPIVEKILQHREL TKLKNTYVDPLPSLVHPRTGRLHTRFNQTATATGRLSSSDPNLQNIPVRTPLGQRIRRAF VAEAGWALVALDYSQIELRVLAHLSGDENLIRVFQEGKDIHTQTASWMFGVPPEAVDPLM RRAAKTVNFGVLYGMSAHRLSQELAIPYEEAVAFIERYFQSFPKVRAWIEKTLEEGRKRG YVETLFGRRRYVPDLNARVKSVREAAERMAFNMPVQGTAADLMKLAMVKLFPRLREMGAR MLLQVHDELLLEAPQARAEEVAALAKEAMEKAYPLAVPLEVEVGMGEDWLSAKG SEQ ID NO: 43 (amino acid sequence of wild-type Escherichia coli DNA polymerase I) MVQIPQNPLILVDGSSYLYRAYHAFPPLTNSAGEPTGAMYGVLNMLRSLIMQYKPTHAAV VFDAKGKTFRDELFEHYKSHRPPMPDDLRAQIEPLHAMVKAMGLPLLAVSGVEADDVIGT LAREAEKAGRPVLISTGDKDMAQLVTPNITLINTMTNTILGPEEVVNKYGVPPELIIDFL ALMGDSSDNIPGVPGVGEKTAQALLQGLGGLDTLYAEPEKIAGLSFRGAKTMAAKLEQNK EVAYLSYQLATIKTDVELELTCEQLEVQQPAAEELLGLFKKYEFKRWTADVEAGKWLQAK GAKPAAKPQETSVADEAPEVTATVISYDNYVTILDEETLKAWIAKLEKAPVFAFDTETDS LDNISANLVGLSFAIEPGVAAYIPVAHDYLDAPDQISRERALELLKPLLEDEKALKVGQN LKYDRGILANYGIELRGIAFDTMLESYILNSVAGRHDMDSLAERWLKHKTITFEEIAGKG KNQLTFNQIALEEAGRYAAEDADVTLQLHLKMWPDLQKHKGPLNVFENIEMPLVPVLSRI ERNGVKIDPKVLHNHSEELTLRLAELEKKAHEIAGEEFNLSSTKQLQTILFEKQGIKPLK KTPGGAPSTSEEVLEELALDYPLPKVILEYRGLAKLKSTYTDKLPLMINPKTGRVHTSYH QAVTATGRLSSTDPNLQNIPVRNEEGRRIRQAFIAPEDYVIVSADYSQIELRIMAHLSRD KGLLTAFAEGKDIHRATAAEVFGLPLETVTSEQRRSAKAINFGLIYGMSAFGLARQLNIP RKEAQKYMDLYFERYPGVLEYMERTRAQAKEQGYVETLDGRRLYLPDIKSSNGARRAAAE RAAINAPMQGTAADIIKRAMIAVDAWLQAEQPRVRMIMQVHDELVFEVHKDDVDAVAKQI HQLMENCTRLDVPLLVEVGSGENWDQAH SEQ ID NO: 44 (amino acid sequence of wild-type E. coli phage T7 DNA polymerase) MIVSDIEANALLESVTKFHCGVIYDYSTAEYVSYRPSDFGAYLDALEAEVARGGLIVFHN GHKYDVPALTKLAKLQLNREFHLPRENCIDTLVLSRLIHSNLKDTDMGLLRSGKLPGKRF GSHALEAWGYRLGEMKGEYKDDFKRMLEEQGEEYVDGMEWWNFNEEMMDYNVQDVVVTKA LLEKLLSDKHYFPPEIDFTDVGYTTFWSESLEAVDIEHRAAWLLAKQERNGFPFDTKAIE ELYVELAARRSELLRKLTETFGSWYQPKGGTEMFCHPRTGKPLPKYPRIKTPKVGGIFKK PKNKAQREGREPCELDTREYVAGAPYTPVEHVVFNPSSRDHIQKKLQEAGWVPTKYTDKG APVVDDEVLEGVRVDDPEKQAAIDLIKEYLMIQKRIGQSAEGDKAWLRYVAEDGKIHGSV NPNGAVTGRATHAFPNLAQIPGVRSPYGEQCRAAFGAEHHLDGITGKPWVQAGIDASGLE LRCLAHFMARFDNGEYAHEILNGDIHTKNQIAAELPTRDNAKTFIYGFLYGAGDEKIGQI VGAGKERGKELKKKFLENTPAIAALRESIQQTLVESSQWVAGEQQVKWKRRWIKGLDGRK VHVRSPHAALNTLLQSAGALICKLWIIKTEEMLVEKGLKHGWDGDFAYMAWVHDEIQVGC RTEEIAQVVIETAQEAMRWVGDHWNFRCLLDTEGKMGPNWAICH SEQ ID NO: 45 (amino acid sequence of wild-type Bacillus stearothermophilus DNA polymerase) MKNKLVLIDGNSVAYRAFFALPLLHNDKGIHTNAVYGFTMMLNKILAEEQPTHILVAFDA GKTTFRHETFQDYKGGRQQTPPELSEQFPLLRELLKAYRIPAYELDHYEADDIIGTMAAR AEREGFAVKVISGDRDLTQLASPQVTVEITKKGITDIESYTPETVVEKYGLTPEQIVDLK GLMGDKSDNIPGVPGIGEKTAVKLLKQFGTVENVLASIDEIKGEKLKENLRQYRDLALLS KQLAAICRDAPVELTLDDIVYKGEDREKVVALFQELGFQSFLDKMAVQTDEGEKPLAGMD FAIADSVTDEMLADKAALVVEVVGDNYHHAPIVGIALANERGRFFLRPETALADPKFLAW LGDETKKKTMFDSKRAAVALKWKGIELRGVVFDLLLAAYLLDPAQAAGDVAAVAKMHQYE AVRSDEAVYGKGAKRTVPDEPTLAEHLVRKAAAIWALEEPLMDELRRNEQDRLLTELEQP LAGILANMEFTGVKVDTKRLEQMGAELTEQLQAVERRIYELAGQEFNINSPKQLGTVLFD KLQLPVLKKTKTGYSTSADVLEKLAPHHEIVEHILHYRQLGKLQSTYIEGLLKVVHPVTG KVHTMFNQALTQTGRLSSVEPNLQNIPIRLEEGRKIRQAFVPSEPDWLIFAADYSQIELR VLAHIAEDDNLIEAFRRGLDIHTKTAMDIFHVSEEDVTANMRRQAKAVNFGIVYGISDYG LAQNLNITRKEAAEFIERYFASFPGVKQYMDNIVQEAKQKGYVTTLLHRRRYLPDITSRN FNVRSFAERTAMNTPIQGSAADIIKKAMIDLSVRLREERLQARLLLQVHDELILEAPKEE IERLCRLVPEVMEQAVTLRVPLKVDYHYGPTWYDAK SEQ ID NO: 46 (amino acid sequence of wild-type Bacillus subtilis DNA polymerase) MTERKKLVLVDGNSLAYRAFFALPLLSNDKGVHTNAVYGFAMILMKMLEDEKPTHMLVAF DAGKTTFRHGTFKEYKGGRQKTPPELSEQMPFIRELLDAYQISRYELEQYEADDIIGTLA KSAEKDGFEVKVFSGDKDLTQLATDKTTVAITRKGITDVEFYTPEHVKEKYGLTPEQIID MKGLMGDSSDNIPGVPGVGEKTAIKLLKQFDSVEKLLESIDEVSGKKLKEKLEEFKDQAL MSKELATIMTDAPIEVSVSGLEYQGFNREQVIAIFKDLGFNTLLERLGEDSAEAEQDQSL EDINVKTVTDVTSDILVSPSAFVVEQIGDNYHEEPILGFSIVNETGAYFIPKDIAVESEV FKEWVENDEQKKWVFDSKRAVVALRWQGIELKGAEFDTLLAAYIINPGNSYDDVASVAKD YGLHIVSSDESVYGKGAKRAVPSEDVLSEHLGRKALAIQSLREKLVQELENNDQLELFEE LEMPLALILGEMESTGVKVDVDRLKRMGEELGAKLKEYEEKIHEIAGEPFNINSPKQLGV ILFEKIGLPVVKKTKTGYSTSADVLEKLADKHDIVDYILQYRQIGKLQSTYIEGLLKVTR PDSHKVHTRFNQALTQTGRLSSTDPNLQNIPIRLEEGRKIRQAFVPSEKDWLIFAADYSQ IELRVLAHISKDENLIEAFTNDMDIHTKTAMDVFHVAKDEVTSAMRRQAKAVNFGIVYGI SDYGLSQNLGITRKEAGAFIDRYLESFQGVKAYMEDSVQEAKQKGYVTTLMHRRRYIPEL TSRNFNIRSFAERTAMNTPIQGSAADIIKKAMIDMAAKLKEKQLKARLLLQVHDELIFEA PKEEIEILEKLVPEVMEHALALDVPLKVDFASGPSWYDAK SEQ ID NO: 47 (amino acid sequence of wild-type Bacillus phage SP01 DNA polymerase) MGSALDTLKEFNPKPMKGQGSKKARIIIVQENPFDYEYRKKKYMTGKAGKLLKFGLAEVG IDPDEDVYYTSIVKYPTPENRLPTPDEIKESMDYMWAEIEVIDPDIIIPTGNLSLKFLTK MTAITKVRGKLYEIEGRKFFPMIHPNTVLKQPKYQDFFIKDLEILASLLEGKTPKNVLAF TKERRYCDTFEDAIDEIKRYLELPAGSRVVIDLETVKTNPFIEKVTMKKTTLEAYPMSQQ PKIVGIGLSDRSGYGCAIPLYHRENLMKGNQIGTIVKFLRKLLEREDLEFIAHNGKFDIR WLRASLDIYLDISIWDTMLIHIIDYRGERYSWSKRLAWLETDMGGYDDALDGEKPKGEDE GNYDLIPWDILKVYLADDCDVTFRLSEKYIPLVEENEEKKWLWENIMVPGYYTLLDIEMD GIHVDREWLEVLRVSYEKEISRLEDKMREFPEGVAMEREMRDKWKERVMIGNIKSANRTP EQQDKFKKYKKYDPSKGGDKINFGSTKQLGELLFERMGLETVIFTDKGAPSTNDDSLKFM GSQSDFVKVLMEFRKANHLYNNFVSKLSLMIDPDNIVHPSYNIHGTVTGRLSSNEPNAQQ FPRKVNTPTLFQYNFEIKKMFNSRFGDGGVIVQFDYSQLELRILVCYYSRPYTIDLYRSG ADLHKAVASDAFGVAIEEVSKDQRTASKKIQFGIVYQESARGLSEDLRAEGITMSEDECE IFIKKYFKRFPKVSKWIRDTKKHVKDISTVKTLTGATRNLPDIDSIDQSKANEAERQAVN TPIQGTGSDCTLMSLILINQWLRESGLRSRICITVHDSIVLDCPKDEVLEVAKKVKHIME NLGEYNEFYKFLGDVPILSEMEIGRNYGDAFEATIEDIEEHGVDGFIEMKEKEKLEKDMK EFTKIIEDGGSIPDYARIYWENIS SEQ ID NOs: 48 to 83 are shown in Table 1 EXAMPLES The present invention will be further illustrated by the following examples without being limited thereto. Material and methods: Oligonucleotides and HeLa genomic DNA DNA oligonucleotides were purchased from biomers.net GmbH in HPLC grade and directly used for primer extension reactions with capillary electrophoresis (CE) analysis, PCR and NGS library preparation. Oligonucleotides applied in primer extension reactions with denaturing polyacrylamide gel electrophoresis (PAGE) analysis and phosphor imaging were purified by preparative denaturing PAGE and radioactively 5'-end labelled with [y32P]-ATP and T4 polynucleotide kinase (New England Biolabs) according to the manufacturer's instructions. DNA sequences of used oligonucleotides are listed in Table 2. HeLa genomic DNA (gDNA) and CpG methylated HeLa gDNA (mCpG gDNA) were purchased from New England Biolabs and directly used in experiments. DNA sequences of templates and the gDNA amplicon target region are listed in Table 3. Table 2: DNA oligonucleotides used herein. Oligonucleotide Sequence SEQ ID NO: Primer extension experiments Radioactive primer 5'-ACTACAAGCCCCAAAAGCAG-3' 10 extension primer Oligonucleotide C 5'-ATCTGCTCGAGGCCTGCTTTTGGGGCTTGTAGT-(P)-3' 11 template (P = Phosphate) Oligonucleotide 5mC 5'-ATCTGCTCGAGG5mCCTGCTTTTGGGGCTTGTAGT-(P)-3' 12 template (P = Phosphate) CE primer 20 nt 5'-(F)-ACTACAAGCCCCAAAAGCAG-3' 13 (F = FAM/HEX) CE primer 25 nt 5'-(F)-CGATCACTACAAGCCCCAAAAGCAG-3' 14 (F = FAM/HEX) CE primer 30 nt 5'-(F)-TCGATCGATCACTACAAGCCCCAAAAGCAG-3' 15 (F = FAM/HEX) CE primer 35 nt 5'-(F)-ATCGATCGATCGATCACTACAAGCCCCAAAAGCAG-3' 16 (F = FAM/HEX) CE primer 40 nt 5'-(F)-GATCGATCGATCGATCGATCACTACAAGCCCCAAAAGCAG-3' 17 (F = FAM/HEX) CE primer 45 nt 5'-(F)- 18 (F = FAM/HEX) CGATCGATCGATCGATCGATCGATCACTACAAGCCCCAAAAGCAG-3' PCR experiments 109 bp forward primer 5'-GAATGGGATAGAGAAGGGATCAAAAG-3' 19 109 bp reverse primer 5'-CTGCTGCTTGAAAATGGATTGTGC-3' 20 803 bp forward primer 5'-TCTGTCTTTTCATCATTGGTTCT-3' 21 803 bp reverse primer 5'-TCCTAGACACAACTGAATCCCAA-3' 22 Bisulfite conversion 5'-CATAATACTACTTAAAAAAATCACTCTAACA-3' 23 forward primer Bisulfite conversion 5'-GATTTTTTGGAATTTTAAATATAATTTTGAAGT-3' 24 reverse primer NGS library preparation UMI forward primer 1 5'- 25 CTTTCCCTACACGACGCTCTTCCGATCTNNNNNNNNGAATGGGATAGA GAAGGGATCAA-3' UMI forward primer 2 5'- 26 CTTTCCCTACACGACGCTCTTCCGATCTNNNNNNNNCGAATGGGATAG AGAAGGGATCAA-3' UMI forward primer 3 5'- 27 CTTTCCCTACACGACGCTCTTCCGATCTNNNNNNNNTCGAATGGGATA GAGAAGGGATCAA-3' UMI reverse primer 1 5'- 28 GGAGTTCAGACGTGTGCTCTTCCGATCTNNNNNNNNCTGCTGCTTGAA AATGGATTGTG-3' UMI reverse primer 2 5'- 29 GGAGTTCAGACGTGTGCTCTTCCGATCTNNNNNNNNACTGCTGCTTGA AAATGGATTGTG-3' UMI reverse primer 3 5'- 30 GGAGTTCAGACGTGTGCTCTTCCGATCTNNNNNNNNTACTGCTGCTTG AAAATGGATTGTG-3' 183 bp forward primer 5'-CTTTCCCTACACGACGCTCTTCCGAT-3' 31 183 bp reverse primer 5'-GGAGTTCAGACGTGTGCTCTTCCGAT-3' 32 Amplicon forward 5'-CAAGCAGAAGACGGCATACGAGAT [i7] 33 primer GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC-3' ([i7] = Illumina TruSeq CD i7 indexes 1) Amplicon reverse primer 5'-AATGATACGGCGACCACCGAGATCTACAC [i5] 34 ([i5] = Illumina TruSeq ACACTCTTTCCCTACACGACGCTCTTCCGATCT-3' CD i5 indexes 2) Bisulfite conversion 5'- 35 UMI forward primer CTTTCCCTACACGACGCTCTTCCGATCTNNNNNNNNAAAAATCAAAAA CTATAAACCTC-3' Bisulfite conversion 5'- 36 UMI reverse primer GGAGTTCAGACGTGTGCTCTTCCGATCTNNNNNNNNTTGTTGTTTGAA AATGGATTGTG-3' Table 3: Templates and PCR products used herein. 803 bp template DNA generated by PCR (SEQ ID NO: 37) 5'-TCTGTCTTTTCATCATTGGTTCTTTTATTATTTTTTAAACTTACATTTGTTTTTCTGAAACCGAGCTAA AAACTGTAGACATTGCTTCATTTAATGTTTAGCATTTCTGAGAAATCTTAGATCAGTTTGATTATAATTCT TTTATAAGAATGGTGTTTTTTCCTTCATAGATTCTCTGGAATTTTAAACATAACCTTGAAGTTCAAATTAT TCACCAAGACCTGACTAATATTTAGCCTCTTTTAAATAAGTTGTCTGCTGCTTGAAAATGGATTGTGCGTA AAGACGGAGGGTAATTATAGATATACCACCTAGTCTTCTTGATCCGAGGCCTACAGCTTTTGATCCCTTCT CTATCCCATTCTATCAACAATGTCAGAGTGATCCTTCTAAGTAGCATTATGACAATGTCACTCTGCAGCTT CAAATATTCAGGTGAATCTCCTCATCTATAAAATAAAGTCCAAAATTCTCAGCATGTAATATAAGTCTATT AATGTAATATATCCAAAACACTGACCATATCTTTGTTTATCTTTTACTTTGTGTCAGTTCTGGTTTTTACT GTTCCTAATAAAGTTTTTAATTTTTATTATTAATTTTTTTTAACCAGAGTTTGCTAACCACATTCATTTCT TTTTTTATTTATCCCAGCCTCTCAATTTCATTTTCCAGCTTAATATCACCATTTCTCCACTAGACTTCAAT ATTCTAGATTACGAATACAATAAGAAGAGCCTCTAGATAACCAACTGCAAATTAGAAACCAGTAATACAAA ATTGGGATTCAGTTGTGTCTAGGA-3' 109 bp PCR product and NGS amplicon target OR10A2 olfactory receptor family 10 subfamily A member 2 [ Homo sapiens (human) ] (SEQ ID NO: 38) 5'-CTGCTGCTTGAAAATGGATTGTGCGTAAAGACGGAGGGTAATTATAGATATACCACCTAGTCTTCTTGA TCCGAGGCCTACAGCTTTTGATCCCTTCTCTATCCCATTC-3' 235 bp bisulfite conversion product (SEQ ID NO: 39) 5'-GATTTTTTGGAATTTTAAATATAATTTTGAAGTTTAAATTATTTATTAAGATTTGATTAATATTTAGTT TTTTTTAAATAAGTTGTTTGTTGTTTGAAAATGGATTGTGTGTAAAGATGGAGGGTAATTATAGATATATTATT TAGTTTTTTTGATTTGAGGTTTATAGTTTTTGATTTTTTTTTTATTTTATTTTATTAATAATGTTAGAGTGATT TTTTTAAGTAGTATTATG-3' 95 bp NGS bisulfite conversion target (SEQ ID NO: 40) 5'-TTGTTGTTTGAAAATGGATTGTGTGTAAAGATGGAGGGTAATTATAGATATATTATTTAGTTTTTTTGA TTTGAGGTTTATAGTTTTTGATTTTT-3' Sequence region for 5mC detection (SEQ ID NO: 41) 5'-CGTAAAGACGGAGGGTAATTATAGATATACCACCTAGTCTTCTTGATCCGAGGCCTACAG-3'a Primer binding sites highlighted in greyb C at CpG sites highlighted in boldc C positions highlighted by underlining Primer extension with radioactively labelled primer The reaction mixtures contained 150 nM [y32P]-labelled primer, 200 nM oligonucleotide template (C or 5mC) and 50 nM KTq wild-type in 1× KTq reaction buffer (50 mM Tris HCI (pH 9.2), 16 mM (NH4)2SO4, 2.5 mM MgCl2, 0.1% (v/v) Tween 20). Reaction mixtures were heated to 95°C for 2 min and subsequently cooled down stepwise to 4°C for annealing. Reaction mixtures were then incubated at 55°C and primer extension was started by adding 50 µM of the respective dNTP in 35 µL end volume. After indicated reaction times, 5 µL reaction mixture were stopped by mixing with 5 µL stop solution (80% formamide, 20 mM EDTA, 0.25% (w/v) bromophenol blue, 0.25%.(w/v) xylene cyanol). After denaturation for 3 min at 95°C, reactions were analysed by 12% denaturing PAGE and visualised by phosphor imaging (Typhoon TM FLA 9500, GE Healthcare Life Science). Library expression and lysate preparation in 96-well plates KTq variant libraries applied in screening included all 19 single mutants at positions N485, E507, S515, K540, Y545, T569, A570, T571, R573, D578, N583, 1584, V586, R587, I614, E615, L616, 1638, H639, R659, R660, A661; K663, T664, 1665, F667, G668, V669, L670, Y671, G672, M673, R677, E681, R728, A743, R746, M747, F749, N750, Q754, V783 and H784. In addition, 153 double mutants were generated and screened based on rational combination of functional single amino acid mutations. KTq libraries were prepared by site directed mutagenesis of the respective codons as known in the art and stored as glycerol stocks in 96-deep-well plates. 2167 PCR active variants, derived from a combinatorial library generated by random chimeragenesis on a transient template (RACHITT) were directly used for gene expression in 96-deep-well plates. For library expression, 990 µL LB medium supplemented with 100 µg/mL carbenicillin disodium salt were inoculated with 10 µL of overnight grown cultures of Escherichia coli (E. coil) BL21 (DE3) cells (Novagen) harbouring library plasmids. Cells were grown at 37°C on a plate shaker to an OD600 of 0.4-0.6 and gene expression was induced by the addition of IPTG (final concentration 0.4 mM). After incubation at 37°C for 3 h, cells were harvested by centrifugation at 4°C for 30 min. Pellets were lysed as known in the art. Lysates were stored at 4°C up to four weeks and used without any further purification for primer extension reactions. Primer extension for determination of lysate dilution The screening reaction conditions were set up considering the optimal reaction window, with complete primer elongation after 10 min using 70 nM dGTP substrate and starting primer elongation after 15 min using 35 µM dATP substrate by the KTq wild-type (Fig.1). To conduct experiments with the same reaction conditions and lysate concentration for all KTq variants, lysate dilutions were determined in a preliminary primer extension reaction. For this, KTq wild- type lysate was diluted in 1x KTq reaction buffer (50 mM Tris HCI (pH 9.2), 16 mM (NH4)2SO4, 2.5 mM MgCl2, 0.1% (v/v) Tween 20) in ratios between 1:10 to 1:100. Lysate dilutions (final 20% (v/v)) were mixed with 100 nM oligonucleotide C template in 1× KTq reaction buffer and added to 10 nM fluorescently labelled primers. Primers varied in length and each size was assigned to a different dilution ratio. Reaction mixtures were prepared twice, by using either a 5'-6-carboxyfluorescein (FAM) labelled primer set for primer extension in presence of the match base guanine (G) or a 5'-hexachlorofluorescein (HEX) labelled primer set for primer extension in presence of the mismatch base A. Reaction mixtures were heated to 95°C for 2 min and then cooled down to 4°C for annealing. Primer extension reaction was started at 55°C by adding 70 nM dGTP or 35 µM dATP. 5 µL of reaction mixtures were stopped by mixing with 5 µL CE stop solution (80% (v/v) formamide, 20 mM EDTA) after 5, 10 and 15 min. After denaturation for 3 min at 95°C, single-nucleotide incorporation was analysed by CE. The applicable concentrations of dGTP (70 nM) and dATP (35 µM) were determined in a preceding experiment by a dNTP dilution series employing KTq wild-type lysate. Primer extension in screening experiment Screening experiments were performed either with dGTP or dATP as substrate for single- nucleotide incorporation. KTq variant lysates were diluted in 1× KTq buffer (50 mM Tris HCI (pH 9.2), 16 mM (NH4)2SO4, 2.5 mM MgCl2, 0.1% (v/v) Tween 20) according to the predefined dilution ratios in 96-deep-well plates.4 µL per column of diluted lysates were transferred twice into adjacent columns in a 96-well reaction plate (on ice). As six primers of different length (20 nt, 25 nt, 30 nt, 35 nt, 40 nt, 45 nt) and two different fluorescence dyes (5'-FAM and 5'-HEX) were utilised in the primer extension experiment, 48 lysates could be multiplexed and analysed in one 8-capillary CE run. Prior to reaction, 10 nM fluorescently labelled primers (sorted by size in ascending order, same primer length in consecutive tubes) were mixed in a 12-tube PCR strip with 100 nM oligonucleotide template (5'-FAM labelled primers with C template, 5'-HEX labelled primers with 5mC template) in sufficient amount. Mixtures were heated for 2 min at 95°C and cooled down to 4°C. Annealed primer/template pairs were mixed with 1× KTq reaction buffer and 12 µL were distributed to each lysate row. Primer extension was started at 55°C by addition of 70 nM dGTP or 35 µM dATP in a final volume of 20 µL. After 10 min, reactions were stopped by adding 20 µL CE stop solution (80% (v/v) formamide, 20 mM EDTA). After denaturation for 3 min at 95°C, single-nucleotide incorporation was analysed by CE. Capillary electrophoresis (CE) CE was used for separation and analysis of extended 5'-fluorescent labelled primers. For one CE run, 38 µL Hi-DiTM formamide (Thermo Fisher Scientific) mixed with 0.15% (v/v) GeneScanTM 120 LIZ Size Standard (Thermo Fisher Scientific) were added to each well of one column of a MicroAmpTM 96-well plate (Thermo Fisher Scientific). Then 1 µL of each of the 12 reactions from one row of the 96-well reaction plate were combined in a single well of the MicroAmpTM 96-well plate to obtain a final volume of 50 µL per well. The MicroAmpTM 96- well plate was shortly centrifuged and placed into the Applied Biosystems Genetic Analyzer 3500 (Thermo Fisher Scientific) with an 8-capillary array (35 cm) filled with POP-6TM polymer (Thermo Fisher Scientific). The following parameters were applied for the CE run: G5 dye set, 60°C oven temperature, 1900 s run time, 13.0 kV run voltage, 180 s pre run time, 13.0 kV pre run voltage, 50 s injection time, 1.6 kV injection voltage and 200 s data delay. Qualitative CE data analysis was performed using the GeneMapperTM Software 5. KTq variant expression and protein purification For KTq variant expression, 25 mL LB medium supplemented with 100 µg/mL carbenicillin disodium salt were inoculated with 250 µL of overnight grown cultures of E. coli BL21 (DE3) cells (Novagen) harbouring gene plasmids.5 mL of remaining overnight grown cultures were used for plasmid extraction and purification employing the QlAprep® Spin Miniprep Kit (Qiagen). KTq variant plasmids were analysed by Sanger sequencing (Azenta Life Sciences) and mutation sites are listed in Table 4. Inoculated media were incubated at 37°C on a shaker to an OD600 of 0.4-0.6 and gene expression was induced by the addition of IPTG (final concentration 1 mM). After incubation at 37°C under shaking for 4 h, cells were harvested by centrifugation at 4°C for 30 min. Pellets were lysed in 15 mL 1× KTq basis buffer (10 mM Tris HCI (pH 9.2), 300 mM NaCI, 2.5 mM MgCl2, 0.1% (v/v) Triton X-100) containing 1 mg/mL lysozyme at 37°C for 20 min. After heat denaturation of E. coli host proteins at 75°C for 40 min, bacterial cell debris was pelleted by centrifugation at 20000 rpm for 45 min at 4°C. For 6× His-tagged protein purification, the supernatant was supplemented with 5 mM imidazole and metal ion-based affinity purification was performed employing 0.5 mL calibrated cOmpleteTM His-Tag Purification Resin (Roche). Before use, the Ni2+ chelate resin was washed and calibrated 4 times with 9 mL 1× KTq basis buffer containing 5 mM imidazole by mixing and subsequent centrifugation at 900 rpm for 2 min at 4°C. The lysate/nickel beads suspension was incubated overnight at 4°C in an overhead shaker. After centrifugation at 900 rpm for 2 min at 4°C, supernatant was removed and Ni2+ chelate resin was washed 2 times with 15 mL 1× KTq basis buffer containing 20 mM imidazole. For protein elution, Ni2+ chelate resin was incubated with 5 mL 2× elution buffer (100 mM Tris HCI (pH 9.2), 5 mM MgCl2) containing 100 mM imidazole for 30 min at 4°C in an overhead shaker. The elution fraction was obtained after centrifugation and the elution step was repeated with additional 5 mL 2× elution buffer. The imidazole was removed from the combined elution fractions using Amicon® Ultra Centrifugal Filters 30000 MWCO (Merck) and washing 4 times with 10 mL 2× elution buffer at 4°C. Finally, elution was concentrated to an end volume of 0.1 to 0.3 mL and 1× KTq storage buffer (5 mM Tris HCI (pH 9.2), 16 mM (NH4)2SO4, 0.25 mM MgCl2, 0.1% (v/v) Tween 20) with 50% final (v/v) glycerol were added for storage at -20°C. Protein concentrations were determined using the Bradford assay and adjusted protein concentrations and purity of enzymes were verified by SDS-PAGE.
K I 6 - 5 K 84 - 5 W 1- - - - 7 M L 8 41- - 7 G 4 3 7 87 - V 1 5 5 R G N S R6I M V K3 K N N VI 5 8 - 7 5 0 K0 G - 6 - K4- - - 1 4 05 1 4 7 8 5 5 5 5 1- - R D N E S K A V6I K3 A7 K N K G 5 0 0 6 M S7 VI 8 8 4 - 05 1 4 5 5 75 - 85 - 41 - 96- - - R A N E S K A V6I F K I 5 N S 5 7 K II 61 - 84 - 1- - - - 8 4 5 5 1- - - - - R O N S R6I K N- I I - 5 - 5 W - - 7 M4 I 1 8 1- - 2 4 5 8 1- - - - -7 R N N S 5 R6I 4 - g nist K3 K K N0 E0 G8 G K n nI 8 - 70 5 6 1 4 7 7 8 - 4- - - - - e a ei I rr I 81 4 5 5 5 5 5 5 1 N E S A6I a R J K D V cs v r q K e T 3 R K N0 K G K tI f KI aI 02 84 - 70 5 5 1 4 07 - 68 - 4- - - - - 5 5 5 5 1 R H N E S K A V6I de ni atI- - - - - - - - - K4- - bI 7 1- - - 6 o R GI stn K T E G G ai 3 K7 0 0 8 6 K4 M 8 raI- - I 4 7- - - - - 3 1 4 05 5 5 75 85 1 2 v R L N E K A D V6I8I qT K3 K N N0 E0 G8 K K I 2 8 - 70 51 4 75 7- - 41- - - - - gI 2 4 5 5 5 5 ni R B N E S K A D6I si m K or 3 D K N 2 8 - 70 5 0 K0 G 1 4 7 - 68 - K41- - - - - pI 1 4 5 5 5 5 5 f R A N E S K A V6I osn K K N Y E0 V G K G o 3 it 8 - 7 at I 0 5 04 7 87 68 - 4 55- - - - 9 4 5 15 5 5 5 5 1 6 R N E S K A D V6I D u A M: n 4 oi et lat s 38 5 0 8 7 5 4 07 87 68 7 7 8 4 55 7 4 38 3 ba uet 4 4 05 15 5 5 5 5 5 1 6 96 7 7 2 T Mis N N E S K A D V R6I D F M V8I Primer extension with fluorescently labelled primer and purified KTq variant Primer extension reactions with purified KTq variants were executed similarly to reactions in the screening experiment with the difference that one row of the reaction plate (one capillary) was assigned to one DNA polymerase variant. In short, reactions mixtures contained 2.5 nM KTq variant, 100 nM C or 5mC oligonucleotide template and 10 nM fluorescently labelled primer in 1× KTq buffer (50 mM Tris HCI (pH 9.2), 16 mM (NH4)2SO4, 2.5 mM MgCl2, 0.1% (v/v) Tween 20) with a final volume of 20 µL. Reactions were started at 55°C with addition of dNTPs. For single-nucleotide incorporation experiments 35 nM, 70 nM and 100 nM dGTP as match substrate and 35 µM dATP, 50 µM dATP and 70 µM dATP as mismatch substrate were added. For multiple-nucleotide incorporation experiments either dNTP mixtures with 100 nM dGTP and 100 nM dCTP, or 70 µM dATP and 10 µM dCTP were added. Reactions were stopped after 10 min by adding 20 µL CE stop solution (80% (v/v) formamide, 20 mM EDTA). After denaturation for 3 min at 95°C, extended primers were analysed by CE. PCR activity of purified KTq variants Reaction mixtures for quantitative real-time PCR (qPCR) contained 20 µM dGTP and 200 µM d(A/T/C)TP (each), 400 nM 109 bp forward and reverse primer, 50 pM 803 bp C template generated by PCR, 1× SYBR Green I (Sigma-Aldrich) and 250 nM purified KTq variant in 1× KTq reaction buffer (50 mM Tris HCI (pH 9.2), 16 mM (NH4)2SO4, 2.5 mM MgCl2, 0.1% (v/v) Tween 20). qPCR was performed in 10 µL using the Light Cycler® 96 instrument (Roche Diagnostics) with an initial denaturation at 95°C for 1 min followed by amplification over 30 cycles with denaturation at 95°C for 10 s, annealing at 62°C for 30 s and elongation at 72°C for 4 min. High resolution melting curves were measured immediately after PCR amplification. qPCR data was analysed by the Light Cycler® 96 Application Software (Version 1.1.0.1320) and quantification cycle (Cq) values were determined with the predefined fluorescence threshold value 0.2. Formation of correct 109 bp amplicon product was confirmed by comparing melting curves of KTq variant derived PCR products with the melting curve profile of the PCR product obtained by the KTq wild-type. Amplification of human genomic DNA for generation of the 803 bp template Reaction mixtures contained 200 µM dNTPs (each), 500 nM 803 bp forward and reverse primer, 15 ng/µL HeLa gDNA (New England Biolabs) and 0.02 U/µL Q5® High-Fidelity DNA Polymerase in 1× Q5® Reaction Buffer (New England Biolabs). PCR was performed in six separate 50 µL reaction mixtures with an initial denaturation at 98°C for 3 min followed by amplification over 30 cycles with denaturation at 98°C for 10 s, annealing at 62°C for 30 s and elongation at 72°C for 30 s. Final elongation was performed for 2 min at 72°C. The 803 bp PCR product was purified by preparative agarose gel electrophoresis using the NucleoSpin® Gel and PCR Clean-up kit (Macherey-Nagel) according to the manufacturer's instructions. Extracted DNA was combined and treated with the PreCR® Repair Mix (New England Biolabs) for DNA damage repair according to the manufacturer's instructions (in short: 100 µM dNTPs (each), 1× NAD+, 1 µL PreCR® Repair Mix in 1× ThermoPol® Reaction Buffer for 20 min at 37°C). Repaired DNA product was purified using the QIAEX II system (Qiagen) according to the manufacturer's instructions and was eluted in 20 µL Milli-Q water. DNA sequence was verified by Sanger sequencing (Azenta Life Sciences). Part of the purified and repaired PCR product was used as the unmodified 803 bp C template and the remaining part was directly used for the generation of the methylated template DNA. Methylation of template DNA generated by PCR Reaction mixtures for CpG methylation of PCR product contained 600 µM S-adenosyl methionine, 12 U CpG Methyltransferase (M.Sssl) in 1× NEBufferTM 2 (New England Biolabs). Methylation reaction was performed with purified and repaired 803 bp template DNA in 30 µL end volume by incubation at 37°C for 1 h. The reaction mixture was purified using the QIAEX II system (Qiagen) according to the manufacturer's instructions and DNA was eluted in 20 µL Milli-Q water. Methylation reaction and purification step were repeated seven times for full CpG methylation. CpG methylation of the modified 803 bp 5mC template DNA was verified by bisulfite conversion and NGS (Fig.2). Bisulfite conversion 10 µL of template DNA generated by PCR or HeLa gDNA (New England Biolabs) were bisulfite converted using the EpiMark® Bisulfite Conversion Kit (New England Biolabs) and immediately desulphonated and purified according to the manufacturer's instructions. Next, bisulfite treated DNA 5'-3' strand was amplified applying two PCRs consecutively. Reaction mixtures for the first PCR contained 200 µM dNTP (each), 100 nM bisulfite conversion forward and reverse primer, 10 µL bisulfite converted DNA and 0.025 U/µL EpiMark® Hot Start Taq DNA Polymerase in 1× EpiMark® Hot Start Taq Reaction Buffer (New England Biolabs). PCR was performed in 50 µL with an initial denaturation at 95°C for 30 s, followed by amplification over 30 cycles with denaturation at 95 °C for 20 s, annealing at 49.2°C for 30 s and elongation at 68°C for 40 s. Next, 10 µL of the performed PCR reaction was used as template for the subsequent PCR with the same reaction conditions in 50 µL volume. Five of these reactions were performed and combined. The formation of the 235 bp product was verified by agarose gel electrophoresis and DNA was purified using the NucleoSpin® Gel and PCR Clean-up kit (Macherey-Nagel) according to the manufacturer's instructions.285 pM of bisulfite converted and amplified 5'-3' strand dsDNA was employed for NGS library preparation following the protocol for gDNA based samples. Methylation status at CpG sites was analysed by NGS (Fig. 2). Linear amplification of the 5'-3' strand of template DNA generated by PCR Reaction mixtures for linear PCR contained 2 µM dGTP and 200 µM d(A/T/C)TP (each), 100 nM 109 bp forward primer, 50 pM 803 bp C or 5mC template and 250 nM KTq variant in 1× KTq reaction buffer (50 mM Tris HCI (pH 9.2), 16 mM (NH4)2SO4, 2.5 mM MgCl2, 0.1% (v/v) Tween 20). PCR was performed in 25 µL with an initial denaturation at 95°C for 3 min followed by amplification over 20 cycles with denaturation at 95°C for 10 s, annealing at 62°C for 30 s and elongation at 72°C for 4 min. Final elongation was performed for 10 min at 72°C. Single- stranded (ssDNA) PCR product from 109 to 364 nt was purified by preparative agarose gel electrophoresis using the NucleoSpin® Gel and PCR Clean-up kit (Macherey-Nagel) in combination with the NTC binding buffer according to the manufacturer's instructions with elution in 22 µL Milli-Q water. To verify that no original template DNA was extracted, 1 µL of purified PCR product was subsequently applied in a PCR for purity verification. Reaction mixtures contained 200 µM dNTPs (each), 500 nM 803 bp forward and reverse primer, 10% (v/v) PCR product as template and 0.02 U/µL Q5® Hot Start High-Fidelity DNA Polymerase in 1× Q5® Reaction Buffer (New England Biolabs). PCR was performed in 10 µL reaction mixtures with an initial denaturation at 98°C for 1 min followed by amplification over 25 cycles with denaturation at 98°C for 10 s, annealing at 62°C for 30 s and elongation at 72°C for 30 s. Final elongation was performed for 2 min at 72°C. The absence of the 803 bp PCR product was verified by agarose gel electrophoresis. NGS library preparation using template DNA generated by PCR 17.5 µL of purified linear PCR product ssDNA was treated with the PreCR® Repair Mix (New England Biolabs) in 20 µL according to the manufacturer's instructions. Without further purification, the reaction mixture was applied in a 2-cycle UMI PCR. Reaction mixtures for UMI introduction contained 200 µM dNTPs (each), 200 nM forward and reverse UMI primer, 0.02 U/µL Q5® Hot Start High-Fidelity DNA Polymerase and 0.2× Q5® Reaction Buffer (New England Biolabs) in 25 µL end volume. PCR was performed with an initial denaturation at 98°C for 1 min followed by primer elongation over 2 cycles with denaturation at 98°C for 10 s, annealing at 48°C for 30 s and elongation at 72°C for 30 s. Final elongation was performed for 2 min at 72°C. 183 bp UMI PCR product was purified by preparative agarose gel electrophoresis using the NucleoSpin® Gel and PCR Clean-up kit (Macherey-Nagel) according to the manufacturer's instructions. To improve purification, NTI binding buffer was 1:1.5 diluted with Milli-Q water for solubilising agarose gel slices and the dissolved mixture was diluted in total to 1:2.5 prior to DNA binding on the purification column. Purified UMI PCR product was eluted in 15 µL Milli-Q water. DNA concentration was determined by qPCR using previously prepared 183 bp reference DNA in decadic dilution series for the quantification based on standard amplification curves. Reaction mixtures contained 0.96× NEBNext® UltraTM II Q5® Master Mix (New England Biolabs), 400 nM 183 bp forward and reverse primer, 1× SYBR Green I (Sigma-Aldrich) and 1.85 µL reference template or UMI PCR product elution in 5 µL end volume. qPCR was performed in a Light Cycler® 96 instrument (Roche Diagnostics) with an initial denaturation at 98°C for 1 min followed by amplification over 30 cycles with denaturation at 98°C for 10 s, annealing at 60°C for 30 s and elongation at 72°C for 30 s. High resolution melting curves were measured immediately after PCR amplification. For absolute quantification, Cq values of reactions with reference template were plotted against the logarithm of their DNA concentrations and the linear regression function was used to calculate the concentration of the UMI PCR product. Calculated concentrations were multiplied by two, because applied reference template DNA was double stranded and only one strand of UMI PCR product carried both primer binding sites. Reactions for reference and UMI DNA were performed twice and a no template control reaction was performed to determine side product signals. The minimum reaction efficiency for the standard curve was 0.9 to 1.1 and the cut-off for the correlation coefficient of linear regression was 0.98. 67.63 fM of purified UMI PCR product was treated with the PreCR® Repair Mix (New England Biolabs) in 11.25 µL according to the manufacturer's instructions. Without further purification, the reaction mixture was applied in the Amplicon PCR. Reaction mixtures for Amplicon library preparation contained 1× NEBNext® UltraTM II Q5® Master Mix (New England Biolabs), 400 nM forward and reverse Amplicon primer (containing Illumina TruSeq adapter and indexes sequences) and 26.57 fM UMI PCR product (for final analysis of approximately 100000 UMI families) in 25 µL end volume. PCR was performed with an initial denaturation at 98°C for 1 min followed by amplification over 35 cycles with denaturation at 98°C for 10 s, annealing at 70°C for 30 s and elongation at 72°C for 30 s. Final elongation was performed for 2 min at 72°C.263 bp Amplicon PCR product was purified by preparative agarose gel electrophoresis using the NucleoSpin® Gel and PCR Clean-up kit (Macherey-Nagel) according to the manufacturer's instructions. To improve purification, NTI binding buffer was 1:1.5 diluted with Milli-Q water for solubilising agarose gel slices and the solved mixture was diluted in total to 1:6.5 prior to DNA binding on the purification column. The PCR product was eluted in 20 µL Milli-Q water and treated with the PreCR® Repair Mix (New England Biolabs) in 20 µL according to the manufacturer's instructions. Repaired product DNA was purified using the QIAEX II system (Qiagen) according to the manufactures instructions and was eluted in 18 µL Milli-Q water. DNA library concentrations were determined using the QuantusTM Fluorometer (Promega). After quality control by electrophoresis with the Bioanalyzer 2100 system (Agilent), DNA libraries were pooled equimolarly and sequencing was performed in paired-end mode on an Illumina MiSegTM or NextSeq 2000 system with 2 × 75 bp read length. Each prepared NGS library was prepared and sequenced once. Linear amplification of human genomic DNA 5'-3' strand Reaction mixtures for linear PCR contained 10 µM dGTP and 200 µM d(A/T/C)TP (each), 100 nM 109 bp forward primer, template DNA (3 fM 803 bp C or 5mC template generated by PCR, 62.5 ng HeLa gDNA native or CpG methylated (New England Biolabs)) and 300 nM RIV A8 KTq variant in 1× KTq reaction buffer (50 mM Tris HCI (pH 9.2), 16 mM (NH4)2SO4, 2.5 mM MgCl2, 0.1% (v/v) Tween 20) in 25 µL end volume. PCR was performed with an initial denaturation at 95°C for 3 min followed by amplification over 20 cycles with denaturation at 95°C for 10 s, annealing at 60°C for 30 s and elongation at 72°C for 10 min. Reaction mixtures were purified by using the NucleoSpin® Gel and PCR Clean-up XS kit (Macherey-Nagel) in combination with the NTC binding buffer according to the manufacturer's instructions for ssDNA PCR product purification. DNA was eluted with 18 µL Elution Buffer NE (5 mM Tris HCI (pH 8.5)). To use the linear PCR product as template for NGS library preparation, the ssDNA was exponentially amplified by a high-fidelity-DNA polymerase to increase the 109 bp product concentration. Reaction mixtures contained 200 µM dNTPs (each), 100 nM 109 bp forward and reverse primer, 67% (v/v) of the PCR elution and 0.02 U/µL Q5® Hot Start High- Fidelity DNA Polymerase in 1× Q5® Reaction Buffer (New England Biolabs). PCR was performed in 25 µL reaction mixtures with an initial denaturation at 98°C for 30 s followed by amplification over 10 cycles with denaturation at 98°C for 5 s, annealing at 62°C for 10 s and elongation at 72°C for 5 s. Reaction mixtures were purified by using the NucleoSpin® Gel and PCR Clean-up XS kit (Macherey-Nagel) and a 1:2 dilution of the NTI buffer according to the manufacturer's instructions to optimise PCR product purification and yield. DNA was eluted with 18 µL Elution Buffer NE. Note, all reactions were performed in the Thermo Scientific Low Profile Tubes (Thermo Scientific) and DNA was eluted in Eppendorf DNA LoBind® Tubes (Eppendorf). NGS library preparation using human genomic DNA 16.25 µL of purified linear PCR product dsDNA was applied as template for a 2-cycle UMI PCR. Reaction mixtures for UMI introduction contained 200 µM dNTPs (each), 200 nM forward and reverse UMI primer, 0.02 U/µL Q5® Hot Start High-Fidelity DNA Polymerase and 1× Q5® Reaction Buffer (New England Biolabs) in 25 µL end volume. PCR was performed with an initial denaturation at 98°C for 30 s followed by primer elongation over 2 cycles with denaturation at 98°C for 10 s, annealing at 48°C for 30 s and elongation at 72°C for 30 s. Final elongation was performed for 2 min at 72°C. 183 bp UMI PCR product was purified by preparative agarose gel electrophoresis using the NucleoSpin® Gel and PCR Clean-up XS kit (Macherey-Nagel) according to the manufacturer's instructions. Purified UMI PCR product was eluted with 18 µL Elution Buffer NE. DNA concentration was determined by qPCR according to the method described for NGS library preparation using template DNA generated by PCR. DNA repair using the PreCR® Repair Mix (New England Biolabs) and Amplicon PCR were performed according to the method described for NGS library preparation using template DNA generated by PCR. Amplicon PCR was performed with an initial denaturation at 98°C for 30 s followed by amplification over 30 cycles with denaturation at 98°C for 10 s, annealing at 70°C for 30 s and elongation at 72°C for 30 s. Final elongation was performed for 2 min at 72°C.263 bp Amplicon PCR product was purified by preparative agarose gel electrophoresis using the NucleoSpin® Gel and PCR Clean-up XS kit (Macherey-Nagel) according to the manufacturer's instructions. To improve purification, non-diluted NTI binding buffer was used first for solubilising agarose gel slices and subsequently the solved mixture was diluted in total to 1:6.5 with Milli-Q water prior to DNA binding on the purification column. PCR product was eluted with Elution Buffer NE. Treatment of extracted DNA with PreCR® Repair Mix (New England Biolabs) and QIAEX II system (Qiagen) purification was performed according to the method described for NGS library preparation using template DNA generated by PCR. Repaired product DNA was eluted with 22 µL Milli-Q water. DNA library concentration determination, quality control and sequencing was performed as described for NGS library preparation using template DNA generated by PCR. Note, all reactions were performed in the Thermo Scientific Low Profile Tubes (Thermo Scientific) and DNA was eluted in Eppendorf DNA LoBind® Tubes (Eppendorf). NGS data processing for the detection of 5mC Sequencing data quality control, processing and error calculation was executed using the KNIME Analytics Platform software. First. raw sequence and quality values were extracted from the FASTQ file format and Phred quality scores (Q scores) were transformed into base calling error probabilities (P) by using: −^^ ^^ = 1010 The data were pre-processed by defining the UMI sequence context and translating Read 1 sequence and P into the reverse complement orientation. Read 1 and Read 2 were aligned to give the expected size of the template and merged into one sequence for which base calls with the lower error probability were transferred for the Read 1 and Read 2 overlay segment. High quality data was filtered by removal of reads containing a base within the UMI contexts with a P value above the threshold or removal of merged reads with a mean error probability over all bases in the sequence context above the threshold value. For each filtered read, a deletion and insertion correction was performed to adjust frameshifts by employing the Levenshtein distance between the read and reference template. Additionally, N base calls were replaced by reference bases to prevent false positive error detection. Next, reads were aligned to the reference template sequence and reads with a misalignment higher as 6% or 12% were removed from the data set. For error calculation, reads were sorted into UMI family groups with identical UMIs and analysis was proceeded with UMI families containing a minimum of three reads. First, error calculation of UMI families was performed by averaging over the error for each sequence position of each read in one UMI family. By employing an error cut-off of 0.9, true KTq derived errors were then set to 1 and errors below the cut-off were set to 0 for each UMI family. The mean error was calculated over all UMI families at each sequence position, yielding the KTq based error rate.5mC detection was facilitated by comparing the error rates of the unmodified C template data set with the error rates of the 5mC template sequencing data. Calculating the error difference with: ∆ ^^^^^^^^^^ ^^^^^^^^ = ^^^^^^^^^^ ^^^^^^^^5^^^^ ^^^^^^^^^^^^^^^^ − ^^^^^^^^^^ ^^^^^^^^^^ ^^^^^^^^^^^^^^^^ results in an increased Δ error rate at 5mC positions. Base calls were analysed with sequencing data that was grouped into UMI families but not yet further processed by the error cut-off. The coverage and UMI family number used for error calculation of each NGS library are listed in Table 5. P-values for statistical analysis in Figure 17 were determined using the Wilcoxon- Mann-Whitney (WMW) test (GraphPad Prism Version 6.00). Table 5: The coverage and UMI family number used for error calculation of NGS libraries. Bisulfite conversion of template DNA UMI Figure Illumina System Template Coverage families 803 bp C template 1173293 189832 MiSeq 803 bp 5mC template 1091292 180331 Fig.2 gDNA native 2983694 155223 NextSeq 200 gDNA mCpG 4191504 89429 Template DNA generated by PCR Illumina KTq dNTP UMI Figure Template Coverage System variant concentration families KTq 803 bp C template 1159368 137314 wild-type 803 bp 5mC template 1113872 158298 803 bp C template 1233697 139850 RIII H20 2 µM dGTP & Figs. NextSeq 803 bp 5mC template 662646 21464 200 µM 3, 4, 15 200 803 bp C template 762172 103562 RIV A8 d(A/T/C)TP 803 bp 5mC template 858997 138209 803 bp C template 911981 129173 RIV D15 803 bp 5mC template 1437209 143204 2 µM dGTP & 803 bp C template 1993496 72857 Fig.16 MiSeq RIV A8 200 µM 803 bp 5mC template 2008642 75481 d(A/T/C)TP Human genomic DNA Illumina KTq dNTP UMI Figure Template Coverage System variant concentration families 803 bp C template 1896898 150058 10 µM dGTP & NextSeq 803 bp 5mC template 1805204 206996 Fig.19 RIV A8 200 µM 200 gDNA native 1375518 78540 d(A/T/C)TP gDNA mCpG 1668979 155454 Error rate processing for the detection of 5mC in human genomic DNA Error rates, calculated from sequencing data of human genomic DNA based NGS libraries, were processed further for 5mC detection. Errors were standardised using an adapted z-score which only includes error rates at C positions, values from CpG sites (where increased misincorporation would arise) were excluded, to calculate the mean and the standard deviation. Error rates were standardised using: ^^^^^^^^^^ ^^^^^^^^ − ^^^^^^^^^^ ^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^
Figure imgf000057_0001
With this, the z-score gave the number of standard deviation of each error above or below the mean error rate opposite unmodified C bases. Now, 5mC detection was independent of absolute error rates and executed by comparing the z-score of data from the unmodified C template with the z-score of data from the modified template. Calculating the z-score difference with: ^^ ^^^^^^^^^^ ^^ − ^^
Figure imgf000057_0002
results in an increased Δ z-score at 5mC positions. P-values for statistical analysis in Figure 20 were determined using the Wilcoxon-Mann-Whitney (WMW) test (GraphPad Prism Version 6.00). EXAMPLE 1: Screening for DNA polymerase variants with altered fidelity opposite 5mC First, a screening-based engineering approach was developed to discover a DNA polymerase variant with increased misincorporation opposite methylated bases. Recent studies showed that the KTq already discriminates 5mC while processing 3'-modified or 3'-mismatched primer strands. These results, together with the fact that the KTq lacks a proofreading functionality due to the structural origin of the enzyme, rendered this DNA polymerase a promising starting point for the evolution of altered fidelity characteristics opposite 5mC. This was considered to be highly challenging since the methyl group is averted from the Watson-Crick site and rather positioned in the major groove at which very bulky modifications, up to several orders of magnitude larger than the natural substrate, are accepted by DNA polymerases. The DNA polymerase libraries applied for screening included single amino acid substitutions with a broad spectrum of target mutation sites. The mutated residues were chosen based on their close proximity to the nascent base pair in a ternary complex crystal structure, previously published influence on fidelity and evolutionary conservation in family A DNA polymerases. In addition, functional promising mutation sites were rationally combined to generate double mutation variants, resulting in a total of 970 focused KTq variants. Furthermore, over 2100 KTq variants containing multiple mutations were tested, established by combinatorial shuffling of active mutants using the RACHITT method followed by preselection for PCR activity. The libraries were expressed in E. coli and the DNA polymerase variants were directly evaluated from cell lysates in single-nucleotide incorporation experiments. The screening reactions were performed in parallel with oligonucleotides as templates with the same sequence and either C or 5mC at the site of first incorporation. The 5'-labelling of the primers with two different fluorescence dyes, FAM and HEX, and the varying 5'-overhangs enabled a pooling of primer extension reactions for the multiplexed analysis of several KTq variants using CE (Fig. 5A). As substrates for single-nucleotide incorporation either the complementary dGTP (match) or the non-complementary dATP (mismatch) was applied. dATP was selected to identify KTq variants with increased misincorporation activity, because the KTq wild-type enzyme misincorporated dAMP with a higher efficiency as the other mismatching nucleotides opposite C and 5mC (Fig.6). This means that a total of four primer extension reactions per KTq variant (C or 5mC template, dGMP or dAMP incorporation) were performed to evaluate the corresponding incorporation characteristics. To identify a DNA polymerase with an increased error rate opposite 5mC, KTq variants were screened for 5mC discrimination by reduced dGMP (match) incorporation opposite the modified template base. Guided by previous studies on DNA polymerase fidelity, it was reasoned that this decreased efficiency of the KTq variants to incorporate the matching nucleotide opposite 5mC would increase the ratio of mismatch nucleotide incorporation instead. To promote this misincorporation, promising KTq variants were additionally screened for a low to moderately increased dAMP misincorporation, but without discrimination, opposite C and 5mC (Fig. 5B). This should result in a catalytically active DNA polymerase variant which generates mutation signatures opposite 5mC but processes canonical nucleotides without increased error rates. Discrimination and misincorporation efficiencies of the KTq variants were evaluated in comparison to the incorporation characteristics of the KTq wild-type enzyme. In the first three screening rounds all KTq variants showing either 5mC discrimination and/or increased misincorporation were selected. High 5mC discrimination was considered if dGMP was incorporated at least 1.5-fold more efficiently opposite C than opposite 5mC. Increased misincorporation efficiency was detected as soon as 20% of the primer was extended by dAMP incorporation. The screening revealed that most of the KTq variants featured only one if any of the searched characteristics and that none of the screened KTq variants preferred to misincorporate dAMP opposite 5mC. Around 15% of the screened KTq variants combined 5mC discrimination by reduced dGMP incorporation and increased dAMP misincorporation opposite both C and 5mC. However, several KTq variants additionally discriminated 5mC by reduced dAMP incorporation or showed a high misincorporation rate combined with reduced activity to incorporate dGMP opposite C. Both characteristics could influence the incorporation fidelity opposite unmodified C, possibly rendering these KTq variants catalytically inefficient and generally unsuitable for 5mC detection. Therefore, a fourth screening round was performed. By selecting only KTq variants that misincorporated dAMP with a comparable efficiency opposite C and 5mC, as well as elongate not more than 50% of primer by dAMP incorporation, about 3% of the initially applied KTq variants were chosen for a final screening. Since it was observed that the KTq wild-type already discriminates 5mC to some extent, only those KTq variants with the same or higher 5mC discrimination efficiency were selected in the fifth screening round. In addition, promising KTq variants were entitled if the DNA polymerases incorporated dAMP and dGMP comparatively efficient opposite 5mC under the applied reaction conditions (70 nM dGTP and 35 µM dATP). Considering the combination of selected discrimination and misincorporation characteristics with the previously screened features, 12 promising KTq variants were identified as most promising hits (Fig.7 and Fig.8). All these KTq variants derived from the combinatorial RACHITT mutant library and were named according to the location in the library (Table 4). Sequence analysis revealed that all variants are mutated at residue I614 with 75% of the variants having the I614K mutation and 25% having the I614M mutation. Indeed, the KTq variant RII G7 is a single mutant solely with the I614K mutation. The other 11 KTq variants are multiple mutation variants with four to nine randomly combined mutation sites and a median of seven mutations. Here, the residues N483, E507, S515, K540, A570 and V586 are mutated in the majority of the multiple mutation variants. The mutation sites D578, N485 and R587 were detected less frequent. The mutations D655G, F697S, M747L, V783G and 1823M were only identified in one KTq variant respectively. Promising KTq enzymes were purified, and primer extension experiments for characterisation were performed. Here, similar reaction conditions as for the screening experiments were applied but with differing dNTP concentrations. Analysis of single-nucleotide incorporation reactions verified that each screening hit featured an improved 5mC discrimination by incorporating dGMP with reduced efficiency opposite 5mC (Fig.9A and Fig.10). It became clear that some mutants also showed a reduced incorporation of dGMP opposite C, indicating a decreased catalytic efficiency. Based on this, only variants RII L1, RII G7, RIII H20, RIII J18, RIV A8, RIV D15, that incorporate dGMP opposite C comparably efficient as the KTq wild-type, were considered for a detailed analysis of the discrimination characteristics. By estimating to which extend dGMP was more efficiently incorporated opposite C in comparison to 5mC (at 35 nM substrate input), the respective variants could be compared and evaluated. The KTq variants RIII H20 and RIV D15 showed the strongest discrimination against 5mC by 3.5-fold higher incorporation of dGMP opposite C than 5mC. This was followed by RII L1 (2.5-fold) and RIV A8 (2.1-fold). The variants RIII J18 (1.8-fold) and RII G7 (1.6-fold) still had an increased 5mC discrimination compared with the KTq wild-type (1.3-fold), but were considered less promising in regard to discrimination characteristics. Likewise, the analysis of dAMP incorporation verified the increased misincorporation efficiency of the KTq variants. Especially, the two variants RII G7 and RIV A8 featured an increased dAMP misincorporation as more than 80% of primer was elongated at 70 µM dATP. The variants RIII H20 and RIII O16 elongated more than 60% of primer under these reaction conditions and the remaining variants elongated between 20% to 40% of primer by dAMP misincorporation. The KTq wild-type enzyme elongated less than 10% of primer by processing 70 µM dATP substrate. Taking the verified incorporation characteristics into account, the KTq variants RIII H20 (high discrimination, moderate misincorporation), RIV A8 (moderate discrimination, high misincorporation) and RIV D15 (high discrimination, low misincorporation) represented the most interesting screening hits. EXAMPLE 2: Testing the DNA polymerase variants for mismatch extension and DNA synthesis activity Further primer extension and amplification studies with the purified screening hits were performed to investigate how efficiently the KTq variants process incorporated mismatches and synthesise DNA while having less dGTP (match nucleotide) available for the amplification reaction (Table 3). Multiple nucleotide incorporation experiments were conducted to gain insight into the DNA polymerase elongation capability by using similar reaction conditions as in the screening and adding dCTP as the second nucleotide for incorporation. First, the efficiency of the DNA polymerases to process correctly incorporated nucleotides was evaluated by studying the primer elongation after dGMP incorporation. Here, the KTq variants RII B22, RII G7, RII L1, RII O16, RIII J18 and RIV D15 showed an extension efficiency comparable to the KTq wild-type. Around 90% full-length product formation was found after incorporation opposite C and between 30% to 90% full-length formation after incorporation opposite 5mC (Fig.9B and Fig.11A). The mutants RI A9, RI A12, RIII H20, RIII N21, RIV A8 and RVI G16 extended up to 30% of primer to the full-length product after dGMP incorporation opposite C and maximum 20% of primer after dGMP incorporation opposite 5mC (the RVI G16 variant only incorporated one dCMP nucleotide with high efficiency). Both findings highlight that all KTq variants, including the KTq wild-type enzyme, discriminated against 5mC by reduced elongation after dGMP incorporation opposite the methylated template base. However, the processing and elongation of an incorrectly incorporated nucleotide represents a challenge for DNA polymerases and contributes to overall replication fidelity. Therefore, it was focused on selecting DNA polymerase mutants that feature efficient mismatch elongation. Monitoring primer elongation after dAMP misincorporation revealed that only the KTq variants RII G7, RIII H20, RIV A8 and RIV D15 were able to efficiently extend a mismatch (Fig.9C and Fig. 11B). For processing the C template, a fluorescence intensity peak after first incorporation indicated that the KTq variants tend to pause after the dAMP-dC mismatch formation. Here, approximately 5% of full-length product was obtained in presence of the C template. In comparison, the KTq variants featured a mismatch extension efficiency of 15% to 30% full-length primer elongation after dAMP was incorporated opposite 5mC. This translates into a 5mC discrimination that favours the mismatch elongation opposite methylated bases. In detail, the KTq variant RIV A8 showed a 5.8-fold more efficient elongation after dAMP incorporation opposite 5mC compared to C. This was followed by the mutant RII G7 with a 4.6-fold more efficient mismatch elongation, 4.2-fold for RIII H20 and 3.5-fold for the RIV D15 mutant. The KTq wild-type however discriminated against 5mC, by a 2-fold increased efficiency to process a mismatch opposite C. Testing the PCR efficiency and robustness of the DNA polymerases confirmed that all KTq variants were PCR active and amplified the correct PCR product (Fig. 12A, 12B). Here, the KTq variants RIII H20, RIV A8 and RIV D15 showed the highest PCR efficiency of the mutants (Fig.12C). Considering this, only the KTq variants RIII H20, RIV A8 and RIV D15 combined the improved incorporation characteristics, namely 5mC discrimination, increased misincorporation, mismatch extension capability and sufficient activity in DNA synthesis. EXAMPLE 3: DNA polymerase variants with increased misincorporation opposite 5mC Next, it was evaluated whether the KTq variants RIII H20, RIV A8 and RIV D15, were able to generate 5mC dependent mutation signatures in PCR. In order to enhance the 5mC-dependent signatures, linear amplification of an unmodified C and a modified 5mC template was performed in the presence of a reduced concentration of the match nucleotide (Fig.9 and Fig. 13A). It was assumed that supplying a decreased dGTP concentration of 2 µM, in comparison to 200 µM of d(A/T/C)TP (each), would result in an increased mismatch formation opposite cytosines as less of the matching nucleotide is available. In this case, it was reasoned that the error formation at methylated sites might be favoured due to the reduced dGMP incorporation efficiency of the KTq variants opposite 5mC. Thus, an unbalanced dNTP pool promotes both, dAMP misincorporation and 5mC discrimination, and thereby facilitates specific 5mC sensing. Subsequently, the respective PCR products served as templates for NGS library preparation and were subjected to sequencing. Data analysis and UMI-based error calculation were conducted using a self-scripted KNIME workflow (Fig.13B). The unmodified template DNA was generated by PCR in which an 803 bp long flanking region of a 109 bp target was amplified. Additionally, a portion of the generated template was methylated using the M.Sssl methyltransferase and full methylation of CpG sites C24, C32 and C72 was confirmed by bisulfite sequencing (Fig.2). For evaluating the error rates, only the 109 bp target region from positions 24 bp to 84 bp was considered, excluding the primer binding sites. Indeed, the KTq RIV A8 variant showed an up to twofold increased error rate at the methylated CpG sites C24, C32 and C72 in the 5mC template compared to processing the C template (Fig. 14A, black arrows). Detailed analysis of the error rates for the amplification from both templates revealed that the DNA polymerase preferentially incorporated mismatches opposite templating C bases, with an average error of 3.9% opposite C compared to 0.25% opposite all non-C bases. As expected, the KTq wild-type enzyme showed a much lower error profile with an average error of 0.12% opposite C bases and 0.02% opposite all non-C bases, rendering the RIV A8 variant more error-prone in general (Fig. 15A). Interestingly, different sequence positions (even non-C bases) showed variable error rates. It is known that the DNA polymerase fidelity heavily depends on the DNA sequence context and secondary structures, consequently the characteristics of the engineered mismatch formation would be equally affected. Here, it is particularly important that RIV A8 faithfully processed both templates with comparable fidelity at similar positions. Therefore, it is even more striking that a significant error difference was exclusively detected by comparing methylated and unmethylated CpG sites (Fig.14B, left black arrows). At the methylated CpG sites C24, C32 and C72, the RIV A8 variant featured an average of 16.5-fold error increase compared to the KTq wild-type enzyme, which featured a marginally increased error rate opposite 5mC only at position C24 and C32 (with a mean error difference of 5.93% for RIV A8 and 0.36% for the KTq wild-type) (Fig.15B). Analysis of the mutation signature verified that an increased dAMP misincorporation (detected as T base calls) led to the enhanced mismatch formation opposite 5mC by RIV A8 (Fig. 14B, right). Furthermore, RIV A8 reproduced this distinctive 5mC-dependent error signature with an alike outcome in a repetition experiment, rendering RIV A8 applicable for the detection of 5mC by increased misincorporation (Fig.16). Also the mutants RIII H20 and RIV D15 featured increased dAMP misincorporation opposite 5mC although with lower efficiency, resulting in a 10.8-fold increase for the RIII H20 variant (mean error difference of 3.89%) and 5.4-fold increase for the RIV D15 variant (mean error difference of 1.93%) in comparison to the KTq wildtype (Fig.3 and Fig.17). EXAMPLE 4: KTq variant RIV A8 detects 5mC in human genomic DNA Finally, it was set out to test the RIV A8 mutant for 5mC detection in human genomic DNA. In addition to the C and 5mC templates generated by PCR, genomic DNA isolated from HeLa cells was used to investigate the 5mC sensing in a native methylation pattern (gDNA native). To apply fully methylated genomic DNA, HeLa genomic DNA was utilised that was methylated at CpG sites C24, C32 and C72 using the M.Sssl methyltransferase (mCpG gDNA). Methylation levels were determined by bisulfite sequencing (Fig.2). The RIV A8 variant was also able to sense methylated CpG sites in human genomic DNA by generating site-specific 5mC-dependent signatures (Fig. 18, black arrows). Due to the inherently low copy number of genomic DNA, only small amounts of linear PCR product were obtained after the reaction with the KTq variant. Therefore, an exponential amplification by a high-fidelity DNA polymerase followed to generate the required concentration for NGS library preparation. In this step, co-amplification of the template DNA before library preparation led to a mixture of linear PCR product and the starting material. As a consequence, absolute error rates did not represent actual errors derived from RIV A8 (Fig. 19). Based on the previous findings that RIV A8 processed identical positions in each template with similar fidelity and that a significant error difference derived only from methylation (Fig.17C), the absolute error rates were standardised into customised z-scores (Fig.4). Using the z-score values, calculation of the differences between the modified 5mC and unmodified C template confirmed the preceding capability of RIV A8 to sense 5mC, displayed by an 11.37-fold average increase in z-score difference opposite the CpG sites C24, C32 and C72 (Fig.18A). Impressively, despite the challenging and complex nature of amplification from genomic DNA, RIV A8 detected methylation levels that are greater than 50% at CpG sites C24 and C32 in the native gDNA, indicated by a 6.06-fold increase of average z-score difference opposite the CpG sites in comparison to C bases (Fig.18B). This was confirmed by reading an increased misincorporation at the CpG sites C24, C32 and C72 in the mCpG gDNA template (10.74-fold increase in average z-score difference opposite CpG sites in comparison to C bases) (Fig. 18C), making RIV A8 suitable for 5mC detection in highly methylated genomic DNA (Fig.20). Discussion Here, the engineering of a thermostable DNA polymerase variant with altered fidelity opposite 5mC is described. The modified base is discriminated against its unmodified form by increased dAMP misincorporation during PCR and resulting mutation signatures are directly detected by NGS. This facilitates an easy and straightforward 5mC detection without the need for sample conversion prior to usage or extensive data analysis subsequent to sequencing. The DNA polymerase variant was identified by screening a KTq library for altered incorporation characteristics using primer extension reactions. Here, promising DNA polymerase variants were selected for enhanced 5mC discrimination by match nucleotide incorporation and for simultaneously increased misincorporation activity, but without discrimination by mismatch nucleotide incorporation. Further monitoring of mismatch extension and DNA synthesis efficiencies yielded the KTq variants RIII H20, RIV A8 and RIV D15 which were validated for effective 5mC sensing. The DNA polymerase variants showed significantly increased misincorporation opposite 5mC. Unmodified G, A and T bases produced no or only minor elevated error rates, whereas increased error rates also occurred by processing templating C bases. This strategy was applied for the valid detection of highly methylated CpG sites in human genomic DNA by RIV A8. Analysis of the sequencing data revealed 5mC detection at multiple CpG sites within the natural occurring sequence context. Of note, based on the nature of DNA polymerase mismatch processing, in which enhanced mismatch formation leads to less full-length product, and the additional co-amplification of the starting template, only a qualitative 5mC detection could be performed. Interestingly, the identified KTq variants derive all from the same mutant library and feature a relatively high mutational load of seven to eight mutations with similarly mutated residues (Fig. 21). The library was generated by combinatorial shuffling of functional mutations which are located in the proximity of the active site or are evolutionarily conserved in family A DNA polymerases. Indeed, the mutations N483K and A570K can be found in all variants and are located in evolutionary conserved motifs. N483K is located in Motif 1 and A570K in Motif 2, which are both in contact with the template. Residue I614 is evolutionarily conserved in Motif A and directly located at the active site, forming a part of the hydrophobic binding pocket for the incoming nucleotide. Hydrophilic substitution to I614K is known to decrease the DNA polymerase fidelity as well as enhance mismatch extension capability. The I614M mutation contributes to an increased ribonucleotide incorporation activity. Also the mutations K540N and V586G are present in all three KTq variants and both make direct contact with the primer. Here, hydrophilic mutation K540R, joint with other mutations, is involved in an increased DNA-binding affinity and contributes to enhanced resistance to inhibitors such as heparin. Mutation of the negatively charged residue E507 into the positively charged E507K led to an improved resistance to several inhibitors and in combination with other mutations displayed even more enhanced heparin resistance. Furthermore, the single mutation variant E507K exhibits increased activity and stability during PCR by a strong interaction with the primed template DNA. Residue S515 contacts the primer strand and mutation S515R was found in a reverse transcription active KTq variant (RT-KTq, L459M S515R I638F M747K). The mutation F697S shows no direct interaction with the nascent base pair and probably acts through interactions with the dNTP binding residues. Notably, F697 was no target for mutagenesis and was introduced during amplification of the mutant gene. The mechanism by which the KTq variants discriminate 5mC and exhibit increased misincorporation is currently unclear. However, because no single or rationally designed variant comprised the required characteristics, it can be speculated that the individual effects of the distal mutations contribute to a synergistic alteration of the DNA polymerases fidelity. Here, crucial features that contribute to the replication fidelity, such as reliable 5mC processing, high substrate specificity and mismatch discrimination, had to be overcome to engineer a DNA polymerase that efficiently detects 5mC and still retains catalytic activity. It is conceivable that the mutations act on different mechanistic levels. Substitutions at the active site could promote misincorporation by maintaining a proper alignment of the residues upon encountering mismatches and thus facilitating incorporation of incorrect nucleotides. Furthermore, mutated residues contacting primer and/or template strand could stabilise transition states and therefore enhance DNA binding, promote mismatch elongation and improve DNA synthesis in general. Moreover, mutation sites in close proximity to the nascent base pair could discriminate incorporation and elongation opposite methylated template bases. Resulting in a reduced catalytic efficiency for dGMP incorporation opposite 5mC and thus enhance tolerance for mismatch formation at this site. Therefore, 5mC sensing relies more on a generally increased misincorporation rate and specificity against 5mC, but less on targeted misincorporation opposite methylated template sites. Intriguingly, previously identified KTq variants from the same library feature very similar mutational patterns to the herein described mutants with variant Mut_ADL: N483K E507K S515N K540G A570E D578G V586G I614M, and Mut_RT: N483K E507K K540Y V586G I614K. Although these enzymes derive from a single library, they comprise highly diverse functional scopes. Mut_ADL was evolved to exclusively extend from matched primer strands, which would have been rather disadvantageous in this study. Mut_RT efficiently reverse transcribes from RNA substrate templates. Furthermore, by applying a similar evolution approach, the mutant RT-KTq l614Y was identified which is capable to discriminate modified from unmodified RNA bases during reverse transcription. This proposes that changes in these identified key residues affect fundamental properties of the polymerase and that the interplay of distant mutations leads to a variety in incorporation characteristics. Consequently, rendering the mutation sites targets for further optimisation of the evolved 5mC sensing capability. Of note, the DNA polymerases in Sequence Family A, to which Taq DNA polymerase belongs, bear highly conserved motifs. This holds true also for many mutation sites discussed herein with respect to Taq DNA polymerase and KlenTaq DNA polymerase. Therefore, it is expected that the properties of these mutated DNA polymerases can also be transferred to other members of this sequence family by way of implementing the corresponding mutations (Table 1). In this study, the ability of the discovered DNA polymerase to replicate DNA templates modulated by 5mC, combined with the here established sequencing workflow, allow reliable detection of even slight 5mC induced error differences. Considering that the methyl group is not involved in Watson-Crick base pairing, even the subtle 5mC-dependent mutation signatures shown here highlight DNA polymerase engineering as a powerful tool to overcome given fidelity characteristics and obtain enzymes with desired properties for future applications.

Claims

CLAIMS 1. A DNA polymerase derived from wild-type Thermus aquaticus (Taq) DNA polymerase, comprising the mutations N483K, E507K/A/R, S515K/N, K540N, A570K, V586G, and I614M/K with regard to the amino acid sequence of wild-type Taq DNA polymerase (SEQ ID NO: 1). 2. The DNA polymerase of claim 1 comprising an amino acid sequence at least 90% identical to SEQ ID NO: 1 including said mutations. 3. The DNA polymerase of claim 1 or claim 2 comprising an amino acid sequence corresponding to and being at least 90% identical to (i) amino acids 293 to 832 of SEQ ID NO: 1 including said mutations, (ii) amino acids 4 to 832 of SEQ ID NO: 1 including said mutations, (iii) amino acids 279 to 832 of SEQ ID NO: 1 including said mutations, or (iv) amino acids 290 to 832 of SEQ ID NO: 1 including said mutations. 4. The DNA polymerase of claim 3 comprising the amino acid sequence corresponding to amino acids 293 to 832 of SEQ ID NO: 1 including said mutations. 5. The DNA polymerase of claim 4 comprising the amino acid sequence as shown in SEQ ID NO: 2 including said mutations. 6. The DNA polymerase of any one of claims 1 to 5 further comprising the mutation F697S with regard to SEQ ID NO: 1. 7. The DNA polymerase of any one of claims 1 to 6 comprising the mutations N483K, E507A, S515K, K540N, A570K, V586G, I614M, and F697S with regard to SEQ ID NO: 1. 8. The DNA polymerase of any one of claims 1 to 5 comprising the mutations N483K, E507R, S515K, K540N, A570K, V586G, and I614K with regard to SEQ ID NO: 1. 9. The DNA polymerase of any one of claims 1 to 5 comprising the mutations N483K, E507K, S515N, K540N, A570K, V586G, and I614K with regard to SEQ ID NO: 1. 10. The DNA polymerase of any one of claims 1 to 5 comprising the amino acid sequence as shown in one of SEQ ID NOs: 3 to 5. 11. The DNA polymerase of any one of claims 1 to 10, wherein the DNA polymerase is thermostable. 12. A DNA polymerase selected from the following DNA polymerases (i) to (vi): (i) a DNA polymerase derived from wild-type Thermus thermophilus (Tth) DNA polymerase, comprising the mutations N485K, Q509K/A/R, S517K/N, K542N, A572K, V588G, and I616M/K with regard to the amino acid sequence of wild-type Tth DNA polymerase (SEQ ID NO: 42); (ii) a DNA polymerase derived from wild-type E. coli DNA polymerase I, comprising the mutations N579K, P603K/A/R, S610K/N, K635N, A665K, V681G, and I709M/K with regard to the amino acid sequence of wild-type E. coli DNA polymerase I (SEQ ID NO: 43); (iii) a DNA polymerase derived from wild-type E. coli phage T7 DNA polymerase, comprising the mutations N335K, one of the mutations T357K/A/R and V368K/A/R, one of the mutations D365K/N and D376K/N, one of the mutations K394N and K404N, V426K, V443G, and L479M/K with regard to the amino acid sequence of wild-type E. coli phage T7 DNA polymerase I (SEQ ID NO: 44); (iv) a DNA polymerase derived from wild-type Bacillus stearothermophilus (Bst) DNA polymerase, comprising the mutations N527K, S557K/N, K582N, Q612K, I628G, and I657M/K, and optionally the mutation K551A/R, with regard to the amino acid sequence of wild-type Bacillus stearothermophilus DNA polymerase (SEQ ID NO: 45); (v) a DNA polymerase derived from wild-type Bacillus subtilis (Bsu) DNA polymerase, comprising the mutations N531K, S561K/N, K586N, Q616K, I632G, and I661M/K, and optionally the mutation K555A/R, with regard to the amino acid sequence of wild-type Bacillus subtilis DNA polymerase (SEQ ID NO: 46); (vi) a DNA polymerase derived from wild-type Bacillus phage SP01 DNA polymerase, comprising the mutations N502K, D526K/A/R, H558N, V587K, V605G, and L639M/K, and optionally the mutation N533K, with regard to the amino acid sequence of wild-type Bacillus phage SP01 DNA polymerase (SEQ ID NO: 46). 13. The DNA polymerase of claim 11 comprising an amino acid sequence at least 90% identical to (i) Thermus thermophilus (Tth) DNA polymerase of SEQ ID NO: 42 and further comprising the mutations N485K, Q509K/A/R, S517K/N, K542N, A572K, V588G, and I616M/K with regard to the amino acid sequence of wild-type Tth DNA polymerase (SEQ ID NO: 42); (ii) a DNA polymerase derived from wild-type E. coli DNA polymerase I of SEQ ID NO: 43 and further comprising the mutations N579K, P603K/A/R, S610K/N, K635N, A665K, V681G, and I709M/K with regard to the amino acid sequence of wild-type E. coli DNA polymerase I (SEQ ID NO: 43); (iii) a DNA polymerase derived from wild-type E. coli phage T7 DNA polymerase of SEQ ID NO: 44 and further comprising the mutations N335K, one of the mutations T357K/A/R and V368K/A/R, one of the mutations D365K/N and D376K/N, one of the mutations K394N and K404N, V426K, V443G, and L479M/K with regard to the amino acid sequence of wild-type E. coli phage T7 DNA polymerase I (SEQ ID NO: 44); (iv) a DNA polymerase derived from wild-type Bacillus stearothermophilus (Bst) DNA polymerase of SEQ ID NO: 45 and further comprising the mutations N527K, S557K/N, K582N, Q612K, I628G, and I657M/K, and optionally the mutation K551A/R, with regard to the amino acid sequence of wild-type Bacillus stearothermophilus DNA polymerase (SEQ ID NO: 45); (v) a DNA polymerase derived from wild-type Bacillus subtilis (Bsu) DNA polymerase of SEQ ID NO: 46 and further comprising the mutations N531K, S561K/N, K586N, Q616K, I632G, and I661M/K, and optionally the mutation K555A/R, with regard to the amino acid sequence of wild-type Bacillus subtilis DNA polymerase (SEQ ID NO: 46); (vi) a DNA polymerase derived from wild-type Bacillus phage SP01 DNA polymerase of SEQ ID NO: 46 and further comprising the mutations N502K, D526K/A/R, H558N, V587K, V605G, and L639M/K, and optionally the mutation N533K, with regard to the amino acid sequence of wild-type Bacillus phage SP01 DNA polymerase (SEQ ID NO: 46). 14. A nucleic acid comprising a nucleotide sequence coding for a DNA polymerase according to any one of claim 1 to 13. 15. A vector comprising the nucleic acid of claim 14. 16. A host cell comprising the vector of claim 15 or the nucleic acid of claim 14. 17. A method for the detection of 5-methylcytosine nucleotides (5mC) in a DNA molecule of interest, comprising the steps of: (a) amplifying a first aliquot of the DNA molecule of interest in a polymerase chain reaction (PCR), said PCR using a thermostable DNA polymerase having altered fidelity opposite 5mC leading to increased nucleotide misincorporation opposite the 5mC nucleotide during PCR, (b) sequencing the amplified PCR product obtained in step (a) to generate a test sequence, (c) comparing the test sequence obtained in step (b) to a reference sequence, wherein said reference sequence is obtained by way of (i) amplifying a second aliquot of the DNA molecule of interest in a PCR, said PCR using a High-Fidelity DNA polymerase, thereby generating an unmodified reference template, (ii) amplifying the reference template obtained in step (c)(i) in a PCR, said PCR using a DNA polymerase of the present invention, and (iii) sequencing the amplified PCR product obtained in step (c)(ii) to generate the reference sequence, and (d) identifying mismatches in the test sequence as compared to the reference sequence at positions in which the reference sequence shows a C, and the test sequence shows a T at the same positions, wherein a mismatch identified in step (e) indicates the presence of a 5-methylcytosine at the corresponding positions in the DNA molecule of interest. 18. The method of claim 17, wherein the thermostable DNA polymerase is a DNA polymerase of any one of claims 1 to 13 19. The method of any one of claims 17 to 18, wherein sequencing in step (b) and/or step (c) comprises Next Generation Sequencing (NGS). 20. The method of any one of claims 17 to 19, wherein the identification of mismatches in step (e) comprises determining a relative error rate in the test sequence as compared to the reference sequence at positions in which the reference sequence shows a C. 21. The method of any one of claims 17 to 20, wherein the DNA molecule of interest does not require any chemical and/or enzymatic pre-treatment prior to step (a). 22. A kit comprising at least one container providing the DNA-polymerase of any one of claims 1 to 13. 23. The kit of claim 22, further comprising one or more additional containers selected from the group consisting of: (a) a container providing a primer hybridizable, under primer extension conditions, to a predetermined polynucleotide template; (b) a container providing dNTPs; and (c) a container providing a buffer suitable for primer extension.
PCT/EP2024/0858982023-12-132024-12-12Dna polymerases for the detection of epigenetic dna marksPendingWO2025125413A1 (en)

Applications Claiming Priority (2)

Application NumberPriority DateFiling DateTitle
EPPCT/EP2023/0855482023-12-13
PCT/EP2023/085548WO2025124705A1 (en)2023-12-132023-12-13Dna polymerases for the detection of epigenetic dna marks

Publications (1)

Publication NumberPublication Date
WO2025125413A1true WO2025125413A1 (en)2025-06-19

Family

ID=89386305

Family Applications (2)

Application NumberTitlePriority DateFiling Date
PCT/EP2023/085548PendingWO2025124705A1 (en)2023-12-132023-12-13Dna polymerases for the detection of epigenetic dna marks
PCT/EP2024/085898PendingWO2025125413A1 (en)2023-12-132024-12-12Dna polymerases for the detection of epigenetic dna marks

Family Applications Before (1)

Application NumberTitlePriority DateFiling Date
PCT/EP2023/085548PendingWO2025124705A1 (en)2023-12-132023-12-13Dna polymerases for the detection of epigenetic dna marks

Country Status (1)

CountryLink
WO (2)WO2025124705A1 (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US4458066A (en)1980-02-291984-07-03University Patents, Inc.Process for preparing polynucleotides
US4683195A (en)1986-01-301987-07-28Cetus CorporationProcess for amplifying, detecting, and/or-cloning nucleic acid sequences
US4683202A (en)1985-03-281987-07-28Cetus CorporationProcess for amplifying nucleic acid sequences
US4965188A (en)1986-08-221990-10-23Cetus CorporationProcess for amplifying, detecting, and/or cloning nucleic acid sequences using a thermostable enzyme
EP1627924A1 (en)*2004-08-192006-02-22Epigenomics AGMethod for the analysis of methylated DNA
WO2021252603A1 (en)*2020-06-102021-12-16Rhodx, Inc.Methods for identifying modified bases in a polynucleotide
US20220177950A1 (en)*2020-12-032022-06-09Roche Sequencing Solutions, Inc.Whole transcriptome analysis in single cells
WO2022194764A1 (en)*2021-03-152022-09-22F. Hoffmann-La Roche AgTargeted next-generation sequencing via anchored primer extension

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US4458066A (en)1980-02-291984-07-03University Patents, Inc.Process for preparing polynucleotides
US4683202A (en)1985-03-281987-07-28Cetus CorporationProcess for amplifying nucleic acid sequences
US4683202B1 (en)1985-03-281990-11-27Cetus Corp
US4683195A (en)1986-01-301987-07-28Cetus CorporationProcess for amplifying, detecting, and/or-cloning nucleic acid sequences
US4683195B1 (en)1986-01-301990-11-27Cetus Corp
US4965188A (en)1986-08-221990-10-23Cetus CorporationProcess for amplifying, detecting, and/or cloning nucleic acid sequences using a thermostable enzyme
EP1627924A1 (en)*2004-08-192006-02-22Epigenomics AGMethod for the analysis of methylated DNA
WO2021252603A1 (en)*2020-06-102021-12-16Rhodx, Inc.Methods for identifying modified bases in a polynucleotide
US20220177950A1 (en)*2020-12-032022-06-09Roche Sequencing Solutions, Inc.Whole transcriptome analysis in single cells
WO2022194764A1 (en)*2021-03-152022-09-22F. Hoffmann-La Roche AgTargeted next-generation sequencing via anchored primer extension

Non-Patent Citations (26)

* Cited by examiner, † Cited by third party
Title
ALTSCHUL ET AL., J. MOL. BIOL., vol. 215, 1990, pages 403 - 10
ALTSCHUL ET AL., NUC. ACIDS RES., vol. 25, 1977, pages 3389 - 402
ANDERSON ET AL.: "An expanded genetic code with a functional quadruplet codon", PROC. NATL. ACAD. SCI. U.S.A., vol. 101, no. 20, 2004, pages 7566 - 7571, XP055227944, DOI: 10.1073/pnas.0401517101
AUSUBEL ET AL., CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, 1995
BACHER ET AL.: "Selection and Characterization of Escherichia coli Variants Capable of Growth on an Otherwise Toxic Tryptophan Analogue", J. BACTERIOL., vol. 183, no. 18, 2001, pages 5414 - 5425
BEAUCAGE ET AL., TETRAHEDRON LETT., vol. 22, 1981, pages 1859 - 1862
BUDISA ET AL.: "Proteins with {beta}-(thienopyrrolyl)alanines as alternative chromophores and pharmaceutically active amino acids", PROTEIN SCI., vol. 10, no. 7, 2001, pages 1281 - 1292, XP008009390, DOI: 10.1110/ps.51601
CHIN ET AL.: "An Expanded Eukaryotic Genetic Code", SCIENCE, vol. 301, no. 5635, 2003, pages 964 - 967, XP002405596
HAMANO-TAKAKU ET AL.: "A Mutant Escherichia coli Tyrosyl-tRNA Synthetase Utilizes the Unnatural Amino Acid Azatyrosine More Efficiently than Tyrosine", J. BIOL. CHEM., vol. 275, no. 51, 2000, pages 40324 - 40328, XP002950537, DOI: 10.1074/jbc.M003696200
HENIKOFFHENIKOFF, PROC. NATL. ACAD. SCI. USA, vol. 89, 1989, pages 10915
IBBA ET AL.: "Genetic code introducing pyrrolysine", CURR BIOL., vol. 12, no. 13, 2002, pages R464 - R466
IKEDA ET AL.: "Synthesis of a novel histidine analogue and its efficient incorporation into a protein in vivo", PROTEIN ENG. DES. SEL., vol. 16, no. 9, 2003, pages 699 - 706, XP002982482, DOI: 10.1093/protein/gzg084
JAMES ET AL.: "Kinetic characterization of ribonuclease S mutants containing photoisomerizable phenylazophenylalanine residues", PROTEIN ENG. DES. SEL., vol. 14, no. 12, 2001, pages 983 - 991, XP093103027, DOI: 10.1093/protein/14.12.983
JOOS ASCHENBRENNER ET AL: "Direct Sensing of 5-Methylcytosine by Polymerase Chain Reaction", ANGEWANDTE CHEMIE INTERNATIONAL EDITION, VERLAG CHEMIE, HOBOKEN, USA, vol. 53, no. 31, 12 June 2014 (2014-06-12), pages 8154 - 8158, XP072074925, ISSN: 1433-7851, DOI: 10.1002/ANIE.201403745*
KARLINALTSCHUL, PROC. NATL. ACAD. SCI. USA, vol. 90, 1993, pages 5873 - 87
KOHRER ET AL.: "Import of amber and ochre suppressor tRNAs into mammalian cells: A general approach to site-specific insertion of amino acid analogues into proteins", PROC. NATL. ACAD. SCI. U.S.A., vol. 98, no. 25, 2001, pages 14310 - 14315, XP002296505, DOI: 10.1073/pnas.251438898
MATTEUCCI ET AL., J. AM. CHEM. SOC., vol. 103, 1981, pages 3185 - 3191
MATTHIAS DRUM ET AL: "Variants of a Thermus aquaticus DNA Polymerase with Increased Selectivity for Applications in Allele- and Methylation-Specific Amplification", PLOS ONE, vol. 9, no. 5, 6 May 2014 (2014-05-06), pages e96640, XP055179711, DOI: 10.1371/journal.pone.0096640*
NARANG ET AL., METH. ENZYMOL., vol. 68, 1979, pages 109 - 151
NEEDLEMANWUNSCH, J. MOL. BIOL., vol. 48, 1970, pages 443
NIELSEN ET AL., SCIENCE, vol. 254, 1991, pages 1497 - 1500
PEARSONLIPMAN, PROC. NATL. ACAD. SCI. USA, vol. 85, 1988, pages 2444
RAUCH T A ET AL: "DNA methylation profiling using the methylated-CpG island recovery assay (MIRA)", METHODS, ACADEMIC PRESS, NL, vol. 52, no. 3, 1 November 2010 (2010-11-01), pages 213 - 217, XP027456645, ISSN: 1046-2023, [retrieved on 20100319], DOI: 10.1016/J.YMETH.2010.03.004*
SMITHWATERMAN, ADV. APPL. MATH., vol. 2, 1970, pages 482
STADTMAN: "Selenocysteine", ANNU REV BIOCHEM, vol. 65, 1996, pages 83 - 100, XP002925830, DOI: 10.1146/annurev.bi.65.070196.000503
ZHANG ET AL.: "Selective incorporation of 5-hydroxytryptophan into proteins in mammalian cells", PROC. NATL. ACAD. SCI. U.S.A., vol. 101, no. 24, 2004, pages 8882 - 8887, XP002435941, DOI: 10.1073/pnas.0307029101

Also Published As

Publication numberPublication date
WO2025124705A1 (en)2025-06-19

Similar Documents

PublicationPublication DateTitle
EP2971080B1 (en)Methods for amplification and sequencing using thermostable tthprimpol
US8202972B2 (en)Isothermal DNA amplification
EP2788481B1 (en)Dna polymerases with improved activity
CA2831180C (en)Dna polymerases with improved activity
US10544404B2 (en)DNA polymerases with increased 3′-mismatch discrimination
US9279150B2 (en)Mutant endonuclease V enzymes and applications thereof
AU2011267421B2 (en)DNA polymerases with increased 3&#39;-mismatch discrimination
US10590400B2 (en)DNA polymerases with increased 3′-mismatch discrimination
CN112703248A (en)Mutant DNA polymerase having improved strand displacement ability
WO2025125413A1 (en)Dna polymerases for the detection of epigenetic dna marks
EP3768832B1 (en)Dna polymerases for efficient and effective incorporation of methylated-dntps
US20120135472A1 (en)Hot-start pcr based on the protein trans-splicing of nanoarchaeum equitans dna polymerase
HK1183897B (en)Dna polymerases with increased 3&#39;-mismatch discrimination

Legal Events

DateCodeTitleDescription
121Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number:24828056

Country of ref document:EP

Kind code of ref document:A1


[8]ページ先頭

©2009-2025 Movatter.jp