Movatterモバイル変換


[0]ホーム

URL:


CN115093482B - High-precision adenine base editor and application thereof - Google Patents

High-precision adenine base editor and application thereof

Info

Publication number
CN115093482B
CN115093482BCN202210538473.1ACN202210538473ACN115093482BCN 115093482 BCN115093482 BCN 115093482BCN 202210538473 ACN202210538473 ACN 202210538473ACN 115093482 BCN115093482 BCN 115093482B
Authority
CN
China
Prior art keywords
leu
lys
glu
asp
ser
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210538473.1A
Other languages
Chinese (zh)
Other versions
CN115093482A (en
Inventor
欧阳红生
袁泓明
王子茹
逄大欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Jitang Biotechnology Research Institute Co ltd
Original Assignee
Chongqing Jitang Biotechnology Research Institute Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Jitang Biotechnology Research Institute Co ltdfiledCriticalChongqing Jitang Biotechnology Research Institute Co ltd
Priority to CN202210538473.1ApriorityCriticalpatent/CN115093482B/en
Publication of CN115093482ApublicationCriticalpatent/CN115093482A/en
Application grantedgrantedCritical
Publication of CN115093482BpublicationCriticalpatent/CN115093482B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本发明的提供一种高精准性ABEs碱基编辑器及应用,其中的融合蛋白自N端至C端包含第一区域和第二区域,第一区域包含ABEs,所述的ABEs包含腺嘌呤脱氨酶或其酶促活性成分和nCas9;第二区域包含e18蛋白;所述的融合蛋白将e18蛋白与ABEs碱基编辑系统融合,所述ABEs碱基编辑器利用与其融合表达的e18加速其自身nCas9‑TadA的降解,降低其半衰期从而增加精确性,相比于DNA碱基编辑RNP复合物,不需要体外合成RNP,具有价格更低廉、操作更简单和更易保存等优点。

The present invention provides a high-precision ABEs base editor and its application, wherein the fusion protein comprises a first region and a second region from the N-terminus to the C-terminus, the first region comprising ABEs, the ABEs comprising adenine deaminase or its enzymatically active component and nCas9; the second region comprising e18 protein; the fusion protein fuses the e18 protein with the ABEs base editing system, and the ABEs base editor utilizes the e18 fused with it to accelerate the degradation of its own nCas9-TadA, reducing its half-life and thereby increasing accuracy. Compared with the DNA base editing RNP complex, it does not require in vitro synthesis of RNP, and has the advantages of being cheaper, simpler to operate and easier to store.

Description

High-precision adenine base editor and application thereof
Technical Field
The invention belongs to the technical field of biology, and particularly relates to a high-precision adenine base editor and application thereof.
Background
The single base editor is a gene editing system composed of fusion expression of Cas9 protein and deaminase (mainly including adenine deaminase and cytosine deaminase) and other proteins. The system can accurately and irreversibly realize the conversion from one base pair to another base pair without introducing DNA double strand break and exogenous repair template. Single base editors currently include mainly Adenine Base Editors (ABEs), cytosine Base Editors (CBEs), guanine base editors (GCBEs), as well as DdCBEs and TALEDs, which can achieve precise editing of the mitochondrial genome.
The adenine base editor (Adenine Base Editor, ABEs) is a fusion protein composed of nCas (D10A) and an artificially modified adenine deaminase. The protein can specifically recognize and bind to a target sequence under the guidance of sgRNA, and deaminizes adenine (A) to form creatinine (I), and then I is converted into G again, so that A-G conversion is realized. Because of the characteristics of high efficiency of base editing, simple operation and the like, the base editing system has been widely used in various versions ABEs (mainly including ABE7.10, ABEmax, ABE8e and the like), which also greatly promotes the development of life science fields (such as gene therapy, human disease models, biological medicine, disease occurrence mechanism and the like). However, with the continuous and intensive research, researchers find that ABEs, CBEs and the like have problems of excessively large editing window, off-target effects in the genome range (including Cas 9-dependent off-target effects and Cas 9-independent off-target effects) and the like.
Therefore, in application, the development of a base editor with higher accuracy is of great importance for the development of the life sciences field. Studies show that DNA base editing ribonucleoprotein complex (ribonucleoprotein complex, RNP) composed of sgRNA and ABEs fusion protein can be rapidly degraded by protease in cells, so that the purity of editor products can be obviously promoted and the off-target effect can be reduced. However, RNP synthesis cost is high in the practical application process, and the storage difficulty is high, so that the application prospect is greatly limited. Ubiquitin-protease system (UPS) is the most important mechanism to control protein levels as the primary pathway for protein degradation. Ubiquitination involves three main steps, activation, binding and ligation. The proteolytic process is mainly through the activation of ubiquitin by the E1 ubiquitin activating enzyme, followed by the binding of the E2 ubiquitin binding enzyme to ubiquitin transferred by the E1 ubiquitin activating enzyme, and finally through the selective attachment of ubiquitin to lysine, serine, threonine or cysteine residues of the protein of interest by the E3 ligase. The E3 ligase can bind directly to the substrate and determine the specificity of the ubiquitin system. Rad18 is a RING type E3 ubiquitin ligase, in DNA damage repair plays a critical role, E18 protein is a SAP domain removed Rad18 variant protein.
Disclosure of Invention
The invention aims to construct a high-precision ABEs base editor, and fuses an e18 protein and a ABEs base editing system, wherein the ABEs base editor accelerates the degradation of nCas-TadA of the e18 by using the e18 which is expressed in a fused manner, reduces the half life period of the e.g. by using the e.18, thereby increasing the precision.
The aim of the invention is realized by the following technical scheme:
A fusion protein comprising, from N-terminus to C-terminus, a first region comprising ABEs and a second region comprising an adenine deaminase or an enzymatically active ingredient thereof and nCas, ABEs, and a second region comprising an e18 protein, the fusion protein optionally comprising one or more linker amino acid sequences located in the first region and between the first and second regions of the fusion protein.
As a preferable technical scheme of the invention, ABEs is one of ABE7.10, ABEmax and ABE8e base editors.
As a preferred embodiment of the present invention, the fusion protein further comprises a nuclear localization signal fragment.
The preferable technical scheme of the invention is that the amino acid sequence of the fusion protein is shown as SEQ ID NO. 40 when ABEs is ABEmax, and the amino acid sequence of the fusion protein is shown as SEQ ID NO. 41 when ABEs is ABE8 e.
It is a further object of the present invention to provide a polynucleotide encoding the fusion protein described above. The polynucleotide sequence comprises a first region encoding ABEs and a second region encoding an e18 protein, the polynucleotide optionally comprising one or more linker amino acid sequences located in the first region and between the first region and the second region of the fusion protein.
When ABEs is ABEmax, the polynucleotide sequence is constructed in the following manner that a ABEmax gene sequence before gene modification is shown as SEQ ID NO. 2, an upstream primer sequence F1 capable of effectively amplifying a target e18 gene fragment 1 is shown as SEQ ID NO. 3, a downstream primer sequence R1 is shown as SEQ ID NO. 4, an upstream primer sequence F2 capable of effectively amplifying a target e18 gene fragment 2 is shown as SEQ ID NO. 5, a downstream primer sequence R2 is shown as SEQ ID NO. 6, and an upstream primer sequence F3 capable of effectively amplifying an e18 fragment gene sequence with an enzyme cleavage site is shown as SEQ ID NO.
7. The downstream primer sequence R3 is shown as SEQ ID NO. 8, the e18 gene sequence with double enzyme cutting sites is shown as SEQ ID NO. 9, and the fragment after enzyme cutting of SEQ ID NO. 9 and enzyme cutting of SEQ ID NO.
2. The sequence obtained by connecting the fragments is shown as a polynucleotide sequence shown as SEQ ID NO. 1.
The preferable technical scheme of the invention is that when ABEs is ABE8e, the construction mode of the polynucleotide sequence shown as SEQ ID NO.10 is as follows, the ABE8e gene sequence before genetic modification is shown as SEQ ID NO. 11, the e18 gene sequence with double enzyme cutting sites is shown as SEQ ID NO. 9, and the sequence obtained by connecting the fragment SEQ ID NO. 9 with the fragment after enzyme cutting of SEQ ID NO. 11 is shown as SEQ ID NO. 10.
It is also an object of the present invention to provide a construct comprising said polynucleotide. The construct may be constructed by inserting the polynucleotide into a suitable expression vector. The expression vector may be, but is not limited to, a pCMV expression vector, a pSV2 expression vector, and the like.
It is a further object of the present invention to provide an expression system comprising said construct or said polynucleotide integrated with an exogenous source in the genome. The expression system may be a host cell that may express a fusion protein as described above, which may be mated with the sgRNA, such that the fusion protein may be localized to a target region, enabling base editing of the target region.
It is also an object of the present invention to provide a use, in particular, a use of said fusion protein and said polynucleotide and said construct or said expression system in gene editing, which is converting base a to G.
It is a further object of the present invention to provide a base editing system comprising the fusion protein and an sgRNA, the fusion protein cooperating with the sgRNA to localize the fusion protein to a target region.
It is still another object of the present invention to provide a gene editing method comprising performing gene editing of the fusion protein or the base editing system, the gene editing being converting base A into G.
The beneficial effects are as follows:
The adenine base editor with high accuracy fuses the exogenous protein e18 on the existing adenine base editor, and the gene sequences constituting plasmids are shown as SEQ ID NO.1 and SEQ ID NO.10, namely ABEmax-e18 and ABE8e-e18. The editing windows have obvious accuracy, namely 5-7 bits and 1-9 bits respectively. Editing the positive clone cells of PCSK9 can significantly increase LDL uptake, representing its potential for use in gene therapy. Provides more possible improvements for the development of subsequent precise base editors, while increasing the number of tools for the editors.
The invention can shorten ABEs protein half-life, accelerate ABEs expression plasmid degraded in cells, increase ABEs accuracy and reduce off-target effect.
Drawings
FIG. 1a is a schematic diagram of a ABEmax-e18 plasmid vector of the present invention, FIG. 1b is a schematic diagram of an ABE8e-e18 plasmid vector of the present invention;
FIG. 2 is a statistical plot of sequencing of the efficiency of sgRNA base editing at multiple endogenous sites for ABEmax and ABEmax-e18 of the present invention;
FIG. 3 is a sequencing diagram of a positive PCSK9 editing clone of the invention;
FIG. 4 is a graph showing comparison of LDL uptake by wild cells and positive PCSK 9-editing cloned cells of the invention, and FIG. 5 is a statistical graph showing sequencing of the efficiency of sgRNA base editing at multiple endogenous sites for ABE8e and ABE8e-e18 of the invention.
Detailed Description
The first aspect of the invention provides a fusion protein comprising, from the N-terminus to the C-terminus, a first region comprising ABEs and a second region comprising ABEs comprising adenine deaminase or an enzymatically active ingredient thereof and nCas, and a second region comprising an e18 protein, the fusion protein optionally comprising one or more adaptor amino acid sequences located in the first region and between the first and second regions of the fusion protein. The fusion protein is subjected to base editing at a target position under the guidance of sgRNA, the ABEs fragment is an existing ABEs base editor, and the ABEs base editor accelerates the degradation of nCas-TadA by using an e18 sequence which is fused and expressed with the fusion protein, reduces the half life of the fusion protein and further increases the accuracy.
In the fusion protein provided by the invention, ABEs is one of ABE7.10, ABEmax and ABE8e base editors.
The fusion protein provided by the invention has the amino acid sequence shown in SEQ ID NO.40 or SEQ ID NO. 41, or has the amino acid sequence with more than 80% of sequence similarity with SEQ ID NO.40 or SEQ ID NO. 41, and has the function of the amino acid sequence defined by SEQ ID NO.40 or SEQ ID NO. 41. Specifically, the amino acid sequence having 80% or more sequence similarity with SEQ ID No.40 or SEQ ID No. 41 refers to a polypeptide fragment obtained by substituting, deleting or adding one or more (specifically, 1 to 50, 1 to 30, 1 to 20, 1 to 10,1 to 5, 1 to 3, 1 to 2, or 3) amino acids to the amino acid sequence shown in SEQ ID No.40 or SEQ ID No. 41, or adding one or more (specifically, 1 to 50, 1 to 30, 1 to 20, 1 to 10,1 to 5, 1 to 3, 1 to 2, or 3) amino acids to the N-terminus and/or the C-terminus, and having the function of a polypeptide fragment of the amino acid shown in SEQ ID No.40 or SEQ ID No. 41. The similarity of sequences generally refers to the percentage of identical amino acid residues in the sequences involved in the alignment, and two or more entries of sequence similarity may be calculated using calculation software well known in the art, and may be, for example, software from NCBI.
In the fusion proteins provided by the present invention, the substitution, deletion or addition may be conservative amino acid substitutions. The term "conservative amino acid substitution" may specifically refer to the case where an amino acid residue is substituted for another amino acid residue having a similar side chain. Families of amino acid residues with similar side chains should be known to those skilled in the art.
In the fusion proteins provided by the present invention, the fusion proteins may further comprise a nuclear localization signal fragment, which may typically interact with a nuclear import vector, thereby enabling the protein to be transported into the nucleus.
In a second aspect the invention provides a polynucleotide encoding a fusion protein as described above.
The polynucleotide sequence provided by the invention is shown as SEQ ID NO.1 or SEQ ID NO. 10. The polynucleotide provided by the invention has the construction mode that the polynucleotide sequence shown as SEQ ID NO.1 is constructed in the following way, the ABEmax gene sequence before gene modification is shown as SEQ ID NO. 2, the upstream primer sequence F1 capable of effectively amplifying the target e18 gene fragment 1 is shown as SEQ ID NO. 3, the downstream primer sequence R1 is shown as SEQ ID NO.4, the upstream primer sequence F2 capable of effectively amplifying the target e18 gene fragment 2 is shown as SEQ ID NO.5, the downstream primer sequence R2 is shown as SEQ ID NO. 6, the upstream primer sequence F3 capable of effectively amplifying the e18 fragment gene sequence with the enzyme cleavage site is shown as SEQ ID NO. 7, the downstream primer sequence R3 is shown as SEQ ID NO.8, the e18 fragment with the double enzyme cleavage site is shown as SEQ ID NO. 9, and the sequence obtained by connecting the fragment after enzyme cleavage of SEQ ID NO. 9 with the fragment after enzyme cleavage of SEQ ID NO. 2 is shown as SEQ ID NO. 1.
The polynucleotide provided by the invention has the construction mode of a polynucleotide sequence shown as SEQ ID NO. 10, wherein the ABE8e gene sequence before gene modification is shown as SEQ ID NO. 11, and the sequence obtained by connecting a fragment SEQ ID NO. 9 with a fragment obtained by enzyme digestion of SEQ ID NO. 11 is shown as SEQ ID NO. 10.
In a third aspect the invention provides a construct comprising said polynucleotide. The construct may be constructed by inserting the polynucleotide into a suitable expression vector. The skilled artisan can select suitable expression vectors, for example, which can be, including but not limited to, pCMV expression vectors, pSV2 expression vectors, and the like.
In a fourth aspect, the present invention provides an expression system comprising the construct or the polynucleotide described above integrated with an exogenous source in the genome. The expression system may be a host cell that may express a fusion protein as described above, which may be mated with the sgRNA, such that the fusion protein may be localized to a target region, enabling base editing of the target region. In another embodiment of the present invention, the host cell may be a eukaryotic cell and/or a prokaryotic cell, more specifically a mouse cell, a human cell, etc., and more specifically a mouse brain neuroma cell, a human embryonic kidney cell, a human cervical cancer cell, a human colon cancer cell, a human osteosarcoma cell, etc.
In a fifth aspect the invention provides the use of said fusion protein and said polynucleotide and said construct or said expression system in gene editing. The use in gene editing of eukaryotic organisms, which may be specifically metazoans, may be specifically including but not limited to humans, mice, and the like, is preferred. The uses may be, in particular, base editing including, but not limited to, a to G, etc., which may be applied to edit splice acceptor/donor sites to modulate RNA splicing, as well as to perform model (e.g., disease model, cell model, animal model, etc.) construction, or treatment of human diseases, etc. In one embodiment of the invention, the object being edited may be an embryo, a cell, or the like.
In a sixth aspect, the invention provides a base editing system comprising the fusion protein and an sgRNA. One skilled in the art can select appropriate sgrnas targeting a specific site based on the targeted editing region of the gene. For example, the sgRNA sequence can be at least partially complementary to the target region, so that it can be coordinated with the fusion protein to localize the fusion protein to the target region, effecting base editing within the target region, which is the conversion of base a to G.
In a seventh aspect of the present invention, there is provided a gene editing method comprising performing gene editing of the fusion protein or the base editing system, the gene editing being converting a base A into G. For example, the gene editing method may comprise culturing the expression system provided in the fourth aspect of the invention under appropriate conditions to express the fusion protein which allows base editing of the target region in the presence of the sgRNA targeted to the target region to which it is complexed. Methods of providing conditions under which the sgrnas are present should be known to those skilled in the art, and for example, may be culturing under appropriate conditions an expression system capable of expressing the sgrnas, which may be a host cell comprising an expression vector comprising a polynucleotide encoding the sgrnas, or a host cell having a polynucleotide encoding the sgrnas integrated in the chromosome. In a specific embodiment of the invention, the sgRNA and the fusion protein may be expressed in the same host cell, which may be a target cell. In another embodiment of the present invention, the gene editing is in vitro gene editing.
Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention.
Before further describing embodiments of the present invention, it is to be understood that the scope of the invention is not limited to the specific embodiments described below, and that the terminology used in the examples of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the scope of the invention, as the singular forms "a", "an" and "the" include plural forms unless the context clearly dictates otherwise.
Where numerical ranges are provided in the examples, it is understood that unless otherwise stated herein, both endpoints of each numerical range and any number between the two endpoints are significant both in the numerical range. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. In addition to the specific methods, devices, materials used in the embodiments, any methods, devices, and materials of the prior art similar or equivalent to those described in the embodiments of the present invention may be used to practice the present invention according to the knowledge of one skilled in the art and the description of the present invention.
Unless otherwise indicated, the experimental methods, detection methods, and preparation methods disclosed in the present invention employ techniques conventional in the art of molecular biology, biochemistry, chromatin structure and analysis, analytical chemistry, cell culture, recombinant DNA techniques, and related arts.
Example 1ABEmax-e18 and construction of the ABE8e-e18 plasmid
Designing and synthesizing a primer sequence aiming at an e18 sequence, obtaining an e18 fragment with an enzyme cutting site through a PCR in-vitro amplification method, respectively connecting the fragment with the same enzyme-cut ABEmax and ABE8e fragments to obtain ABEmax-e18 and ABE8e-e18 plasmids, wherein the main elements of the vector are adenine deaminase, nCas, e18 and nuclear localization signals in sequence. The characteristic information such as editing activity and the like can be obtained by the combined action of the editor plasmid and the sgRNA plasmid. After verification of the constructed plasmid sequencing, the target plasmid is extracted and ethanol precipitation is carried out, and the expression vector with a certain concentration after purification is subjected to the construction, and the constructed plasmid map is shown in figure 1.
The invention constructs ABEs expression plasmid which can shorten ABEs protein half-life, accelerate degradation in cells, increase ABEs accuracy and reduce off-target effect. The fusion protein comprises deaminase, nCas and e18 fragments in sequence.
Example 2 design of sgRNA sequences and plasmid construction.
SgRNA sequences were designed and synthesized for use on cells of human origin. The designed sgRNA sequence is synthesized, the DNA sequence of single-stranded sgRNA is annealed to form oligonucleotide chains of the sgRNA with different sites, and then the oligonucleotide chains are connected into the sgRNA skeleton plasmid vector. The sequences of the plurality of sgrnas are respectively:
SgRNA-1 sequence 5-GAATACTAAGCATAGACTCC-3 SgRNA-2 sequence 5-GTAAACAAAGCATAGACTGA-3 SgRNA-3 sequence 5-GAACACAAAGCATAGACTGC-3 SgRNA-4 sequence 5-GATGAGATAATGATGAGTCA-3 SgRNA-5 sequence 5-GACAAACCAGAAGCCGCTCC-3 SgRNA-6 sequence 5-GGGAATAAATCATAGAATCC-3 SgRNA-7 sequence 5-GGAACACAAAGCATAGACTG-3 SgRNA-8 sequence 5-GCACCTACCTCGGGAGCTGA-3 SgRNA-9 sequence 5-GGAATCCCTTCTGCAGCACC-3 SgRNA-10 sequence 5-TCAGAAAGTGGTGGCTGGTG-3 SgRNA-11 sequence 5-GGCCCAGACTGAGCACGTGA-3 SgRNA-12 sequence 5-ATATTTGCATTGAGATAGTG-3 SgRNA-13 sequence 5-GTCATCTTAGTCATTACCTG-3 SgRNA-14 sequence 5-GAAGATAGAGAATAGACTGC-3.
And (3) extracting a target plasmid and performing ethanol precipitation after sequencing and verifying the constructed sgRNA expression vector, and purifying the purified sgRNA expression vector with a certain concentration.
Example 3 cotransfection of ABEs plasmid with sgRNA expression vector.
And cotransfection of ABEmax-e18 plasmid and sgRNA expression vector.
Plating HEK293T cells, introducing ABEmax and ABEmax-e18 and sgRNA expression vector into cells by liposome transfection when the density is about 80%, extracting genome of each group of cells after 72 hours of transfection, then carrying out PCR reaction by using specific primer for detecting mutation efficiency, sending the obtained PCR product to sequence, evaluating the editing efficiency of sgRNA locus and editing window of editor by analyzing sequencing peak diagram, and displaying the obtained sgRNA1 sequence corresponding to SEQ ID NO. 12 and SEQ ID NO. 13, the sgRNA3 sequence corresponding to SEQ ID NO. 14 and SEQ ID NO. 15, the sgRNA4 sequence corresponding to SEQ ID NO. 18 and SEQ ID NO. 19, the sgRNA3 sequence corresponding to SEQ ID NO. 16 and SEQ ID NO. 19, the result of FIG. 2 shows that the obtained sgRNA1 sequence corresponding to SEQ ID NO. 12 and SEQ ID NO. 20 and SEQ ID NO. 21
The corresponding sgRNA5 sequence, the sgRNA6 sequences corresponding to SEQ ID NO. 22 and SEQ ID NO. 23, the sgRNA7 sequences corresponding to SEQ ID NO. 24 and SEQ ID NO. 25 can effectively guide cas9 protein to edit target sites in cells and obtain the characteristics of an editing window, the range of sites in the ABEmax editing window is 3-8 sites, and the range of sites in the ABEmax-e18 editing window is 5-7 sites.
Co-transfection of 3-2 ABE8e and ABE8e-e18 plasmid with sgRNA expression vector.
HEK293T cells were plated in the same manner, and when approximately 80% of the density was reached, ABE8e and ABE8e-e18 were introduced into the cells by liposome transfection, and after 72 hours of transfection, the genomes of each group of cells were extracted, and then PCR reactions were performed using specific primers for detecting the mutation efficiency, and the obtained PCR products were sent to sequencing to evaluate the editing efficiency of the sgRNA sites and the editing window of the editor by analysis of the sequencing peak patterns, and the results of FIG. 5 show SEQ ID NO. 12 and SEQ ID NO obtained above.
13. The sequence of the corresponding sgRNA1, the sequence of the corresponding sgRNA2 of SEQ ID NO. 14 and SEQ ID NO. 15, the sequence of the corresponding sgRNA3 of SEQ ID NO. 16 and SEQ ID NO. 17, the sequence of the corresponding sgRNA4 of SEQ ID NO. 18 and SEQ ID NO. 19, the sequence of the corresponding sgRNA6 of SEQ ID NO. 22 and SEQ ID NO. 23, the sequence of the corresponding sgRNA9 of SEQ ID NO. 28 and SEQ ID NO. 29, the sequence of the corresponding sgRNA10 of SEQ ID NO. 30 and SEQ ID NO. 31, the sequence of the corresponding sgRNA11 of SEQ ID NO. 32 and SEQ ID NO. 33, the sequence of the corresponding sgRNA13 of SEQ ID NO. 36 and SEQ ID NO. 37, the sequence of the corresponding sgRNA14 of SEQ ID NO. 38 and SEQ ID NO. 39 can effectively guide the cas9 protein in cells to edit the target site and obtain the characteristic of the editing site within the range of the editing window of the ABE 1-14 in the editing window of the ABE-9. Such as the range of sites and the range of efficiencies within the editable window, see fig. 2.
Co-transfection of the 3-3 ABEmax-e18 plasmid with the PCSK9-sgRNA expression vector.
Resuscitate HepG2 cells, when they are nearly full, wash 2-3 times with PBS, discard supernatant, add electrotransfection buffer, then add ABEmax-e18 plasmid and sgRNA expression plasmid into cells and buffer according to proportion, mix gently with pipettor, gently suck the mixture into pipettor with special gun head. It is inserted into a pipette holder with an electrode cup, an electrotransport buffer is added to the electrode cup, a program is set on the device, and then the process is started. After the electric shock is finished, standing for 2 minutes, and transferring the mixed solution in the electric transfer gun into a fine powder
In a cell culture dish. Finally, the cell culture dish was placed in a 37 ℃ carbon dioxide incubator for culture. After 12 hours of culture, the solution was changed. The results in FIG. 3 show that the sgRNA8 sequences corresponding to SEQ ID No. 26 and SEQ ID No. 27 obtained above can effectively guide cas9 protein to edit target sites in HepG2 cells, see FIG. 3.
Example 4 preparation of PCSK 9-edited positive clone HepG2 cells.
After electrotransfection for 72h, spreading HepG2 cells into a 100mm cell culture dish by using pancreatin digested cells through a limiting dilution method, and replacing the cell culture solution once for 2-3 days. After 8-10 days, uniformly marking the cell clones under a fluorescence microscope after the cell clones grow, and then picking the marked clones into a 24-hole cell culture plate for subsequent culture. After 2-3 days, after the cells in the 24-well plate grow to a certain confluence degree, one half of the cells are cracked by NP40 lysate, and then PCSK9 fixed-point base editing events are further verified and determined by a PCR sequencing method, see FIG. 4.
Experimental example 5, positive PCSK9 site-directed base editing HepG2 cell in vitro LDL uptake assay.
The HepG2 cells obtained above were cultured with DMEM,5% fbs,1% diabody, 1mM sodiumpyruvate 1% glutamine, 1% nonessential amino acid. The cells obtained were seeded at a certain density in 6-well cell culture plates. Dil-LDL was diluted 1:100 and incubated with cells for 3 hours when the cell density reached 70%, then the supernatant was discarded, washed three times with PBS, cells were fixed in culture plates by adding 4% paraformaldehyde solution for 2h, and the samples were washed three times with PBS for 5 minutes each. Cell permeabilization was performed with 0.5% triton X-100 (PBS formulation) for 10min. The PBS was washed three times for 5 minutes each. Staining was performed for 10 minutes with the addition of 0.5 ug/ml DAPI (PBS formulation). Wash three times with PBS. And observed under a fluorescence microscope. The site-directed editing cells were able to uptake greater amounts of LDL than the fluorescent brightness results of the non-edited control cells, see fig. 5.
In conclusion, the invention realizes the function of utilizing exogenous proteins to make the gene sequence as SEQ ID NO.
9. The e18 (rad 18 gene with SAP domain removed) gene sequence is fused to the common ABEmax and ABE8e adenine base editor with highest editing activity, so that an editor with more accurate editing window and lower editing activity is obtained, and a new direction is provided for further improving the editor subsequently.
The above embodiments are merely illustrative of the principles of the present invention and its effectiveness, and are not intended to limit the invention. Modifications and variations may be made to the above-described embodiments by those skilled in the art without departing from the spirit and scope of the invention. Accordingly, it is intended that all equivalent modifications and variations of the invention be covered by the claims, which are within the ordinary skill of the art, be within the spirit and scope of the present disclosure.
Sequence listing
<110> Chongqing Ji Tang Biotechnology research laboratory Co., ltd
<120> A highly accurate adenine base editor and use thereof
<160> 41
<170> SIPOSequenceListing 1.0
<210> 1
<211> 10224
<212> DNA
<213> Artificial sequence
<400> 1
atatgccaag tacgccccct attgacgtca atgacggtaa atggcccgcc tggcattatg 60
cccagtacat gaccttatgg gactttccta cttggcagta catctacgta ttagtcatcg 120
ctattaccat ggtgatgcgg ttttggcagt acatcaatgg gcgtggatag cggtttgact 180
cacggggatt tccaagtctc caccccattg acgtcaatgg gagtttgttt tggcaccaaa 240
atcaacggga ctttccaaaa tgtcgtaaca actccgcccc attgacgcaa atgggcggta 300
ggcgtgtacg gtgggaggtc tatataagca gagctggttt agtgaaccgt cagatccgct 360
agagatccgc ggccgctaat acgactcact atagggagag ccgccaccat gaaacggaca 420
gccgacggaa gcgagttcga gtcaccaaag aagaagcgga aagtctctga agtcgagttt 480
agccacgagt attggatgag gcacgcactg accctggcaa agcgagcatg ggatgaaaga 540
gaagtccccg tgggcgccgt gctggtgcac aacaatagag tgatcggaga gggatggaac 600
aggccaatcg gccgccacga ccctaccgca cacgcagaga tcatggcact gaggcaggga 660
ggcctggtca tgcagaatta ccgcctgatc gatgccaccc tgtatgtgac actggagcca 720
tgcgtgatgt gcgcaggagc aatgatccac agcaggatcg gaagagtggt gttcggagca 780
cgggacgcca agaccggcgc agcaggctcc ctgatggatg tgctgcacca ccccggcatg 840
aaccaccggg tggagatcac agagggaatc ctggcagacg agtgcgccgc cctgctgagc 900
gatttcttta gaatgcggag acaggagatc aaggcccaga agaaggcaca gagctccacc 960
gactctggag gatctagcgg aggatcctct ggaagcgaga caccaggcac aagcgagtcc 1020
gccacaccag agagctccgg cggctcctcc ggaggatcct ctgaggtgga gttttcccac 1080
gagtactgga tgagacatgc cctgaccctg gccaagaggg cacgcgatga gagggaggtg 1140
cctgtgggag ccgtgctggt gctgaacaat agagtgatcg gcgagggctg gaacagagcc 1200
atcggcctgc acgacccaac agcccatgcc gaaattatgg ccctgagaca gggcggcctg 1260
gtcatgcaga actacagact gattgacgcc accctgtacg tgacattcga gccttgcgtg 1320
atgtgcgccg gcgccatgat ccactctagg atcggccgcg tggtgtttgg cgtgaggaac 1380
gcaaaaaccg gcgccgcagg ctccctgatg gacgtgctgc actaccccgg catgaatcac 1440
cgcgtcgaaa ttaccgaggg aatcctggca gatgaatgtg ccgccctgct gtgctatttc 1500
tttcggatgc ctagacaggt gttcaatgct cagaagaagg cccagagctc caccgactcc 1560
ggaggatcta gcggaggctc ctctggctct gagacacctg gcacaagcga gagcgcaaca 1620
cctgaaagca gcgggggcag cagcgggggg tcagacaaga agtacagcat cggcctggcc 1680
atcggcacca actctgtggg ctgggccgtg atcaccgacg agtacaaggt gcccagcaag 1740
aaattcaagg tgctgggcaa caccgaccgg cacagcatca agaagaacct gatcggagcc 1800
ctgctgttcg acagcggcga aacagccgag gccacccggc tgaagagaac cgccagaaga 1860
agatacacca gacggaagaa ccggatctgc tatctgcaag agatcttcag caacgagatg 1920
gccaaggtgg acgacagctt cttccacaga ctggaagagt ccttcctggt ggaagaggat 1980
aagaagcacg agcggcaccc catcttcggc aacatcgtgg acgaggtggc ctaccacgag 2040
aagtacccca ccatctacca cctgagaaag aaactggtgg acagcaccga caaggccgac 2100
ctgcggctga tctatctggc cctggcccac atgatcaagt tccggggcca cttcctgatc 2160
gagggcgacc tgaaccccga caacagcgac gtggacaagc tgttcatcca gctggtgcag 2220
acctacaacc agctgttcga ggaaaacccc atcaacgcca gcggcgtgga cgccaaggcc 2280
atcctgtctg ccagactgag caagagcaga cggctggaaa atctgatcgc ccagctgccc 2340
ggcgagaaga agaatggcct gttcggaaac ctgattgccc tgagcctggg cctgaccccc 2400
aacttcaaga gcaacttcga cctggccgag gatgccaaac tgcagctgag caaggacacc 2460
tacgacgacg acctggacaa cctgctggcc cagatcggcg accagtacgc cgacctgttt 2520
ctggccgcca agaacctgtc cgacgccatc ctgctgagcg acatcctgag agtgaacacc 2580
gagatcacca aggcccccct gagcgcctct atgatcaaga gatacgacga gcaccaccag 2640
gacctgaccc tgctgaaagc tctcgtgcgg cagcagctgc ctgagaagta caaagagatt 2700
ttcttcgacc agagcaagaa cggctacgcc ggctacattg acggcggagc cagccaggaa 2760
gagttctaca agttcatcaa gcccatcctg gaaaagatgg acggcaccga ggaactgctc 2820
gtgaagctga acagagagga cctgctgcgg aagcagcgga ccttcgacaa cggcagcatc 2880
ccccaccaga tccacctggg agagctgcac gccattctgc ggcggcagga agatttttac 2940
ccattcctga aggacaaccg ggaaaagatc gagaagatcc tgaccttccg catcccctac 3000
tacgtgggcc ctctggccag gggaaacagc agattcgcct ggatgaccag aaagagcgag 3060
gaaaccatca ccccctggaa cttcgaggaa gtggtggaca agggcgcttc cgcccagagc 3120
ttcatcgagc ggatgaccaa cttcgataag aacctgccca acgagaaggt gctgcccaag 3180
cacagcctgc tgtacgagta cttcaccgtg tataacgagc tgaccaaagt gaaatacgtg 3240
accgagggaa tgagaaagcc cgccttcctg agcggcgagc agaaaaaggc catcgtggac 3300
ctgctgttca agaccaaccg gaaagtgacc gtgaagcagc tgaaagagga ctacttcaag 3360
aaaatcgagt gcttcgactc cgtggaaatc tccggcgtgg aagatcggtt caacgcctcc 3420
ctgggcacat accacgatct gctgaaaatt atcaaggaca aggacttcct ggacaatgag 3480
gaaaacgagg acattctgga agatatcgtg ctgaccctga cactgtttga ggacagagag 3540
atgatcgagg aacggctgaa aacctatgcc cacctgttcg acgacaaagt gatgaagcag 3600
ctgaagcggc ggagatacac cggctggggc aggctgagcc ggaagctgat caacggcatc 3660
cgggacaagc agtccggcaa gacaatcctg gatttcctga agtccgacgg cttcgccaac 3720
agaaacttca tgcagctgat ccacgacgac agcctgacct ttaaagagga catccagaaa 3780
gcccaggtgt ccggccaggg cgatagcctg cacgagcaca ttgccaatct ggccggcagc 3840
cccgccatta agaagggcat cctgcagaca gtgaaggtgg tggacgagct cgtgaaagtg 3900
atgggccggc acaagcccga gaacatcgtg atcgaaatgg ccagagagaa ccagaccacc 3960
cagaagggac agaagaacag ccgcgagaga atgaagcgga tcgaagaggg catcaaagag 4020
ctgggcagcc agatcctgaa agaacacccc gtggaaaaca cccagctgca gaacgagaag 4080
ctgtacctgt actacctgca gaatgggcgg gatatgtacg tggaccagga actggacatc 4140
aaccggctgt ccgactacga tgtggaccat atcgtgcctc agagctttct gaaggacgac 4200
tccatcgaca acaaggtgct gaccagaagc gacaagaacc ggggcaagag cgacaacgtg 4260
ccctccgaag aggtcgtgaa gaagatgaag aactactggc ggcagctgct gaacgccaag 4320
ctgattaccc agagaaagtt cgacaatctg accaaggccg agagaggcgg cctgagcgaa 4380
ctggataagg ccggcttcat caagagacag ctggtggaaa cccggcagat cacaaagcac 4440
gtggcacaga tcctggactc ccggatgaac actaagtacg acgagaatga caagctgatc 4500
cgggaagtga aagtgatcac cctgaagtcc aagctggtgt ccgatttccg gaaggatttc 4560
cagttttaca aagtgcgcga gatcaacaac taccaccacg cccacgacgc ctacctgaac 4620
gccgtcgtgg gaaccgccct gatcaaaaag taccctaagc tggaaagcga gttcgtgtac 4680
ggcgactaca aggtgtacga cgtgcggaag atgatcgcca agagcgagca ggaaatcggc 4740
aaggctaccg ccaagtactt cttctacagc aacatcatga actttttcaa gaccgagatt 4800
accctggcca acggcgagat ccggaagcgg cctctgatcg agacaaacgg cgaaaccggg 4860
gagatcgtgt gggataaggg ccgggatttt gccaccgtgc ggaaagtgct gagcatgccc 4920
caagtgaata tcgtgaaaaa gaccgaggtg cagacaggcg gcttcagcaa agagtctatc 4980
cggcccaaga ggaacagcga taagctgatc gccagaaaga aggactggga ccctaagaag 5040
tacggcggct tcgtgagccc caccgtggcc tattctgtgc tggtggtggc caaagtggaa 5100
aagggcaagt ccaagaaact gaagagtgtg aaagagctgc tggggatcac catcatggaa 5160
agaagcagct tcgagaagaa tcccatcgac tttctggaag ccaagggcta caaagaagtg 5220
aaaaaggacc tgatcatcaa gctgcctaag tactccctgt tcgagctgga aaacggccgg 5280
aagagaatgc tggcctctgc cagattcctg cagaagggaa acgaactggc cctgccctcc 5340
aaatatgtga acttcctgta cctggccagc cactatgaga agctgaaggg ctcccccgag 5400
gataatgagc agaaacagct gtttgtggaa cagcacaagc actacctgga cgagatcatc 5460
gagcagatca gcgagttctc caagagagtg atcctggccg acgctaatct ggacaaagtg 5520
ctgtccgcct acaacaagca ccgggataag cccatcagag agcaggccga gaatatcatc 5580
cacctgttta ccctgaccaa tctgggagcc cctcgggcct tcaagtactt tgacaccacc 5640
atcgaccgga aggtgtaccg gagcaccaaa gaggtgctgg acgccaccct gatccaccag 5700
agcatcaccg gcctgtacga gacacggatc gacctgtctc agctgggagg tgactctggc 5760
ggctcaaaaa gaaccgccga cggcagcgaa ttcagcacag ggagcatggg aatggactcc 5820
ctggccgagt ctcggtggcc tccgggcctg gcagtcatga agacaataga tgatttgctg 5880
cggtgtggaa tttgcttcga gtatttcaac attgcaatga taatacctca gtgttcacat 5940
aactactgct ctctctgtat aagaaaattt ctgtcctata aaactcagtg tccaacttgc 6000
tgtgtgactg tcacagagcc ggatctgaaa aataaccgca tattagatga actggtaaaa 6060
agcttgaatt ttgcacggaa tcatctgctg cagtttgctt tagagtcacc agccaaatct 6120
cctgcttctt cctcttcaaa gaatcttgct gtcaaagtat atactcctgt agcctccaga 6180
cagtctttaa agcaggggag caggttaatg gataatttct tgatcagaga aatgagtggt 6240
tctacatcag agttgttgat aaaagaaaat aaaagcaaat tcagccctca aaaagaggcg 6300
agccctgctg caaagaccaa agagacacgt tctgtagaag agatcgctcc agatccctca 6360
gaggctaagc gtcctgagcc accctcgaca tccactttga aacaagttac taaagtggat 6420
tgtcctgttt gcggggttaa cattccagaa agtcacatta ataagcattt agacagctgt 6480
ttatcacgcg aagagaagaa ggaaagcctc agaagttctg ttcacaaaag gaagccgcac 6540
atgtacaatg cccaatgcga tgctttgcat cctaaatcag ctgctgaaat agttcgagaa 6600
atcgaaaata tagagaagac taggatgcgt cttgaagcta gtaaactcaa tgaaagtgta 6660
atggttttta caaaggacca aacagaaaag gaaatagatg aaatccacag taaatatcgt 6720
aaaaaacata agagtgaatt tcagcttctg gtggatcagg ctagaaaagg atacaagaaa 6780
attgctggaa tgtcacaaaa aacagtaaca ataacaaaag aagatgaatc tacagaaaag 6840
ctatcttctg tatgcatggg acaggaagat aatatgacct cagtaacaaa ccacttttct 6900
caatcaaagc tggactcccc agaggaattg gaacctgaca gagaagagga ttcttctagc 6960
tgtattgata ttcaagaagt tctttcttca tcagaatcag attcatgcaa tagttccagt 7020
tcagacatca taagagatct tttagaagaa gaggaagcct gggaagcatc acataaaaac 7080
gatcttcaag acacagaaat aagtccaaga cagaatcgcc gcacaagagc cgctgaaagt 7140
gctgagattg aaccaagaaa caagcgtaat aggaatgaaa aaagaaccgc cgacggcagc 7200
gagttcgagc ccaagaagaa gaggaaagtc caaccggtca tcatcaccat caccattgag 7260
tttaaacccg ctgatcagcc tcgactgtgc cttctagttg ccagccatct gttgtttgcc 7320
cctcccccgt gccttccttg accctggaag gtgccactcc cactgtcctt tcctaataaa 7380
atgaggaaat tgcatcgcat tgtctgagta ggtgtcattc tattctgggg ggtggggtgg 7440
ggcaggacag caagggggag gattgggaag acaatagcag gcatgctggg gatgcggtgg 7500
gctctatggc ttctgaggcg gaaagaacca gctggggctc gataccgtcg acctctagct 7560
agagcttggc gtaatcatgg tcatagctgt ttcctgtgtg aaattgttat ccgctcacaa 7620
ttccacacaa catacgagcc ggaagcataa agtgtaaagc ctagggtgcc taatgagtga 7680
gctaactcac attaattgcg ttgcgctcac tgcccgcttt ccagtcggga aacctgtcgt 7740
gccagctgca ttaatgaatc ggccaacgcg cggggagagg cggtttgcgt attgggcgct 7800
cttccgcttc ctcgctcact gactcgctgc gctcggtcgt tcggctgcgg cgagcggtat 7860
cagctcactc aaaggcggta atacggttat ccacagaatc aggggataac gcaggaaaga 7920
acatgtgagc aaaaggccag caaaaggcca ggaaccgtaa aaaggccgcg ttgctggcgt 7980
ttttccatag gctccgcccc cctgacgagc atcacaaaaa tcgacgctca agtcagaggt 8040
ggcgaaaccc gacaggacta taaagatacc aggcgtttcc ccctggaagc tccctcgtgc 8100
gctctcctgt tccgaccctg ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa 8160
gcgtggcgct ttctcatagc tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct 8220
ccaagctggg ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc ttatccggta 8280
actatcgtct tgagtccaac ccggtaagac acgacttatc gccactggca gcagccactg 8340
gtaacaggat tagcagagcg aggtatgtag gcggtgctac agagttcttg aagtggtggc 8400
ctaactacgg ctacactaga agaacagtat ttggtatctg cgctctgctg aagccagtta 8460
ccttcggaaa aagagttggt agctcttgat ccggcaaaca aaccaccgct ggtagcggtg 8520
gtttttttgt ttgcaagcag cagattacgc gcagaaaaaa aggatctcaa gaagatcctt 8580
tgatcttttc tacggggtct gacactcagt ggaacgaaaa ctcacgttaa gggattttgg 8640
tcatgagatt atcaaaaagg atcttcacct agatcctttt aaattaaaaa tgaagtttta 8700
aatcaatcta aagtatatat gagtaaactt ggtctgacag ttaccaatgc ttaatcagtg 8760
aggcacctat ctcagcgatc tgtctatttc gttcatccat agttgcctga ctccccgtcg 8820
tgtagataac tacgatacgg gagggcttac catctggccc cagtgctgca atgataccgc 8880
gagacccacg ctcaccggct ccagatttat cagcaataaa ccagccagcc ggaagggccg 8940
agcgcagaag tggtcctgca actttatccg cctccatcca gtctattaat tgttgccggg 9000
aagctagagt aagtagttcg ccagttaata gtttgcgcaa cgttgttgcc attgctacag 9060
gcatcgtggt gtcacgctcg tcgtttggta tggcttcatt cagctccggt tcccaacgat 9120
caaggcgagt tacatgatcc cccatgttgt gcaaaaaagc ggttagctcc ttcggtcctc 9180
cgatcgttgt cagaagtaag ttggccgcag tgttatcact catggttatg gcagcactgc 9240
ataattctct tactgtcatg ccatccgtaa gatgcttttc tgtgactggt gagtactcaa 9300
ccaagtcatt ctgagaatag tgtatgcggc gaccgagttg ctcttgcccg gcgtcaatac 9360
gggataatac cgcgccacat agcagaactt taaaagtgct catcattgga aaacgttctt 9420
cggggcgaaa actctcaagg atcttaccgc tgttgagatc cagttcgatg taacccactc 9480
gtgcacccaa ctgatcttca gcatctttta ctttcaccag cgtttctggg tgagcaaaaa 9540
caggaaggca aaatgccgca aaaaagggaa taagggcgac acggaaatgt tgaatactca 9600
tactcttcct ttttcaatat tattgaagca tttatcaggg ttattgtctc atgagcggat 9660
acatatttga atgtatttag aaaaataaac aaataggggt tccgcgcaca tttccccgaa 9720
aagtgccacc tgacgtcgac ggatcgggag atcgatctcc cgatccccta gggtcgactc 9780
tcagtacaat ctgctctgat gccgcatagt taagccagta tctgctccct gcttgtgtgt 9840
tggaggtcgc tgagtagtgc gcgagcaaaa tttaagctac aacaaggcaa ggcttgaccg 9900
acaattgcat gaagaatctg cttagggtta ggcgttttgc gctgcttcgc gatgtacggg 9960
ccagatatac gcgttgacat tgattattga ctagttatta atagtaatca attacggggt 10020
cattagttca tagcccatat atggagttcc gcgttacata acttacggta aatggcccgc 10080
ctggctgacc gcccaacgac ccccgcccat tgacgtcaat aatgacgtat gttcccatag 10140
taacgccaat agggactttc cattgacgtc aatgggtgga gtatttacgg taaactgccc 10200
acttggcagt acatcaagtg tatc 10224
<210> 2
<211> 8811
<212> DNA
<213> Artificial sequence
<400> 2
atatgccaag tacgccccct attgacgtca atgacggtaa atggcccgcc tggcattatg 60
cccagtacat gaccttatgg gactttccta cttggcagta catctacgta ttagtcatcg 120
ctattaccat ggtgatgcgg ttttggcagt acatcaatgg gcgtggatag cggtttgact 180
cacggggatt tccaagtctc caccccattg acgtcaatgg gagtttgttt tggcaccaaa 240
atcaacggga ctttccaaaa tgtcgtaaca actccgcccc attgacgcaa atgggcggta 300
ggcgtgtacg gtgggaggtc tatataagca gagctggttt agtgaaccgt cagatccgct 360
agagatccgc ggccgctaat acgactcact atagggagag ccgccaccat gaaacggaca 420
gccgacggaa gcgagttcga gtcaccaaag aagaagcgga aagtctctga agtcgagttt 480
agccacgagt attggatgag gcacgcactg accctggcaa agcgagcatg ggatgaaaga 540
gaagtccccg tgggcgccgt gctggtgcac aacaatagag tgatcggaga gggatggaac 600
aggccaatcg gccgccacga ccctaccgca cacgcagaga tcatggcact gaggcaggga 660
ggcctggtca tgcagaatta ccgcctgatc gatgccaccc tgtatgtgac actggagcca 720
tgcgtgatgt gcgcaggagc aatgatccac agcaggatcg gaagagtggt gttcggagca 780
cgggacgcca agaccggcgc agcaggctcc ctgatggatg tgctgcacca ccccggcatg 840
aaccaccggg tggagatcac agagggaatc ctggcagacg agtgcgccgc cctgctgagc 900
gatttcttta gaatgcggag acaggagatc aaggcccaga agaaggcaca gagctccacc 960
gactctggag gatctagcgg aggatcctct ggaagcgaga caccaggcac aagcgagtcc 1020
gccacaccag agagctccgg cggctcctcc ggaggatcct ctgaggtgga gttttcccac 1080
gagtactgga tgagacatgc cctgaccctg gccaagaggg cacgcgatga gagggaggtg 1140
cctgtgggag ccgtgctggt gctgaacaat agagtgatcg gcgagggctg gaacagagcc 1200
atcggcctgc acgacccaac agcccatgcc gaaattatgg ccctgagaca gggcggcctg 1260
gtcatgcaga actacagact gattgacgcc accctgtacg tgacattcga gccttgcgtg 1320
atgtgcgccg gcgccatgat ccactctagg atcggccgcg tggtgtttgg cgtgaggaac 1380
gcaaaaaccg gcgccgcagg ctccctgatg gacgtgctgc actaccccgg catgaatcac 1440
cgcgtcgaaa ttaccgaggg aatcctggca gatgaatgtg ccgccctgct gtgctatttc 1500
tttcggatgc ctagacaggt gttcaatgct cagaagaagg cccagagctc caccgactcc 1560
ggaggatcta gcggaggctc ctctggctct gagacacctg gcacaagcga gagcgcaaca 1620
cctgaaagca gcgggggcag cagcgggggg tcagacaaga agtacagcat cggcctggcc 1680
atcggcacca actctgtggg ctgggccgtg atcaccgacg agtacaaggt gcccagcaag 1740
aaattcaagg tgctgggcaa caccgaccgg cacagcatca agaagaacct gatcggagcc 1800
ctgctgttcg acagcggcga aacagccgag gccacccggc tgaagagaac cgccagaaga 1860
agatacacca gacggaagaa ccggatctgc tatctgcaag agatcttcag caacgagatg 1920
gccaaggtgg acgacagctt cttccacaga ctggaagagt ccttcctggt ggaagaggat 1980
aagaagcacg agcggcaccc catcttcggc aacatcgtgg acgaggtggc ctaccacgag 2040
aagtacccca ccatctacca cctgagaaag aaactggtgg acagcaccga caaggccgac 2100
ctgcggctga tctatctggc cctggcccac atgatcaagt tccggggcca cttcctgatc 2160
gagggcgacc tgaaccccga caacagcgac gtggacaagc tgttcatcca gctggtgcag 2220
acctacaacc agctgttcga ggaaaacccc atcaacgcca gcggcgtgga cgccaaggcc 2280
atcctgtctg ccagactgag caagagcaga cggctggaaa atctgatcgc ccagctgccc 2340
ggcgagaaga agaatggcct gttcggaaac ctgattgccc tgagcctggg cctgaccccc 2400
aacttcaaga gcaacttcga cctggccgag gatgccaaac tgcagctgag caaggacacc 2460
tacgacgacg acctggacaa cctgctggcc cagatcggcg accagtacgc cgacctgttt 2520
ctggccgcca agaacctgtc cgacgccatc ctgctgagcg acatcctgag agtgaacacc 2580
gagatcacca aggcccccct gagcgcctct atgatcaaga gatacgacga gcaccaccag 2640
gacctgaccc tgctgaaagc tctcgtgcgg cagcagctgc ctgagaagta caaagagatt 2700
ttcttcgacc agagcaagaa cggctacgcc ggctacattg acggcggagc cagccaggaa 2760
gagttctaca agttcatcaa gcccatcctg gaaaagatgg acggcaccga ggaactgctc 2820
gtgaagctga acagagagga cctgctgcgg aagcagcgga ccttcgacaa cggcagcatc 2880
ccccaccaga tccacctggg agagctgcac gccattctgc ggcggcagga agatttttac 2940
ccattcctga aggacaaccg ggaaaagatc gagaagatcc tgaccttccg catcccctac 3000
tacgtgggcc ctctggccag gggaaacagc agattcgcct ggatgaccag aaagagcgag 3060
gaaaccatca ccccctggaa cttcgaggaa gtggtggaca agggcgcttc cgcccagagc 3120
ttcatcgagc ggatgaccaa cttcgataag aacctgccca acgagaaggt gctgcccaag 3180
cacagcctgc tgtacgagta cttcaccgtg tataacgagc tgaccaaagt gaaatacgtg 3240
accgagggaa tgagaaagcc cgccttcctg agcggcgagc agaaaaaggc catcgtggac 3300
ctgctgttca agaccaaccg gaaagtgacc gtgaagcagc tgaaagagga ctacttcaag 3360
aaaatcgagt gcttcgactc cgtggaaatc tccggcgtgg aagatcggtt caacgcctcc 3420
ctgggcacat accacgatct gctgaaaatt atcaaggaca aggacttcct ggacaatgag 3480
gaaaacgagg acattctgga agatatcgtg ctgaccctga cactgtttga ggacagagag 3540
atgatcgagg aacggctgaa aacctatgcc cacctgttcg acgacaaagt gatgaagcag 3600
ctgaagcggc ggagatacac cggctggggc aggctgagcc ggaagctgat caacggcatc 3660
cgggacaagc agtccggcaa gacaatcctg gatttcctga agtccgacgg cttcgccaac 3720
agaaacttca tgcagctgat ccacgacgac agcctgacct ttaaagagga catccagaaa 3780
gcccaggtgt ccggccaggg cgatagcctg cacgagcaca ttgccaatct ggccggcagc 3840
cccgccatta agaagggcat cctgcagaca gtgaaggtgg tggacgagct cgtgaaagtg 3900
atgggccggc acaagcccga gaacatcgtg atcgaaatgg ccagagagaa ccagaccacc 3960
cagaagggac agaagaacag ccgcgagaga atgaagcgga tcgaagaggg catcaaagag 4020
ctgggcagcc agatcctgaa agaacacccc gtggaaaaca cccagctgca gaacgagaag 4080
ctgtacctgt actacctgca gaatgggcgg gatatgtacg tggaccagga actggacatc 4140
aaccggctgt ccgactacga tgtggaccat atcgtgcctc agagctttct gaaggacgac 4200
tccatcgaca acaaggtgct gaccagaagc gacaagaacc ggggcaagag cgacaacgtg 4260
ccctccgaag aggtcgtgaa gaagatgaag aactactggc ggcagctgct gaacgccaag 4320
ctgattaccc agagaaagtt cgacaatctg accaaggccg agagaggcgg cctgagcgaa 4380
ctggataagg ccggcttcat caagagacag ctggtggaaa cccggcagat cacaaagcac 4440
gtggcacaga tcctggactc ccggatgaac actaagtacg acgagaatga caagctgatc 4500
cgggaagtga aagtgatcac cctgaagtcc aagctggtgt ccgatttccg gaaggatttc 4560
cagttttaca aagtgcgcga gatcaacaac taccaccacg cccacgacgc ctacctgaac 4620
gccgtcgtgg gaaccgccct gatcaaaaag taccctaagc tggaaagcga gttcgtgtac 4680
ggcgactaca aggtgtacga cgtgcggaag atgatcgcca agagcgagca ggaaatcggc 4740
aaggctaccg ccaagtactt cttctacagc aacatcatga actttttcaa gaccgagatt 4800
accctggcca acggcgagat ccggaagcgg cctctgatcg agacaaacgg cgaaaccggg 4860
gagatcgtgt gggataaggg ccgggatttt gccaccgtgc ggaaagtgct gagcatgccc 4920
caagtgaata tcgtgaaaaa gaccgaggtg cagacaggcg gcttcagcaa agagtctatc 4980
cggcccaaga ggaacagcga taagctgatc gccagaaaga aggactggga ccctaagaag 5040
tacggcggct tcgtgagccc caccgtggcc tattctgtgc tggtggtggc caaagtggaa 5100
aagggcaagt ccaagaaact gaagagtgtg aaagagctgc tggggatcac catcatggaa 5160
agaagcagct tcgagaagaa tcccatcgac tttctggaag ccaagggcta caaagaagtg 5220
aaaaaggacc tgatcatcaa gctgcctaag tactccctgt tcgagctgga aaacggccgg 5280
aagagaatgc tggcctctgc cagattcctg cagaagggaa acgaactggc cctgccctcc 5340
aaatatgtga acttcctgta cctggccagc cactatgaga agctgaaggg ctcccccgag 5400
gataatgagc agaaacagct gtttgtggaa cagcacaagc actacctgga cgagatcatc 5460
gagcagatca gcgagttctc caagagagtg atcctggccg acgctaatct ggacaaagtg 5520
ctgtccgcct acaacaagca ccgggataag cccatcagag agcaggccga gaatatcatc 5580
cacctgttta ccctgaccaa tctgggagcc cctcgggcct tcaagtactt tgacaccacc 5640
atcgaccgga aggtgtaccg gagcaccaaa gaggtgctgg acgccaccct gatccaccag 5700
agcatcaccg gcctgtacga gacacggatc gacctgtctc agctgggagg tgactctggc 5760
ggctcaaaaa gaaccgccga cggcagcgaa ttcgagccca agaagaagag gaaagtctaa 5820
ccggtcatca tcaccatcac cattgagttt aaacccgctg atcagcctcg actgtgcctt 5880
ctagttgcca gccatctgtt gtttgcccct cccccgtgcc ttccttgacc ctggaaggtg 5940
ccactcccac tgtcctttcc taataaaatg aggaaattgc atcgcattgt ctgagtaggt 6000
gtcattctat tctggggggt ggggtggggc aggacagcaa gggggaggat tgggaagaca 6060
atagcaggca tgctggggat gcggtgggct ctatggcttc tgaggcggaa agaaccagct 6120
ggggctcgat accgtcgacc tctagctaga gcttggcgta atcatggtca tagctgtttc 6180
ctgtgtgaaa ttgttatccg ctcacaattc cacacaacat acgagccgga agcataaagt 6240
gtaaagccta gggtgcctaa tgagtgagct aactcacatt aattgcgttg cgctcactgc 6300
ccgctttcca gtcgggaaac ctgtcgtgcc agctgcatta atgaatcggc caacgcgcgg 6360
ggagaggcgg tttgcgtatt gggcgctctt ccgcttcctc gctcactgac tcgctgcgct 6420
cggtcgttcg gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca 6480
cagaatcagg ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga 6540
accgtaaaaa ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc 6600
acaaaaatcg acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg 6660
cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat 6720
acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt 6780
atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc 6840
agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg 6900
acttatcgcc actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg 6960
gtgctacaga gttcttgaag tggtggccta actacggcta cactagaaga acagtatttg 7020
gtatctgcgc tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg 7080
gcaaacaaac caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca 7140
gaaaaaaagg atctcaagaa gatcctttga tcttttctac ggggtctgac actcagtgga 7200
acgaaaactc acgttaaggg attttggtca tgagattatc aaaaaggatc ttcacctaga 7260
tccttttaaa ttaaaaatga agttttaaat caatctaaag tatatatgag taaacttggt 7320
ctgacagtta ccaatgctta atcagtgagg cacctatctc agcgatctgt ctatttcgtt 7380
catccatagt tgcctgactc cccgtcgtgt agataactac gatacgggag ggcttaccat 7440
ctggccccag tgctgcaatg ataccgcgag acccacgctc accggctcca gatttatcag 7500
caataaacca gccagccgga agggccgagc gcagaagtgg tcctgcaact ttatccgcct 7560
ccatccagtc tattaattgt tgccgggaag ctagagtaag tagttcgcca gttaatagtt 7620
tgcgcaacgt tgttgccatt gctacaggca tcgtggtgtc acgctcgtcg tttggtatgg 7680
cttcattcag ctccggttcc caacgatcaa ggcgagttac atgatccccc atgttgtgca 7740
aaaaagcggt tagctccttc ggtcctccga tcgttgtcag aagtaagttg gccgcagtgt 7800
tatcactcat ggttatggca gcactgcata attctcttac tgtcatgcca tccgtaagat 7860
gcttttctgt gactggtgag tactcaacca agtcattctg agaatagtgt atgcggcgac 7920
cgagttgctc ttgcccggcg tcaatacggg ataataccgc gccacatagc agaactttaa 7980
aagtgctcat cattggaaaa cgttcttcgg ggcgaaaact ctcaaggatc ttaccgctgt 8040
tgagatccag ttcgatgtaa cccactcgtg cacccaactg atcttcagca tcttttactt 8100
tcaccagcgt ttctgggtga gcaaaaacag gaaggcaaaa tgccgcaaaa aagggaataa 8160
gggcgacacg gaaatgttga atactcatac tcttcctttt tcaatattat tgaagcattt 8220
atcagggtta ttgtctcatg agcggataca tatttgaatg tatttagaaa aataaacaaa 8280
taggggttcc gcgcacattt ccccgaaaag tgccacctga cgtcgacgga tcgggagatc 8340
gatctcccga tcccctaggg tcgactctca gtacaatctg ctctgatgcc gcatagttaa 8400
gccagtatct gctccctgct tgtgtgttgg aggtcgctga gtagtgcgcg agcaaaattt 8460
aagctacaac aaggcaaggc ttgaccgaca attgcatgaa gaatctgctt agggttaggc 8520
gttttgcgct gcttcgcgat gtacgggcca gatatacgcg ttgacattga ttattgacta 8580
gttattaata gtaatcaatt acggggtcat tagttcatag cccatatatg gagttccgcg 8640
ttacataact tacggtaaat ggcccgcctg gctgaccgcc caacgacccc cgcccattga 8700
cgtcaataat gacgtatgtt cccatagtaa cgccaatagg gactttccat tgacgtcaat 8760
gggtggagta tttacggtaa actgcccact tggcagtaca tcaagtgtat c 8811
<210> 3
<211> 21
<212> DNA
<213> Artificial sequence
<400> 3
atggactccc tggccgagtc t 21
<210> 4
<211> 20
<212> DNA
<213> Artificial sequence
<400> 4
cggcttcctt ttgtgaacag 20
<210> 5
<211> 42
<212> DNA
<213> Artificial sequence
<400> 5
ctgttcacaa aaggaagccg cacatgtaca atgcccaatg cg 42
<210> 6
<211> 24
<212> DNA
<213> Artificial sequence
<400> 6
attcctatta cgcttgtttc ttgg 24
<210> 7
<211> 45
<212> DNA
<213> Artificial sequence
<400> 7
gaattcagca cagggagcat gggaatggac tccctggccg agtct 45
<210> 8
<211> 86
<212> DNA
<213> Artificial sequence
<400> 8
accggttgga ctttcctctt cttcttgggc tcgaactcgc tgccgtcggc ggttcttttt 60
tcattcctat tacgcttgtt tcttgg 86
<210> 9
<211> 1451
<212> DNA
<213> Artificial sequence
<400> 9
gaattcagca cagggagcat gggaatggac tccctggccg agtctcggtg gcctccgggc 60
ctggcagtca tgaagacaat agatgatttg ctgcggtgtg gaatttgctt cgagtatttc 120
aacattgcaa tgataatacc tcagtgttca cataactact gctctctctg tataagaaaa 180
tttctgtcct ataaaactca gtgtccaact tgctgtgtga ctgtcacaga gccggatctg 240
aaaaataacc gcatattaga tgaactggta aaaagcttga attttgcacg gaatcatctg 300
ctgcagtttg ctttagagtc accagccaaa tctcctgctt cttcctcttc aaagaatctt 360
gctgtcaaag tatatactcc tgtagcctcc agacagtctt taaagcaggg gagcaggtta 420
atggataatt tcttgatcag agaaatgagt ggttctacat cagagttgtt gataaaagaa 480
aataaaagca aattcagccc tcaaaaagag gcgagccctg ctgcaaagac caaagagaca 540
cgttctgtag aagagatcgc tccagatccc tcagaggcta agcgtcctga gccaccctcg 600
acatccactt tgaaacaagt tactaaagtg gattgtcctg tttgcggggt taacattcca 660
gaaagtcaca ttaataagca tttagacagc tgtttatcac gcgaagagaa gaaggaaagc 720
ctcagaagtt ctgttcacaa aaggaagccg cacatgtaca atgcccaatg cgatgctttg 780
catcctaaat cagctgctga aatagttcga gaaatcgaaa atatagagaa gactaggatg 840
cgtcttgaag ctagtaaact caatgaaagt gtaatggttt ttacaaagga ccaaacagaa 900
aaggaaatag atgaaatcca cagtaaatat cgtaaaaaac ataagagtga atttcagctt 960
ctggtggatc aggctagaaa aggatacaag aaaattgctg gaatgtcaca aaaaacagta 1020
acaataacaa aagaagatga atctacagaa aagctatctt ctgtatgcat gggacaggaa 1080
gataatatga cctcagtaac aaaccacttt tctcaatcaa agctggactc cccagaggaa 1140
ttggaacctg acagagaaga ggattcttct agctgtattg atattcaaga agttctttct 1200
tcatcagaat cagattcatg caatagttcc agttcagaca tcataagaga tcttttagaa 1260
gaagaggaag cctgggaagc atcacataaa aacgatcttc aagacacaga aataagtcca 1320
agacagaatc gccgcacaag agccgctgaa agtgctgaga ttgaaccaag aaacaagcgt 1380
aataggaatg aaaaaagaac cgccgacggc agcgagttcg agcccaagaa gaagaggaaa 1440
gtccaaccgg t 1451
<210> 10
<211> 9630
<212> DNA
<213> Artificial sequence
<400> 10
atatgccaag tacgccccct attgacgtca atgacggtaa atggcccgcc tggcattatg 60
cccagtacat gaccttatgg gactttccta cttggcagta catctacgta ttagtcatcg 120
ctattaccat ggtgatgcgg ttttggcagt acatcaatgg gcgtggatag cggtttgact 180
cacggggatt tccaagtctc caccccattg acgtcaatgg gagtttgttt tggcaccaaa 240
atcaacggga ctttccaaaa tgtcgtaaca actccgcccc attgacgcaa atgggcggta 300
ggcgtgtacg gtgggaggtc tatataagca gagctggttt agtgaaccgt cagatccgct 360
agagatccgc ggccgctaat acgactcact atagggagag ccgccaccat gaaacggaca 420
gccgacggaa gcgagttcga gtcaccaaag aagaagcgga aagtctctga ggtggagttt 480
tcccacgagt actggatgag acatgccctg accctggcca agagggcacg ggatgagagg 540
gaggtgcctg tgggagccgt gctggtgctg aacaatagag tgatcggcga gggctggaac 600
agagccatcg gcctgcacga cccaacagcc catgccgaaa ttatggccct gagacagggc 660
ggcctggtca tgcagaacta cagactgatt gacgccaccc tgtacgtgac attcgagcct 720
tgcgtgatgt gcgccggcgc catgatccac tctaggatcg gccgcgtggt gtttggcgtg 780
aggaactcaa aaagaggcgc cgcaggctcc ctgatgaacg tgctgaacta ccccggcatg 840
aatcaccgcg tcgaaattac cgagggaatc ctggcagatg aatgtgccgc cctgctgtgc 900
gatttctatc ggatgcctag acaggtgttc aatgctcaga agaaggccca gagctccatc 960
aactccggag gatctagcgg aggctcctct ggctctgaga cacctggcac aagcgagagc 1020
gcaacacctg aaagcagcgg gggcagcagc ggggggtcag acaagaagta cagcatcggc 1080
ctggccatcg gcaccaactc tgtgggctgg gccgtgatca ccgacgagta caaggtgccc 1140
agcaagaaat tcaaggtgct gggcaacacc gaccggcaca gcatcaagaa gaacctgatc 1200
ggagccctgc tgttcgacag cggcgaaaca gccgaggcca cccggctgaa gagaaccgcc 1260
agaagaagat acaccagacg gaagaaccgg atctgctatc tgcaagagat cttcagcaac 1320
gagatggcca aggtggacga cagcttcttc cacagactgg aagagtcctt cctggtggaa 1380
gaggataaga agcacgagcg gcaccccatc ttcggcaaca tcgtggacga ggtggcctac 1440
cacgagaagt accccaccat ctaccacctg agaaagaaac tggtggacag caccgacaag 1500
gccgacctgc ggctgatcta tctggccctg gcccacatga tcaagttccg gggccacttc 1560
ctgatcgagg gcgacctgaa ccccgacaac agcgacgtgg acaagctgtt catccagctg 1620
gtgcagacct acaaccagct gttcgaggaa aaccccatca acgccagcgg cgtggacgcc 1680
aaggccatcc tgtctgccag actgagcaag agcagacggc tggaaaatct gatcgcccag 1740
ctgcccggcg agaagaagaa tggcctgttc ggaaacctga ttgccctgag cctgggcctg 1800
acccccaact tcaagagcaa cttcgacctg gccgaggatg ccaaactgca gctgagcaag 1860
gacacctacg acgacgacct ggacaacctg ctggcccaga tcggcgacca gtacgccgac 1920
ctgtttctgg ccgccaagaa cctgtccgac gccatcctgc tgagcgacat cctgagagtg 1980
aacaccgaga tcaccaaggc ccccctgagc gcctctatga tcaagagata cgacgagcac 2040
caccaggacc tgaccctgct gaaagctctc gtgcggcagc agctgcctga gaagtacaaa 2100
gagattttct tcgaccagag caagaacggc tacgccggct acattgacgg cggagccagc 2160
caggaagagt tctacaagtt catcaagccc atcctggaaa agatggacgg caccgaggaa 2220
ctgctcgtga agctgaacag agaggacctg ctgcggaagc agcggacctt cgacaacggc 2280
agcatccccc accagatcca cctgggagag ctgcacgcca ttctgcggcg gcaggaagat 2340
ttttacccat tcctgaagga caaccgggaa aagatcgaga agatcctgac cttccgcatc 2400
ccctactacg tgggccctct ggccagggga aacagcagat tcgcctggat gaccagaaag 2460
agcgaggaaa ccatcacccc ctggaacttc gaggaagtgg tggacaaggg cgcttccgcc 2520
cagagcttca tcgagcggat gaccaacttc gataagaacc tgcccaacga gaaggtgctg 2580
cccaagcaca gcctgctgta cgagtacttc accgtgtata acgagctgac caaagtgaaa 2640
tacgtgaccg agggaatgag aaagcccgcc ttcctgagcg gcgagcagaa aaaggccatc 2700
gtggacctgc tgttcaagac caaccggaaa gtgaccgtga agcagctgaa agaggactac 2760
ttcaagaaaa tcgagtgctt cgactccgtg gaaatctccg gcgtggaaga tcggttcaac 2820
gcctccctgg gcacatacca cgatctgctg aaaattatca aggacaagga cttcctggac 2880
aatgaggaaa acgaggacat tctggaagat atcgtgctga ccctgacact gtttgaggac 2940
agagagatga tcgaggaacg gctgaaaacc tatgcccacc tgttcgacga caaagtgatg 3000
aagcagctga agcggcggag atacaccggc tggggcaggc tgagccggaa gctgatcaac 3060
ggcatccggg acaagcagtc cggcaagaca atcctggatt tcctgaagtc cgacggcttc 3120
gccaacagaa acttcatgca gctgatccac gacgacagcc tgacctttaa agaggacatc 3180
cagaaagccc aggtgtccgg ccagggcgat agcctgcacg agcacattgc caatctggcc 3240
ggcagccccg ccattaagaa gggcatcctg cagacagtga aggtggtgga cgagctcgtg 3300
aaagtgatgg gccggcacaa gcccgagaac atcgtgatcg aaatggccag agagaaccag 3360
accacccaga agggacagaa gaacagccgc gagagaatga agcggatcga agagggcatc 3420
aaagagctgg gcagccagat cctgaaagaa caccccgtgg aaaacaccca gctgcagaac 3480
gagaagctgt acctgtacta cctgcagaat gggcgggata tgtacgtgga ccaggaactg 3540
gacatcaacc ggctgtccga ctacgatgtg gaccatatcg tgcctcagag ctttctgaag 3600
gacgactcca tcgacaacaa ggtgctgacc agaagcgaca agaaccgggg caagagcgac 3660
aacgtgccct ccgaagaggt cgtgaagaag atgaagaact actggcggca gctgctgaac 3720
gccaagctga ttacccagag aaagttcgac aatctgacca aggccgagag aggcggcctg 3780
agcgaactgg ataaggccgg cttcatcaag agacagctgg tggaaacccg gcagatcaca 3840
aagcacgtgg cacagatcct ggactcccgg atgaacacta agtacgacga gaatgacaag 3900
ctgatccggg aagtgaaagt gatcaccctg aagtccaagc tggtgtccga tttccggaag 3960
gatttccagt tttacaaagt gcgcgagatc aacaactacc accacgccca cgacgcctac 4020
ctgaacgccg tcgtgggaac cgccctgatc aaaaagtacc ctaagctgga aagcgagttc 4080
gtgtacggcg actacaaggt gtacgacgtg cggaagatga tcgccaagag cgagcaggaa 4140
atcggcaagg ctaccgccaa gtacttcttc tacagcaaca tcatgaactt tttcaagacc 4200
gagattaccc tggccaacgg cgagatccgg aagcggcctc tgatcgagac aaacggcgaa 4260
accggggaga tcgtgtggga taagggccgg gattttgcca ccgtgcggaa agtgctgagc 4320
atgccccaag tgaatatcgt gaaaaagacc gaggtgcaga caggcggctt cagcaaagag 4380
tctatcctgc ccaagaggaa cagcgataag ctgatcgcca gaaagaagga ctgggaccct 4440
aagaagtacg gcggcttcga cagccccacc gtggcctatt ctgtgctggt ggtggccaaa 4500
gtggaaaagg gcaagtccaa gaaactgaag agtgtgaaag agctgctggg gatcaccatc 4560
atggaaagaa gcagcttcga gaagaatccc atcgactttc tggaagccaa gggctacaaa 4620
gaagtgaaaa aggacctgat catcaagctg cctaagtact ccctgttcga gctggaaaac 4680
ggccggaaga gaatgctggc ctctgccggc gaactgcaga agggaaacga actggccctg 4740
ccctccaaat atgtgaactt cctgtacctg gccagccact atgagaagct gaagggctcc 4800
cccgaggata atgagcagaa acagctgttt gtggaacagc acaagcacta cctggacgag 4860
atcatcgagc agatcagcga gttctccaag agagtgatcc tggccgacgc taatctggac 4920
aaagtgctgt ccgcctacaa caagcaccgg gataagccca tcagagagca ggccgagaat 4980
atcatccacc tgtttaccct gaccaatctg ggagcccctg ccgccttcaa gtactttgac 5040
accaccatcg accggaagag gtacaccagc accaaagagg tgctggacgc caccctgatc 5100
caccagagca tcaccggcct gtacgagaca cggatcgacc tgtctcagct gggaggtgac 5160
tctggcggct caaaaagaac cgccgacggc agcgaattca gcacagggag catgggaatg 5220
gactccctgg ccgagtctcg gtggcctccg ggcctggcag tcatgaagac aatagatgat 5280
ttgctgcggt gtggaatttg cttcgagtat ttcaacattg caatgataat acctcagtgt 5340
tcacataact actgctctct ctgtataaga aaatttctgt cctataaaac tcagtgtcca 5400
acttgctgtg tgactgtcac agagccggat ctgaaaaata accgcatatt agatgaactg 5460
gtaaaaagct tgaattttgc acggaatcat ctgctgcagt ttgctttaga gtcaccagcc 5520
aaatctcctg cttcttcctc ttcaaagaat cttgctgtca aagtatatac tcctgtagcc 5580
tccagacagt ctttaaagca ggggagcagg ttaatggata atttcttgat cagagaaatg 5640
agtggttcta catcagagtt gttgataaaa gaaaataaaa gcaaattcag ccctcaaaaa 5700
gaggcgagcc ctgctgcaaa gaccaaagag acacgttctg tagaagagat cgctccagat 5760
ccctcagagg ctaagcgtcc tgagccaccc tcgacatcca ctttgaaaca agttactaaa 5820
gtggattgtc ctgtttgcgg ggttaacatt ccagaaagtc acattaataa gcatttagac 5880
agctgtttat cacgcgaaga gaagaaggaa agcctcagaa gttctgttca caaaaggaag 5940
ccgcacatgt acaatgccca atgcgatgct ttgcatccta aatcagctgc tgaaatagtt 6000
cgagaaatcg aaaatataga gaagactagg atgcgtcttg aagctagtaa actcaatgaa 6060
agtgtaatgg tttttacaaa ggaccaaaca gaaaaggaaa tagatgaaat ccacagtaaa 6120
tatcgtaaaa aacataagag tgaatttcag cttctggtgg atcaggctag aaaaggatac 6180
aagaaaattg ctggaatgtc acaaaaaaca gtaacaataa caaaagaaga tgaatctaca 6240
gaaaagctat cttctgtatg catgggacag gaagataata tgacctcagt aacaaaccac 6300
ttttctcaat caaagctgga ctccccagag gaattggaac ctgacagaga agaggattct 6360
tctagctgta ttgatattca agaagttctt tcttcatcag aatcagattc atgcaatagt 6420
tccagttcag acatcataag agatctttta gaagaagagg aagcctggga agcatcacat 6480
aaaaacgatc ttcaagacac agaaataagt ccaagacaga atcgccgcac aagagccgct 6540
gaaagtgctg agattgaacc aagaaacaag cgtaatagga atgaaaaaag aaccgccgac 6600
ggcagcgagt tcgagcccaa gaagaagagg aaagtccaac cggtcatcat caccatcacc 6660
attgagttta aacccgctga tcagcctcga ctgtgccttc tagttgccag ccatctgttg 6720
tttgcccctc ccccgtgcct tccttgaccc tggaaggtgc cactcccact gtcctttcct 6780
aataaaatga ggaaattgca tcgcattgtc tgagtaggtg tcattctatt ctggggggtg 6840
gggtggggca ggacagcaag ggggaggatt gggaagacaa tagcaggcat gctggggatg 6900
cggtgggctc tatggcttct gaggcggaaa gaaccagctg gggctcgata ccgtcgacct 6960
ctagctagag cttggcgtaa tcatggtcat agctgtttcc tgtgtgaaat tgttatccgc 7020
tcacaattcc acacaacata cgagccggaa gcataaagtg taaagcctag ggtgcctaat 7080
gagtgagcta actcacatta attgcgttgc gctcactgcc cgctttccag tcgggaaacc 7140
tgtcgtgcca gctgcattaa tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg 7200
ggcgctcttc cgcttcctcg ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag 7260
cggtatcagc tcactcaaag gcggtaatac ggttatccac agaatcaggg gataacgcag 7320
gaaagaacat gtgagcaaaa ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc 7380
tggcgttttt ccataggctc cgcccccctg acgagcatca caaaaatcga cgctcaagtc 7440
agaggtggcg aaacccgaca ggactataaa gataccaggc gtttccccct ggaagctccc 7500
tcgtgcgctc tcctgttccg accctgccgc ttaccggata cctgtccgcc tttctccctt 7560
cgggaagcgt ggcgctttct catagctcac gctgtaggta tctcagttcg gtgtaggtcg 7620
ttcgctccaa gctgggctgt gtgcacgaac cccccgttca gcccgaccgc tgcgccttat 7680
ccggtaacta tcgtcttgag tccaacccgg taagacacga cttatcgcca ctggcagcag 7740
ccactggtaa caggattagc agagcgaggt atgtaggcgg tgctacagag ttcttgaagt 7800
ggtggcctaa ctacggctac actagaagaa cagtatttgg tatctgcgct ctgctgaagc 7860
cagttacctt cggaaaaaga gttggtagct cttgatccgg caaacaaacc accgctggta 7920
gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag 7980
atcctttgat cttttctacg gggtctgaca ctcagtggaa cgaaaactca cgttaaggga 8040
ttttggtcat gagattatca aaaaggatct tcacctagat ccttttaaat taaaaatgaa 8100
gttttaaatc aatctaaagt atatatgagt aaacttggtc tgacagttac caatgcttaa 8160
tcagtgaggc acctatctca gcgatctgtc tatttcgttc atccatagtt gcctgactcc 8220
ccgtcgtgta gataactacg atacgggagg gcttaccatc tggccccagt gctgcaatga 8280
taccgcgaga cccacgctca ccggctccag atttatcagc aataaaccag ccagccggaa 8340
gggccgagcg cagaagtggt cctgcaactt tatccgcctc catccagtct attaattgtt 8400
gccgggaagc tagagtaagt agttcgccag ttaatagttt gcgcaacgtt gttgccattg 8460
ctacaggcat cgtggtgtca cgctcgtcgt ttggtatggc ttcattcagc tccggttccc 8520
aacgatcaag gcgagttaca tgatccccca tgttgtgcaa aaaagcggtt agctccttcg 8580
gtcctccgat cgttgtcaga agtaagttgg ccgcagtgtt atcactcatg gttatggcag 8640
cactgcataa ttctcttact gtcatgccat ccgtaagatg cttttctgtg actggtgagt 8700
actcaaccaa gtcattctga gaatagtgta tgcggcgacc gagttgctct tgcccggcgt 8760
caatacggga taataccgcg ccacatagca gaactttaaa agtgctcatc attggaaaac 8820
gttcttcggg gcgaaaactc tcaaggatct taccgctgtt gagatccagt tcgatgtaac 8880
ccactcgtgc acccaactga tcttcagcat cttttacttt caccagcgtt tctgggtgag 8940
caaaaacagg aaggcaaaat gccgcaaaaa agggaataag ggcgacacgg aaatgttgaa 9000
tactcatact cttccttttt caatattatt gaagcattta tcagggttat tgtctcatga 9060
gcggatacat atttgaatgt atttagaaaa ataaacaaat aggggttccg cgcacatttc 9120
cccgaaaagt gccacctgac gtcgacggat cgggagatcg atctcccgat cccctagggt 9180
cgactctcag tacaatctgc tctgatgccg catagttaag ccagtatctg ctccctgctt 9240
gtgtgttgga ggtcgctgag tagtgcgcga gcaaaattta agctacaaca aggcaaggct 9300
tgaccgacaa ttgcatgaag aatctgctta gggttaggcg ttttgcgctg cttcgcgatg 9360
tacgggccag atatacgcgt tgacattgat tattgactag ttattaatag taatcaatta 9420
cggggtcatt agttcatagc ccatatattg agttccgcgt tacataactt acggtaaatg 9480
gcccgcctgg ctgaccgccc aacgaccccc gcccattgac gtcaataatg acgtatgttc 9540
ccatagtaac gccaataggg actttccatt gacgtcaatg ggtggagtat ttacggtaaa 9600
ctgcccactt ggcagtacat caagtgtatc 9630
<210> 11
<211> 8217
<212> DNA
<213> Artificial sequence
<400> 11
atatgccaag tacgccccct attgacgtca atgacggtaa atggcccgcc tggcattatg 60
cccagtacat gaccttatgg gactttccta cttggcagta catctacgta ttagtcatcg 120
ctattaccat ggtgatgcgg ttttggcagt acatcaatgg gcgtggatag cggtttgact 180
cacggggatt tccaagtctc caccccattg acgtcaatgg gagtttgttt tggcaccaaa 240
atcaacggga ctttccaaaa tgtcgtaaca actccgcccc attgacgcaa atgggcggta 300
ggcgtgtacg gtgggaggtc tatataagca gagctggttt agtgaaccgt cagatccgct 360
agagatccgc ggccgctaat acgactcact atagggagag ccgccaccat gaaacggaca 420
gccgacggaa gcgagttcga gtcaccaaag aagaagcgga aagtctctga ggtggagttt 480
tcccacgagt actggatgag acatgccctg accctggcca agagggcacg ggatgagagg 540
gaggtgcctg tgggagccgt gctggtgctg aacaatagag tgatcggcga gggctggaac 600
agagccatcg gcctgcacga cccaacagcc catgccgaaa ttatggccct gagacagggc 660
ggcctggtca tgcagaacta cagactgatt gacgccaccc tgtacgtgac attcgagcct 720
tgcgtgatgt gcgccggcgc catgatccac tctaggatcg gccgcgtggt gtttggcgtg 780
aggaactcaa aaagaggcgc cgcaggctcc ctgatgaacg tgctgaacta ccccggcatg 840
aatcaccgcg tcgaaattac cgagggaatc ctggcagatg aatgtgccgc cctgctgtgc 900
gatttctatc ggatgcctag acaggtgttc aatgctcaga agaaggccca gagctccatc 960
aactccggag gatctagcgg aggctcctct ggctctgaga cacctggcac aagcgagagc 1020
gcaacacctg aaagcagcgg gggcagcagc ggggggtcag acaagaagta cagcatcggc 1080
ctggccatcg gcaccaactc tgtgggctgg gccgtgatca ccgacgagta caaggtgccc 1140
agcaagaaat tcaaggtgct gggcaacacc gaccggcaca gcatcaagaa gaacctgatc 1200
ggagccctgc tgttcgacag cggcgaaaca gccgaggcca cccggctgaa gagaaccgcc 1260
agaagaagat acaccagacg gaagaaccgg atctgctatc tgcaagagat cttcagcaac 1320
gagatggcca aggtggacga cagcttcttc cacagactgg aagagtcctt cctggtggaa 1380
gaggataaga agcacgagcg gcaccccatc ttcggcaaca tcgtggacga ggtggcctac 1440
cacgagaagt accccaccat ctaccacctg agaaagaaac tggtggacag caccgacaag 1500
gccgacctgc ggctgatcta tctggccctg gcccacatga tcaagttccg gggccacttc 1560
ctgatcgagg gcgacctgaa ccccgacaac agcgacgtgg acaagctgtt catccagctg 1620
gtgcagacct acaaccagct gttcgaggaa aaccccatca acgccagcgg cgtggacgcc 1680
aaggccatcc tgtctgccag actgagcaag agcagacggc tggaaaatct gatcgcccag 1740
ctgcccggcg agaagaagaa tggcctgttc ggaaacctga ttgccctgag cctgggcctg 1800
acccccaact tcaagagcaa cttcgacctg gccgaggatg ccaaactgca gctgagcaag 1860
gacacctacg acgacgacct ggacaacctg ctggcccaga tcggcgacca gtacgccgac 1920
ctgtttctgg ccgccaagaa cctgtccgac gccatcctgc tgagcgacat cctgagagtg 1980
aacaccgaga tcaccaaggc ccccctgagc gcctctatga tcaagagata cgacgagcac 2040
caccaggacc tgaccctgct gaaagctctc gtgcggcagc agctgcctga gaagtacaaa 2100
gagattttct tcgaccagag caagaacggc tacgccggct acattgacgg cggagccagc 2160
caggaagagt tctacaagtt catcaagccc atcctggaaa agatggacgg caccgaggaa 2220
ctgctcgtga agctgaacag agaggacctg ctgcggaagc agcggacctt cgacaacggc 2280
agcatccccc accagatcca cctgggagag ctgcacgcca ttctgcggcg gcaggaagat 2340
ttttacccat tcctgaagga caaccgggaa aagatcgaga agatcctgac cttccgcatc 2400
ccctactacg tgggccctct ggccagggga aacagcagat tcgcctggat gaccagaaag 2460
agcgaggaaa ccatcacccc ctggaacttc gaggaagtgg tggacaaggg cgcttccgcc 2520
cagagcttca tcgagcggat gaccaacttc gataagaacc tgcccaacga gaaggtgctg 2580
cccaagcaca gcctgctgta cgagtacttc accgtgtata acgagctgac caaagtgaaa 2640
tacgtgaccg agggaatgag aaagcccgcc ttcctgagcg gcgagcagaa aaaggccatc 2700
gtggacctgc tgttcaagac caaccggaaa gtgaccgtga agcagctgaa agaggactac 2760
ttcaagaaaa tcgagtgctt cgactccgtg gaaatctccg gcgtggaaga tcggttcaac 2820
gcctccctgg gcacatacca cgatctgctg aaaattatca aggacaagga cttcctggac 2880
aatgaggaaa acgaggacat tctggaagat atcgtgctga ccctgacact gtttgaggac 2940
agagagatga tcgaggaacg gctgaaaacc tatgcccacc tgttcgacga caaagtgatg 3000
aagcagctga agcggcggag atacaccggc tggggcaggc tgagccggaa gctgatcaac 3060
ggcatccggg acaagcagtc cggcaagaca atcctggatt tcctgaagtc cgacggcttc 3120
gccaacagaa acttcatgca gctgatccac gacgacagcc tgacctttaa agaggacatc 3180
cagaaagccc aggtgtccgg ccagggcgat agcctgcacg agcacattgc caatctggcc 3240
ggcagccccg ccattaagaa gggcatcctg cagacagtga aggtggtgga cgagctcgtg 3300
aaagtgatgg gccggcacaa gcccgagaac atcgtgatcg aaatggccag agagaaccag 3360
accacccaga agggacagaa gaacagccgc gagagaatga agcggatcga agagggcatc 3420
aaagagctgg gcagccagat cctgaaagaa caccccgtgg aaaacaccca gctgcagaac 3480
gagaagctgt acctgtacta cctgcagaat gggcgggata tgtacgtgga ccaggaactg 3540
gacatcaacc ggctgtccga ctacgatgtg gaccatatcg tgcctcagag ctttctgaag 3600
gacgactcca tcgacaacaa ggtgctgacc agaagcgaca agaaccgggg caagagcgac 3660
aacgtgccct ccgaagaggt cgtgaagaag atgaagaact actggcggca gctgctgaac 3720
gccaagctga ttacccagag aaagttcgac aatctgacca aggccgagag aggcggcctg 3780
agcgaactgg ataaggccgg cttcatcaag agacagctgg tggaaacccg gcagatcaca 3840
aagcacgtgg cacagatcct ggactcccgg atgaacacta agtacgacga gaatgacaag 3900
ctgatccggg aagtgaaagt gatcaccctg aagtccaagc tggtgtccga tttccggaag 3960
gatttccagt tttacaaagt gcgcgagatc aacaactacc accacgccca cgacgcctac 4020
ctgaacgccg tcgtgggaac cgccctgatc aaaaagtacc ctaagctgga aagcgagttc 4080
gtgtacggcg actacaaggt gtacgacgtg cggaagatga tcgccaagag cgagcaggaa 4140
atcggcaagg ctaccgccaa gtacttcttc tacagcaaca tcatgaactt tttcaagacc 4200
gagattaccc tggccaacgg cgagatccgg aagcggcctc tgatcgagac aaacggcgaa 4260
accggggaga tcgtgtggga taagggccgg gattttgcca ccgtgcggaa agtgctgagc 4320
atgccccaag tgaatatcgt gaaaaagacc gaggtgcaga caggcggctt cagcaaagag 4380
tctatcctgc ccaagaggaa cagcgataag ctgatcgcca gaaagaagga ctgggaccct 4440
aagaagtacg gcggcttcga cagccccacc gtggcctatt ctgtgctggt ggtggccaaa 4500
gtggaaaagg gcaagtccaa gaaactgaag agtgtgaaag agctgctggg gatcaccatc 4560
atggaaagaa gcagcttcga gaagaatccc atcgactttc tggaagccaa gggctacaaa 4620
gaagtgaaaa aggacctgat catcaagctg cctaagtact ccctgttcga gctggaaaac 4680
ggccggaaga gaatgctggc ctctgccggc gaactgcaga agggaaacga actggccctg 4740
ccctccaaat atgtgaactt cctgtacctg gccagccact atgagaagct gaagggctcc 4800
cccgaggata atgagcagaa acagctgttt gtggaacagc acaagcacta cctggacgag 4860
atcatcgagc agatcagcga gttctccaag agagtgatcc tggccgacgc taatctggac 4920
aaagtgctgt ccgcctacaa caagcaccgg gataagccca tcagagagca ggccgagaat 4980
atcatccacc tgtttaccct gaccaatctg ggagcccctg ccgccttcaa gtactttgac 5040
accaccatcg accggaagag gtacaccagc accaaagagg tgctggacgc caccctgatc 5100
caccagagca tcaccggcct gtacgagaca cggatcgacc tgtctcagct gggaggtgac 5160
tctggcggct caaaaagaac cgccgacggc agcgaattcg agcccaagaa gaagaggaaa 5220
gtctaaccgg tcatcatcac catcaccatt gagtttaaac ccgctgatca gcctcgactg 5280
tgccttctag ttgccagcca tctgttgttt gcccctcccc cgtgccttcc ttgaccctgg 5340
aaggtgccac tcccactgtc ctttcctaat aaaatgagga aattgcatcg cattgtctga 5400
gtaggtgtca ttctattctg gggggtgggg tggggcagga cagcaagggg gaggattggg 5460
aagacaatag caggcatgct ggggatgcgg tgggctctat ggcttctgag gcggaaagaa 5520
ccagctgggg ctcgataccg tcgacctcta gctagagctt ggcgtaatca tggtcatagc 5580
tgtttcctgt gtgaaattgt tatccgctca caattccaca caacatacga gccggaagca 5640
taaagtgtaa agcctagggt gcctaatgag tgagctaact cacattaatt gcgttgcgct 5700
cactgcccgc tttccagtcg ggaaacctgt cgtgccagct gcattaatga atcggccaac 5760
gcgcggggag aggcggtttg cgtattgggc gctcttccgc ttcctcgctc actgactcgc 5820
tgcgctcggt cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt 5880
tatccacaga atcaggggat aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg 5940
ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca taggctccgc ccccctgacg 6000
agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga ctataaagat 6060
accaggcgtt tccccctgga agctccctcg tgcgctctcc tgttccgacc ctgccgctta 6120
ccggatacct gtccgccttt ctcccttcgg gaagcgtggc gctttctcat agctcacgct 6180
gtaggtatct cagttcggtg taggtcgttc gctccaagct gggctgtgtg cacgaacccc 6240
ccgttcagcc cgaccgctgc gccttatccg gtaactatcg tcttgagtcc aacccggtaa 6300
gacacgactt atcgccactg gcagcagcca ctggtaacag gattagcaga gcgaggtatg 6360
taggcggtgc tacagagttc ttgaagtggt ggcctaacta cggctacact agaagaacag 6420
tatttggtat ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt 6480
gatccggcaa acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta 6540
cgcgcagaaa aaaaggatct caagaagatc ctttgatctt ttctacgggg tctgacactc 6600
agtggaacga aaactcacgt taagggattt tggtcatgag attatcaaaa aggatcttca 6660
cctagatcct tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata tatgagtaaa 6720
cttggtctga cagttaccaa tgcttaatca gtgaggcacc tatctcagcg atctgtctat 6780
ttcgttcatc catagttgcc tgactccccg tcgtgtagat aactacgata cgggagggct 6840
taccatctgg ccccagtgct gcaatgatac cgcgagaccc acgctcaccg gctccagatt 6900
tatcagcaat aaaccagcca gccggaaggg ccgagcgcag aagtggtcct gcaactttat 6960
ccgcctccat ccagtctatt aattgttgcc gggaagctag agtaagtagt tcgccagtta 7020
atagtttgcg caacgttgtt gccattgcta caggcatcgt ggtgtcacgc tcgtcgtttg 7080
gtatggcttc attcagctcc ggttcccaac gatcaaggcg agttacatga tcccccatgt 7140
tgtgcaaaaa agcggttagc tccttcggtc ctccgatcgt tgtcagaagt aagttggccg 7200
cagtgttatc actcatggtt atggcagcac tgcataattc tcttactgtc atgccatccg 7260
taagatgctt ttctgtgact ggtgagtact caaccaagtc attctgagaa tagtgtatgc 7320
ggcgaccgag ttgctcttgc ccggcgtcaa tacgggataa taccgcgcca catagcagaa 7380
ctttaaaagt gctcatcatt ggaaaacgtt cttcggggcg aaaactctca aggatcttac 7440
cgctgttgag atccagttcg atgtaaccca ctcgtgcacc caactgatct tcagcatctt 7500
ttactttcac cagcgtttct gggtgagcaa aaacaggaag gcaaaatgcc gcaaaaaagg 7560
gaataagggc gacacggaaa tgttgaatac tcatactctt cctttttcaa tattattgaa 7620
gcatttatca gggttattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata 7680
aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc acctgacgtc gacggatcgg 7740
gagatcgatc tcccgatccc ctagggtcga ctctcagtac aatctgctct gatgccgcat 7800
agttaagcca gtatctgctc cctgcttgtg tgttggaggt cgctgagtag tgcgcgagca 7860
aaatttaagc tacaacaagg caaggcttga ccgacaattg catgaagaat ctgcttaggg 7920
ttaggcgttt tgcgctgctt cgcgatgtac gggccagata tacgcgttga cattgattat 7980
tgactagtta ttaatagtaa tcaattacgg ggtcattagt tcatagccca tatattgagt 8040
tccgcgttac ataacttacg gtaaatggcc cgcctggctg accgcccaac gacccccgcc 8100
cattgacgtc aataatgacg tatgttccca tagtaacgcc aatagggact ttccattgac 8160
gtcaatgggt ggagtattta cggtaaactg cccacttggc agtacatcaa gtgtatc 8217
<210> 12
<211> 20
<212> DNA
<213> Artificial sequence
<400> 12
gaatactaag catagactcc 20
<210> 13
<211> 20
<212> DNA
<213> Artificial sequence
<400> 13
ggagtctatg cttagtattc 20
<210> 14
<211> 20
<212> DNA
<213> Artificial sequence
<400> 14
gtaaacaaag catagactga 20
<210> 15
<211> 20
<212> DNA
<213> Artificial sequence
<400> 15
tcagtctatg ctttgtttac 20
<210> 16
<211> 20
<212> DNA
<213> Artificial sequence
<400> 16
gaacacaaag catagactgc 20
<210> 17
<211> 20
<212> DNA
<213> Artificial sequence
<400> 17
gcagtctatg ctttgtgttc 20
<210> 18
<211> 20
<212> DNA
<213> Artificial sequence
<400> 18
gatgagataa tgatgagtca 20
<210> 19
<211> 20
<212> DNA
<213> Artificial sequence
<400> 19
tgactcatca ttatctcatc 20
<210> 20
<211> 20
<212> DNA
<213> Artificial sequence
<400> 20
gacaaaccag aagccgctcc 20
<210> 21
<211> 20
<212> DNA
<213> Artificial sequence
<400> 21
ggagcggctt ctggtttgtc 20
<210> 22
<211> 20
<212> DNA
<213> Artificial sequence
<400> 22
gggaataaat catagaatcc 20
<210> 23
<211> 20
<212> DNA
<213> Artificial sequence
<400> 23
ggattctatg atttattccc 20
<210> 24
<211> 20
<212> DNA
<213> Artificial sequence
<400> 24
ggaacacaaa gcatagactg 20
<210> 25
<211> 20
<212> DNA
<213> Artificial sequence
<400> 25
cagtctatgc tttgtgttcc 20
<210> 26
<211> 20
<212> DNA
<213> Artificial sequence
<400> 26
gcacctacct cgggagctga 20
<210> 27
<211> 20
<212> DNA
<213> Artificial sequence
<400> 27
tcagctcccg aggtaggtgc 20
<210> 28
<211> 20
<212> DNA
<213> Artificial sequence
<400> 28
ggaatccctt ctgcagcacc 20
<210> 29
<211> 20
<212> DNA
<213> Artificial sequence
<400> 29
ggtgctgcag aagggattcc 20
<210> 30
<211> 20
<212> DNA
<213> Artificial sequence
<400> 30
tcagaaagtg gtggctggtg 20
<210> 31
<211> 20
<212> DNA
<213> Artificial sequence
<400> 31
caccagccac cactttctga 20
<210> 32
<211> 20
<212> DNA
<213> Artificial sequence
<400> 32
ggcccagact gagcacgtga 20
<210> 33
<211> 20
<212> DNA
<213> Artificial sequence
<400> 33
tcacgtgctc agtctgggcc 20
<210> 34
<211> 20
<212> DNA
<213> Artificial sequence
<400> 34
atatttgcat tgagatagtg 20
<210> 35
<211> 20
<212> DNA
<213> Artificial sequence
<400> 35
cactatctca atgcaaatat 20
<210> 36
<211> 20
<212> DNA
<213> Artificial sequence
<400> 36
gtcatcttag tcattacctg 20
<210> 37
<211> 20
<212> DNA
<213> Artificial sequence
<400> 37
caggtaatga ctaagatgac 20
<210> 38
<211> 20
<212> DNA
<213> Artificial sequence
<400> 38
gaagatagag aatagactgc 20
<210> 39
<211> 20
<212> DNA
<213> Artificial sequence
<400> 39
gcagtctatt ctctatcttc 20
<210> 40
<211> 1791
<212> PRT
<213> Artificial sequence
<400> 40
Pro Lys Lys Lys Arg Lys Val Ser Glu Val Glu Phe Ser His Glu Tyr
1 5 10 15
Trp Met Arg His Ala Leu Thr Leu Ala Lys Arg Ala Trp Asp Glu Arg
20 25 30
Glu Val Pro Val Gly Ala Val Leu Val His Asn Asn Arg Val Ile Gly
35 40 45
Glu Gly Trp Asn Arg Pro Ile Gly Arg His Asp Pro Thr Ala His Ala
50 55 60
Glu Ile Met Ala Leu Arg Gln Gly Gly Leu Val Met Gln Asn Tyr Arg
65 70 75 80
Leu Ile Asp Ala Thr Leu Tyr Val Thr Leu Glu Pro Cys Val Met Cys
85 90 95
Ala Gly Ala Met Ile His Ser Arg Ile Gly Arg Val Val Phe Gly Ala
100 105 110
Arg Asp Ala Lys Thr Gly Ala Ala Gly Ser Leu Met Asp Val Leu His
115 120 125
His Pro Gly Met Asn His Arg Val Glu Ile Thr Glu Gly Ile Leu Ala
130 135 140
Asp Glu Cys Ala Ala Leu Leu Ser Asp Phe Phe Arg Met Arg Arg Gln
145 150 155 160
Glu Ile Lys Ala Gln Lys Lys Ala Gln Ser Ser Thr Asp Ser Gly Gly
165 170 175
Ser Ser Gly Gly Ser Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser
180 185 190
Ala Thr Pro Glu Ser Ser Gly Gly Ser Ser Gly Gly Ser Ser Glu Val
195 200 205
Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu Thr Leu Ala Lys
210 215 220
Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly Ala Val Leu Val Leu
225 230 235 240
Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Ala Ile Gly Leu His
245 250 255
Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg Gln Gly Gly Leu
260 265 270
Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu Tyr Val Thr Phe
275 280 285
Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His Ser Arg Ile Gly
290 295 300
Arg Val Val Phe Gly Val Arg Asn Ala Lys Thr Gly Ala Ala Gly Ser
305 310 315 320
Leu Met Asp Val Leu His Tyr Pro Gly Met Asn His Arg Val Glu Ile
325 330 335
Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu Leu Cys Tyr Phe
340 345 350
Phe Arg Met Pro Arg Gln Val Phe Asn Ala Gln Lys Lys Ala Gln Ser
355 360 365
Ser Thr Asp Ser Gly Gly Ser Ser Gly Gly Ser Ser Gly Ser Glu Thr
370 375 380
Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser Gly Gly Ser Ser
385 390 395 400
Gly Gly Ser Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn
405 410 415
Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys
420 425 430
Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn
435 440 445
Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr
450 455 460
Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg
465 470 475 480
Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp
485 490 495
Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp
500 505 510
Lys Lys His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val
515 520 525
Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu
530 535 540
Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu
545 550 555 560
Ala His Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu
565 570 575
Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln
580 585 590
Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val
595 600 605
Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu
610 615 620
Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe
625 630 635 640
Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser
645 650 655
Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr
660 665 670
Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr
675 680 685
Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu
690 695 700
Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser
705 710 715 720
Ala Ser Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu
725 730 735
Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile
740 745 750
Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly
755 760 765
Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys
770 775 780
Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu
785 790 795 800
Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile
805 810 815
His Leu Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr
820 825 830
Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe
835 840 845
Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe
850 855 860
Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe
865 870 875 880
Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg
885 890 895
Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys
900 905 910
His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys
915 920 925
Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly
930 935 940
Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys
945 950 955 960
Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys
965 970 975
Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser
980 985 990
Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe
995 1000 1005
Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr
1010 1015 1020
Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr
1025 1030 1035 1040
Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg
1045 1050 1055
Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile
1060 1065 1070
Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp
1075 1080 1085
Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu
1090 1095 1100
Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp
1105 1110 1115 1120
Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys
1125 1130 1135
Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val
1140 1145 1150
Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu
1155 1160 1165
Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys
1170 1175 1180
Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu
1185 1190 1195 1200
His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr
1205 1210 1215
Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile
1220 1225 1230
Asn Arg Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe
1235 1240 1245
Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys
1250 1255 1260
Asn Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys
1265 1270 1275 1280
Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln
1285 1290 1295
Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu
1300 1305 1310
Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln
1315 1320 1325
Ile Thr Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys
1330 1335 1340
Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu
1345 1350 1355 1360
Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys
1365 1370 1375
Val Arg Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn
1380 1385 1390
Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser
1395 1400 1405
Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile
1410 1415 1420
Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1425 1430 1435 1440
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn
1445 1450 1455
Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly
1460 1465 1470
Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val
1475 1480 1485
Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr
1490 1495 1500
Gly Gly Phe Ser Lys Glu Ser Ile Arg Pro Lys Arg Asn Ser Asp Lys
1505 1510 1515 1520
Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe
1525 1530 1535
Val Ser Pro Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu
1540 1545 1550
Lys Gly Lys Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile
1555 1560 1565
Thr Ile Met Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu
1570 1575 1580
Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu
1585 1590 1595 1600
Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu
1605 1610 1615
Ala Ser Ala Arg Phe Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser
1620 1625 1630
Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys
1635 1640 1645
Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His
1650 1655 1660
Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1665 1670 1675 1680
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr
1685 1690 1695
Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile
1700 1705 1710
His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Arg Ala Phe Lys Tyr
1715 1720 1725
Phe Asp Thr Thr Ile Asp Arg Lys Val Tyr Arg Ser Thr Lys Glu Val
1730 1735 1740
Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr
1745 1750 1755 1760
Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Ser Gly Gly Ser Lys Arg
1765 1770 1775
Thr Ala Asp Gly Ser Glu Phe Glu Pro Lys Lys Lys Arg Lys Val
1780 1785 1790
<210> 41
<211> 1605
<212> PRT
<213> Artificial sequence
<400> 41
Met Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Ser Pro Lys Lys Lys
1 5 10 15
Arg Lys Val Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His
20 25 30
Ala Leu Thr Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val
35 40 45
Gly Ala Val Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn
50 55 60
Arg Ala Ile Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala
65 70 75 80
Leu Arg Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala
85 90 95
Thr Leu Tyr Val Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met
100 105 110
Ile His Ser Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ser Lys
115 120 125
Arg Gly Ala Ala Gly Ser Leu Met Asn Val Leu Asn Tyr Pro Gly Met
130 135 140
Asn His Arg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala
145 150 155 160
Ala Leu Leu Cys Asp Phe Tyr Arg Met Pro Arg Gln Val Phe Asn Ala
165 170 175
Gln Lys Lys Ala Gln Ser Ser Ile Asn Ser Gly Gly Ser Ser Gly Gly
180 185 190
Ser Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu
195 200 205
Ser Ser Gly Gly Ser Ser Gly Gly Ser Asp Lys Lys Tyr Ser Ile Gly
210 215 220
Leu Ala Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr Asp Glu
225 230 235 240
Tyr Lys Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr Asp Arg
245 250 255
His Ser Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly
260 265 270
Glu Thr Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr
275 280 285
Thr Arg Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn
290 295 300
Glu Met Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu Glu Ser
305 310 315 320
Phe Leu Val Glu Glu Asp Lys Lys His Glu Arg His Pro Ile Phe Gly
325 330 335
Asn Ile Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr
340 345 350
His Leu Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp Leu Arg
355 360 365
Leu Ile Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg Gly His Phe
370 375 380
Leu Ile Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp Lys Leu
385 390 395 400
Phe Ile Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro
405 410 415
Ile Asn Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu
420 425 430
Ser Lys Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu
435 440 445
Lys Lys Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu
450 455 460
Thr Pro Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu
465 470 475 480
Gln Leu Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala
485 490 495
Gln Ile Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu
500 505 510
Ser Asp Ala Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile
515 520 525
Thr Lys Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His
530 535 540
His Gln Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro
545 550 555 560
Glu Lys Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala
565 570 575
Gly Tyr Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile
580 585 590
Lys Pro Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys
595 600 605
Leu Asn Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly
610 615 620
Ser Ile Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg
625 630 635 640
Arg Gln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile
645 650 655
Glu Lys Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala
660 665 670
Arg Gly Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr
675 680 685
Ile Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala
690 695 700
Gln Ser Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn
705 710 715 720
Glu Lys Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val
725 730 735
Tyr Asn Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys
740 745 750
Pro Ala Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu
755 760 765
Phe Lys Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr
770 775 780
Phe Lys Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu
785 790 795 800
Asp Arg Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile
805 810 815
Ile Lys Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu
820 825 830
Glu Asp Ile Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile
835 840 845
Glu Glu Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met
850 855 860
Lys Gln Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg
865 870 875 880
Lys Leu Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu
885 890 895
Asp Phe Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu
900 905 910
Ile His Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln
915 920 925
Val Ser Gly Gln Gly Asp Ser Leu His Glu His Ile Ala Asn Leu Ala
930 935 940
Gly Ser Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val
945 950 955 960
Asp Glu Leu Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val
965 970 975
Ile Glu Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn
980 985 990
Ser Arg Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly
995 1000 1005
Ser Gln Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn
1010 1015 1020
Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val
1025 1030 1035 1040
Asp Gln Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp His
1045 1050 1055
Ile Val Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val
1060 1065 1070
Leu Thr Arg Ser Asp Lys Asn Arg Gly Lys Ser Asp Asn Val Pro Ser
1075 1080 1085
Glu Glu Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn
1090 1095 1100
Ala Lys Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu
1105 1110 1115 1120
Arg Gly Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln
1125 1130 1135
Leu Val Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile Leu Asp
1140 1145 1150
Ser Arg Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu
1155 1160 1165
Val Lys Val Ile Thr Leu Lys Ser Lys Leu Val Ser Asp Phe Arg Lys
1170 1175 1180
Asp Phe Gln Phe Tyr Lys Val Arg Glu Ile Asn Asn Tyr His His Ala
1185 1190 1195 1200
His Asp Ala Tyr Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys
1205 1210 1215
Tyr Pro Lys Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr
1220 1225 1230
Asp Val Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala
1235 1240 1245
Thr Ala Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr
1250 1255 1260
Glu Ile Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu
1265 1270 1275 1280
Thr Asn Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe
1285 1290 1295
Ala Thr Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys
1300 1305 1310
Lys Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro
1315 1320 1325
Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1330 1335 1340
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu
1345 1350 1355 1360
Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val
1365 1370 1375
Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys
1380 1385 1390
Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys
1395 1400 1405
Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn
1410 1415 1420
Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn
1425 1430 1435 1440
Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser
1445 1450 1455
His Tyr Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln
1460 1465 1470
Leu Phe Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln
1475 1480 1485
Ile Ser Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp
1490 1495 1500
Lys Val Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu
1505 1510 1515 1520
Gln Ala Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala
1525 1530 1535
Pro Ala Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr
1540 1545 1550
Thr Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile
1555 1560 1565
Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1570 1575 1580
Ser Gly Gly Ser Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Pro Lys
1585 1590 1595 1600
Lys Lys Arg Lys Val
1605

Claims (7)

CN202210538473.1A2022-05-172022-05-17High-precision adenine base editor and application thereofActiveCN115093482B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202210538473.1ACN115093482B (en)2022-05-172022-05-17High-precision adenine base editor and application thereof

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202210538473.1ACN115093482B (en)2022-05-172022-05-17High-precision adenine base editor and application thereof

Publications (2)

Publication NumberPublication Date
CN115093482A CN115093482A (en)2022-09-23
CN115093482Btrue CN115093482B (en)2025-08-12

Family

ID=83288420

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202210538473.1AActiveCN115093482B (en)2022-05-172022-05-17High-precision adenine base editor and application thereof

Country Status (1)

CountryLink
CN (1)CN115093482B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110029096A (en)*2019-05-092019-07-19上海科技大学A kind of adenine base edit tool and application thereof

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111172133B (en)*2020-03-102021-12-31上海科技大学Base editing tool and application thereof
CN112852791B (en)*2020-11-202022-05-24中国农业科学院植物保护研究所Adenine base editor and related biological material and application thereof

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110029096A (en)*2019-05-092019-07-19上海科技大学A kind of adenine base edit tool and application thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Stimulation of CRISPR-mediated homology-directed repair by an engineered RAD18 variant;Tarun S Nambiar,et al.;《Nat Commun .》;20190730;第10卷(第1期);第1-13页*

Also Published As

Publication numberPublication date
CN115093482A (en)2022-09-23

Similar Documents

PublicationPublication DateTitle
AU2019368215B2 (en)Engineered enzymes
KR102181258B1 (en)Virus like particle composition
CN110029096B (en)Adenine base editing tool and application thereof
CN112639104B (en) Novel promoter derived from organic acid-tolerant yeast and method for expressing target gene using the same
CN113186140B (en)Genetically engineered bacteria for preventing and/or treating hangover and liver disease
CN111549062A (en)Whole genome knockout vector library of silkworm based on CRISPR/Cas9 system and construction method
CN113584033B (en) A CRISPR/Cpf1 gene editing system and its construction method and its application in Gibberella
CN101657097A (en)With the inflammation is the treatment of diseases of feature
CN112011574B (en)Lentiviral vector, construction method and application thereof
KR102335519B1 (en)Vaccine composition for preventing or reducing clinical symptom of severe acute respiratory syndrome coronavirus 2
CN111534541A (en)Eukaryotic organism CRISPR-Cas9 double gRNA vector and construction method thereof
CN113584062B (en)Fusion imaging gene, lentivirus expression plasmid, lentivirus and cell thereof, and preparation method and application thereof
CN113652451B (en)Lentiviral vector, construction method and application thereof
CN111534543A (en) A eukaryotic CRISPR/Cas9 knockout system, basic vector, vector and cell line
CN111549060A (en) A eukaryotic CRISPR/Cas9 genome-wide editing cell library and construction method
CN115093482B (en)High-precision adenine base editor and application thereof
CN106086054A (en)A kind of method of helicobacter pylori gene traceless knockout
CN114606265B (en)Mini base editor capable of realizing single AAV virus coating
CN114058607B (en)Fusion protein for editing C to U base, and preparation method and application thereof
CN106399373B (en)A kind of Cas9 expression vector
CN111041039B (en) A thermophilic anaerobic Ethanolbacterium genome editing vector and its application
KR102721142B1 (en)Method for preparing a reassortant virus of the family Reoviridae and vector library therefor
CN112209883B (en) A kind of fluorescein dye that specifically binds to RNA and its application
KR20110017146A (en) Interleukin-10 inhibitory siRNA, compositions and cells comprising same
CN116324408A (en) Cell lines that secrete α-synuclein-targeting antibodies, progranulin and prosaposin and their complexes, and GDNF

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp