CN104313172A

Movatterモバイル変換

Info

Publication number: CN104313172A
Application number: CN201410624588.8A
Authority: CN
Inventors: 王师; 焦文倩; 包振民; 吕佳; 付晓腾; 张玲玲; 胡晓丽
Original assignee: Ocean University of China
Current assignee: Ocean University of China
Priority date: 2014-11-06
Filing date: 2014-11-06
Publication date: 2015-01-28

Abstract

Translated fromChinese

本发明提供一种大量样本同时分型的方法，能够通用于多种测序平台的SNP筛查和分型。本发明选用II B型限制性内切酶对提取的基因组DNA进行酶切，并在酶切片段的两端加上接头，且其中一个接头从5'至3'端包括有测序平台的通用测序引物和用于连接的简并序列；另一个接头从5'至3'端包括有测序平台的另一端通用测序引物、用于区分目标样本的Barcode序列和用于连接的简并序列。本发明能够实现一次建库就能达到多种高通量测序平台兼容的目的，同时利用双Barcode序列的策略可以达到单测序泳道对多达576个样本同时测序分型。The present invention provides a method for simultaneous typing of a large number of samples, which can be generally used for SNP screening and typing of various sequencing platforms. In the present invention, type II B restriction endonuclease is used to digest the extracted genomic DNA, and adapters are added to both ends of the digested fragments, and one of the adapters includes a universal sequencing platform from the 5' to 3' end Primers and degenerate sequences for connection; the other adapter includes the other end of the sequencing platform from the 5' to 3' end of the universal sequencing primer, the Barcode sequence for distinguishing target samples and the degenerate sequence for connection. The invention can achieve the goal of compatibility with multiple high-throughput sequencing platforms through one-time library construction, and at the same time, the double Barcode sequence strategy can be used to simultaneously sequence and type up to 576 samples in a single sequencing lane.

Description

Translated fromChinese

一种大量样本同时分型的方法A method for simultaneous genotyping of a large number of samples

技术领域technical field

本发明属于分子生物学DNA遗传标记技术领域，具体涉及一种大量样本同时分型的方法，能够通用于多种测序平台的SNP筛查和分型。The invention belongs to the technical field of DNA genetic markers in molecular biology, and in particular relates to a method for simultaneous typing of a large number of samples, which can be generally used for SNP screening and typing of various sequencing platforms.

背景技术Background technique

单核苷酸多态性(single nucleotide polymorphism，SNP)是指存在于基因组特定位置上的单个核苷酸的变异。作为第三代分子标记，SNP具有数量大，分布广及遗传稳定性高的特点，是进行生物进化，群体遗传学研究以及连锁分析和关联分析最为理想的分子标记。随着高通量测序技术的产生和发展，SNP标记开发的通量和规模逐渐向高通量全基因组的方向发展。针对模式生物，全基因组SNP标记开发的策略主要是根据已知的基因组信息，通过重测序的方法进行序列比对筛查全基因组范围内的SNP标记。Single nucleotide polymorphism (single nucleotide polymorphism, SNP) refers to the variation of a single nucleotide that exists at a specific position in the genome. As a third-generation molecular marker, SNP has the characteristics of large number, wide distribution and high genetic stability. It is the most ideal molecular marker for biological evolution, population genetics research, linkage analysis and association analysis. With the emergence and development of high-throughput sequencing technology, the throughput and scale of SNP marker development are gradually developing towards the direction of high-throughput genome-wide. For model organisms, the strategy for the development of genome-wide SNP markers is mainly based on the known genome information, through sequence comparison and screening of genome-wide SNP markers by resequencing.

随着材料科学、计算机科学等学科的飞速发展，测序技术发生了革命性的进步，以Roche 454、Illumina Solexa和ABI SOLiD为代表的一批新技术兴起，被称为高通量测序技术或新一代测序技术(Next generation sequencing，NGS)。随着高通量测序技术的持续发展和测序平台的不断升级，Illumina测序平台以其多样的测序读长(单端36bp、单端50bp、双端100bp、双端150bp)、灵活的测序通量、较高的测序准确性、低廉的测序成本等优势，逐渐成为目前用于SNP分型的主流测序平台之一。其中HiSeq2000测序平台以高数据产出、高测序准确性为优势，HiSeq2500可实现较长的测序读长(双端150bp)和更快的测序速度，MiSeq测序平台可对文库进行快速检测和少量数据获得，研究者可根据不同研究目的选择不同测序平台。With the rapid development of material science, computer science and other disciplines, sequencing technology has undergone revolutionary progress, and a number of new technologies represented by Roche 454, Illumina Solexa and ABI SOLiD have emerged, which are called high-throughput sequencing technology or new technologies. Next generation sequencing (NGS). With the continuous development of high-throughput sequencing technology and the continuous upgrading of sequencing platforms, the Illumina sequencing platform has a variety of sequencing read lengths (single-end 36bp, single-end 50bp, paired-end 100bp, paired-end 150bp), flexible sequencing throughput , high sequencing accuracy, and low sequencing cost have gradually become one of the mainstream sequencing platforms currently used for SNP typing. Among them, the HiSeq2000 sequencing platform has the advantages of high data output and high sequencing accuracy, the HiSeq2500 can achieve a longer sequencing read length (double-end 150bp) and faster sequencing speed, and the MiSeq sequencing platform can perform rapid detection of the library and a small amount of data. According to different research purposes, researchers can choose different sequencing platforms.

近年来一系列基于高通量测序平台的简化基因组方法为非模式生物基因组范围SNP规模开发以及大样本的群体遗传学研究提供了有力的途径。目前几种主流的简化基因组测序技术都是基于酶切的原理将基因组DNA片段化，对获得的代表性标签进行高通量测序，In recent years, a series of simplified genome methods based on high-throughput sequencing platforms have provided a powerful approach for the large-scale development of genome-wide SNPs in non-model organisms and the study of population genetics with large samples. At present, several mainstream simplified genome sequencing technologies are based on the principle of enzyme digestion to fragment genomic DNA and perform high-throughput sequencing on the obtained representative tags.

但目前的方法并不适用于基因组信息相对匮乏的非模式生物或是针对大量群体样本的SNP研究，因此，需要在不依赖所研究物种的基因组序列前提下进行全基因组范围内高密度SNP标记的开发和分型的方法，从而能够兼容多元化的高通量测序平台、并且可以实现大量样本平行测序分型。However, the current method is not suitable for non-model organisms with relatively scarce genome information or SNP research for a large number of population samples. Therefore, it is necessary to conduct genome-wide high-density SNP markers without relying on the genome sequence of the species under study. The method of development and typing can be compatible with diversified high-throughput sequencing platforms, and can realize parallel sequencing and typing of a large number of samples.

发明内容Contents of the invention

本发明的目的是提供一种大量样本同时分型的方法，能够通用于多种测序平台的SNP筛查和分型。本发明构建所得文库具备在Illumina HiSeq2000,HiSeq2500和MiSeq等多种测序平台的兼容性，并可在一个测序泳道内同时对多达576个样本测序分型。The purpose of the present invention is to provide a method for simultaneous typing of a large number of samples, which can be generally used for SNP screening and typing of various sequencing platforms. The library constructed by the present invention is compatible with various sequencing platforms such as Illumina HiSeq2000, HiSeq2500 and MiSeq, and can simultaneously sequence and type up to 576 samples in one sequencing lane.

本发明的方法，包括有如下的步骤：The method of the present invention comprises the following steps:

1)首先提取要进行分析的不同目标样本的基因组DNA，1) First extract the genomic DNA of different target samples to be analyzed,

2)选用II B型限制性内切酶对提取的基因组DNA进行酶切，并在酶切片段的两端加上接头，且其中一个接头从5'至3'端包括有测序平台的通用测序引物和用于连接的简并序列；另一个接头从5'至3'端包括有测序平台的另一端通用测序引物、用于区分目标样本的Barcode序列和用于连接的简并序列。2) Use type II B restriction endonucleases to digest the extracted genomic DNA, and add adapters to both ends of the digested fragments, and one of the adapters includes a universal sequencing platform from the 5' to 3' end Primers and degenerate sequences for connection; the other adapter includes the other end of the sequencing platform from the 5' to 3' end of the universal sequencing primer, the Barcode sequence for distinguishing target samples and the degenerate sequence for connection.

优选的接头为Slx-MpAd1和Slx-MpAd2，其中Slx-MpAd1含有样本区分所需Barcode序列；所述酶切片段为BsaXI酶的酶切片段。The preferred linkers are Slx-MpAd1 and Slx-MpAd2, wherein Slx-MpAd1 contains the Barcode sequence required for sample differentiation; the enzyme-cleaved fragment is an enzyme-cleaved fragment of BsaXI.

接头Slx-MpAd1和Slx-MpAd2的一种具体序列如下：A specific sequence of linkers Slx-MpAd1 and Slx-MpAd2 is as follows:

Slx-MpAd1：Slx-MpAd1:

5'-ACACTCTTTCCCTACACGACGCTCTTCCGATCTXXXXXX(N_x)-3'；5'-ACACTCTTTCCCTACACGACGCTCTTCCGATCTXXXXXXX(N_x )-3';

5'-XXXXXXAGATCGGAAGAGC-3'；5'-XXXXXXAGATCGGAAGAGC-3';

Slx-MpAd2：Slx-MpAd2:

5'-GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT(N_x)-3'；5'-GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT(N_x )-3';

5'–AGATCGGAAGAGC-3'；5'–AGATCGGAAGAGC-3';

其中所述XXXXXX为Barcode序列，N_X中N为兼并碱基，x为简并碱基的数目，其与酶切产生的突出碱基的数目保持一致。5端的序列为测序平台的通用测序引物序列。Wherein the XXXXXX is a Barcode sequence, N in N_X is a degenerate base, and x is the number of degenerate bases, which is consistent with the number of protruding bases generated by enzyme digestion. The sequence at the 5-end is the sequence of the universal sequencing primer of the sequencing platform.

3)PCR扩增及目的片段的回收：将步骤2)连接好接头的酶切片段进行PCR扩增，所用的引物与接头的5'的序列结合；3) PCR amplification and recovery of the target fragment: perform PCR amplification on the digested fragment connected with the adapter in step 2), and the primers used are combined with the 5' sequence of the adapter;

4)以步骤3)扩增的目的片段作为底物进行再次扩增，PCR扩增的正向引物为与测序平台的正向测序引物序列相匹配的引物，反向引物为含有样本区分所需Barcode2序列的测序平台的反向向测序引物序列相匹配的引物；4) Re-amplify the target fragment amplified in step 3) as a substrate, the forward primer of PCR amplification is a primer that matches the sequence of the forward sequencing primer of the sequencing platform, and the reverse primer is the primer that contains the samples needed for differentiation. Primers matching the reverse sequencing primer sequence of the sequencing platform of the Barcode2 sequence;

5)将步骤4)的扩增产物使用Illumina HiSeq2000进行高通量测序。5) The amplified product of step 4) was subjected to high-throughput sequencing using Illumina HiSeq2000.

所述步骤3)中，实施例优选所使用的具体引物如下：In the step 3), the preferred specific primers used in the embodiment are as follows:

Slx-1ST-MpPrimer-1:Slx-1ST-MpPrimer-1:

5'–ACACTCTTTCCCTACACGACGCT-3'；5'–ACACTCTTTCCCTACACGACGCT-3';

Slx-1ST-MpPrimer-2:Slx-1ST-MpPrimer-2:

5'-GTGACTGGAGTTCAGACGTGTGCT-3'；5'-GTGACTGGAGTTCAGACGTGTGCT-3';

步骤4)中实施例所使用的具体引物如下：The specific primers used in the embodiment in step 4) are as follows:

Slx-2ND-MpPrimer:Slx-2ND-MpPrimer:

5'-AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCT-3'；5'-AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCT-3';

Slx-index-Barcode:Slx-index-Barcode:

5'-CAAGCAGAAGACGGCATACGAGATXXXXXXGTGACTGGAGTTCAGACGTGTGCT-3'。5'-CAAGCAGAAGACGGCATACGAGATXXXXXXGTGACTGGAGTTCAGACGTGTGCT-3'.

作为对技术方案的进一步改进，所述步骤(2)中取目标基因组DNA 100ng，按照15μl体系进行酶切反应：100ng DNA，1.5μl 10×NEBuffer4，2μl限制性内切酶，用ddH₂O补足至15μl；As a further improvement to the technical solution, in the step (2), take 100ng of the target genomic DNA, and carry out the enzyme digestion reaction according to the 15μl system: 100ng DNA, 1.5μl 10×NEBuffer4, 2μl restriction endonuclease, supplemented with ddH₂ O to 15 μl;

连接反应体系为：体积为22μl，10μl酶切产物，2μl 10mM ATP，4μl 1μMslx-ad1，4μl 1μM slx-ad2，2μl T4DNA ligase。4℃连接16h。The ligation reaction system is: volume 22 μl, 10 μl digestion product, 2 μl 10 mM ATP, 4 μl 1 μM slx-ad1, 4 μl 1 μM slx-ad2, 2 μl T4DNA ligase. Connect at 4°C for 16h.

对技术方案的进一步改进：所述步骤(3)中PCR反应体系为：7μl连接产物，4μl 5×HF Buffer，0.6μl 10mM dNTP，0.2μl 10μM slx-PCR1，0.2μl 10μMslx-PCR2，0.2μl DNA聚合酶，7.8μl ddH₂O。Further improvement to the technical solution: the PCR reaction system in the step (3) is: 7 μl ligation product, 4 μl 5×HF Buffer, 0.6 μl 10 mM dNTP, 0.2 μl 10 μM slx-PCR1, 0.2 μl 10 μM slx-PCR2, 0.2 μl DNA Polymerase, 7.8 μl_ddH2O .

对技术方案的进一步改进：所述步骤(3)中PCR反应条件为：98℃5s，60℃20s，72℃10s，22个循环；72℃延伸10min。Further improvement on the technical solution: the PCR reaction conditions in the step (3) are: 98°C for 5s, 60°C for 20s, 72°C for 10s, 22 cycles; 72°C for 10min.

对技术方案的进一步改进：所述步骤(4)中反应体系为：体积为20μl，5μl回收产物，4μl 5×HF Buffer，0.6μl 10mM dNTP，0.2μl 10μM slx-PCR1，0.2μl 10μMslx-PCR3，0.2μl DNA聚合酶，9.8μl ddH₂O。Further improvement to the technical scheme: the reaction system in the step (4) is: volume 20 μl, 5 μl recovered product, 4 μl 5×HF Buffer, 0.6 μl 10 mM dNTP, 0.2 μl 10 μM slx-PCR1, 0.2 μl 10 μM slx-PCR3, 0.2 μl DNA polymerase, 9.8 μl ddH₂ O.

对技术方案的进一步改进：所述步骤(4)中PCR反应条件为：98℃5s，60℃20s，72℃10s，5-7个循环；72℃延伸10min。Further improvement to the technical solution: the PCR reaction conditions in the step (4) are: 98°C for 5s, 60°C for 20s, 72°C for 10s, 5-7 cycles; 72°C for 10 minutes.

本发明利用独特的接头和引物，能够实现一次建库就能达到多种高通量测序平台兼容的目的，同时利用双Barcode序列的策略可以达到单测序泳道对多达576个样本同时测序分型。构建所得测序文库不仅可直接在MiSeq平台上进行文库测试或少量数据获得，而且可以在HiSeq2000和HiSeq2500平台上进行较大数据量的测序。The invention utilizes unique adapters and primers to achieve the goal of compatibility with multiple high-throughput sequencing platforms for one-time library construction, and at the same time utilizes the strategy of double Barcode sequences to achieve simultaneous sequencing and typing of up to 576 samples in a single sequencing lane . The constructed sequencing library can not only be directly tested on the MiSeq platform or obtain a small amount of data, but also can be sequenced on the HiSeq2000 and HiSeq2500 platforms with a large amount of data.

具体实施方式Detailed ways

对于本发明中所涉及的名词定义如下：For the noun definitions involved in the present invention are as follows:

1、内切酶，又称为核酸内切酶(endonuclease)在核酸水解酶中，为可水解分子链内部磷酸二酯键生成寡核苷酸的酶；本发明所用到的II B型限制性内切酶为BsaXI酶，但还可选用其它的II B型限制性内切酶。1. Endonuclease, also known as endonuclease (endonuclease) in nucleolytic enzymes, is the enzyme that can hydrolyze the internal phosphodiester bond of the molecular chain to generate oligonucleotides; the II B type restriction used in the present invention The endonuclease is BsaXI enzyme, but other II B type restriction endonucleases can also be selected for use.

2、接头：(adaptor DNA)，是一段短的含酶切位点并能与钝性末端或粘性末端匹配的人工合成DNA片段，接头DNA常用于一钝性末端DNA与一粘性末端DNA的连接。有时连接到粘性末端的接头DNA是为了给未知DNA片段提供一段已知的序列，根据其设计引物，扩增未知的DNA片段。2. Adapter: (adaptor DNA), is a short segment of artificially synthesized DNA fragment that contains a restriction site and can be matched with a blunt end or a sticky end. Adapter DNA is often used to connect a blunt end DNA to a sticky end DNA . Sometimes the linker DNA ligated to the cohesive end is to provide a known sequence to the unknown DNA fragment, based on which primers are designed to amplify the unknown DNA fragment.

3、其中N(兼并碱基)为碱基A、T、G、C中的任一个；其中A、T、G、C代表组成DNA分子的四种脱氧核苷。3. Wherein N (merged base) is any one of bases A, T, G, and C; where A, T, G, and C represent four deoxynucleosides that make up DNA molecules.

4、Barcode即一段短的特征序列，对多个样本同时进行高通量测序时，对每条reads上带有的一段特定短序列(即barcode)测序能够准确识别样本来源。4. Barcode is a short characteristic sequence. When high-throughput sequencing is performed on multiple samples at the same time, sequencing a specific short sequence (ie barcode) on each reads can accurately identify the source of the sample.

下面结合具体实施方式对本发明的技术方案作进一步详细的说明。The technical solutions of the present invention will be further described in detail below in conjunction with specific embodiments.

实施例1Example 1

下面以120个扇贝个体为例通过实验的实施详细叙述本发明的技术方案。Take 120 individual scallops as an example below to describe the technical scheme of the present invention in detail through the implementation of experiments.

1、扇贝基因组DNA的提取1. Extraction of scallop genome DNA

利用酚-氯仿抽提法提取扇贝高纯度基因组DNA，具体步骤如下：Using phenol-chloroform extraction method to extract high-purity genomic DNA from scallops, the specific steps are as follows:

取扇贝闭壳肌约0.1g，加入500μl STE裂解缓冲液(NaCl：100mM；EDTA：1mM，PH＝8.0；Tris-Cl，10mM，PH＝8.0)，剪碎，再加入50μl 10％的SDS(10％)，以及5μl 20mg/ml的蛋白酶K，56℃处理，直到裂解液澄清。加入等体积饱和酚(250μl)、氯仿/异戊醇(24:1)(250μl)，抽提3次。取上清液，加入等体积氯仿/异戊醇(500μl)抽提1次。取上清液，加入50μl NaAc(3M，pH5.2)，1ml冰无水乙醇，缓慢摇匀，12,000rpm离心10min，将核酸沉淀于管底。70％乙醇(1ml)洗涤沉淀并干燥直到乙醇全部挥发。加入100μl ddH2O和少量RNase A，37℃消化30min，紫外分光光度计定量到100ng/μl备用。Take about 0.1 g of scallop adductor muscle, add 500 μl of STE lysis buffer (NaCl: 100 mM; EDTA: 1 mM, pH=8.0; Tris-Cl, 10 mM, pH=8.0), cut it into pieces, and then add 50 μl of 10% SDS ( 10%), and 5 μl of 20mg/ml proteinase K, treated at 56°C until the lysate was clear. Add an equal volume of saturated phenol (250 μl), chloroform/isoamyl alcohol (24:1) (250 μl), and extract 3 times. Take the supernatant, add an equal volume of chloroform/isoamyl alcohol (500 μl) and extract once. Take the supernatant, add 50μl NaAc (3M, pH 5.2), 1ml ice absolute ethanol, shake slowly, centrifuge at 12,000rpm for 10min, and precipitate the nucleic acid at the bottom of the tube. The precipitate was washed with 70% ethanol (1 ml) and dried until all the ethanol evaporated. Add 100 μl ddH2O and a small amount of RNase A, digest at 37°C for 30 minutes, and quantify to 100 ng/μl with a UV spectrophotometer for later use.

2、II B型限制性内切酶酶切及接头的连接2. Type II B restriction endonuclease digestion and linker connection

取每个样本的DNA 100ng，按照以下体系(15μl)进行酶切反应：100ng DNA，1.5μl 10×NEBuffer4，2μl BsaXI，用ddH₂O补足至15μl。37℃酶切3小时。取5μl酶切产物，1％琼脂糖凝胶电泳检测酶切是否完全。Take 100 ng of DNA from each sample, and carry out enzyme digestion reaction according to the following system (15 μl): 100 ng DNA, 1.5 μl 10×NEBuffer4, 2 μl BsaXI, make up to 15 μl with ddH₂ O. Digest at 37°C for 3 hours. Take 5 μl of the digested product and run 1% agarose gel electrophoresis to check whether the digested product is complete.

将设计的特异性接头(Slx-MpAd1和Slx-MpAd2)连于上述酶切产物，其中Slx-MpAd1含有Barcode1序列(表1)，用于区分每个样本；本实施例中，两个接头3’端的兼并碱基为3个随机碱基，与BsaXI酶切产生的酶切标签粘性末端互补。Connect the designed specific adapters (Slx-MpAd1 and Slx-MpAd2) to the above-mentioned digested products, wherein Slx-MpAd1 contains the Barcode1 sequence (Table 1) to distinguish each sample; in this embodiment, two adapters 3 The merged bases at the 'end are 3 random bases, which are complementary to the cohesive end of the enzyme-cut tag produced by BsaXI digestion.

反应体系(22μl)为：10μl酶切产物，2μl 10mM ATP，4μl 1μM Slx-MpAd1，4μl 1μM Slx-MpAd2，2μl T4DNA连接酶。4℃连接16h。The reaction system (22 μl) is: 10 μl of digested products, 2 μl of 10 mM ATP, 4 μl of 1 μM Slx-MpAd1, 4 μl of 1 μM Slx-MpAd2, and 2 μl of T4 DNA ligase. Connect at 4°C for 16h.

接头：Connector:

Slx-MpAd1：Slx-MpAd1:

5'ACACTCTTTCCCTACACGACGCTCTTCCGATCTXXXXXXNNN 3'5'ACACTCTTTCCCTACACGACGCTCTTCCGATCTXXXXXXXNNN 3'

5'XXXXXXAGATCGGAAGAGC 3'5'XXXXXXAGATCGGAAGAGC 3'

Slx-MpAd2：Slx-MpAd2:

5'GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTNNN 3'5'GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTNNN 3'

5'AGATCGGAAGAGC 3'。5'AGATCGGAAGAGC 3'.

3、PCR扩增及目的片段的回收3. PCR amplification and recovery of target fragments

PCR扩增使用高保真DNA聚合酶(Phusion DNA polymerase,NEB)和Solexa特异性引物(slx-PCR1和slx-PCR2)，反应体系(20μl)为：7μl连接产物，4μl 5×HFBuffer，0.6μl 10mM dNTP，0.2μl 10μM Slx-1ST-MpPrimer-1，0.2μl 10μMSlx-1ST-MpPrimer-2，0.2μl Phusion DNA聚合酶，7.8μl ddH₂O。PCR amplification uses high-fidelity DNA polymerase (Phusion DNA polymerase, NEB) and Solexa-specific primers (slx-PCR1 and slx-PCR2), the reaction system (20μl) is: 7μl ligation product, 4μl 5×HFBuffer, 0.6μl 10mM dNTP, 0.2 μl 10 μM Slx-1ST-MpPrimer-1, 0.2 μl 10 μM Slx-1ST-MpPrimer-₂ , 0.2 μl Phusion DNA polymerase, 7.8 μl ddH2O.

引物：Primers:

Slx-1ST-MpPrimer-1:Slx-1ST-MpPrimer-1:

5'ACACTCTTTCCCTACACGACGCT 3'；5' ACACTCTTTCCCTACACGACGCT 3';

Slx-1ST-MpPrimer-2:Slx-1ST-MpPrimer-2:

5'GTGACTGGAGTTCAGACGTGTGCT 3'；5'GTGACTGGAGTTCAGACGTGTGCT 3';

PCR反应条件为：98℃5s，60℃20s，72℃10s，22个循环；72℃延伸10min。对每个样本平行进行3个扩增反应，将扩增产物合成1管用于目的片段的回收。The PCR reaction conditions were: 98°C for 5s, 60°C for 20s, 72°C for 10s, 22 cycles; 72°C for 10min. For each sample, 3 amplification reactions were performed in parallel, and the amplification products were synthesized into 1 tube for the recovery of the target fragment.

将60μl PCR产物上于8％聚丙烯酰胺凝胶，300V电泳40min。将目的片段(96bp)从胶上切下，置于1.5ml离心管中捣碎，加入50μl ddH₂O，4℃放置12h，离心后上清液的浓度大约在5-10ng/μl。Put 60 μl of PCR product on 8% polyacrylamide gel, electrophoresis at 300V for 40min. Cut the target fragment (96bp) from the gel, mash it in a 1.5ml centrifuge tube, add 50μl ddH₂ O, and place it at 4°C for 12h. After centrifugation, the concentration of the supernatant is about 5-10ng/μl.

4、接头延伸4. Joint extension

对上述回收产物，用另一对引物Slx-2ND-MpPrimer和Slx-index-Barcode进行再次扩增，其中Slx-index-Barcode含有Barcode2序列(表2)反应体系(20μl)为：5μl回收产物(5ng/μl)，4μl 5×HF Buffer，0.6μl 10mM dNTP，0.2μl 10μMSlx-2ND-MpPrimer，0.2μl 10μM Slx-index-Barcode，0.2μl Phusion DNApolymerase，9.8μl ddH₂O。For the above recovered product, another pair of primers Slx-2ND-MpPrimer and Slx-index-Barcode were used to amplify again, wherein Slx-index-Barcode contained the Barcode2 sequence (Table 2). The reaction system (20 μl) was: 5 μl recovered product ( 5ng/μl), 4 μl 5×HF Buffer, 0.6 μl 10 mM dNTP, 0.2 μl 10 μM Slx-2ND-MpPrimer, 0.2 μl 10 μM Slx-index-Barcode, 0.2 μl Phusion DNApolymerase, 9.8 μl ddH₂ O.

引物：Primers:

Slx-2ND-MpPrimer:Slx-2ND-MpPrimer:

5'AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCT 3'；5'AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCT 3';

Slx-index-Barcode:Slx-index-Barcode:

5'CAAGCAGAAGACGGCATACGAGATXXXXXXGTGACTGGAGTTCAGACGTGTGCT 3'5'CAAGCAGAAGACGGCATACGAGATXXXXXXGTGACTGGAGTTCAGACGTGTGCT 3'

PCR反应条件为：98℃5s，60℃20s，72℃10s，5-7个循环；72℃延伸10min。对每个样本平行进行3个扩增反应，将扩增产物合成1管用于用于纯化。The PCR reaction conditions are: 98°C for 5s, 60°C for 20s, 72°C for 10s, 5-7 cycles; 72°C for 10min. Three amplification reactions were performed in parallel for each sample, and the amplification products were synthesized into one tube for purification.

PCR产物的纯化使用Qiagen’s QIAquick PCR purification kit，纯化步骤按照说明书进行，最后改用35μl ddH₂O洗脱。The PCR product was purified using Qiagen's QIAquick PCR purification kit, and the purification steps were performed according to the instructions, and finally eluted with 35 μl ddH₂ O.

5、高通量测序5. High-throughput sequencing

将每120个样本的纯化产物等量混合成1个待测样品(100ng)，占用一个lane，进行HiSeq2000测序。The purified products of every 120 samples were mixed in equal amounts into one sample to be tested (100ng), occupied one lane, and performed HiSeq2000 sequencing.

表1：含有Barcode1的Slx-MpAd1序列Table 1: Slx-MpAd1 sequences containing Barcode1

表2：含有Barcode2的Slx-index-Barcode序列Table 2: Slx-index-Barcode sequence containing Barcode2

6、数据分析6. Data analysis

120个扇贝个体的混合文库在Hiseq2000平台上占用1个lane进行singer-end测序，通常情况下，1个lane数据产出约为150M reads,本次测序产出237M条reads，数据量达8.5G。对原始reads的质量统计结果显示Q20大于98％，说明本发明的接头和引物序列可以与Hiseq2000测序平台兼容，并且可以达到很好的测序质量。根据barcode序列对120个样品进行区分，结果显示120个样品reads数均在1.83M-2.07M范围内，个体数据量分布均匀，barcode扩增效率无偏好性，结果表明本发明使用的barcode序列可以对样本进行有效的区分，能够实现单泳道内对大量样本的平行测序分型。进一步对reads质量进行处理，去除低质量并且含有N的reads，120个个体中高质量reads占原始reads的95％以上，其中有98％的reads包含正确的酶切识别位点可以进行下一步的SNP分型分析，该结果说明本发明的接头和引物的设计能够富集正确的目的片段，可以用于高质量的2b-RAD简化基因组文库构建，实验结果稳定可靠。The mixed library of 120 scallop individuals occupies 1 lane on the Hiseq2000 platform for singer-end sequencing. Normally, 1 lane data output is about 150M reads. This sequencing output 237M reads, with a data volume of 8.5G . The statistical results of the quality of the original reads show that Q20 is greater than 98%, indicating that the adapter and primer sequences of the present invention are compatible with the Hiseq2000 sequencing platform and can achieve good sequencing quality. According to the barcode sequence, 120 samples were distinguished. The results showed that the number of reads of the 120 samples were all in the range of 1.83M-2.07M, the individual data volume was evenly distributed, and the barcode amplification efficiency had no preference. The results showed that the barcode sequence used in the present invention can be Effectively distinguish samples, and can realize parallel sequencing and typing of a large number of samples in a single lane. Further process the quality of reads to remove low-quality reads containing N. Among the 120 individuals, high-quality reads account for more than 95% of the original reads, and 98% of the reads contain the correct enzyme-cleaved recognition sites for the next step of SNP Typing analysis, the result shows that the adapter and primer design of the present invention can enrich the correct target fragment, can be used for high-quality 2b-RAD simplified genome library construction, and the experimental results are stable and reliable.

以上实施例仅用以说明本发明的技术方案，而非对其进行限制；尽管参照前述实施例对本发明进行了详细的说明，对于本领域的普通技术人员来说，依然可以对前述实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或替换，并不使相应技术方案的本质脱离本发明所要求保护的技术方案的精神和范围。The above embodiments are only used to illustrate the technical solutions of the present invention, rather than to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art can still understand the foregoing embodiments. Modifications are made to the technical solutions described, or equivalent replacements are made to some of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions claimed in the present invention.