Specific embodiment
It should be noted that in the case where not conflicting, the feature in embodiment and embodiment in the application can phaseMutually combination.Below with reference to the accompanying drawings and in conjunction with the embodiments describing the present invention in detail.
Firstly the need of explanation, various possible mutation present in sample can be measured using high-flux sequence, be wrappedInclude InDel, SNP and large fragment deletion.The deletion mutation that methods and apparatus of the present invention detection is obtained is mainly from statisticsAngle to infer sample to be tested in gene mutation site that may be present and its concrete species of mutation, as to whether and diseaseThere is direct or indirect relation, need many checkings of other testing results, thus this method and device are only fittedIt is used for scientific research and academic basic research, and is not suitable for the diagnosis of clinically disease.
As background section is previously mentioned, in prior art when gene mutation is detected using high-flux sequence method,Existing cannot accurately detect in batches the defect of mutated site and all possible mutation type.The present invention is above-mentioned scarce in order to improveFall into, in a kind of typical embodiment, as shown in Figure 1, there is provided it is a kind of detection gene mutation method, the method include withLower step:Obtain the sequencing data of sample to be tested and check sample;Judge in testing data with the presence or absence of SNP mutation and/orInDel is mutated;And judge to whether there is deletion mutation in testing data;Wherein, judge in testing data with the presence or absence of disappearanceThe step of mutation, includes:Homogenization is processed, and sequencing data is cut into into window, and statistics sample to be tested and check sample are respectively eachThe sequence number of window, and the sequence number to each window carries out homogenization process, obtains sample to be tested and check sample respectively eachThe homogenization sequence number of window;Standard deviation and median are calculated, and calculate homogenization sequence number of the matched group sample on each windowStandard deviation and median;Irrelevance is calculated, and is calculated on each window according to formula (1), the homogenization sequence number of sample to be testedWith the median of check sample
Z=(the homogenization sequence number-median of sample to be tested)/standard deviation (1)
Irrelevance Z values;Disappearance judgement, when Z values are more than 3, then judges that the window has deletion mutation.
The present invention said method by the sequencing data of sample to be tested and check sample is carried out respectively SNP mutation and/Or InDel mutation judge and deletion mutation judges, can be by the site that there is above-mentioned various mutations type in sample to be tested allDetect.And, in mutation judgement is lacked, by using the homogenization sequence number and check sample of each window in the windowMedian between departure degree statistical method come judge a certain window with the presence or absence of disappearance, compare and homogeneous sequence numberThe statistical method of the departure degree of average, from statistical significance for, validity and accuracy it is all higher, it is easier to distinguishFalse positive.
In above-mentioned homogenization process step, to be cut into window in the form of carry out the calculating of sequence number, be easy to according to differenceSequencing data sequencing depth and target deletion fragment size come flexible splitter size, make the deletion fragment of detectionMagnitude range it is more extensive;Also, when it is determined that a certain window whether there is deletion mutation, sample to be tested is calculated first eachThe homogenization sequence number of window and check sample the median of the window difference, then further according to the difference and check sampleWhether it is more than 3 to determine the window with the presence or absence of disappearance in the ratio of the standard deviation of the homogenization sequence number of the window.It is this to sentenceDisconnected method is compared and adopts meansigma methodss by choosing the median of one group of check sample as the standard for comparing, the calculating of medianThe impact of the homogenization sequence number of indivedual unusual fluctuations is not susceptible to, thus judged result is more accurate.
In the said method of the present invention, can be according to the sensitivity of detection and inspection by the form of sequencing data splitterThe relation surveyed between accuracy, carries out suitably weighing and arranging.In the present invention, it is preferred to the window of above-mentioned cutting is continuous not phaseThe window of friendship.Longer exon is divided into into continuous disjoint window, the exon shorter for length then will be wholeExon is divided in a window.When window is set to less value, it is easy to find less deletion mutation, but it is differentSequencing sequence number changes greatly and is inconvenient to compare on sample room identical window.When window is set to larger value, not equallyThe change of sequencing sequence number is less on uniform window between product, but cannot find less deletion mutation.
In above-described embodiment, the length of window is according to the sequencing depth of sample to be tested and check sample and the base of expected detectionArrange because of the length of deletion mutation.In being detected due to this sequencing, the length of the exon in the length and gene of all genesDegree is all known, therefore can carry out window size setting according to the mutation length of expected detection.If it is desired to detectionMutant fragments are less, then arrange a less value, otherwise can then arrange one larger value of window.The length of window is lessDetection sensitivity is higher, correspondingly accuracy relative drop.The more big then accuracy of the length of window is higher, under sensitivity relativelyDrop.In a kind of preferred embodiment of the present invention, it is more than or equal to for 300 × when in sequencing depth, the length of each window is 50~160bp.It is more than or equal to for 300 × when in sequencing depth, the length control of each window can be more taken into account into detection in 50~100bp sensitiveDegree and detection accuracy.
In the said method of the present invention, the step of homogenization is processed in primarily to the sequence number of each window of homogenizing,So that sample to be tested and check sample will not cause comparison result in the sequence number of each window because of the difference of sequencing depthDeviation, thus, this area is applied to the present invention to the operation that data carry out uniforming process.In the present invention, it is preferred to will treatTest sample sheet and the respective sequencing data of check sample are cut into window, and the sequence number of respective each window is designated as into First rayNumber, the summation of each First ray number is designated as the second sequence number;Formula as shown in formula (2) is each to sample to be tested and check sampleFrom sequence number enter
The homogenization sequence numbers (2) of sequence number=First ray number * 1000/ second
Row homogenization is processed, and obtains the homogenization sequence number of sample to be tested and each each window of leisure of check sample.
In above-described embodiment, carry out uniforming the step of processing using formula listed by formula (2), can more effectively eliminate notWith the deviation that the sequence number brought because depth is sequenced between sample is counted so that the sequence number phase of each window of each sampleTo homogeneous.
In said method, the step being mutated with the presence or absence of SNP mutation and/or InDel in the sequencing data of sample to be tested is judgedSuddenly it is mutated with the presence or absence of SNP mutation or InDel in detection sample to be tested, or whether two kinds of mutation are while all exist, orPerson, in sample to be tested, different purpose sites whether there is above-mentioned one or two different mutation types, thus, using thisThe conventional determination methods in field.
In a kind of preferred embodiment of the invention, judge in the sequencing data of sample to be tested with the presence or absence of SNP mutation and/Or InDel includes the step of mutation:Sequence alignment procedures, sequencing data and reference gene group are compared and obtain comparing knotReally;For the first time screening step, filters out the site that there is SNP mutation and/or InDel mutation from comparison result, is designated as firstCandidate locus;Programmed screening step, filters out site of crowd's mutation frequency less than 2% from the first candidate locus, is designated asSecond candidate locus;SNP and/or InDel mutation judge step, according in functional annotation data base to the work(of the second candidate locusCan annotate, judge in the second candidate locus with the presence or absence of the SNP mutation site and/or InDel for causing gene function to changeMutational site;If existing, the second candidate locus are designated as into the 3rd candidate locus;SNP and/or InDel mutation verification steps, whenWhen there are three candidate locus, the 3rd candidate locus are defined as into SNP mutation site and/or InDel mutational sites.
In above preferred embodiment, can be using such as SOAP (http during sequence alignment://) etc soap.genomics.org.cn/ software, sequencing gained sequence the corresponding position of reference gene group is navigated to;I.e.The SNP site different from the corresponding position of reference gene group and/or InDel sites can be obtained.In actual process, also needThe overburden depth (number of times that i.e. site is measured) in each SNP site and/or InDel sites is counted, in order to ensureThe accuracy in the mutational site found, site of the overburden depth less than 30 is removed.Afterwards, in remaining SNP siteAnd/or in InDel sites, according to functional annotation of each site in functional annotation data base, determine in these sites whether depositThe function of gene can be affected in certain site, if there are such a or several sites, can confirm that this is one or severalThere is SNP mutation and/or InDel mutation in site.Additionally, the difference of the function according to genes of interest of interest, can also adoptExclude or confirm whether certain site is to cause parafunctional mutational site with other corresponding householder methods.Such as, ifWant to confirm whether above-mentioned site is the related site of disease, in addition to carrying out dysfunction according to the existing information of data base and judging,Can with according to disease sample and control normal specimens SNP mutation and/or InDel mutational sites information, select in crowdSite of the frequency less than 2%, is predicted using SIFT softwares to protein function, has the site of change to protein function as diseaseThe pathogenic candidate locus of disease.
In the said method of the present invention, before step S1, also include carrying out sample to be tested and check sample respectivelyPrepared by exon library the step of, is prepared in the preparation process in exon library using the method for liquid phase capture.Using liquidMutually to prepare exon library capture rate higher for the method for capture, and can save the plenty of time.
In the said method of the present invention, before the method captured using liquid phase is prepared, also include according to outside targetThe step of aobvious subregion design liquid phase capture probe.Liquid phase capture probe can adopt the method for designing of this area typical probe to enterRow design, such as carry out liquid phase probe customization, official manual side of NimbleGen companies by Agilent company official manual methodMethod carries out liquid phase probe customization etc..
It is different according to research purpose in the preparation process of above-mentioned exon library, the multiple of sample to be tested can be selectedGenes of interest carries out the preparation of exon library;Or select one or more genes in multiple samples to be tested to carry out outer showing respectivelySublibrary builds.It is multiple in a kind of preferred embodiment of the invention when the exon library to multiple genes is preparedGene at least includes following gene:MLH1、MSH2、MSH3、MSH6、PMS1、PMS2、BUB1、BUB3、STK11、PTEN、SMAD4, APC, MUTYH, EPCAM, SETD2, MAX, TSC2, ATM and FANCC.When sequencing data includes above-mentioned multiple basesBecause when, mutational site that may be present and its mutation type in above-mentioned multiple genes can be detected by said method simultaneously.
Above-mentioned multiple genes are known, and in the present invention, inventor is provided to above-mentioned at least 19 firstThe method that gene carries out centralized detecting, it is thus possible to which disposably detecting in same sample to be tested may in above-mentioned multiple genesThe mutational site of presence and its mutation type.
In a kind of specific embodiment of the invention, the step in the above-mentioned exon library for preparing sample to be tested and check sampleSuddenly include:Break process is carried out to the genomic DNA of sample to be tested and check sample, broken DNA is obtained;Broken DNA is carried outA process is repaired and added in end, obtains the DNA plerosis at 3 ' ends band " A ";Joint connection is carried out to DNA plerosis, belt lacing DNA is obtained;Enter performing PCR amplification to belt lacing DNA, obtain DNA amplification;Hybridized with liquid phase capture probe and DNA amplification, obtain treating test sampleThis exon library with check sample.Above-mentioned exon library is obtained containing target base in preparing using the method for liquid phase captureBecause of the sequencing library of exon region, obtain the efficiency high of exon and save time.
It is after the exon library of sample to be tested and check sample is obtained and right in the said method of the present inventionBefore exon library is sequenced, the step of also carry out degenerative treatments including external aobvious sublibrary.Degenerative treatments are carried out hereinPurpose is easy for high-flux sequence and uses.
In another kind of typical embodiment of the invention, there is provided a kind of device of deletion mutant detection, the dressPut including:Acquisition module, for obtaining the sequencing data of sample to be tested and check sample;First judge module, for judging to treatSurvey in data and be mutated with the presence or absence of SNP mutation and/or InDel;And second judge module, judge to whether there is in testing dataDeletion mutation;Wherein, the second judge module includes:Homogenization submodule, for sequencing data to be cut into into window, counts to be measuredSample and check sample are respectively in the sequence number of each window, and the sequence number to each window carries out homogenization process, obtains to be measuredSample and check sample are respectively in the homogenization sequence number of each window;First calculating sub module, exists for calculating matched group sampleThe standard deviation and median of the homogenization sequence number on each window;Second calculating sub module, for calculating each according to formula (1)
Z=(the homogenization sequence number-median of sample to be tested)/standard deviation (1)
On window, the irrelevance Z values of the homogenization sequence number of sample to be tested and the median of check sample;And disappearance is sentencedDisconnected submodule, for when Z values are more than 3, then judging that window has deletion mutation.
The said apparatus of the present invention, by acquisition module the sequencing data of sample to be tested and check sample is obtained;UtilizeFirst judge module judges to be mutated with the presence or absence of SNP mutation and/or InDel in testing data;And using the second judge moduleJudge to whether there is deletion mutation in testing data;And the second judge module utilizes homogenization submodule by sequencing data cuttingInto window, statistics sample to be tested and check sample are carried out homogeneous respectively in the sequence number of each window to the sequence number of each windowChange is processed, and obtains sample to be tested and check sample respectively in the homogenization sequence number of each window;Then the first calculating sub module meterCalculate the standard deviation and median of homogenization sequence number of the matched group sample on each window;Using the second calculating sub module according to formula(1) each window is calculated
Z=(the homogenization sequence number-median of sample to be tested)/standard deviation (1)
On, the irrelevance Z values of the homogenization sequence number of sample to be tested and the median of check sample;Then perform disappearance to sentenceDisconnected submodule, when Z values are more than 3, then judges that window has deletion mutation.
Said apparatus are easy to be lacked according to the sequencing depth and target of different sequencing datas by using homogenization submoduleThe size for losing fragment carrys out the size of flexible splitter, makes the magnitude range of the deletion fragment of detection more extensive.And, second sentencesDisconnected module is to be measured by being used as calculating using the median between check sample when it is determined that a certain window whether there is deletion mutationThe standard that sample compares in the departure degree of the homogenization sequence number of each window, compares using meansigma methodss and standard deviation as comparingStandard, the calculating of median is not susceptible to the impact of the sequence numbers of indivedual exceptions, it is easier to distinguishes false positive, makes determination resultIt is more accurate.
In the said apparatus of the present invention, above-mentioned homogenization submodule can enter to homogenization submodule commonly used in the artRow is suitably modified, and any homogenization submodule that can be standardized the sequence number of each window of the present invention is equalSuitable for the present invention.In a preferred embodiment, above-mentioned homogenization submodule is further included:Statistic unit:ForThe sequence number of each window of leisure each to sample to be tested and check sample is counted, and is designated as respective First ray number, will be eachCounted from the First ray number sum of all windows, be designated as respective second sequence number;Computing unit:For test sample will to be treatedThis and check sample carry out homogenization process in the First ray number of each window according to the formula shown in formula (2), obtain treating test sampleSheet and check sample
Homogenization sequence number=sequence number ... ... ... ... ... .. of First ray number * 1000/ second(2)
The homogenization sequence number of each each window of leisure.In the embodiment, this homogenization submodule can be reduced effectivelyImpact of the sequencing depth difference between each sample to result.
The present invention said apparatus in, the first judge module be judge in sample to be tested whether there is SNP mutation orInDel is mutated, or whether two kinds of mutation are while all exist, or, in sample to be tested, different purpose sites are with the presence or absence of upperOne or two different mutation types are stated, thus, using the conventional judge module in this area.
In a kind of preferred embodiment of the invention, the first judge module includes:Sequence alignment submodule, for being sequencedData are compared with reference gene group and obtain comparison result;First screening submodule, deposits for filtering out from comparison resultIn the site that SNP mutation and/or InDel are mutated, the first candidate locus are designated as;Second screening submodule, for from the first candidateSite of crowd's mutation frequency less than 2% is filtered out in site, the second candidate locus are designated as;SNP and/or InDel mutation judgeSubmodule, for being in the second candidate locus according to, to the functional annotation of the second candidate locus, judging in functional annotation data baseIt is no to there is the SNP mutation site and/or InDel mutational sites for causing gene function to change;If existing, by the second candidateSite is designated as the 3rd candidate locus;SNP and/or InDel mutation confirm submodule, for when there are three candidate locus, inciting somebody to action3rd candidate locus are defined as SNP mutation site and/or InDel mutational sites.
In above preferred embodiment, sequence alignment submodule can be using such as SOAP (http://) etc soap.genomics.org.cn/ comparing module is compared.In above-mentioned first screening submodule, according to actual numberAccording to sequencing quality height, the screening submodule that is removed of site less than 30 to overburden depth can also be included.Second sieveSubmodule is selected to be site according to the crowd's mutation frequency counted in current existing data base less than 2% to the first candidate locusScreened, the second candidate locus for obtaining belong to site of crowd's mutation frequency less than 2%, and imply that may be not belonging to bodyThe high frequency mutation of existing individual variation, and it is probably the mutation related to disease, then perform SNP and/or InDel mutation and judge sonWhether module, changing for gene function is caused according to given data storehouse to the annotation of each gene function come the mutation for judging a certain siteBecome, if there is such site, further perform SNP and/or InDel mutation and confirm submodule, gene function will be caused to changeThe site of change is defined as SNP mutation site and/or InDel mutational sites.
The above-mentioned data base annotated to gene function includes but are not limited to dbSNP (http://www.ncbi.nlm.nih.gov/projects/SNP/)、HGMD(www.hgmd.cf.ac.uk)、ClinVar(http://www.ncbi.nlm.nih.gov/clinvar/)、LOVInSiGHT(http://insight-group.org/lovd.html)。
In said apparatus, before detection module, said apparatus also include that exon library prepares module:For adoptingLiquid phase catching method is prepared to the exon library of sample to be tested and check sample.Above-mentioned exon library prepares module and adoptsThe method captured with liquid phase obtains catching for the library Exon that the sequencing library containing target gene exon region is obtainedObtain efficiency high and save time.
In said apparatus, sublibrary is shown outside and is prepared before module, device also includes that probe designs module:For basisTarget exon region designs liquid phase capture probe.The design principle of probe design module is that design is little with target area complementationFragment, captures target area sequence.Probe design module commonly used in the art can be adopted, such as by official of Agilent companyManual technique carries out liquid phase probe customization, NimbleGen companies official manual method and carries out liquid phase probe customization.
Beneficial effects of the present invention are further illustrated with reference to specific embodiment.
It should be noted that following examples describe the method for the present invention in detail by taking 19 genes listed by table 1 as an example,Reagent used or medicine and instrument, such as without special mark, both from Agilent company of the U.S..The present embodiment recruits 96 canCan be the carrier and 10 normal persons, signature Written informed consent of gene mutation, then detect that carrier there may beMutant gene and its concrete mutation type.Buccal swab sample extraction is carried out according to buccal swab extracting method, it is prompt according to peaceThe description of human relations carries out chip preparation and hybridization, is sequenced according to the description of Illumina.Comprise the following steps that:
Table 1:
| MLH1 | MSH2 | MSH3 | MSH6 | PMS1 | PMS2 |
| BUB1 | BUB3 | STK11 | PTEN | SMAD4 | APC |
| MUTYH | EPCAM | SETD2 | MAX | TSC2 | ATM |
| FANCC | | | | | |
Test chip design
Reference sequences are above-mentioned 19 genes of NCBI build 37/hg19 (from www.ncbi.nlm.nih.gov)Group exon sequence and in front and back 10bp, are completed by the design of Agilent Agilent company of the U.S..
Test two DNA extraction
1) material is processed:By the cotton swab transposition wiped across in buccal in 2ml centrifuge tubes, cotton swab part is cut with shearsUnder.
2) cell pyrolysis liquid and E.C. 3.4.21.64,56 DEG C of placement 60min peptic cells are added.
3) buffer, 70 DEG C of placement 10min, extruding is added to throw away cotton swab, lysate is proceeded to into new centrifuge tube.
4) dehydrated alcohol, precipitated dna are added.
5) solution is added into adsorption column centrifugation, outwells the waste liquid in collecting pipe.
6) buffer, centrifugation is added to outwell the waste liquid in collecting pipe.
7) rinsing liquid, centrifugation is added to outwell the waste liquid in collecting pipe.It is repeated 1 times.
8) it is dried column matrix.
9) elution buffer eluted dna is added in adsorption column.
10) DNA is collected by centrifugation, repeats eluting once, DNA product is stored in -20 DEG C.
Test the preparation of three libraries
Step one:DNA is crushed
1) gDNA quality inspections, it is ensured that DNA is up-to-standard (without degraded;A260/A280 is between 1.8-2.0).Detected with QubitThe concentration of sample gDNA.
2) according to parameter setting covaris of table 2, concrete operations are as follows:
Table 2:
| Arrange | Numerical value |
| Service factor (Duty Factor) | 10% |
| Power peak (PIP) | 175 |
| Each pulse period number | 200 |
| Process time | 360sec |
| Temperature | 4 DEG C~8 DEG C |
A. deionized water, water level is added to reach scale " 12 " in covaris water vats;
B. check whether water level can not have sample cell glass part;
C. chilling temperature is set to into 2-5 DEG C, is cooled to 5 DEG C;
D. alternatively, add ethylene glycol (ethylene glycol) to the 20% of cumulative volume, prevent from freezing.
E. " Degas " button on panel is pressed, " Degas " operation at least 30min before use.
3) in 1.5ml EP pipes, 3ug gDNA are diluted to into 130ul with 1X Low TE Buffer;
4) Covaris microTube are attached on covaris;
5) 130ul DNA samples are carefully drawn with taper pipette tips, in being added to Covaris microTube pipes.It is (carefulOperation, should not make tube bottom bubble occur)
6) carry out DNA according to the Covaris parameters of the setting of table 2 to crush, the main peak of breakdown products is in 150-200bp.
7) DNA sample after carefully being crushed with taper pipette tips is drawn onto in a new 1.5ml EP pipe.
Step 2:With Agencourt AMPure XP magnetic beads for purifying DNA samples
1) AMPure XP bead are placed at least 30min in room temperature;Then fully mix AMPure XP bead to suspendLiquid, until suspension color homogeneous (should not freeze).
2) add in the new pipes of 1.5ml AMPure XP bead suspensions that 180ul mixes and broken DNA library (~130ul).It is vortexed and mixes, room temperature places 5min.
3) pipe is placed on magnetic frame, stands about 3-5min and become clarification to solution.
4) supernatant in pipe is carefully absorbed on magnetic frame, pipette tips should not encounter magnetic bead.
5) on magnetic frame, in each Guan Zhongfen the ethanol of 500ul 70% is added.Can be obtained more with the fresh ethanol now matched somebody with somebodyGood effect.
6) after standing 1min allows magnetic bead sedimentation, ethanol is absorbed.
7) repeat step 5), 6) once.
8) in the upper 37 DEG C of heating 5min of heat block (head block), or it is heated to the ethanol evaporating completely remained in pipe.Note:Magnetic bead surfaces should not be heated to and crackle occur.Magnetic bead overdrying can cause the efficiency of eluting to be remarkably decreased.
9) add 50ul without RNase water, mix on vortex instrument, room temperature places 2min.
10) PE pipes are placed on magnetic frame, stand about 2-3min and become clarification to solution.
11) in drawing the new 1.5ml pipes of about 50ul supernatants to.Magnetic bead can be abandoned after this EOS.If noSubsequent step is carried out, by Sample preservation in -20 DEG C of refrigerators.
Step 3:Repair end
1) using SureSelect Library Prep Kit, ILM. test kits prepare reactant liquor on ice.
2) reaction mixture is prepared in PCR pipe (or comb, PCR plate) formula shown in inner according to the form below 3, is mixed.
3) 52ul reactant liquor mix are added in each PCR pipe (or hole).
4) 48ul DNA samples are added in each PCR pipe (or hole), is mixed with rifle pressure-vaccum.
5) it is subsequently placed in PCR instrument, 20 DEG C of temperature bath 30min hot should not be covered.
Table 3:
Step 4:With Agencourt AMPure XP magnetic beads for purifying DNA samples (the same step 2 of concrete operations)
Step 5:DNA fragmentation end adds A
1) using SureSelect Library Prep Kit, ILM. test kits, the formula of according to the form below 4 prepares anti-on iceAnswer liquid.
Table 4:
2) it is placed in PCR instrument, 37 DEG C of temperature bath 30min.If using heat lid, it is ensured that hot lid temperature is less than 50 DEG C.StepSix:With Agencourt AMPure XP magnetic beads for purifying DNA samples (concrete purification process same step 2)
Step 7:Joint of the connection with special label
1) formula preparation end adds joint reactant liquor as shown in table 5;
Table 5:
2) it is placed in PCR instrument, 20 DEG C of temperature bath 15min.Should not be using heat lid.If not carrying out subsequent step, sample is protectedThere are -20 DEG C of refrigerators.
Step 8:With Agencourt AMPure XP magnetic beads for purifying DNA samples (same to step 2)
Step 9:Amplification connects the library of joint
1) library of joint is connected to, is only expanded with therein 1/3, remaining Sample preservation is in -20 DEG C of refrigerators.
2) PCR reactant liquors are prepared according to formula shown in table 6:
Table 6:
Note:The amount of added DNA library can also be 250ng (quantitative with bioanalyzer DNA1000chip).
3) PCR instrument is put into, according to the form below 7 arranges PCR response procedures and reacted.
Table 7:
Step 10:With Agencourt AMPure XP magnetic beads for purifying DNA samples (same to step 2)
Test the capture of four liquid phases
Step one:Library hybridization
This part contains following steps:By the library for preparing and hybridizing reagent, closed reagent (blocking) and SureSelect capture probes library carries out hybrid reaction agent.Each DNA library must individually be hybridized and be caughtObtain, then again by PCR reaction introducing index.
Each library is done once hybridization and is once captured, and should not carry out the mixed pond of sample in this step.Hybridization requires 750ngDNA initial amounts, maximum volume is specific as follows no more than 3.4ul:
1) at room temperature according to formula preparing hybrid buffer as shown in table 8 below.
Table 8:
2) SureSelect capture library mixture (the Capture library for target acquistion are prepared in PCR platemix);Pipe is kept to be put on ice for.For each sample, according to the size (Mb) of target area, the ratio with reference to shown in table 9 belowAdd appropriate SureSelect captures library (Capture Library).And with reference to table 9 below with without the dilution of RNase waterSureSelect RNase Block.Prepare the diluent of sample reactions all enough according to table 9 simultaneously, to leave surplus capacity.SureSelect RNase Block diluents are added with reference to table 9 below, is mixed with rifle pressure-vaccum.
Table 9:
3) the SureSelect Block Mix of sample reactions all enough are prepared according to table 10.
Table 10:
4) in another PCR plate, the library for preparing is processed, for target acquistion.
A. sample is divided into into the row of A, B two, in each hole on B rows, is separately added into 3.4ul 221ng/ul libraries.
B. in each hole on B rows, 5.6ul SureSelect Block Mix are separately added into.It is mixed with the upper and lower pressure-vaccum of rifleIt is even.
C. the hole of each sample is obturaged with lid, is put into PCR instrument,
D. reacted according to the program in table 11;
Table 11:
| Step | Temperature | Time |
| 1 | 95℃ | 5min |
| 2 | 65℃ | Constant temperature |
5) during 65 DEG C of temperature baths, covered with 105 DEG C of heat.
6) keep PCR plate under conditions of 65 DEG C, in each hole that the A of 96 orifice plates is arranged 40ul hybridization buffers added,The hole count of addition is identical with the library number of B rows on 96 orifice plate.Note:Ensure to carry out before step 10, PCR plate is in 65 DEG C of temperature bathsAt least 5min.
7) add on capture library mix to the PCR of step 2 preparation:
A. keep PCR plate under conditions of 65 DEG C, 7ul capture are added in the hole on C rows on above-mentioned 96 orifice platelibrary mix。
B. mouth is sealed up with row's lid, it is ensured that sealing is tight.
C.65 a DEG C temperature bathes 2min.
8) keep PCR plate under conditions of 65 DEG C, 13ul hybridization buffers are drawn from A rows with the volley of rifle fire, be added to C rows'In capture library mix.
9) keep PCR plate under conditions of 65 DEG C, arranged from B with the volley of rifle fire and draw whole library mixed liquors, be added to C rows'In hybridization solution.With rifle lentamente upper and lower pressure-vaccum 8-10 time, fully mix.Now the volume of hybrid mixed liquid is probably 27-29ul, evaporates the Volume Loss size for causing when bathing depending on front step temperature.
10) with row's lid or double-deck mucosa (double adhesive film) sealing, it is ensured that all hole sealings are tight.
Note:Using new row's lid or sealed membrane, used its integrity in heating process can decline.If using rowPipe, situation about being evaporated by preliminary experiment inspection before the first use, it is ensured that the volume of evaporation does not exceed 3-4ul.
11) hybrid mixed liquid is covered in 65 DEG C of temperature bath 24h with 105 DEG C of heat.
Step 2:Prepare magnetic bead
This step uses the reagent of SureSelect Target Enrichment Kit Box#1:SureSelectBind Buffer and SureSelect Wash 2.
1) 65 DEG C of preheating SureSelect Wash 2 on water-bath or heat block, use in Step 3.
2) magnetic bead can be settled when preserving, and be vortexed acutely concussion, allow Dynabeads MyOne StreptavidinT1 suspends again.
3) to each hybridization, 50ul Dynabeads MyOne Streptavidin T1 to 1.5ml centrifuge tubes are takenIn.
4) magnetic bead is rinsed:
A. 200ul SureSelect Binding Buffer, votex concussion 5s are added.
B. pipe is placed on magnetic frame, becomes to solution and absorb supernatant after clarification.
C. twice of repeat step a-b, rinses 3 times altogether.
5) suspended again magnetic bead with 200ul SureSelect Binding Buffer.
Step 3:Capture and eluting
This step uses the reagent of SureSelect Target Enrichment Kit Box#1:SureSelectWash 1 and SureSelect Wash 2.
1) after the temperature bath of 24 hours, estimate (being estimated with rifle) and record the volume of remaining hybrid mixed liquid.
2) keep PCR plate under conditions of 65 DEG C, hybrid mixed liquid is applied directly in bead solution, overturn and mix 3-5It is secondary.
Note:If after temperature bath hybridization 24h, there is excessive evaporation, remaining volume is less than 20ul, it will after impactContinuous capture effect.
3) mixed liquor is placed on nutator (wobbler), room temperature mixes 30min.
4) brief centrifugation.
5) pipe is placed on magnetic frame, is stood to solution clarification, absorb supernatant.
6) 500ul SureSelect Wash 1 are added, votex 5s allow bead to suspend again.
7) room temperature places 15min, is mixed several times with votex therebetween.
8) brief centrifugation.
9) pipe is placed on magnetic frame, is stood to solution clarification, absorb supernatant.
10) bead is rinsed
A. add 500ul through the SureSelect Wash 2 of 65 DEG C of preheatings, votex 5s allow bead to suspend again.
B. 65 DEG C of temperature bath 10min on water-bath or heat block, are mixed several times therebetween with votex.
If c. bead has been settled, it is reverse it is several under allow it to suspend.
D. brief centrifugation.
E. pipe is placed on magnetic frame, is stood to solution clarification, absorb supernatant.
F. twice of repeat step a-e, rinses 3 times altogether.Guarantee that all of wash buffer are absorbed.
G. 30ul nuclease-free water, votex 5s is added to allow bead to suspend again.
Experiment five:PCR amplifications, introducing label (index) after hybridization
The experimental procedure that this part includes is:Index, PCR primer purification and library quality inspection are entered by pcr amplification primer.
Step one:Pcr amplification primer enters index
The reagent that this step is used:
·Herculase II Fusion DNA Polymerase(Agilent)
·SureSelect Target Enrichment Kit ILM Indexing Hyb Module Box#2
·SureSelect Library Prep Kit,ILM
Note:Should not be with the PCR enzymes beyond Herculase II Fusion DNA Polymerase, the effect of other enzymes is notEmpirical tests.
1) 1 hybridization is with 1 PCR reaction, an additional negative control (being not added with template).
2) multiple samples are placed on ice, are proceeded as follows:
A. the formula of according to the form below 12 prepares reaction liquid mixture, mixes;
B. 35ul reactant liquor mix are added in each PCR pipe (or hole).
C. PCR Primer Index are taken out from test kit " SureSelect Library Prep Kit, ILM "1through Index 16 (clear caps), add the appropriate index of 1ul in each hole, mixed with rifle pressure-vaccum.
For by the different samples being sequenced on same lane, using different index primer.
E. with each DNA sample of rifle pressure-vaccum, it is ensured that bead solution mix homogeneously.
F. each sample draws 14ul in corresponding PCR pipe (or hole), and upper and lower pressure-vaccum is mixed.
Table 12:Herculase II Master Mix formula
* Herculase II Fusion DNA Polymerase (Agilent) test kits are taken from.Should not be using other examinationsBuffer the and dNTP mix of agent box.
A takes from test kit:SureSelect Target Enrichment Kit ILM Indexing Hyb ModuleBox#2。
B uses SureSelect Library Prep Kit, 1 in the primer of 16 in ILM test kits.
3) PCR pipe is put into into PCR instrument to be expanded, amplification program such as table 13 below:
Table 13:
Step 2:With Agencourt AMPure XP magnetic beads for purifying DNA samples (with the step two in experiment three)
Test six high-flux sequences
Step one:Dilution library, degeneration
1) degeneration 0.2N NaOH are prepared:It is molten that 200 μ L 0.1N NaOH are added to preparation 0.2N NaOH in 800 μ L pure waterLiquid.
2) library is diluted to into 2nM, according to each library desired data amount pooling, obtains the library that concentration is 2nM and diluteLiquid.
3) the isopyknic 10 μ L 0.2nM NaOH of 10 μ L 2nM libraries diluents additions are taken, after pressure-vaccum mixes 3 times, is startedTiming 5min.Period concussion is mixed, that is, shake 10s, is centrifuged, and repeats concussion centrifugally operated twice.
4) after degeneration 5min, 970 μ L HT1, concussion are added to mix library solution, 280*g in degeneration library
5) 1min is centrifuged, obtains the degeneration library of 20pM.
6) the degeneration library of 20pM is diluted to into 3pM for upper machine.The μ L of degeneration library solution 450 are added to 2550 μ L pre-coolingsHT1 in, it is reverse mixing for several times, centrifugation, obtain 3mL 3pM degeneration library.
Step 2:Upper machine
1) prepare test kit (Reagent Cartridge), thaw, check and add sodium hypochlorite;Prepare sequence testing chip(flow cell):Equilibrate to room temperature, opening, check.
2) test kit (Reagent Cartridge) is prepared:First test kit (Reagent Cartridge) thaws, soCheck that test kit (Reagent Cartridge) big reservoir determines whether reagent thaws completely afterwards.
(1) test kit (Reagent Cartridge) thaws:Test kit (Reagent Cartridge) can be in 2-8DEG C, overnight thaw.Just can thaw completely in the minimum 18h of this temperature reagent.One week can be preserved in this temperature reagent.①Test kit (Reagent Cartridge) is taken out from -15-25 DEG C;2. test kit (Reagent Cartridge) is put into and can be soakedIn ning the water-bath of room temperature of test kit (Reagent Cartridge) bottom.Note:Water will not reach test kit (ReagentCartridge top).3. reagent thaws about 60min in room-temperature water bath, to thawing completely.4. test kit is taken out from water-bath(Reagent Cartridge), raps on the table, removes the water of test kit (Reagent Cartridge) bottom, makes examinationAgent box (Reagent Cartridge) bottom is dried.
(2) test kit (Reagent Cartridge) is checked:1. overturn test kit (Reagent Cartridge) to mix for 5 timesThe reagent of even defrosting.2. 29,30,31 and 32 reservoirs of test kit (Reagent Cartridge) bottom are checked, it is ensured that these storagesThe reagent of layer thaws completely.3. rap test kit (Reagent Cartridge) on the table to drive out of in the bubble in reagent.
(3) it is put into fresh NaOCl:In order to avoid pollution of upper one operation to instrument, in Reagent CartridgeBefore being put into Nextseq 500, the NaOCl of dilution is added in Reagent Cartridge.Illumina recommends 3%-6%'sNaOCl is diluted to 0.03%-0.06%.Note:The NaOCl of preparation is used in 24h.1. the 0.03%-0.06% of 2mL is preparedNaOCl, volume ratio is the μ L of NaOCl 20 of 20 3%-6% and the μ L of pure water 1980.2. overturn and mix centrifuge tube for several times;3. paper is usedTowel wipes clean 28 hole napkins;4. 28 hole pore membranes are broken with clean 1mL pipette tips;5. 2mL is added in No. 28 holes0.03%-0.06%,
(4) sequence testing chip (flow cell) is prepared:Sequence testing chip (flow cell) is taken out from 2-8 DEG C, bag is openedDress and by sequence testing chip (flow cell) wiped clean, machine in wait.
3) addition library diluent is in No. 10 holes of test kit.
4) sequence is selected to start sequencing program setting steps from software interface.
5) it is put into sequence testing chip (flow cell);
6) waste tray is emptied, and is put back to.Buffer box is put into, test kit is put into.
7) inspection result before operational factor and operation is examined.Selection brings into operation.
8) by NCS softwares and SAV software supervision runnings.
Test seven data analysiss
Step one:Data filtering
Raw sequencing data is with fastq stored in file format (filenames:* .fq), needed before next step analysis is carried outData filtering is carried out, filter method is as follows:
(1) need to filter out the sequence containing joint sequence (reads);
(2) when the content of the N contained in single-ended sequencing sequence exceedes the 10% of the sequence length ratio, need to removeThis is to both-end sequencing sequence (paired reads);
(3) as the low quality (Q contained in single-ended sequencing sequence<=5) base number exceed the read length ratiosWhen 50%, need to remove this to both-end sequencing sequence (paired reads).
Step 2:Sequence alignment and Quality Control
Through the strict filtration to sequencing data, high-quality ordered sequence (Clean data) is obtained.Ordered sequence leads toCross BWA (Burrows-Wheeler Alignment tool) software to compare to NCBI build 37/hg19 reference gene groupsOn, comparison result Jing picard (http://broadinstitute.github.io/picard/) remove and repeat, and filter outSequence of the base mismatch number more than 5.
Step 3:Pathogenic mutation analysis is carried out to target sequence
3.1SNP and InDel analyses comprise the steps:
(1) by software SOAP (http://soap.genomics.org.cn/), sequencing gained sequence is navigated to peopleThe corresponding position of genoid group;
(2) SNP and InDel overburden depths are counted, removes site of the overburden depth less than 30.
(3) according to disease sample and normal specimens information, selecting the site in crowd's medium frequency less than 2% is carried out furtherUnderstand, protein function is predicted using SIFT softwares, resulting site is used as the pathogenic candidate locus of disease.
(4) synthesis dbSNP (http://www.ncbi.nlm.nih.gov/projects/SNP/)、HGMD(www.hgmd.cf.ac.uk)、ClinVar(http://www.ncbi.nlm.nih.gov/clinvar/)、LOVDInSiGHT(http://insight-group.org/lovd.html) mutational site is annotated.Jing is analyzed, resultingPathogenic candidate locus are as shown in table 14 below.
Table 14:
Annotation:
“-“:Finger does not detect any change or without relevant information.
Heterozygosis:Refer to that on same site two allele there are different genotype.
It is pure and mild:Refer to that on same site two allele have identical genotype.
Nonsense mutation:Refer to because the change of certain base makes to represent the codon mutation of certain aminoacid as termination codonSon, so that peptide chain synthesis terminates in advance.
Missense mutation:Refer to that the codon for encoding certain aminoacid Jing after base replacement, becomes to encode another kind of aminoacidCodon so that the amino acid classes and sequence of polypeptide chain change.
Splice site:Referring to may affect subgenomic transcription to form the variation of messenger RNA.
Insertion mutation:Finger inserts nucleotide in genome, the mutation for causing gene code to change.
Deletion mutation:Finger lost several nucleotide in genome, the mutation for causing gene code to change.
The analysis of 3.2 large fragment deletions comprises the steps:
(1) partition window value:Selected 100bp is divided into the longer target area of length as information analysiss window valueLength is the window of 100bp.The window shorter in order to prevent length, target area of the length less than 160bp does not divideProcess.
(2) using the depth of Coverage module meters of GATK (The Genome Analysis Toolkit) instrumentThe sequencing sequence number of target sample and control sample group on each window is calculated, both are carried out into homogenization process, uniformedProcessing formula is:
The sequence sum of all windows of sequence number * 1000/ original on sequence number=window on the rear hatch of standardization
(3) using the sequence number after standardization, mark of the control sample on each window between sequence number is calculatedIt is accurate poor, and standard deviation is designated as into Sd.The median of control sample sequence number on each window is calculated, and median is designated asMed。
(4) for specific window, the median of sequence number and check sample after statistics examined samples standardizationDifference, calculates and deviates median degree, and when departure degree is more than 3*Sd deletion mutation is judged as.The formula that disappearance judges is such asUnder:
Zi=(by sequence number i-Medi after sample product standardization)/(Sdi)
Then it is judged as there occurs disappearance on i-th window when Zi is more than 3.
The gene such as table 15 below of presence deletion mutation detected according to the method described above:
Table 15:
| Chromosome | Genomic locations | Fragment length | Gene | Variable region | Variation type |
| 5 | 112043353-112198302 | 154949bp | APC | Exon region | Large fragment deletion |
Said method and existing employing average are carried out into the method for irrelevance detection and is less than with median ratio0.6 detection method is compared, concrete comparative result such as table 16 below:
Table 16:
| / | Recall rate | Positive predictive value (PPV) |
| The present invention | 13.54% | 100% |
| The irrelevance of calculating compared with average | 15.63% | 86.67% |
| 0.6 is less than with median ratio | 20.8% | 65% |
From above-mentioned table 16 as can be seen that method of the present invention method compared to existing technology reduces false sun in recall rateThe recall rate of property so that positive predictive value reaches 100%, and the accuracy for showing the positive prediction of the method for the present invention is significantly carriedIt is high.
Experiment eight is verified
To prove the accuracy of above-mentioned large fragment deletion testing result, positive findingses are tested and analyzed by DPHLC methods and is carriedThe mutation result of person, experimental procedure is as follows:
1:Mesh is directed to using software Primer primer5.0 (www.premierbiosoft.com/primerdesign)Mark point designs primer.Specifically it is shown in Table 17:
Table 17:
2:PCR is expanded.Amplification system such as table 18 below, amplification program such as table 19 below.
Table 18:
| PCR reactive components | Each system addition |
| 10×Buffer I | 5μl |
| 2.5mM dNTP | 4μl |
| Primer sets | 10μl |
| HS Taq enzymes (5U/ μ l) | 0.4μl |
| DNA | 2.0μl |
| ddH2O | Polishing is to 50 μ l |
Table 19:
3.PCR products are sequenced
Take 1 μ l PCR primers to be detected with 2.0% agarose gel electrophoresiies, and send sequencing.
4. sequencing result is shown in Fig. 1.
In Fig. 1, DHPLC interpretation of result Main Analysis be amplified production peak area, because peak area is approximately equal to bottom (peakIt is wide) × high (peak height)/2, so the amount (i.e. copy number) of PCR primer can pass through testing sample and the peak height before standard controlJudge indirectly.Each peak is an individually designed product on Fig. 1, is all base to be measured in addition to a standard reference geneThe different exons of cause.As long as reference gene is alignd after (peak base and peak height), before observation testing sample and standard controlThe height of other products (peak) is it may determine that the copy number difference of different exons.From Fig. 1 sample to be tested with compare betweenPeak height can be seen that the APC of sample to be tested and there is large fragment deletion, this is consistent with the result of secondary sequencing.So as to demonstrateThe effectiveness and accuracy of the sequence measurement of the present invention.
As can be seen from the above description, the above embodiments of the present invention realize expected technique effect:By treatingThe sequencing data of test sample sheet and check sample to be cut into window in the form of carry out the calculating of sequence number, be easy to according to different surveysThe sequencing depth of ordinal number evidence and the size of target deletion fragment carry out the size of flexible splitter, make detection deletion fragment it is bigSmall range is more extensive;Also, when it is determined that a certain window whether there is deletion mutation, according to sample to be tested the second of each windowThe ratio of the median between sequence and check sample is determined, by using the median between check sample as the mark for comparingStandard, compares using meansigma methodss and standard deviation as the standard for comparing, it is easier to distinguish false positive, makes determination result more accurate, becauseBe when there is no to copy number variation on certain window, using meansigma methodss and standard deviation as the standard for comparing determination mode meetingAffect the accuracy for determining result.
It should be noted that can be in such as one group computer executable instructions the step of the flow process of accompanying drawing is illustratedPerform in computer system, and, although show logical order in flow charts, but in some cases, can be with notThe order being same as herein performs shown or described step.
Obviously, those skilled in the art should be understood that above-mentioned each module of the invention or each step can be with generalComputing device realizing, they can be concentrated on single computing device, or are distributed in multiple computing devices and are constitutedNetwork on, alternatively, they can be realized with the executable program code of computing device, it is thus possible to they are storedPerformed by computing device in the storage device, or they be fabricated to respectively each integrated circuit modules, or by theyIn multiple modules or step be fabricated to single integrated circuit module to realize.So, the present invention is not restricted to any specificHardware and software is combined.
The preferred embodiments of the present invention are the foregoing is only, the present invention is not limited to, for the skill of this areaFor art personnel, the present invention can have various modifications and variations.It is all within the spirit and principles in the present invention, made any repairChange, equivalent, improvement etc., should be included within the scope of the present invention.
Sequence table
<110>Tianjin Nuo Hezhi sources bio information Science and Technology Ltd.
<120>The method and apparatus of detection gene mutation
<130> PN41432NHZY
<160> 14
<170> PatentIn version 3.5
<210> 1
<211> 20
<212> DNA
<213>Synthetic
<400> 1
tcgggaagcg gagagagaag 20
<210> 2
<211> 20
<212> DNA
<213>Synthetic
<400> 2
agacagtgcg agggaaaacc 20
<210> 3
<211> 20
<212> DNA
<213>Synthetic
<400> 3
atttaccagt gagggacggg 20
<210> 4
<211> 20
<212> DNA
<213>Synthetic
<400> 4
acgcttttga gggttgattc 20
<210> 5
<211> 20
<212> DNA
<213>Synthetic
<400> 5
taaggtgcgt gctttgagag 20
<210> 6
<211> 21
<212> DNA
<213>Synthetic
<400> 6
acatcctgag ggtaaggcta a 21
<210> 7
<211> 25
<212> DNA
<213>Synthetic
<400> 7
tgactgtaat attctaagtc ctacc 25
<210> 8
<211> 20
<212> DNA
<213>Synthetic
<400> 8
gagattctga agttgagcgt 20
<210> 9
<211> 22
<212> DNA
<213>Synthetic
<400> 9
cacaacatca ttcactcaca gc 22
<210> 10
<211> 22
<212> DNA
<213>Synthetic
<400> 10
tacttggatt tttgtcctgg tc 22
<210> 11
<211> 25
<212> DNA
<213>Synthetic
<400> 11
tgacaaagga agaacagata gcaaa 25
<210> 12
<211> 22
<212> DNA
<213>Synthetic
<400> 12
aagcctgggt gacagagtga ga 22
<210> 13
<211> 19
<212> DNA
<213>Synthetic
<400> 13
tgttgactcg atccacccc 19
<210> 14
<211> 21
<212> DNA
<213>Synthetic
<400> 14
tgagctgcaa gtttggctga a 21