Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention. In addition, the apparatus and reagents used in the following examples are commercially available ones unless otherwise specified.
Example 1
This example provides flanking sequences of a cotton insertion site for transforming the GhFPF1 gene, including a left flanking sequence; the nucleotide sequence of the left flanking sequence is shown as a sequence table SEQ ID NO. 2, and the length of the left flanking sequence is 937 bp; the 1 st to 501 th site sequences of the left flank sequence are derived from the genome sequence of the Zhongmian 24, and the 502 th to 937 th site sequences of the left flank sequence are derived from the exogenous insert sequence of GhFPF1 transgenic cotton. Wherein, the left flanking sequence refers to the left border flanking sequence of the exogenous insert of GhFPF1 gene-transferred cotton.
Example 2
This example provides flanking sequences of a cotton insertion site for transforming the GhFPF1 gene, including a right flanking sequence; the nucleotide sequence of the right flanking sequence is shown as a sequence table SEQ ID NO. 3, and the length of the nucleotide sequence is 967 bp; the 1 st to 466 th site sequences of the right flanking sequence are derived from the exogenous insertion fragment sequence of GhFPF1 transgenic cotton, and the 467 th to 967 th site sequences of the right flanking sequence are derived from the genome sequence of Zhongmiao 24. Wherein, the right flank sequence refers to the right border flanking sequence of the exogenous insertion fragment of GhFPF1 gene-transferred cotton.
In addition, the method for obtaining the left flank sequence and the right flank sequence provided in theabove embodiments 1 to 2 is as follows:
(1) extracting genome DNA of GhFPF1 transgenic cotton plants: putting young and tender leaves of a GhFPF1 transgenic cotton plant into a centrifuge tube, then putting a sample into liquid nitrogen, and carrying out oscillation and smashing on a proofing machine to obtain a sample; then, 800 muL of the solution is added into the samplePreheating cetyl trimethyl ammonium bromide lysate at 65 ℃, simultaneously adding beta-mercaptoethanol to prevent oxidation, uniformly mixing until no obvious layering occurs, placing in a water bath at 65 ℃ for 30 minutes, and slowly reversing and uniformly mixing once every 10 minutes; after water bath, adding 800 muL of chloroform/isoamyl alcohol mixed solution with the volume ratio of 24:1, and repeatedly reversing until no layering occurs; then, the mixture was centrifuged at 12000rpm for 10 minutes at 4 ℃; sucking the supernatant by using a tip-free gun head, wherein the supernatant is about 600 mu L, and transferring the supernatant to another 2mL centrifuge tube; adding ice-cooled isopropanol with the volume of 0.8 time, slightly inverting for several times until flocculent DNA is generated, centrifuging at 12000rpm for 1 minute, and pouring out the ice-cooled isopropanol; adding 75% ethanol, washing for 2 times, and washing with anhydrous ethanol for 1 time; pouring off the absolute ethyl alcohol, and adding 200 muL ddH after drying2And dissolving the O to obtain the genome DNA of the GhFPF1 transgenic cotton plant, and storing at 4 ℃ for later use. The GhFPF1 gene-transferred cotton is a transgenic plant ZF667 of the Cotton research institute of Chinese academy of agricultural sciences, and is obtained by introducing the cDNA sequence of the GhFPF1 gene into the Cotton institute 24 (national approval GS 08001-1997) of a cotton variety by an agrobacterium-mediated method, and the structural map of a transformation vector pBI121-GhFPF1 is shown in figure 1.
(2) Entrust Beijing and kang biotechnology limited company to carry on resequencing analysis to the above-mentioned genomic DNA which changes GhFPF1 gene cotton plant, specifically, after changing GhFPF1 gene cotton plant's genomic DNA to detect and pass, use the method of mechanical breaking (supersonic wave) to segment DNA, then carry on segment purification, end repair, 3' end add A, connect sequencing joint to fragmented DNA, reuse agarose gel electrophoresis to carry on the size selection of fragment, carry on PCR amplification and form the sequencing library, the library of library built carries on the quality control of the library first, the qualified library of quality control carries on the sequencing with Illumina HiSeq platform; the off-line Data was filtered to obtain Clean Data, and then reads were aligned to Gossypium hirsutum (AD 1), NAU association database and Mapped Data obtained on the insert sequence in the designated reference genomic cotton genome database (http:// www.cottonfgd.org /), and used for exogenous insert search. The specific searching method comprises the following steps: firstly, finding junction reads which contain a reference genome and an insertion sequence in the read; then, determining an insertion position and a direction according to the comparison information of the connection reads; finally, these reads are clustered to determine the insertion location and length, and whether there are pairs of junction clusters. Finding out the following two types of Paired _ end reads according to the comparison result: the first type is that a reference genome sequence is aligned on a reads at one end, and an insertion sequence is aligned on a reads at the other end; the second type is that a part of reads at either end is aligned with the reference genome sequence, and the other part is aligned with the insert sequence. And (3) comparing the reference genome by adopting BWA, selecting all reads capable of comparing the exogenous insertion sequences, and carrying out local assembly. And respectively comparing the exogenous insertion sequence and the reference genome result by using Blastn according to the assembled contig, selecting regions of the contig sequence compared to the chromosome, and carrying out IGV screenshot verification on bwa comparison results of the regions to obtain the insertion position information of the exogenous insertion fragment. The insertion position of the exogenous insertion fragment of GhFPF1 transgenic cotton is verified to be A12 chromosome 63373235 of Zhongmian cotton plant 24, and sequences with insertion sites of 1255bp in upstream and downstream lengths in a reference cotton genome are obtained, wherein the nucleotide sequence of 63372735-63373989 sites of the Gomian cotton plant 24 is shown as SEQ ID NO. 1.
(3) And designing PCR detection primers according to the sequence of the exogenous insertion fragment of the GhFPF1 transgenic cotton and the upstream and downstream sequences of the insertion site in a cotton reference genome. Wherein, the nucleotide sequences of the GhFPF1 transgenic cotton left boundary flanking sequence amplification primer pair are shown as SEQ ID NO. 4-5, and are respectively: 5'-GCACCCAGATAATACGGGCT-3' and 5'-CCTTCAACGTTGCGGTTCTG-3'; the nucleotide sequences of the GhFPF1 transgenic cotton right border flanking sequence amplification primer pair are shown as SEQ ID NO. 6-7, and respectively are as follows: 5'-TCAAGCTCTAAATCGGGGGC-3', and 5'-AAGCATGCACCATGAAGCAAC-3'. Then, taking the genomic DNA of the GhFPF1 transgenic cotton plant as a DNA template, and respectively carrying out PCR amplification reaction by using the two groups of primers to obtain two groups of PCR amplification products, wherein a PCR reaction system (10 uL) is as follows: 5 muL of PCR premixed solution; 0.8 muL of primer pairs (0.4 muL of each of 2 primers); 2 muL of DNA template; ddH2O2.2 muL. The PCR reaction conditions are as follows: pre-denaturation at 94 deg.C for 2min, denaturation at 94 deg.C for 30S, annealing at 55 deg.C for 30S, and annealingStretching at 72 ℃ for 30 seconds for 30 cycles; final extension at 72 deg.C for 2min, and storage at 4 deg.C. And respectively detecting the two groups of PCR amplification products by using 1% agarose gel electrophoresis, purifying the PCR products by using a gel recovery kit, entrusting a prokaryote biotechnology company to perform sequencing verification, and comparing a sequencing result with an exogenous insertion sequence and a reference genome sequence to finally obtain a left boundary flanking sequence (namely the left flanking sequence) and a right boundary flanking sequence (namely the right flanking sequence) of the exogenous insertion fragment of the GhFPF1 transgenic cotton. It should be noted that, as the PCR premix, commercially available 2 × Taq MasterMix (Dye) can be used.
Example 3
This example provides a primer for identifying the specificity of the flanking sequence of the cotton insertion site of the GhFPF1 gene, which is designed based on the left flanking sequence provided in example 1 above. Specifically, the primer comprises a first primer; the first primer comprises a first forward primer designed according to the 1 st to 501 th site sequences of the left flanking sequence and a first reverse primer designed according to the 502 th to 937 th site sequences of the left flanking sequence; wherein, the nucleotide sequence of the first forward primer is shown as a sequence table SEQ ID NO. 4 and is as follows: 5'-GCACCCAGATAATACGGGCT-3', respectively; the nucleotide sequence of the first reverse primer is shown as a sequence table SEQ ID NO. 5 and comprises the following components: 5'-CCTTCAACGTTGCGGTTCTG-3' are provided.
Example 4
This example provides a primer for identifying the specificity of the flanking sequence of the cotton insertion site of the GhFPF1 gene, which is designed according to the right flanking sequence provided in example 2 above. Specifically, the primer comprises a second primer; the second primer comprises a second forward primer designed according to the 1 st to 466 th site sequences of the right flanking sequence and a second reverse primer designed according to the 467 th to 967 th site sequences of the right flanking sequence; wherein the nucleotide sequence of the second forward primer is shown as a sequence table SEQ ID NO. 6 and is: 5'-TCAAGCTCTAAATCGGGGGC-3', respectively; the nucleotide sequence of the second reverse primer is shown as a sequence table SEQ ID NO. 7 and is as follows: 5'-AAGCATGCACCATGAAGCAAC-3' are provided.
Example 5
This example provides a kit for identifying the specificity of flanking sequences of a cotton insertion site of a GhFPF1 gene, which comprises a PCR premix and the primers provided in the above example 3. As the PCR premix, 2 × Taq MasterMix (Dye) which is commercially available is used.
Example 6
This example provides a kit for identifying the specificity of flanking sequences of a cotton insertion site of a GhFPF1 gene, which comprises a PCR premix and the primers provided in example 4. As the PCR premix, 2 × Taq MasterMix (Dye) which is commercially available is used.
Example 7
The embodiment provides a method for detecting whether a sample to be detected contains GhFPF1 transgenic cotton, which is detected by using the kit provided in theembodiment 5, and specifically comprises the following steps:
(1) extracting the genome DNA of a sample to be detected as a DNA template according to the method for extracting the genome DNA of the GhFPF1 transgenic cotton plant.
(2) 2 muL of DNA template, 5 muL of PCR premixed solution, 0.4 muL of first forward primer, 0.4 muL of first reverse primer and 2.2 muL of ddH2And O, mixing to obtain a PCR reaction solution.
(3) Placing the PCR reaction solution in a PCR instrument for PCR amplification reaction to obtain a PCR reaction product; wherein, the PCR reaction conditions are as follows: pre-denaturation at 94 deg.C for 2min, denaturation at 94 deg.C for 30S, annealing at 55 deg.C for 30S, extension at 72 deg.C for 30S, and 30 cycles; final extension at 72 deg.C for 2min, and storage at 4 deg.C.
(4) Carrying out 1% agarose gel electrophoresis detection on the PCR reaction product, and dyeing by using nucleic acid dye to judge whether a specific band exists in the PCR reaction product; if a specific band exists in the PCR reaction product, the sample to be detected contains the GhFPF1 transgenic cotton component. The GhFPF1 transgenic cotton comprises the parents, plants, tissues and seeds of derived lines, and the like.
The method is used for detecting samples of a leaf of GhFPF1 gene-transferred cotton, a seed of GhFPF1 gene-transferred cotton, a root of the seed of GhFPF1 gene-transferred cotton, non-transgenic Zhongmian 24, Zhongmian 50, corn, wheat and the like, and clear water is used as a negative control group, and the gel electrophoresis pattern obtained by detection is shown in figure 2, wherein M is DNA molecular weight standard (DL 5000), 1 is the leaf of GhFPF1 gene-transferred cotton, 2 is the seed of GhFPF1 gene-transferred cotton, 3 is the root of the leaf of GhFPF1 gene-transferred cotton, 4 is thenon-transgenic Zhongmian 24, 5 is theZhongmian 50, 6 is the corn, 7 is the wheat, and 8 is the clear water. As can be seen from the figure, only the samples of leaves, seeds and roots of GhFPF1 transgenic cotton generate specific amplification bands with the length of 301bp, and the other samples do not generate specific amplification bands, which indicates that the detection method provided by the embodiment of the invention can accurately detect whether the samples are GhFPF1 transgenic cotton or cotton containing GhFPF1 transgenic cotton.
Example 8
The embodiment provides a method for detecting whether a sample to be detected contains GhFPF1 transgenic cotton, which is detected by using the kit provided in theembodiment 6, and specifically comprises the following steps:
(1) extracting the genome DNA of a sample to be detected as a DNA template according to the method for extracting the genome DNA of the GhFPF1 transgenic cotton plant.
(2) 2 muL of DNA template, 5 muL of PCR premixed solution, 0.4 muL of second forward primer, 0.4 muL of second reverse primer and 2.2 muL of ddH2And O, mixing to obtain a PCR reaction solution.
(3) Placing the PCR reaction solution in a PCR instrument for PCR amplification reaction to obtain a PCR reaction product; wherein, the PCR reaction conditions are as follows: pre-denaturation at 94 deg.C for 2min, denaturation at 94 deg.C for 30S, annealing at 55 deg.C for 30S, extension at 72 deg.C for 30S, and 30 cycles; final extension at 72 deg.C for 2min, and storage at 4 deg.C.
(4) Carrying out 1% agarose gel electrophoresis detection on the PCR reaction product, and dyeing by using nucleic acid dye to judge whether a specific band exists in the PCR reaction product; if a specific band exists in the PCR reaction product, the sample to be detected contains the GhFPF1 transgenic cotton component. The GhFPF1 transgenic cotton comprises the parents, plants, tissues and seeds of derived lines, and the like.
The method is used for detecting samples of a leaf of GhFPF1 transgenic cotton, a seed of GhFPF1 transgenic cotton, a root of the seed of the GhFPF1 transgenic cotton, non-transgenic Zhongmian cotton 24, Zhongmian cotton 50, corn, wheat and the like, and clear water is used as a negative control group, and the gel electrophoresis chart obtained by detection is shown in figure 3, wherein M is a DNA molecular weight standard (DL 5000), 1 is the leaf of the GhFPF1 transgenic cotton, 2 is the seed of the GhFPF1 transgenic cotton, 3 is the root of the GhFPF1 transgenic cotton, 4 is thenon-transgenic Zhongmian cotton 24, 5 is theZhongmian cotton 50, 6 is the corn, 7 is the wheat and 8 is the clear water. As can be seen from the figure, only the samples of leaves, seeds and roots of GhFPF1 transgenic cotton generate specific amplification bands with the length of 556bp, and the other samples do not generate specific amplification bands, which indicates that the detection method provided by the embodiment of the invention can accurately detect whether the samples are GhFPF1 transgenic cotton or contain GhFPF1 transgenic cotton components.
In light of the foregoing description of the preferred embodiment of the present invention, many modifications and variations will be apparent to those skilled in the art without departing from the spirit and scope of the invention. The technical scope of the present invention is not limited to the content of the specification, and must be determined according to the scope of the claims.
Sequence listing
<110> Cotton research institute of Chinese academy of agricultural sciences
<120> flanking sequence of GhFPF1 transgenic cotton insertion site and specificity identification method thereof
<141> 2019-12-27
<160> 7
<170> SIPOSequenceListing 1.0
<210> 1
<211> 1255
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 1
caaagcatat gttaggcaat tgctttgttt atattagaac aatatgatgg aaaaagattc 60
cttctatttt atctatatat ttatttattt atttcaagta gtagaaagac tattacttaa 120
ttttatttaa aaaattactt aattaggttt aggttttgaa aataaaataa aaattaatac 180
attacataag cacttgaatt ttgtcttttt ttatttgaat aagcccttat atttgtatta 240
cattcaaata aactcttaaa ttagaattga tatccgaatt agctcttaat ctttgacttt 300
atcaaaataa cttactttta caccaatggt ttagcaccca gataatacgg gctcaaaccc 360
tattatcccc cttctccctc ctcttatatc gtaatgataa aaaaagctat ttgaaactat 420
taaaagcata agggtttatt taaataaagt gataaagttg aaggacttat ttagtatttt 480
aaccccaaaa ttgaaaaaca ccatctttta gcaaattgtt tcatgtaaaa ggcaaatata 540
caaacaactt aaagctcttt gagcatctta taatgaggct tctttgacct tagcaatgtc 600
ccaatctacc tatccaaaca atatcaaaat tataaattaa ttaaaaaata tatttttaat 660
taattatata aataattgaa tgctttaaag acttacaagg ccggtcggcc tctatgtttt 720
gtttttgttt ttttttttaa tgtttaggtt gcttttttat taatttaaat ttggttctta 780
tagttacatc cttttccaca caaagggaaa tttgttttcc tatgatttgg gtttatgaaa 840
agaaatatat aattattttg taaagttgct tcatggtgca tgctttaatt ggacaccatt 900
attaatcata attgcttttg aaattttagg tagctttggt ccttctttgg aagatatgga 960
aatataatat atatatattg tttttaatat tcaaaattat catatttgtc tcttgtatgt 1020
taaaattttt attccaaatg tttaacatac aaaaaaatat acttctcata ttgattggaa 1080
atttacaaaa gataatgtat ctgaattact ataagcagaa ttagaaagta caaatcggat 1140
ataacctaaa ccaatggcag aatgtgccat tgagtttttg aggggctact taaaattttt 1200
aaaattttca ggaatttaat taaatttttt taaaaatttt gagaggatta aggag 1255
<210> 2
<211> 937
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 2
caaagcatat gttaggcaat tgctttgttt atattagaac aatatgatgg aaaaagattc 60
cttctatttt atctatatat ttatttattt atttcaagta gtagaaagac tattacttaa 120
ttttatttaa aaaattactt aattaggttt aggttttgaa aataaaataa aaattaatac 180
attacataag cacttgaatt ttgtcttttt ttatttgaat aagcccttat atttgtatta 240
cattcaaata aactcttaaa ttagaattga tatccgaatt agctcttaat ctttgacttt 300
atcaaaataa cttactttta caccaatggt ttagcaccca gataatacgg gctcaaaccc 360
tattatcccc cttctccctc ctcttatatc gtaatgataa aaaaagctat ttgaaactat 420
taaaagcata agggtttatt taaataaagt gataaagttg aaggacttat ttagtatttt 480
aaccccaaaa ttgaaaaaca cctgatagtt taaactgaag gcgggaaacg acaatctgat 540
catgagcgga gaattaaggg agtcacgtta tgacccccgc cgatgacgcg ggacaagccg 600
ttttacgttt ggaactgaca gaaccgcaac gttgaaggag ccactcagcc gcgggtttct 660
ggagtttaat gagctaagca catacgtcag aaaccattat tgcgcgttca aaagtcgcct 720
aaggtcacta tcagctagca aatatttctt gtcaaaaatg ctccactgac gttccataaa 780
ttcccctcgg tatccaatta gagtctcata ttcactctca atccaaataa tctgcaccgg 840
atctggatcg tttcgcatga ttgaacaaga tggattgcac gcaggttctc cggccgcttg 900
ggtggagagg ctattcggct atgactgggc acaacag 937
<210> 3
<211> 967
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 3
ttcttccctt cctttctcgc cacgttcgcc ggctttcccc gtcaagctct aaatcggggg 60
ctccctttag ggttccgatt tagtgcttta cggcacctcg accccaaaaa acttgatttg 120
ggtgatggtt cacgtagtgg gccatcgccc tgatagacgg tttttcgccc tttgacgttg 180
gagtccacgt tctttaatag tggactcttg ttccaaactg gaacaacact caaccctatc 240
tcgggctatt cttttgattt ataagggatt ttgccgattt cggaaccacc atcaaacagg 300
attttcgcct gctggggcaa accagcgtgg accgcttgct gcaactctct cagggccagg 360
cggtgaaggg caatcagctg ttgcccgtct cactggtgaa aagaaaaacc accccagtac 420
attaaaaacg tccgcaatgt gttattaagt tgtctaagcg tcaatttttt attaatttaa 480
atttggttct tatagttaca tccttttcca cacaaaggga aatttgtttt cctatgattt 540
gggtttatga aaagaaatat ataattattt tgtaaagttg cttcatggtg catgctttaa 600
ttggacacca ttattaatca taattgcttt tgaaatttta ggtagctttg gtccttcttt 660
ggaagatatg gaaatataat atatatatat tgtttttaat attcaaaatt atcatatttg 720
tctcttgtat gttaaaattt ttattccaaa tgtttaacat acaaaaaaat atacttctca 780
tattgattgg aaatttacaa aagataatgt atctgaatta ctataagcag aattagaaag 840
tacaaatcgg atataaccta aaccaatggc agaatgtgcc attgagtttt tgaggggcta 900
cttaaaattt ttaaaatttt caggaattta attaaatttt tttaaaaatt ttgagaggat 960
taaggag 967
<210> 4
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 4
gcacccagat aatacgggct 20
<210> 5
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 5
ccttcaacgt tgcggttctg 20
<210> 6
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 6
tcaagctcta aatcgggggc 20
<210> 7
<211> 21
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 7
aagcatgcac catgaagcaa c 21