Movatterモバイル変換

DNA sequencing

From Wikipedia, the free encyclopedia

Process of determining the nucleic acid sequence

Genetics
Part of a series on

Key components Chromosome DNA RNA Genome Heredity Nucleotide Mutation Genetic variation Allele Amino acid Outline Index
History and topics Introduction History Evolution (molecular) Population genetics Mendelian inheritance Quantitative genetics Molecular genetics
Research Geneticist DNA sequencing Genetic engineering Genomics (template) Medical genetics Branches of genetics
Fields Classical Conservation Cytogenetics Ecological Immunogenetics Microbial Molecular Population Quantitative
Personalized medicine Personalized medicine
Category
v t e

DNA sequencing is the process of determining thenucleic acid sequence – the order ofnucleotides inDNA. It includes any method or technology that is used to determine the order of the four bases:adenine,thymine,cytosine, andguanine. The advent of rapid DNA sequencing methods has greatly accelerated biological and medical research and discovery.^[1]^[2]

Knowledge ofDNA sequences has become indispensable for basic biological research,DNA Genographic Projects and in numerous applied fields such asmedical diagnosis,biotechnology,forensic biology,virology and biologicalsystematics. Comparing healthy and mutated DNA sequences can diagnose different diseases including various cancers,^[3] characterize antibody repertoire,^[4] and can be used to guide patient treatment.^[5] Having a quick way to sequence DNA allows for faster and more individualized medical care to be administered, and for more organisms to be identified and cataloged.^[4]

The rapid advancements in DNA sequencing technology have played a crucial role in sequencing complete genomes of humans, as well as numerous animal, plant, and microbial species.

An example of the results of automated chain-termination DNA sequencing

The first DNA sequences were obtained in the early 1970s by academic researchers using laborious methods based ontwo-dimensional chromatography. Following the development offluorescence-based sequencing methods with aDNA sequencer,^[6] DNA sequencing has become easier and orders of magnitude faster.^[7]^[8]

Method	Read length	Accuracy (single read not consensus)	Reads per run	Time per run	Cost per 1 billion bases (in US$)	Advantages	Disadvantages
Single-molecule real-time sequencing (Pacific Biosciences)	30,000 bp (N50); maximum read length >100,000 bases^[97]^[98]^[99]	87% raw-read accuracy^[100]	4,000,000 per Sequel 2 SMRT cell, 100–200 gigabases^[97]^[101]^[102]	30 minutes to 20 hours^[97]^[103]	$7.2-$43.3	Fast. Detects 4mC, 5mC, 6mA.^[104]	Moderate throughput. Equipment can be very expensive.
Ion semiconductor (Ion Torrent sequencing)	up to 600 bp^[105]	99.6%^[106]	up to 80 million	2 hours	$66.8-$950	Less expensive equipment. Fast.	Homopolymer errors.
Pyrosequencing (454)	700 bp	99.9%	1 million	24 hours	$10,000	Long read size. Fast.	Runs are expensive. Homopolymer errors.
Sequencing by synthesis (Illumina)	MiniSeq, NextSeq: 75–300 bp; MiSeq: 50–600 bp; HiSeq 2500: 50–500 bp; HiSeq 3/4000: 50–300 bp; HiSeq X: 300 bp	99.9% (Phred30)	MiniSeq/MiSeq: 1–25 Million; NextSeq: 130-00 Million; HiSeq 2500: 300 million – 2 billion; HiSeq 3/4000 2.5 billion; HiSeq X: 3 billion	1 to 11 days, depending upon sequencer and specified read length^[107]	$5 to $150	Potential for high sequence yield, depending upon sequencer model and desired application.	Equipment can be very expensive. Requires high concentrations of DNA.
Combinatorial probe anchor synthesis (cPAS- BGI/MGI)	BGISEQ-50: 35-50bp; MGISEQ 200: 50-200bp; BGISEQ-500, MGISEQ-2000: 50-300bp^[108]	99.9% (Phred30)	BGISEQ-50: 160M; MGISEQ 200: 300M; BGISEQ-500: 1300M per flow cell; MGISEQ-2000: 375M FCS flow cell, 1500M FCL flow cell per flow cell.	1 to 9 days depending on instrument, read length and number of flow cells run at a time.	$5– $120
Sequencing by ligation (SOLiD sequencing)	50+35 or 50+50 bp	99.9%	1.2 to 1.4 billion	1 to 2 weeks	$60–130	Low cost per base.	Slower than other methods. Has issues sequencing palindromic sequences.^[109]
Nanopore Sequencing	Dependent on library preparation, not the device, so user chooses read length (up to 2,272,580 bp reported^[110]).	~92–97% single read	dependent on read length selected by user	data streamed in real time. Choose 1 min to 48 hrs	$7–100	Longest individual reads. Accessible user community. Portable (Palm sized).	Lower throughput than other machines, Single read accuracy in 90s.
GenapSys Sequencing	Around 150 bp single-end	99.9% (Phred30)	1 to 16 million	Around 24 hours	$667	Low-cost of instrument ($10,000)
Chain termination (Sanger sequencing)	400 to 900 bp	99.9%	N/A	20 minutes to 3 hours	$2,400,000	Useful for many applications.	More expensive and impractical for larger sequencing projects. This method also requires the time-consuming step of plasmid cloning or PCR.

Name of algorithm	Type of algorithm
Cutadapt^[169]	Running sum
ConDeTri^[170]	Window based
ERNE-FILTER^[171]	Running sum
FASTX quality trimmer	Window based
PRINSEQ^[172]	Window based
Trimmomatic^[173]	Window based
SolexaQA^[174]	Window based
SolexaQA-BWA	Running sum
Sickle	Window based

v t e Research methods in biology
Laboratory techniques	Genetic engineering Transformation Gel electrophoresis Chromatography Centrifugation Cell culture DNA sequencing DNA microarray Green fluorescent protein vector Enzyme assay Protein purification Western blot Northern blot Southern blot Restriction enzyme Polymerase chain reaction Two-hybrid screening in vivo in vitro in silico
Field techniques	Belt transect mark and recapture species discovery curve
Biology portal Category Commons WikiProject

Movatterモバイル変換

Applications

Molecular biology

Evolutionary biology

Metagenomics

Virology

Medicine

Forensic investigation

The four canonical bases

History

Discovery of DNA structure and function

RNA sequencing

Early DNA sequencing methods

Sequencing of full genomes

High-throughput sequencing (HTS) methods

Basic methods

Maxam-Gilbert sequencing

Chain-termination methods

Sequencing by synthesis

Large-scale sequencing andde novo sequencing

Shotgun sequencing

High-throughput methods

Long-read sequencing methods

Single molecule real time (SMRT) sequencing

Nanopore DNA sequencing

Short-read sequencing methods

Massively parallel signature sequencing (MPSS)

Polony sequencing

454 pyrosequencing

Illumina (Solexa) sequencing

Combinatorial probe anchor synthesis (cPAS)

SOLiD sequencing

Ion Torrent semiconductor sequencing

DNA nanoball sequencing

Heliscope single molecule sequencing

Microfluidic systems

Methods in development

Tunnelling currents DNA sequencing

Sequencing by hybridization

Sequencing with mass spectrometry

Microfluidic Sanger sequencing

Microscopy-based techniques

RNAP sequencing

In vitro virus high-throughput sequencing

Market share

Sample preparation

Development initiatives

Computational challenges

Read trimming

Ethical issues

See also

Notes

References

External links