- Article
- Published:
Mapping and sequencing of structural variation from eight human genomes
- Jeffrey M. Kidd1,
- Gregory M. Cooper1,
- William F. Donahue2,
- Hillary S. Hayden3,
- Nick Sampas4,
- Tina Graves5,
- Nancy Hansen6,
- Brian Teague7,
- Can Alkan1,
- Francesca Antonacci1,
- Eric Haugen3,
- Troy Zerr1,
- N. Alice Yamada4,
- Peter Tsang4,
- Tera L. Newman1,
- Eray Tüzün1,
- Ze Cheng1,
- Heather M. Ebling2,
- Nadeem Tusneem2,
- Robert David2,
- Will Gillett3,
- Karen A. Phelps3,
- Molly Weaver1,
- David Saranga2,
- Adrianne Brand2,
- Wei Tao2,
- Erik Gustafson2,
- Kevin McKernan2,
- Lin Chen1,
- Maika Malig1,
- Joshua D. Smith1,
- Joshua M. Korn8,
- Steven A. McCarroll8,
- David A. Altshuler8,
- Daniel A. Peiffer9,
- Michael Dorschner1,
- John Stamatoyannopoulos1,
- David Schwartz7,
- Deborah A. Nickerson1,
- James C. Mullikin6,
- Richard K. Wilson5,
- Laurakay Bruhn4,
- Maynard V. Olson3,
- Rajinder Kaul3,
- Douglas R. Smith2 &
- …
- Evan E. Eichler1
Naturevolume 453, pages56–64 (2008)Cite this article
9126Accesses
964Citations
35Altmetric
Abstract
Genetic variation among individual humans occurs on many different scales, ranging from gross alterations in the human karyotype to single nucleotide changes. Here we explore variation on an intermediate scale—particularly insertions, deletions and inversions affecting from a few thousand to a few million base pairs. We employed a clone-based method to interrogate this intermediate structural variation in eight individuals of diverse geographic ancestry. Our analysis provides a comprehensive overview of the normal pattern of structural variation present in these genomes, refining the location of 1,695 structural variants. We find that 50% were seen in more than one individual and that nearly half lay outside regions of the genome previously described as structurally variant. We discover 525 new insertion sequences that are not present in the human reference genome and show that many of these are variable in copy number between individuals. Complete sequencing of 261 structural variants reveals considerable locus complexity and provides insights into the different mutational processes that have shaped the human genome. These data provide the first high-resolution sequence map of human structural variation—a standard for genotyping platforms and a prelude to future individual genome sequencing projects.
This is a preview of subscription content,access via your institution
Access options
Subscription info for Japanese customers
We have a dedicated website for our Japanese customers. Please go tonatureasia.com to subscribe to this journal.
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
¥ 4,980
Prices may be subject to local taxes which are calculated during checkout





Similar content being viewed by others
References
Iafrate, A. J. et al. Detection of large-scale variation in the human genome.Nature Genet.36, 949–951 (2004)
Sebat, J. et al. Large-scale copy number polymorphism in the human genome.Science305, 525–528 (2004)
Tuzun, E. et al. Fine-scale structural variation of the human genome.Nature Genet.37, 727–732 (2005)
Sharp, A. J. et al. Segmental duplications and copy-number variation in the human genome.Am. J. Hum. Genet.77, 78–88 (2005)
Redon, R. et al. Global variation in copy number in the human genome.Nature444, 444–454 (2006)
Wong, K. K. et al. A comprehensive analysis of common copy-number variations in the human genome.Am. J. Hum. Genet.80, 91–104 (2007)
Conrad, D. F., Andrews, T. D., Carter, N. P., Hurles, M. E. & Pritchard, J. K. A high-resolution survey of deletion polymorphisms in the human genome.Nature Genet.38, 75–81 (2006)
McCarroll, S. A. et al. Common deletion polymorphisms in the human genome.Nature Genet.38, 86–92 (2006)
Hinds, D. A., Kloek, A. P., Jen, M., Chen, X. & Frazer, K. A. Common deletions and SNPs are in linkage disequilibrium in the human genome.Nature Genet.38, 82–85 (2006)
Cheng, Z. et al. A genome-wide comparison of recent chimpanzee and human segmental duplications.Nature437, 88–93 (2005)
Aitman, T. J. et al. Copy number polymorphism in Fcgr3 predisposes to glomerulonephritis in rats and humans.Nature439, 851–855 (2006)
Gonzalez, E. et al. The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility.Science307, 1434–1440 (2005)
Fellermann, K. et al. A chromosome 8 gene-cluster polymorphism with low human β-defensin 2 gene copy number predisposes to Crohn disease of the colon.Am. J. Hum. Genet.79, 439–448 (2006)
Hollox, E. J. et al. Psoriasis is associated with increased β-defensin genomic copy number.Nature Genet.40, 23–25 (2007)
Cooper, G. M., Nickerson, D. A. & Eichler, E. E. Mutational and selective effects on copy-number variants in the human genome.Nature Genet.39, S22–S29 (2007)
Istrail, S. et al. Whole-genome shotgun assembly and comparison of human genome assemblies.Proc. Natl Acad. Sci. USA101, 1916–1921 (2004)
Khaja, R. et al. Genome assembly comparison identifies structural variants in the human genome.Nature Genet.38, 1413–1418 (2006)
Levy, S. et al. The diploid genome sequence of an individual human.PLoS Biol.5, e254 (2007)
Eichler, E. E. et al. Completing the map of human genetic variation.Nature447, 161–165 (2007)
The International HapMap Consortium. A haplotype map of the human genome.Nature437, 1299–1320 (2005)
Donahue, W. & Ebling, H. M. Fosmid libraries for genomic structural variation detection.Curr. Protocols Hum. Genet.5, 20.1–20.18 (2007)
Volik, S. et al. End-sequence profiling: sequence-based analysis of aberrant genomes.Proc. Natl Acad. Sci. USA100, 7696–7701 (2003)
Small, K., Iber, J. & Warren, S. Emerin deletion revals a common X-chromosome inversion mediated by inverted repeats.Nature Genet.16, 96–99 (1997)
Giglio, S. et al. Heterozygous submicroscopic inversions involving olfactory receptor-gene clusters mediate the recurrent t(4;8)(p16;p23) translocation.Am. J. Hum. Genet.71, 276–285 (2002)
Stefansson, H. et al. A common inversion under selection in Europeans.Nature Genet.37, 129–137 (2005)
Sharp, A. J. et al. Discovery of previously unidentified genomic disorders from the duplication architecture of the human genome.Nature Genet.38, 1038–1042 (2006)
Sharp, A. J. et al. Characterization of a recurrent 15q24 microdeletion syndrome.Hum. Mol. Genet.16, 567–572 (2007)
Warburton, P. E., Giordano, J., Cheung, F., Gelfand, Y. & Benson, G. Inverted repeat structure of the human genome: the X-chromosome contains a preponderance of large, highly homologous inverted repeats that contain testes genes.Genome Res.14, 1861–1869 (2004)
Sutton, G. G., White, O., Adams, M. D. & Kerlavage, A. TIGR Assembler: a new tool for assembling large shotgun sequencing projects.Genome Sci. Technol.1, 9–19 (1995)
Bovee, D. et al. Closing gaps in the human genome with fosmid resources generated from multiple individuals.Nature Genet.40, 96–101 (2008)
Scherer, S. W. et al. Challenges and standards in integrating surveys of structural variation.Nature Genet.39, S7–S15 (2007)
Nguyen, T. V. et al. Short mucin 6 alleles are associated withH.pylori infection.World J. Gastroenterol.12, 6021–6025 (2006)
Lackner, C., Cohen, J. C. & Hobbs, H. H. Molecular definition of the extreme size polymorphism in apolipoprotein(a).Hum. Mol. Genet.2, 933–940 (1993)
Ning, Z., Cox, A. J. & Mullikin, J. C. SSAHA: a fast search method for large DNA databases.Genome Res.11, 1725–1729 (2001)
ENCODE Project Consortium. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project.Nature447, 799–816 (2007)
Sachidanandam, R. et al. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms.Nature409, 928–933 (2001)
Nusbaum, C. et al. DNA sequence and analysis of human chromosome 8.Nature439, 331–335 (2006)
de Smith, A. J. et al. Array CGH analysis of copy number variation identifies 1284 new genes variant in healthy white males: implications for association studies of complex diseases.Hum. Mol. Genet.16, 2783–2794 (2007)
Sebat, J. et al. Strong association ofde novo copy number mutations with autism.Science316, 445–449 (2007)
Korbel, J. O. et al. Paired-end mapping reveals extensive structural variation in the human genome.Science318, 420–426 (2007)
Gillett, W. et al. Assembly of high-resolution restriction maps based on multiple complete digests of a redundant set of overlapping clones.Genomics33, 389–408 (1996)
Wong, G. K., Yu, J., Thayer, E. C. & Olson, M. V. Multiple-complete-digest restriction fragment mapping: generating sequence-ready maps for large-scale DNA sequencing.Proc. Natl Acad. Sci. USA94, 5225–5230 (1997)
Acknowledgements
We thank the staff from the University of Washington Genome Center and the Washington University Genome Sequencing Center for technical assistance. J.M.K. is supported by a National Science Foundation Graduate Research Fellowship. G.M.C. is supported by a Merck, Jane Coffin Childs Memorial Fund Postdoctoral Fellowship. This work was supported by National Institutes of Health grants HG004120 to E.E.E., D.A.N. and M.V.O., and 3 U54 HG002043 to M.V.O. E.E.E. is an Investigator of the Howard Hughes Medical Institute.
Author Contributions J.M.K., G.M.C., M.V.O, D.A.N, and E.E.E. contributed to the writing of this paper. The study was coordinated by L.B., M.V.O, R.K., D.R.S., J.M.K. and E.E.E. A.B., D.R.S., D.Sa., E.G., H.M.E., K.M., N.T., R.D., W.F.D. and W.T. performed library construction and end sequencing. E.H., H.S.H., K.A.P., M.V.O., R.K., R.K.W., T.G. and W.G. performed clone insert validation and sequencing. C.A., D.A.N., E.T., J.D.S., J.S., L.C., M.D., M.M., M.W., T.L.N. and Z.C. provided technical and analytical support. D.A.P., D.A.A., J.M.Ko. and S.A.M. contributed variation data. G.M.C., J.M.K., L.B., N.A.Y., N.S. and P.T. designed and analysed array CGH experiments. G.M.C. and T.Z. performed the genotype analysis. F.A. performed FISH experiments. B.T. and D.S. performed optical mapping experiments. E.E.E., J.M.K. and L.C. analysed sequenced clones. J.C.M. and N.H. identified SNPs and indels.
Author information
Authors and Affiliations
Department of Genome Sciences and Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA,
Jeffrey M. Kidd, Gregory M. Cooper, Can Alkan, Francesca Antonacci, Troy Zerr, Tera L. Newman, Eray Tüzün, Ze Cheng, Molly Weaver, Lin Chen, Maika Malig, Joshua D. Smith, Michael Dorschner, John Stamatoyannopoulos, Deborah A. Nickerson & Evan E. Eichler
Agencourt Bioscience Corporation, Beverly, Massachusetts 01915, USA ,
William F. Donahue, Heather M. Ebling, Nadeem Tusneem, Robert David, David Saranga, Adrianne Brand, Wei Tao, Erik Gustafson, Kevin McKernan & Douglas R. Smith
Division of Medical Genetics, Department of Medicine, and University of Washington Genome Center, University of Washington, Seattle, Washington 98195, USA,
Hillary S. Hayden, Eric Haugen, Will Gillett, Karen A. Phelps, Maynard V. Olson & Rajinder Kaul
Agilent Technologies, Santa Clara, California 95051, USA ,
Nick Sampas, N. Alice Yamada, Peter Tsang & Laurakay Bruhn
Washington University Genome Sequencing Center, School of Medicine, St Louis, Missouri 63108, USA ,
Tina Graves & Richard K. Wilson
Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA ,
Nancy Hansen & James C. Mullikin
Laboratory of Genetics, University of Wisconsin, Madison, Wisconsin 53706, USA ,
Brian Teague & David Schwartz
Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02114, USA ,
Joshua M. Korn, Steven A. McCarroll & David A. Altshuler
Illumina, Inc., 9885 Towne Centre Drive, San Diego, California 92121, USA ,
Daniel A. Peiffer
- Jeffrey M. Kidd
Search author on:PubMed Google Scholar
- Gregory M. Cooper
Search author on:PubMed Google Scholar
- William F. Donahue
Search author on:PubMed Google Scholar
- Hillary S. Hayden
Search author on:PubMed Google Scholar
- Nick Sampas
Search author on:PubMed Google Scholar
- Tina Graves
Search author on:PubMed Google Scholar
- Nancy Hansen
Search author on:PubMed Google Scholar
- Brian Teague
Search author on:PubMed Google Scholar
- Can Alkan
Search author on:PubMed Google Scholar
- Francesca Antonacci
Search author on:PubMed Google Scholar
- Eric Haugen
Search author on:PubMed Google Scholar
- Troy Zerr
Search author on:PubMed Google Scholar
- N. Alice Yamada
Search author on:PubMed Google Scholar
- Peter Tsang
Search author on:PubMed Google Scholar
- Tera L. Newman
Search author on:PubMed Google Scholar
- Eray Tüzün
Search author on:PubMed Google Scholar
- Ze Cheng
Search author on:PubMed Google Scholar
- Heather M. Ebling
Search author on:PubMed Google Scholar
- Nadeem Tusneem
Search author on:PubMed Google Scholar
- Robert David
Search author on:PubMed Google Scholar
- Will Gillett
Search author on:PubMed Google Scholar
- Karen A. Phelps
Search author on:PubMed Google Scholar
- Molly Weaver
Search author on:PubMed Google Scholar
- David Saranga
Search author on:PubMed Google Scholar
- Adrianne Brand
Search author on:PubMed Google Scholar
- Wei Tao
Search author on:PubMed Google Scholar
- Erik Gustafson
Search author on:PubMed Google Scholar
- Kevin McKernan
Search author on:PubMed Google Scholar
- Lin Chen
Search author on:PubMed Google Scholar
- Maika Malig
Search author on:PubMed Google Scholar
- Joshua D. Smith
Search author on:PubMed Google Scholar
- Joshua M. Korn
Search author on:PubMed Google Scholar
- Steven A. McCarroll
Search author on:PubMed Google Scholar
- David A. Altshuler
Search author on:PubMed Google Scholar
- Daniel A. Peiffer
Search author on:PubMed Google Scholar
- Michael Dorschner
Search author on:PubMed Google Scholar
- John Stamatoyannopoulos
Search author on:PubMed Google Scholar
- David Schwartz
Search author on:PubMed Google Scholar
- Deborah A. Nickerson
Search author on:PubMed Google Scholar
- James C. Mullikin
Search author on:PubMed Google Scholar
- Richard K. Wilson
Search author on:PubMed Google Scholar
- Laurakay Bruhn
Search author on:PubMed Google Scholar
- Maynard V. Olson
Search author on:PubMed Google Scholar
- Rajinder Kaul
Search author on:PubMed Google Scholar
- Douglas R. Smith
Search author on:PubMed Google Scholar
- Evan E. Eichler
Search author on:PubMed Google Scholar
Corresponding author
Correspondence toEvan E. Eichler.
Ethics declarations
Competing interests
Daniel A. Peiffer is currently an employee of Illumina, Inc.; Kevin McKernan and Robert David are currently employed by Applied Biosystems, a manufacturer of DNA-sequencing reagents and instruments; and Laurakay Bruhn, Nick Sampas, Peter Tsang and N. Alice Yamada are employees of Agilent Technologies, Inc.
Supplementary information
Supplementary Information
The file contains extensive Supplementary Information with Supplementary Figures S1-S2, S4-S9. Supplementary Figures S3 and S10 are included in separate files. (PDF 9427 kb)
Supplementary Figure S2
The file contains Supplementary Figure S2 with end-sequence mapping of fosmids against the human genome. All discordant fosmids mapping to the human genome are displayed individually for each library using the following color scheme: ABC7=green, ABC8=forestgreen, ABC10=blue, ABC13=cyan, G248=black, ABC9=purple, ABC11=red, ABC12=orange, and ABC14=hotpink. The end-sequence placements are mapped in the context of gaps within the assembly (purple) and segmental duplications (grey bars). (PDF 4463 kb)
Supplementary Figure S3
The file contains Supplementary Figure S3 with end-sequence mapping of fosmids against the human genome. All discordant fosmids mapping to the human genome are displayed individually for each library using the following color scheme: ABC7=green, ABC8=forestgreen, ABC10=blue, ABC13=cyan, G248=black, ABC9=purple, ABC11=red, ABC12=orange, and ABC14=hotpink. The end-sequence placements are mapped in the context of gaps within the assembly (purple) and segmental duplications (grey bars). (PDF 6227 kb)
Supplementary Figure S4
The file contains Supplementary Figure S4 with end-sequence mapping of fosmids against the human genome. All discordant fosmids mapping to the human genome are displayed individually for each library using the following color scheme: ABC7=green, ABC8=forestgreen, ABC10=blue, ABC13=cyan, G248=black, ABC9=purple, ABC11=red, ABC12=orange, and ABC14=hotpink. The end-sequence placements are mapped in the context of gaps within the assembly (purple) and segmental duplications (grey bars). (PDF 3489 kb)
Supplementary Figure S10
The file contains Supplementary Figure S10 with sequenced Structural Variation and Gene Structure. A graphical representation for sequenced sites (n=266) of structural variation (miropeats view) is provided. Each alignment compares the human reference genome (top) with the sequenced structure of the fosmid clone. (PDF 3571 kb)
Supplementary Table S1
The file contains Supplementary Table S1 showing concordant vs. discordant clone placement summary statistics. (XLS 22 kb)
Supplementary Table S2
The file contains Supplementary Table S2 showing one-end anchored (OEA) clone statistics. (XLS 14 kb)
Supplementary Table S3
The file contains Supplementary Table S3 with All ESP predicted sites of insertions and deletions with associated experimental validation (See Supplementary Material Section 12 for description of column headers) (XLS 5072 kb)
Supplementary Table S4
The file contains Supplementary Table S4 with ESP predicted sites of insertion and deletion loci (non-redundant) across the fosmid libraries (See Supplementary Material Section 12 for description of column headers) (XLS 4106 kb)
Supplementary Table S5
The file contains Supplementary Table S5 with genotyping results for a subset of ESP deletion variants based on analysis of genotypes from the llumina Human1M BeadChip (XLS 38 kb)
Supplementary Table S6
The file contains Supplementary Table S6 with ESP predicted inversion breakpoints (XLS 308 kb)
Supplementary Table S7
The file contains Supplementary Table S7 with merged inversion loci (non-redundant). (XLS 63 kb)
Supplementary Table S8
The file contains Supplementary Table S8 with large insertions of novel sequence confirmed by optical mapping. (XLS 16 kb)
Supplementary Table S9
The file contains Supplementary Table S9 with genbank accession IDs of sequenced clones. (XLS 73 kb)
Supplementary Table S10
The file contains Supplementary Table S10 with sequenced structural variants that affect exons of genes. (XLS 26 kb)
Supplementary Table S11
The file contains Supplementary Table S11 with summary statistics of fosmid end sequences. (XLS 17 kb)
Supplementary Table S12
The file contains Supplementary Table S12 with genotypes based on custom GoldenGate Assay and qPCR. (XLS 78 kb)
Rights and permissions
About this article
Cite this article
Kidd, J., Cooper, G., Donahue, W.et al. Mapping and sequencing of structural variation from eight human genomes.Nature453, 56–64 (2008). https://doi.org/10.1038/nature06862
Received:
Accepted:
Issue date:
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
This article is cited by
Development of a novel five dye insertion/deletion (INDEL) panel for ancestry determination
- Lucio L. Avellaneda
- Damani T. Johnson
- Bobby L. LaRue
International Journal of Legal Medicine (2024)
Gut-host Crosstalk: Methodological and Computational Challenges
- Ivan Ivanov
Digestive Diseases and Sciences (2020)
Palindromic GOLGA8 core duplicons promote chromosome 15q13.3 microdeletion and evolutionary instability
- Francesca Antonacci
- Megan Y Dennis
- Evan E Eichler
Nature Genetics (2014)
Improving mammalian genome scaffolding using large insert mate-pair next-generation sequencing
- Sebastiaan van Heesch
- Wigard P Kloosterman
- Edwin Cuppen
BMC Genomics (2013)
Sequencing of the core MHC region of black grouse (Tetrao tetrix) and comparative genomics of the galliform MHC
- Biao Wang
- Robert Ekblom
- Jacob Höglund
BMC Genomics (2012)


