Efficient Coalescent Simulation and Genealogical Analysis for Large Sample Sizes
- PMID:27145223
- PMCID: PMC4856371
- DOI: 10.1371/journal.pcbi.1004842
Efficient Coalescent Simulation and Genealogical Analysis for Large Sample Sizes
Abstract
A central challenge in the analysis of genetic variation is to provide realistic genome simulation across millions of samples. Present day coalescent simulations do not scale well, or use approximations that fail to capture important long-range linkage properties. Analysing the results of simulations also presents a substantial challenge, as current methods to store genealogies consume a great deal of space, are slow to parse and do not take advantage of shared structure in correlated trees. We solve these problems by introducing sparse trees and coalescence records as the key units of genealogical analysis. Using these tools, exact simulation of the coalescent with recombination for chromosome-sized regions over hundreds of thousands of samples is possible, and substantially faster than present-day approximate methods. We can also analyse the results orders of magnitude more quickly than with existing methods.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures





Similar articles
- Efficient pedigree recording for fast population genetics simulation.Kelleher J, Thornton KR, Ashander J, Ralph PL.Kelleher J, et al.PLoS Comput Biol. 2018 Nov 1;14(11):e1006581. doi: 10.1371/journal.pcbi.1006581. eCollection 2018 Nov.PLoS Comput Biol. 2018.PMID:30383757Free PMC article.
- Accounting for long-range correlations in genome-wide simulations of large cohorts.Nelson D, Kelleher J, Ragsdale AP, Moreau C, McVean G, Gravel S.Nelson D, et al.PLoS Genet. 2020 May 5;16(5):e1008619. doi: 10.1371/journal.pgen.1008619. eCollection 2020 May.PLoS Genet. 2020.PMID:32369493Free PMC article.
- Approximating the coalescent with recombination.McVean GA, Cardin NJ.McVean GA, et al.Philos Trans R Soc Lond B Biol Sci. 2005 Jul 29;360(1459):1387-93. doi: 10.1098/rstb.2005.1673.Philos Trans R Soc Lond B Biol Sci. 2005.PMID:16048782Free PMC article.
- Genealogical trees, coalescent theory and the analysis of genetic polymorphisms.Rosenberg NA, Nordborg M.Rosenberg NA, et al.Nat Rev Genet. 2002 May;3(5):380-90. doi: 10.1038/nrg795.Nat Rev Genet. 2002.PMID:11988763Review.
- Recent trends in population genetics: more data! More math! Simple models?Wakeley J.Wakeley J.J Hered. 2004 Sep-Oct;95(5):397-405. doi: 10.1093/jhered/esh062.J Hered. 2004.PMID:15388767Review.
Cited by
- Our Tangled Family Tree: New Genomic Methods Offer Insight into the Legacy of Archaic Admixture.Ahlquist KD, Bañuelos MM, Funk A, Lai J, Rong S, Villanea FA, Witt KE.Ahlquist KD, et al.Genome Biol Evol. 2021 Jul 6;13(7):evab115. doi: 10.1093/gbe/evab115.Genome Biol Evol. 2021.PMID:34028527Free PMC article.Review.
- Interpreting generative adversarial networks to infer natural selection from genetic data.Riley R, Mathieson I, Mathieson S.Riley R, et al.Genetics. 2024 Apr 3;226(4):iyae024. doi: 10.1093/genetics/iyae024.Genetics. 2024.PMID:38386895Free PMC article.
- Automatic inference of demographic parameters using generative adversarial networks.Wang Z, Wang J, Kourakos M, Hoang N, Lee HH, Mathieson I, Mathieson S.Wang Z, et al.Mol Ecol Resour. 2021 Nov;21(8):2689-2705. doi: 10.1111/1755-0998.13386. Epub 2021 May 3.Mol Ecol Resour. 2021.PMID:33745225Free PMC article.
- Extensive Recombination Suppression and Epistatic Selection Causes Chromosome-Wide Differentiation of a Selfish Sex Chromosome inDrosophila pseudoobscura.Fuller ZL, Koury SA, Leonard CJ, Young RE, Ikegami K, Westlake J, Richards S, Schaeffer SW, Phadnis N.Fuller ZL, et al.Genetics. 2020 Sep;216(1):205-226. doi: 10.1534/genetics.120.303460. Epub 2020 Jul 30.Genetics. 2020.PMID:32732371Free PMC article.
- The Unreasonable Effectiveness of Convolutional Neural Networks in Population Genetic Inference.Flagel L, Brandvain Y, Schrider DR.Flagel L, et al.Mol Biol Evol. 2019 Feb 1;36(2):220-238. doi: 10.1093/molbev/msy224.Mol Biol Evol. 2019.PMID:30517664Free PMC article.
References
- Kingman JFC. The coalescent. Stoch Proc Appl. 1982;13(3):235–248. 10.1016/0304-4149(82)90011-4 - DOI
- Wakeley J. Coalescent theory: an introduction. Englewood, Colorado: Roberts and Company; 2008.
- Hudson RR. Gene genealogies and the coalescent process. Oxford Surveys in Evolutionary Biology. 1990;7:1–44.
Publication types
MeSH terms
Related information
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases