Estimating and interpreting FST: the impact of rare variants
- PMID:23861382
- PMCID: PMC3759727
- DOI: 10.1101/gr.154831.113
Estimating and interpreting FST: the impact of rare variants
Abstract
In a pair of seminal papers, Sewall Wright and Gustave Malécot introduced FST as a measure of structure in natural populations. In the decades that followed, a number of papers provided differing definitions, estimation methods, and interpretations beyond Wright's. While this diversity in methods has enabled many studies in genetics, it has also introduced confusion regarding how to estimate FST from available data. Considering this confusion, wide variation in published estimates of FST for pairs of HapMap populations is a cause for concern. These estimates changed-in some cases more than twofold-when comparing estimates from genotyping arrays to those from sequence data. Indeed, changes in FST from sequencing data might be expected due to population genetic factors affecting rare variants. While rare variants do influence the result, we show that this is largely through differences in estimation methods. Correcting for this yields estimates of FST that are much more concordant between sequence and genotype data. These differences relate to three specific issues: (1) estimating FST for a single SNP, (2) combining estimates of FST across multiple SNPs, and (3) selecting the set of SNPs used in the computation. Changes in each of these aspects of estimation may result in FST estimates that are highly divergent from one another. Here, we clarify these issues and propose solutions.
Figures

Similar articles
- Assessing the power of principal components and wright's fixation index analyzes applied to reveal the genome-wide genetic differences between herds of Holstein cows.Smaragdov MG, Kudinov AA.Smaragdov MG, et al.BMC Genet. 2020 Apr 28;21(1):47. doi: 10.1186/s12863-020-00848-0.BMC Genet. 2020.PMID:32345235Free PMC article.
- The Difference in the Proportions of Deleterious Variations within and between Populations Influences the Estimation of FST.Subramanian S.Subramanian S.Genes (Basel). 2022 Jan 22;13(2):194. doi: 10.3390/genes13020194.Genes (Basel). 2022.PMID:35205239Free PMC article.
- Genetic variation and human longevity.Soerensen M.Soerensen M.Dan Med J. 2012 May;59(5):B4454.Dan Med J. 2012.PMID:22549493
- [Analysis and application of SNP and haplotype in the human genome].Li J, Pan YC, Li YX, Shi TL.Li J, et al.Yi Chuan Xue Bao. 2005 Aug;32(8):879-89.Yi Chuan Xue Bao. 2005.PMID:16231744Review.Chinese.
- Neutral additive genetic variance in a metapopulation.Whitlock MC.Whitlock MC.Genet Res. 1999 Dec;74(3):215-21. doi: 10.1017/s0016672399004127.Genet Res. 1999.PMID:10689799Review.
Cited by
- Integrative functional genomic analyses identify genetic variants influencing skin pigmentation in Africans.Feng Y, Xie N, Inoue F, Fan S, Saskin J, Zhang C, Zhang F, Hansen MEB, Nyambo T, Mpoloka SW, Mokone GG, Fokunang C, Belay G, Njamnshi AK, Marks MS, Oancea E, Ahituv N, Tishkoff SA.Feng Y, et al.Nat Genet. 2024 Feb;56(2):258-272. doi: 10.1038/s41588-023-01626-1. Epub 2024 Jan 10.Nat Genet. 2024.PMID:38200130Free PMC article.
- A saturated map of common genetic variants associated with human height.Yengo L, Vedantam S, Marouli E, Sidorenko J, Bartell E, Sakaue S, Graff M, Eliasen AU, Jiang Y, Raghavan S, Miao J, Arias JD, Graham SE, Mukamel RE, Spracklen CN, Yin X, Chen SH, Ferreira T, Highland HH, Ji Y, Karaderi T, Lin K, Lüll K, Malden DE, Medina-Gomez C, Machado M, Moore A, Rüeger S, Sim X, Vrieze S, Ahluwalia TS, Akiyama M, Allison MA, Alvarez M, Andersen MK, Ani A, Appadurai V, Arbeeva L, Bhaskar S, Bielak LF, Bollepalli S, Bonnycastle LL, Bork-Jensen J, Bradfield JP, Bradford Y, Braund PS, Brody JA, Burgdorf KS, Cade BE, Cai H, Cai Q, Campbell A, Cañadas-Garre M, Catamo E, Chai JF, Chai X, Chang LC, Chang YC, Chen CH, Chesi A, Choi SH, Chung RH, Cocca M, Concas MP, Couture C, Cuellar-Partida G, Danning R, Daw EW, Degenhard F, Delgado GE, Delitala A, Demirkan A, Deng X, Devineni P, Dietl A, Dimitriou M, Dimitrov L, Dorajoo R, Ekici AB, Engmann JE, Fairhurst-Hunter Z, Farmaki AE, Faul JD, Fernandez-Lopez JC, Forer L, Francescatto M, Freitag-Wolf S, Fuchsberger C, Galesloot TE, Gao Y, Gao Z, Geller F, Giannakopoulou O, Giulianini F, Gjesing AP, Goel A, Gordon SD, Gorski M, Grove J, Guo X, Gustafsson S, Haessler J, Hansen TF, Havulinna AS, Haworth SJ, He J, Heard-Costa N, …See abstract for full author list ➔Yengo L, et al.Nature. 2022 Oct;610(7933):704-712. doi: 10.1038/s41586-022-05275-y. Epub 2022 Oct 12.Nature. 2022.PMID:36224396Free PMC article.
- Estimation of site frequency spectra from low-coverage sequencing data using stochastic EM reduces overfitting, runtime, and memory usage.Rasmussen MS, Garcia-Erill G, Korneliussen TS, Wiuf C, Albrechtsen A.Rasmussen MS, et al.Genetics. 2022 Nov 30;222(4):iyac148. doi: 10.1093/genetics/iyac148.Genetics. 2022.PMID:36173322Free PMC article.
- Admixture into and within sub-Saharan Africa.Busby GB, Band G, Si Le Q, Jallow M, Bougama E, Mangano VD, Amenga-Etego LN, Enimil A, Apinjoh T, Ndila CM, Manjurano A, Nyirongo V, Doumba O, Rockett KA, Kwiatkowski DP, Spencer CC; Malaria Genomic Epidemiology Network.Busby GB, et al.Elife. 2016 Jun 21;5:e15266. doi: 10.7554/eLife.15266.Elife. 2016.PMID:27324836Free PMC article.
- Genetic Characterization by SSR Markers of a Comprehensive Wine Grape Collection Conserved at Rancho de la Merced (Andalusia, Spain).Cretazzo E, Moreno Sanz P, Lorenzi S, Benítez ML, Velasco L, Emanuelli F.Cretazzo E, et al.Plants (Basel). 2022 Apr 16;11(8):1088. doi: 10.3390/plants11081088.Plants (Basel). 2022.PMID:35448817Free PMC article.
References
- Balding DJ 2003. Likelihood-based inference for genetic correlation coefficients. Theor Popul Biol 63: 221–230 - PubMed
- Balding DJ, Nichols RA 1995. A method for quantifying differentiation between populations at multi-allelic loci and its implications for investigating identity and paternity. Genetica 96: 3–12 - PubMed
- Barreiro LB, Laval G, Quach H, Patin E, Quintana-Murci L 2008. Natural selection has driven population differentiation in modern humans. Nat Genet 40: 340–345 - PubMed
Publication types
MeSH terms
Related information
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous