Movatterモバイル変換


[0]ホーム

URL:


Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
Thehttps:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

NIH NLM Logo
Log inShow account info
Access keysNCBI HomepageMyNCBI HomepageMain ContentMain Navigation
pubmed logo
Advanced Clipboard
User Guide

Full text links

Elsevier Science full text link Elsevier Science Free PMC article
Full text links

Actions

.2003 Dec;73(6):1402-22.
doi: 10.1086/380416. Epub 2003 Nov 20.

Informativeness of genetic markers for inference of ancestry

Affiliations

Informativeness of genetic markers for inference of ancestry

Noah A Rosenberg et al. Am J Hum Genet.2003 Dec.

Abstract

Inference of individual ancestry is useful in various applications, such as admixture mapping and structured-association mapping. Using information-theoretic principles, we introduce a general measure, the informativeness for assignment (I(n)), applicable to any number of potential source populations, for determining the amount of information that multiallelic markers provide about individual ancestry. In a worldwide human microsatellite data set, we identify markers of highest informativeness for inference of regional ancestry and for inference of population ancestry within regions; these markers, which are listed in online-only tables in our article, can be useful both in testing for and in controlling the influence of ancestry on case-control genetic association studies. Markers that are informative in one collection of source populations are generally informative in others. Informativeness of random dinucleotides, the most informative class of microsatellites, is five to eight times that of random single-nucleotide polymorphisms (SNPs), but 2%-12% of SNPs have higher informativeness than the median for dinucleotides. Our results can aid in decisions about the type, quantity, and specific choice of markers for use in studies of ancestry.

PubMed Disclaimer

Figures

Figure  1
Figure 1
Relationship of informativeness for assignment (In) to δ (A), andFst (B). The plots are based on two alleles and two source populations, and they use equations (7) and (8).
Figure  2
Figure 2
Informativeness for assignment (In), δ, andFst for 8,714 SNPs, based on allele frequency estimates in African Americans and European Americans.A,In vs. δ.B,In vs.Fst.C,Fst vs. δ. Upper and lower bounds for the dependent variable, given the independent variable, are taken from table 2. A red vertical line marks the point of greatest difference between the upper and lower curves. Mean differences between upper and lower curves are3/4-log2≈0.0569 (A),[16log2+2-6(log2)22]/12≈0.0282 (B), and2log2-4/3≈0.0530 (C). Spearman rank correlation coefficients between the variables are 0.921, 0.998, and 0.943 in A, B, and C, respectively.
Figure  3
Figure 3
Similarity coefficients for runs based on reduced sets of markers and runs based on the full data. Sets of markers were chosen with each of four methods: highest informativeness, highest expected heterozygosity, random, and lowest informativeness.
Figure  4
Figure 4
Inferred population structure with five clusters, based on markers of highest and lowest informativeness and plotted usingdistruct (available from Noah Rosenberg's Homepage). Each individual is represented by a thin vertical line, which is partitioned into five colored segments that represent the individual’s estimated ancestry coefficients in the five clusters. Black lines separate individuals of different populations, which are labeled below the figure. The left-right order of individuals is the same in all plots. The bottom plot is the same as is shown in figure 1 of Rosenberg et al. (2002); each of the other graphs is based on the highest-likelihood run among five runs with the relevant set of loci.
Figure  5
Figure 5
Correlations of informativeness for pairs of regional data sets
Figure  6
Figure 6
Informativeness quantiles for microsatellites and SNPs. For each set of populations, curves follow the same relative order over most of the domain (from top to bottom: dinucleotides, trinucleotides, tetranucleotides, and SNPs). SNPs were genotyped in African Americans and European Americans rather than in Africans and Europeans.
See this image and copyright information in PMC

References

Electronic-Database Information

    1. Human Diversity Panel Genotypes, Center for Medical Genetics,http://research.marshfieldclinic.org/genetics/Freq/FreqInfo.htm (for microsatellite genotypes)
    1. Human STRP Screening Sets, Center for Medical Genetics,http://research.marshfieldclinic.org/genetics/sets/combo.html (for Marshfield panel 10)
    1. Joshua Akey’s Homepage,http://cgi.uc.edu/~jakey/ (for SNP allele frequencies)
    1. Noah Rosenberg’s Homepage,http://www.cmb.usc.edu/~noahr/distruct.html (fordistruct software)
    1. Pritchard Lab,http://pritch.bsd.uchicago.edu/ (forstructure software)

References

    1. Abramowitz MA, Stegun IA (1965) Handbook of mathematical functions. Dover, New York
    1. Akey JM, Zhang G, Zhang K, Jin L, Shriver MD (2002) Interrogating a high-density SNP map for signatures of natural selection. Genome Res 12:1805–1814 10.1101/gr.631202 - DOI - PMC - PubMed
    1. Anderson EC, Thompson EA (2002) A model-based method for identifying species hybrids using multilocus genetic data. Genetics 160:1217–1229 - PMC - PubMed
    1. Bamshad MJ, Wooding S, Watkins WS, Ostler CT, Batzer MA, Jorde LB (2003) Human population genetic structure and inference of group membership. Am J Hum Genet 72:578–589 10.1086/368061 - DOI - PMC - PubMed
    1. Banks MA, Eichert W, Olsen JB (2003) Which genetic loci have greater population assignment power? Bioinformatics 19:1436–1438 10.1093/bioinformatics/btg172 - DOI - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources

Full text links
Elsevier Science full text link Elsevier Science Free PMC article
Cite
Send To

NCBI Literature Resources

MeSHPMCBookshelfDisclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.


[8]ページ先頭

©2009-2026 Movatter.jp