Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Ewens's sampling formula

From Wikipedia, the free encyclopedia
Sampling formula which describes the probabilities of alleles in a sample
This article includes alist of references,related reading, orexternal links,but its sources remain unclear because it lacksinline citations. Please helpimprove this article byintroducing more precise citations.(August 2011) (Learn how and when to remove this message)

Inpopulation genetics,Ewens's sampling formula describes theprobabilities associated with counts of how many differentalleles are observed a given number of times in thesample.

Definition

[edit]

Ewens's sampling formula, introduced byWarren Ewens, states that under certain conditions (specified below), if a random sample ofngametes is taken from a population and classified according to thegene at a particularlocus then theprobability that there area1alleles represented once in the sample, anda2 alleles represented twice, and so on, is

Pr(a1,,an;θ)=n!θ(θ+1)(θ+n1)j=1nθajjajaj!,{\displaystyle \operatorname {Pr} (a_{1},\dots ,a_{n};\theta )={n! \over \theta (\theta +1)\cdots (\theta +n-1)}\prod _{j=1}^{n}{\theta ^{a_{j}} \over j^{a_{j}}a_{j}!},}

for some positive numberθ representing thepopulation mutation rate, whenevera1,,an{\displaystyle a_{1},\ldots ,a_{n}} is a sequence of nonnegative integers such that

a1+2a2+3a3++nan=i=1niai=n.{\displaystyle a_{1}+2a_{2}+3a_{3}+\cdots +na_{n}=\sum _{i=1}^{n}ia_{i}=n.\,}

The phrase "under certain conditions" used above is made precise by the following assumptions:

  • The sample sizen is small by comparison to the size of the whole population; and
  • The population is in statistical equilibrium undermutation andgenetic drift and the role of selection at the locus in question is negligible; and
  • Every mutant allele is novel.

This is aprobability distribution on the set of allpartitions of the integern. Among probabilists and statisticians it is often called themultivariate Ewens distribution.

Mathematical properties

[edit]

Whenθ = 0, the probability is 1 that alln genes are the same. Whenθ = 1, then the distribution is precisely that of the integer partition induced by a uniformly distributedrandom permutation. Asθ → ∞, the probability that no two of then genes are the same approaches 1.

This family of probability distributions enjoys the property that if after the sample ofn is taken,m of then gametes are chosen without replacement, then the resulting probability distribution on the set of all partitions of the smaller integerm is just what the formula above would give ifm were put in place of n.

The Ewens distribution arises naturally from theChinese restaurant process.

See also

[edit]

Notes

[edit]
  • Warren Ewens, "The sampling theory of selectively neutral alleles",Theoretical Population Biology, volume 3, pages 87–112, 1972.
  • H. Crane. (2016) "The Ubiquitous Ewens Sampling Formula",Statistical Science, 31:1 (Feb 2016). This article introduces a series of seven articles about Ewens Sampling in a special issue of the journal.
  • J.F.C. Kingman, "Random partitions in population genetics",Proceedings of the Royal Society of London, Series B, Mathematical and Physical Sciences, volume 361, number 1704, 1978.
  • S. Tavare and W. J. Ewens, "The Multivariate Ewens distribution." (1997, Chapter 41 from the reference below).
  • N.L. Johnson, S. Kotz, and N. Balakrishnan (1997)Discrete Multivariate Distributions, Wiley.ISBN 0-471-12844-9.
Discrete
univariate
with finite
support
with infinite
support
Continuous
univariate
supported on a
bounded interval
supported on a
semi-infinite
interval
supported
on the whole
real line
with support
whose type varies
Mixed
univariate
continuous-
discrete
Multivariate
(joint)
Directional
Degenerate
andsingular
Degenerate
Dirac delta function
Singular
Cantor
Families
Retrieved from "https://en.wikipedia.org/w/index.php?title=Ewens%27s_sampling_formula&oldid=1318380458"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp