Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Bernoulli distribution

From Wikipedia, the free encyclopedia
Probability distribution modeling a coin toss which need not be fair

Bernoulli distribution
Probability mass function
Funzione di densità di una variabile casuale normale

Three examples of Bernoulli distribution:

Parameters

0p1{\displaystyle 0\leq p\leq 1}

q=1p{\displaystyle q=1-p}
Supportk{0,1}{\displaystyle k\in \{0,1\}}
PMF{q=1pif k=0pif k=1{\displaystyle {\begin{cases}q=1-p&{\text{if }}k=0\\p&{\text{if }}k=1\end{cases}}}
CDF{0if k<01pif 0k<11if k1{\displaystyle {\begin{cases}0&{\text{if }}k<0\\1-p&{\text{if }}0\leq k<1\\1&{\text{if }}k\geq 1\end{cases}}}
Meanp{\displaystyle p}
Median{0if p<1/2[0,1]if p=1/21if p>1/2{\displaystyle {\begin{cases}0&{\text{if }}p<1/2\\\left[0,1\right]&{\text{if }}p=1/2\\1&{\text{if }}p>1/2\end{cases}}}
Mode{0if p<1/20,1if p=1/21if p>1/2{\displaystyle {\begin{cases}0&{\text{if }}p<1/2\\0,1&{\text{if }}p=1/2\\1&{\text{if }}p>1/2\end{cases}}}
Variancep(1p)=pq{\displaystyle p(1-p)=pq}
MAD2p(1p)=2pq{\displaystyle 2p(1-p)=2pq}
Skewnessqppq{\displaystyle {\frac {q-p}{\sqrt {pq}}}}
Excess kurtosis16pqpq{\displaystyle {\frac {1-6pq}{pq}}}
Entropyqlnqplnp{\displaystyle -q\ln q-p\ln p}
MGFq+pet{\displaystyle q+pe^{t}}
CFq+peit{\displaystyle q+pe^{it}}
PGFq+pz{\displaystyle q+pz}
Fisher information1pq{\displaystyle {\frac {1}{pq}}}
Part of a series onstatistics
Probability theory

Inprobability theory andstatistics, theBernoulli distribution, named after Swiss mathematicianJacob Bernoulli,[1] is thediscrete probability distribution of arandom variable which takes the value 1 with probabilityp{\displaystyle p} and the value 0 with probabilityq=1p{\displaystyle q=1-p}. Less formally, it can be thought of as a model for the set of possible outcomes of any singleexperiment that asks ayes–no question. Such questions lead tooutcomes that areBoolean-valued: a singlebit whose value is success/yes/true/one withprobabilityp and failure/no/false/zero with probabilityq. It can be used to represent a (possibly biased)coin toss where 1 and 0 would represent "heads" and "tails", respectively, andp would be the probability of the coin landing on heads (or vice versa where 1 would represent tails andp would be the probability of tails). In particular, unfair coins would havep1/2.{\displaystyle p\neq 1/2.}

The Bernoulli distribution is a special case of thebinomial distribution where a single trial is conducted (son would be 1 for such a binomial distribution). It is also a special case of thetwo-point distribution, for which the possible outcomes need not be 0 and 1.[2]

Properties

[edit]

IfX{\displaystyle X} is a random variable with a Bernoulli distribution, then:

Pr(X=1)=p,Pr(X=0)=q=1p.{\displaystyle {\begin{aligned}\Pr(X{=}1)&=p,\\\Pr(X{=}0)&=q=1-p.\end{aligned}}}

Theprobability mass functionf{\displaystyle f} of this distribution, over possible outcomesk, is[3]

f(k;p)={pif k=1,q=1pif k=0.{\displaystyle f(k;p)={\begin{cases}p&{\text{if }}k=1,\\q=1-p&{\text{if }}k=0.\end{cases}}}

This can also be expressed as

f(k;p)=pk(1p)1kfor k{0,1}{\displaystyle f(k;p)=p^{k}(1-p)^{1-k}\quad {\text{for }}k\in \{0,1\}}

or as

f(k;p)=pk+(1p)(1k)for k{0,1}.{\displaystyle f(k;p)=pk+(1-p)(1-k)\quad {\text{for }}k\in \{0,1\}.}

The Bernoulli distribution is a special case of thebinomial distribution withn=1.{\displaystyle n=1.}[4]

Thekurtosis goes to infinity for high and low values ofp,{\displaystyle p,} but forp=1/2{\displaystyle p=1/2} the two-point distributions including the Bernoulli distribution have a lowerexcess kurtosis, namely −2, than any other probability distribution.

The Bernoulli distributions for0p1{\displaystyle 0\leq p\leq 1} form anexponential family.

Themaximum likelihood estimator ofp{\displaystyle p} based on a random sample is thesample mean.

The probability mass distribution function of a Bernoulli experiment along with its corresponding cumulative distribution function

Mean

[edit]

Theexpected value of a Bernoulli random variableX{\displaystyle X} is

E[X]=p{\displaystyle \operatorname {E} [X]=p}

This is because for a Bernoulli distributed random variableX{\displaystyle X} withPr(X=1)=p{\displaystyle \Pr(X{=}1)=p} andPr(X=0)=q{\textstyle \Pr(X{=}0)=q} we find[3]

E[X]=Pr(X=1)1+Pr(X=0)0=p1+q0=p.{\displaystyle {\begin{aligned}\operatorname {E} [X]&=\Pr(X{=}1)\cdot 1+\Pr(X{=}0)\cdot 0\\[1ex]&=p\cdot 1+q\cdot 0\\[1ex]&=p.\end{aligned}}}

Variance

[edit]

Thevariance of a Bernoulli distributedX{\displaystyle X} is

Var[X]=pq=p(1p){\displaystyle \operatorname {Var} [X]=pq=p(1-p)}

We first find

E[X2]=Pr(X=1)12+Pr(X=0)02=p12+q02=p=E[X]{\displaystyle {\begin{aligned}\operatorname {E} [X^{2}]&=\Pr(X{=}1)\cdot 1^{2}+\Pr(X{=}0)\cdot 0^{2}\\&=p\cdot 1^{2}+q\cdot 0^{2}\\&=p=\operatorname {E} [X]\end{aligned}}}

From this follows[3]

Var[X]=E[X2]E[X]2=E[X]E[X]2=pp2=p(1p)=pq{\displaystyle {\begin{aligned}\operatorname {Var} [X]&=\operatorname {E} [X^{2}]-\operatorname {E} [X]^{2}=\operatorname {E} [X]-\operatorname {E} [X]^{2}\\[1ex]&=p-p^{2}=p(1-p)=pq\end{aligned}}}

With this result it is easy to prove that, for any Bernoulli distribution, its variance will have a value inside[0,1/4]{\displaystyle [0,1/4]}.

Skewness

[edit]

Theskewness isqppq=12ppq{\displaystyle {\frac {q-p}{\sqrt {pq}}}={\frac {1-2p}{\sqrt {pq}}}}. When we take the standardized Bernoulli distributed random variableXE[X]Var[X]{\displaystyle {\frac {X-\operatorname {E} [X]}{\sqrt {\operatorname {Var} [X]}}}} we find that this random variable attainsqpq{\displaystyle {\frac {q}{\sqrt {pq}}}} with probabilityp{\displaystyle p} and attainsppq{\displaystyle -{\frac {p}{\sqrt {pq}}}} with probabilityq{\displaystyle q}. Thus we get

γ1=E[(XE[X]Var[X])3]=p(qpq)3+q(ppq)3=1pq3(pq3qp3)=pqpq3(q2p2)=(1p)2p2pq=12ppq=qppq.{\displaystyle {\begin{aligned}\gamma _{1}&=\operatorname {E} \left[\left({\frac {X-\operatorname {E} [X]}{\sqrt {\operatorname {Var} [X]}}}\right)^{3}\right]\\&=p\cdot \left({\frac {q}{\sqrt {pq}}}\right)^{3}+q\cdot \left(-{\frac {p}{\sqrt {pq}}}\right)^{3}\\&={\frac {1}{{\sqrt {pq}}^{3}}}\left(pq^{3}-qp^{3}\right)\\&={\frac {pq}{{\sqrt {pq}}^{3}}}(q^{2}-p^{2})\\&={\frac {(1-p)^{2}-p^{2}}{\sqrt {pq}}}\\&={\frac {1-2p}{\sqrt {pq}}}={\frac {q-p}{\sqrt {pq}}}.\end{aligned}}}

Higher moments and cumulants

[edit]

The raw moments are all equal because1k=1{\displaystyle 1^{k}=1} and0k=0{\displaystyle 0^{k}=0}.

E[Xk]=Pr(X=1)1k+Pr(X=0)0k=p1+q0=p=E[X].{\displaystyle \operatorname {E} [X^{k}]=\Pr(X{=}1)\cdot 1^{k}+\Pr(X{=}0)\cdot 0^{k}=p\cdot 1+q\cdot 0=p=\operatorname {E} [X].}

The central moment of orderk{\displaystyle k} is given byμk=(1p)(p)k+p(1p)k.{\displaystyle \mu _{k}=(1-p)(-p)^{k}+p(1-p)^{k}.}The first six central moments areμ1=0,μ2=p(1p),μ3=p(1p)(12p),μ4=p(1p)(13p(1p)),μ5=p(1p)(12p)(12p(1p)),μ6=p(1p)(15p(1p)(1p(1p))).{\displaystyle {\begin{aligned}\mu _{1}&=0,\\\mu _{2}&=p(1-p),\\\mu _{3}&=p(1-p)(1-2p),\\\mu _{4}&=p(1-p)(1-3p(1-p)),\\\mu _{5}&=p(1-p)(1-2p)(1-2p(1-p)),\\\mu _{6}&=p(1-p)(1-5p(1-p)(1-p(1-p))).\end{aligned}}}The higher central moments can be expressed more compactly in terms ofμ2{\displaystyle \mu _{2}} andμ3{\displaystyle \mu _{3}}μ4=μ2(13μ2),μ5=μ3(12μ2),μ6=μ2(15μ2(1μ2)).{\displaystyle {\begin{aligned}\mu _{4}&=\mu _{2}(1-3\mu _{2}),\\\mu _{5}&=\mu _{3}(1-2\mu _{2}),\\\mu _{6}&=\mu _{2}(1-5\mu _{2}(1-\mu _{2})).\end{aligned}}}The first six cumulants areκ1=p,κ2=μ2,κ3=μ3,κ4=μ2(16μ2),κ5=μ3(112μ2),κ6=μ2(130μ2(14μ2)).{\displaystyle {\begin{aligned}\kappa _{1}&=p,\\\kappa _{2}&=\mu _{2},\\\kappa _{3}&=\mu _{3},\\\kappa _{4}&=\mu _{2}(1-6\mu _{2}),\\\kappa _{5}&=\mu _{3}(1-12\mu _{2}),\\\kappa _{6}&=\mu _{2}(1-30\mu _{2}(1-4\mu _{2})).\end{aligned}}}

Entropy and Fisher's Information

[edit]

Entropy

[edit]

Entropy is a measure of uncertainty or randomness in a probability distribution. For a Bernoulli random variableX{\displaystyle X} with success probabilityp{\displaystyle p} and failure probabilityq=1p{\displaystyle q=1-p}, the entropyH(X){\displaystyle H(X)} is defined as:

H(X)=Epln1Pr(X)=Pr(X=0)lnPr(X=0)Pr(X=1)lnPr(X=1)=(qlnq+plnp).{\displaystyle {\begin{aligned}H(X)&=\mathbb {E} _{p}\ln {\frac {1}{\Pr(X)}}\\[1ex]&=-\Pr(X{=}0)\ln \Pr(X{=}0)-\Pr(X{=}1)\ln \Pr(X{=}1)\\[1ex]&=-(q\ln q+p\ln p).\end{aligned}}}

The entropy is maximized whenp=0.5{\displaystyle p=0.5}, indicating the highest level of uncertainty when both outcomes are equally likely. The entropy is zero whenp=0{\displaystyle p=0} orp=1{\displaystyle p=1}, where one outcome is certain.

Fisher's Information

[edit]

Fisher information measures the amount of information that an observable random variableX{\displaystyle X} carries about an unknown parameterp{\displaystyle p} upon which the probability ofX{\displaystyle X} depends. For the Bernoulli distribution, the Fisher information with respect to the parameterp{\displaystyle p} is given by:

I(p)=1pq{\displaystyle I(p)={\frac {1}{pq}}}

Proof:

It is maximized whenp=0.5{\displaystyle p=0.5}, reflecting maximum uncertainty and thus maximum information about the parameterp{\displaystyle p}.

Related distributions

[edit]
The Bernoulli distribution is simplyB(1,p){\displaystyle \operatorname {B} (1,p)}, also written asBernoulli(p).{\textstyle \mathrm {Bernoulli} (p).}

See also

[edit]

References

[edit]
  1. ^Uspensky, James Victor (1937).Introduction to Mathematical Probability. New York:McGraw-Hill. p. 45.OCLC 996937.
  2. ^Dekking, Frederik; Kraaikamp, Cornelis; Lopuhaä, Hendrik; Meester, Ludolf (9 October 2010).A Modern Introduction to Probability and Statistics (1 ed.).Springer London. pp. 43–48.ISBN 9781849969529.
  3. ^abcdBertsekas, Dimitri P. (2002).Introduction to Probability.Tsitsiklis, John N., Τσιτσικλής, Γιάννης Ν. Belmont, Mass.: Athena Scientific.ISBN 188652940X.OCLC 51441829.
  4. ^McCullagh, Peter;Nelder, John (1989).Generalized Linear Models, Second Edition. Boca Raton: Chapman and Hall/CRC. Section 4.2.2.ISBN 0-412-31760-5.
  5. ^Orloff, Jeremy; Bloom, Jonathan."Conjugate priors: Beta and normal"(PDF).math.mit.edu. RetrievedOctober 20, 2023.

Author's mention

[edit]

External links

[edit]
Wikimedia Commons has media related toBernoulli distribution.
Discrete
univariate
with finite
support
with infinite
support
Continuous
univariate
supported on a
bounded interval
supported on a
semi-infinite
interval
supported
on the whole
real line
with support
whose type varies
Mixed
univariate
continuous-
discrete
Multivariate
(joint)
Directional
Degenerate
andsingular
Degenerate
Dirac delta function
Singular
Cantor
Families
Retrieved from "https://en.wikipedia.org/w/index.php?title=Bernoulli_distribution&oldid=1323217203"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp