Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Conditional probability distribution

From Wikipedia, the free encyclopedia
Probability theory and statistics concept
This articleneeds additional citations forverification. Please helpimprove this article byadding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Conditional probability distribution" – news ·newspapers ·books ·scholar ·JSTOR
(April 2013) (Learn how and when to remove this message)

Inprobability theory andstatistics, the conditional probability distribution is a probability distribution that describes the probability of an outcome given the occurrence of a particular event. Given twojointly distributedrandom variablesX{\displaystyle X} andY{\displaystyle Y}, theconditional probability distribution ofY{\displaystyle Y} givenX{\displaystyle X} is theprobability distribution ofY{\displaystyle Y} whenX{\displaystyle X} is known to be a particular value; in some cases the conditional probabilities may be expressed as functions containing the unspecified valuex{\displaystyle x} ofX{\displaystyle X} as a parameter. When bothX{\displaystyle X} andY{\displaystyle Y} arecategorical variables, aconditional probability table is typically used to represent the conditional probability. The conditional distribution contrasts with themarginal distribution of a random variable, which is its distribution without reference to the value of the other variable.

If the conditional distribution ofY{\displaystyle Y} givenX{\displaystyle X} is acontinuous distribution, then itsprobability density function is known as theconditional density function.[1] The properties of a conditional distribution, such as themoments, are often referred to by corresponding names such as theconditional mean andconditional variance.

More generally, one can refer to the conditional distribution of a subset of a set of more than two variables; this conditional distribution is contingent on the values of all the remaining variables, and if more than one variable is included in the subset then this conditional distribution is the conditionaljoint distribution of the included variables.

Conditional discrete distributions

[edit]

Fordiscrete random variables, the conditionalprobability mass function ofY{\displaystyle Y} givenX=x{\displaystyle X=x} can be written according to its definition as:

pY|X(yx)P(Y=yX=x)=P({X=x}{Y=y})P(X=x){\displaystyle p_{Y|X}(y\mid x)\triangleq P(Y=y\mid X=x)={\frac {P(\{X=x\}\cap \{Y=y\})}{P(X=x)}}\qquad }

Due to the occurrence ofP(X=x){\displaystyle P(X=x)} in the denominator, this is defined only for non-zero (hence strictly positive)P(X=x).{\displaystyle P(X=x).}

The relation with the probability distribution ofX{\displaystyle X} givenY{\displaystyle Y} is:

P(Y=yX=x)P(X=x)=P({X=x}{Y=y})=P(X=xY=y)P(Y=y).{\displaystyle P(Y=y\mid X=x)P(X=x)=P(\{X=x\}\cap \{Y=y\})=P(X=x\mid Y=y)P(Y=y).}

Example

[edit]

Consider the roll of a fair die and letX=1{\displaystyle X=1} if the number is even (i.e., 2, 4, or 6) andX=0{\displaystyle X=0} otherwise. Furthermore, letY=1{\displaystyle Y=1} if the number is prime (i.e., 2, 3, or 5) andY=0{\displaystyle Y=0} otherwise.

D123456
X010101
Y011010

Then the unconditional probability thatX=1{\displaystyle X=1} is 3/6 = 1/2 (since there are six possible rolls of the dice, of which three are even), whereas the probability thatX=1{\displaystyle X=1} conditional onY=1{\displaystyle Y=1} is 1/3 (since there are three possible prime number rolls—2, 3, and 5—of which one is even).

Conditional continuous distributions

[edit]

Similarly forcontinuous random variables, the conditionalprobability density function ofY{\displaystyle Y} given the occurrence of the valuex{\displaystyle x} ofX{\displaystyle X} can be written as[2]

fYX(yx)=fX,Y(x,y)fX(x){\displaystyle f_{Y\mid X}(y\mid x)={\frac {f_{X,Y}(x,y)}{f_{X}(x)}}\qquad }

wherefX,Y(x,y){\displaystyle f_{X,Y}(x,y)} gives thejoint density ofX{\displaystyle X} andY{\displaystyle Y}, whilefX(x){\displaystyle f_{X}(x)} gives themarginal density forX{\displaystyle X}. Also in this case it is necessary thatfX(x)>0{\displaystyle f_{X}(x)>0}.

The relation with the probability distribution ofX{\displaystyle X} givenY{\displaystyle Y} is given by:

fYX(yx)fX(x)=fX,Y(x,y)=fX|Y(xy)fY(y).{\displaystyle f_{Y\mid X}(y\mid x)f_{X}(x)=f_{X,Y}(x,y)=f_{X|Y}(x\mid y)f_{Y}(y).}

The concept of the conditional distribution of a continuous random variable is not as intuitive as it might seem:Borel's paradox shows that conditional probability density functions need not be invariant under coordinate transformations.

Example

[edit]
Bivariate normaljoint density

The graph shows abivariate normal joint density for random variablesX{\displaystyle X} andY{\displaystyle Y}. To see the distribution ofY{\displaystyle Y} conditional onX=70{\displaystyle X=70}, one can first visualize the lineX=70{\displaystyle X=70} in theX,Y{\displaystyle X,Y}plane, and then visualize the plane containing that line and perpendicular to theX,Y{\displaystyle X,Y} plane. The intersection of that plane with the joint normal density, once rescaled to give unit area under the intersection, is the relevant conditional density ofY{\displaystyle Y}.

YX=70  N(μY+σYσXρ(70μX),(1ρ2)σY2).{\displaystyle Y\mid X=70\ \sim \ {\mathcal {N}}\left(\mu _{Y}+{\frac {\sigma _{Y}}{\sigma _{X}}}\rho (70-\mu _{X}),\,(1-\rho ^{2})\sigma _{Y}^{2}\right).}

Relation to independence

[edit]

Random variablesX{\displaystyle X},Y{\displaystyle Y} areindependent if and only if the conditional distribution ofY{\displaystyle Y} givenX{\displaystyle X} is, for all possible realizations ofX{\displaystyle X}, equal to the unconditional distribution ofY{\displaystyle Y}. For discrete random variables this meansP(Y=y|X=x)=P(Y=y){\displaystyle P(Y=y|X=x)=P(Y=y)} for all possibley{\displaystyle y} andx{\displaystyle x} withP(X=x)>0{\displaystyle P(X=x)>0}. For continuous random variablesX{\displaystyle X} andY{\displaystyle Y}, having ajoint density function, it meansfY(y|X=x)=fY(y){\displaystyle f_{Y}(y|X=x)=f_{Y}(y)} for all possibley{\displaystyle y} andx{\displaystyle x} withfX(x)>0{\displaystyle f_{X}(x)>0}.

Properties

[edit]

Seen as a function ofy{\displaystyle y} for givenx{\displaystyle x},P(Y=y|X=x){\displaystyle P(Y=y|X=x)} is a probability mass function and so the sum over ally{\displaystyle y} (or integral if it is a conditional probability density) is 1. Seen as a function ofx{\displaystyle x} for giveny{\displaystyle y}, it is alikelihood function, so that the sum (or integral) over allx{\displaystyle x} need not be 1.

Additionally, a marginal of a joint distribution can be expressed as the expectation of the corresponding conditional distribution. For instance,pX(x)=EY[pX|Y(x | Y)]{\displaystyle p_{X}(x)=E_{Y}[p_{X|Y}(x\ |\ Y)]}.

Measure-theoretic formulation

[edit]

Let(Ω,F,P){\displaystyle (\Omega ,{\mathcal {F}},P)} be aprobability space,GF{\displaystyle {\mathcal {G}}\subseteq {\mathcal {F}}} aσ{\displaystyle \sigma }-field inF{\displaystyle {\mathcal {F}}}. GivenAF{\displaystyle A\in {\mathcal {F}}}, theRadon–Nikodym theorem implies that there is[3] aG{\displaystyle {\mathcal {G}}}-measurable random variableP(AG):ΩR{\displaystyle P(A\mid {\mathcal {G}}):\Omega \to \mathbb {R} }, called theconditional probability, such thatGP(AG)(ω)dP(ω)=P(AG){\displaystyle \int _{G}P(A\mid {\mathcal {G}})(\omega )dP(\omega )=P(A\cap G)}for everyGG{\displaystyle G\in {\mathcal {G}}}, and such a random variable is uniquely defined up to sets of probability zero. A conditional probability is calledregular ifP(G)(ω){\displaystyle \operatorname {P} (\cdot \mid {\mathcal {G}})(\omega )} is aprobability measure on(Ω,F){\displaystyle (\Omega ,{\mathcal {F}})} for allωΩ{\displaystyle \omega \in \Omega } a.e.

Special cases:

LetX:ΩE{\displaystyle X:\Omega \to E} be a(E,E){\displaystyle (E,{\mathcal {E}})}-valued random variable. For eachBE{\displaystyle B\in {\mathcal {E}}}, defineμX|G(B|G)=P(X1(B)|G).{\displaystyle \mu _{X\,|\,{\mathcal {G}}}(B\,|\,{\mathcal {G}})=\mathrm {P} (X^{-1}(B)\,|\,{\mathcal {G}}).}For anyωΩ{\displaystyle \omega \in \Omega }, the functionμX|G(|G)(ω):ER{\displaystyle \mu _{X\,|{\mathcal {G}}}(\cdot \,|{\mathcal {G}})(\omega ):{\mathcal {E}}\to \mathbb {R} } is called theconditional probability distribution ofX{\displaystyle X} givenG{\displaystyle {\mathcal {G}}}. If it is a probability measure on(E,E){\displaystyle (E,{\mathcal {E}})}, then it is calledregular.

For a real-valued random variable (with respect to the Borelσ{\displaystyle \sigma }-fieldR1{\displaystyle {\mathcal {R}}^{1}} onR{\displaystyle \mathbb {R} }), every conditional probability distribution is regular.[4] In this case,E[XG]=xμXG(dx,){\displaystyle E[X\mid {\mathcal {G}}]=\int _{-\infty }^{\infty }x\,\mu _{X\mid {\mathcal {G}}}(dx,\cdot )} almost surely.

Relation to conditional expectation

[edit]

For any eventAF{\displaystyle A\in {\mathcal {F}}}, define theindicator function:

1A(ω)={1if ωA,0if ωA,{\displaystyle \mathbf {1} _{A}(\omega )={\begin{cases}1\;&{\text{if }}\omega \in A,\\0\;&{\text{if }}\omega \notin A,\end{cases}}}

which is a random variable. Note that the expectation of this random variable is equal to the probability ofA itself:

E(1A)=P(A).{\displaystyle \operatorname {E} (\mathbf {1} _{A})=\operatorname {P} (A).\;}

Given aσ{\displaystyle \sigma }-fieldGF{\displaystyle {\mathcal {G}}\subseteq {\mathcal {F}}}, the conditional probabilityP(AG){\displaystyle \operatorname {P} (A\mid {\mathcal {G}})} is a version of theconditional expectation of the indicator function forA{\displaystyle A}:

P(AG)=E(1AG){\displaystyle \operatorname {P} (A\mid {\mathcal {G}})=\operatorname {E} (\mathbf {1} _{A}\mid {\mathcal {G}})\;}

An expectation of a random variable with respect to a regular conditional probability is equal to its conditional expectation.

Interpretation of conditioning on a Sigma Field

[edit]

Consider the probability space(Ω,F,P){\displaystyle (\Omega ,{\mathcal {F}},\mathbb {P} )}and a sub-sigma fieldAF{\displaystyle {\mathcal {A}}\subset {\mathcal {F}}}.The sub-sigma fieldA{\displaystyle {\mathcal {A}}} can be loosely interpreted as containing a subset of the information inF{\displaystyle {\mathcal {F}}}. For example, we might think ofP(B|A){\displaystyle \mathbb {P} (B|{\mathcal {A}})} as the probability of the eventB{\displaystyle B} given the information inA{\displaystyle {\mathcal {A}}}.

Also recall that an eventB{\displaystyle B} is independent of a sub-sigma fieldA{\displaystyle {\mathcal {A}}} ifP(B|A)=P(B){\displaystyle \mathbb {P} (B|A)=\mathbb {P} (B)} for allAA{\displaystyle A\in {\mathcal {A}}}. It is incorrect to conclude in general that the information inA{\displaystyle {\mathcal {A}}} does not tell us anything about the probability of eventB{\displaystyle B} occurring. This can be shown with a counter-example:

Consider a probability space on the unit interval,Ω=[0,1]{\displaystyle \Omega =[0,1]}. LetG{\displaystyle {\mathcal {G}}} be the sigma-field of all countable sets and sets whose complement is countable. So each set inG{\displaystyle {\mathcal {G}}} has measure0{\displaystyle 0} or1{\displaystyle 1} and so is independent of each event inF{\displaystyle {\mathcal {F}}}. However, notice thatG{\displaystyle {\mathcal {G}}} also contains all the singleton events inF{\displaystyle {\mathcal {F}}} (those sets which contain only a singleωΩ{\displaystyle \omega \in \Omega }). So knowing which of the events inG{\displaystyle {\mathcal {G}}} occurred is equivalent to knowing exactly whichωΩ{\displaystyle \omega \in \Omega } occurred! So in one sense,G{\displaystyle {\mathcal {G}}} contains no information aboutF{\displaystyle {\mathcal {F}}} (it is independent of it), and in another sense it contains all the information inF{\displaystyle {\mathcal {F}}}.[5][page needed]

See also

[edit]

References

[edit]

Citations

[edit]
  1. ^Ross (1993), pp. 88–91.
  2. ^Park (2018), p. 99.
  3. ^Billingsley (1995), p. 430.
  4. ^Billingsley (1995), p. 439.
  5. ^Billingsley (2012).

Sources

[edit]
Authority control databases: NationalEdit this at Wikidata
Retrieved from "https://en.wikipedia.org/w/index.php?title=Conditional_probability_distribution&oldid=1300668866"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp