Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Independence (probability theory)

From Wikipedia, the free encyclopedia
(Redirected fromStatistical independence)
When the occurrence of one event does not affect the likelihood of another
Part of a series onstatistics
Probability theory

Independence is a fundamental notion inprobability theory, as instatistics and the theory ofstochastic processes. Twoevents areindependent,statistically independent, orstochastically independent[1] if, informally speaking, the occurrence of one does not affect the probability of occurrence of the other or, equivalently, does not affect theodds. Similarly, tworandom variables are independent if the realization of one does not affect theprobability distribution of the other. Conversely,dependence is when the occurrence of one eventdoes affect the likelihood of another.

When dealing with collections of more than two events, two notions of independence need to be distinguished. The events are calledpairwise independent if any two events in the collection are independent of each other, whilemutual independence (orcollective independence) of events means, informally speaking, that each event is independent of any combination of other events in the collection. A similar notion exists for collections of random variables. Mutual independence implies pairwise independence, but not the other way around. In the standard literature of probability theory, statistics, and stochastic processes,independence without further qualification usually refers to mutual independence.

Definition

[edit]

For events

[edit]

Two events

[edit]

Two eventsA{\displaystyle A} andB{\displaystyle B} are independent (often written asAB{\displaystyle A\perp B} orAB{\displaystyle A\perp \!\!\!\perp B}, where the latter symbol often is also used forconditional independence) if and only if theirjoint probability equals the product of their probabilities:[2]: p. 29 [3]: p. 10 

P(AB)=P(A)P(B){\displaystyle \mathrm {P} (A\cap B)=\mathrm {P} (A)\mathrm {P} (B)}Eq.1

AB{\displaystyle A\cap B\neq \emptyset } indicates that two independent eventsA{\displaystyle A} andB{\displaystyle B} have common elements in theirsample space so that they are notmutually exclusive (mutually exclusiveif and only if (iff)AB={\displaystyle A\cap B=\emptyset }). Why this defines independence is made clear by rewriting withconditional probabilitiesP(AB)=P(AB)P(B){\displaystyle P(A\mid B)={\frac {P(A\cap B)}{P(B)}}} as the probability at which the eventA{\displaystyle A} occurs provided that the eventB{\displaystyle B} has or is assumed to have occurred:

P(AB)=P(A)P(B)P(AB)=P(AB)P(B)=P(A).{\displaystyle \mathrm {P} (A\cap B)=\mathrm {P} (A)\mathrm {P} (B)\iff \mathrm {P} (A\mid B)={\frac {\mathrm {P} (A\cap B)}{\mathrm {P} (B)}}=\mathrm {P} (A).}

and similarly

P(AB)=P(A)P(B)P(BA)=P(AB)P(A)=P(B).{\displaystyle \mathrm {P} (A\cap B)=\mathrm {P} (A)\mathrm {P} (B)\iff \mathrm {P} (B\mid A)={\frac {\mathrm {P} (A\cap B)}{\mathrm {P} (A)}}=\mathrm {P} (B).}

Thus, the occurrence ofB{\displaystyle B} does not affect the probability ofA{\displaystyle A}, and vice versa. In other words,A{\displaystyle A} andB{\displaystyle B} are independent of each other. Although the derived expressions may seem more intuitive, they are not the preferred definition, as the conditional probabilities may be undefined ifP(A){\displaystyle \mathrm {P} (A)} orP(B){\displaystyle \mathrm {P} (B)} are 0. Furthermore, the preferred definition makes clear by symmetry that whenA{\displaystyle A} is independent ofB{\displaystyle B},B{\displaystyle B} is also independent ofA{\displaystyle A}.

Odds

[edit]

Stated in terms ofodds, two events are independent if and only if theodds ratio ofA{\displaystyle A} andB{\displaystyle B} is unity (1). Analogously with probability, this is equivalent to the conditional odds being equal to the unconditional odds:

O(AB)=O(A) and O(BA)=O(B),{\displaystyle O(A\mid B)=O(A){\text{ and }}O(B\mid A)=O(B),}

or to the odds of one event, given the other event, being the same as the odds of the event, given the other event not occurring:

O(AB)=O(A¬B) and O(BA)=O(B¬A).{\displaystyle O(A\mid B)=O(A\mid \neg B){\text{ and }}O(B\mid A)=O(B\mid \neg A).}

The odds ratio can be defined as

O(AB):O(A¬B),{\displaystyle O(A\mid B):O(A\mid \neg B),}

or symmetrically for odds ofB{\displaystyle B} givenA{\displaystyle A}, and thus is 1 if and only if the events are independent.

More than two events

[edit]

Afinite set of events{Ai}i=1n{\displaystyle \{A_{i}\}_{i=1}^{n}} ispairwise independent if every pair of events is independent[4]—that is, if and only if for all distinct pairs of indicesm,k{\displaystyle m,k},

P(AmAk)=P(Am)P(Ak){\displaystyle \mathrm {P} (A_{m}\cap A_{k})=\mathrm {P} (A_{m})\mathrm {P} (A_{k})}Eq.2

A finite set of events ismutually independent if every event is independent of any intersection of the other events[4][3]: p. 11 —that is, if and only if for everykn{\displaystyle k\leq n} and for every k indices1i1<<ikn{\displaystyle 1\leq i_{1}<\dots <i_{k}\leq n},

P(j=1kAij)=j=1kP(Aij){\displaystyle \mathrm {P} \left(\bigcap _{j=1}^{k}A_{i_{j}}\right)=\prod _{j=1}^{k}\mathrm {P} (A_{i_{j}})}Eq.3

This is called themultiplication rule for independent events. It isnot a single condition involving only the product of all the probabilities of all single events; it must hold true for all subsets of events.

For more than two events, a mutually independent set of events is (by definition) pairwise independent; but the converse isnot necessarily true.[2]: p. 30 

Log probability and information content

[edit]

Stated in terms oflog probability, two events are independent if and only if the log probability of the joint event is the sum of the log probability of the individual events:

logP(AB)=logP(A)+logP(B){\displaystyle \log \mathrm {P} (A\cap B)=\log \mathrm {P} (A)+\log \mathrm {P} (B)}

Ininformation theory, negative log probability is interpreted asinformation content, and thus two events are independent if and only if the information content of the combined event equals the sum of information content of the individual events:

I(AB)=I(A)+I(B){\displaystyle \mathrm {I} (A\cap B)=\mathrm {I} (A)+\mathrm {I} (B)}

SeeInformation content § Additivity of independent events for details.

For real valued random variables

[edit]

Two random variables

[edit]

Two random variablesX{\displaystyle X} andY{\displaystyle Y} are independentif and only if (iff) the elements of theπ-system generated by them are independent; that is to say, for everyx{\displaystyle x} andy{\displaystyle y}, the events{Xx}{\displaystyle \{X\leq x\}} and{Yy}{\displaystyle \{Y\leq y\}} are independent events (as defined above inEq.1). That is,X{\displaystyle X} andY{\displaystyle Y} withcumulative distribution functionsFX(x){\displaystyle F_{X}(x)} andFY(y){\displaystyle F_{Y}(y)}, are independentiff the combined random variable(X,Y){\displaystyle (X,Y)} has ajoint cumulative distribution function[3]: p. 15 

FX,Y(x,y)=FX(x)FY(y)for all x,y{\displaystyle F_{X,Y}(x,y)=F_{X}(x)F_{Y}(y)\quad {\text{for all }}x,y}Eq.4

More generally and equivalently ifX{\displaystyle X} andY{\displaystyle Y} are real valued, if the pair of random variables(X,Y){\displaystyle (X,Y)} has values inX×Y{\displaystyle {\mathcal {X}}\times {\mathcal {Y}}} with joint probability distributionPX,Y{\displaystyle P_{X,Y}} and marginalsPX{\displaystyle P_{X}} andPY{\displaystyle P_{Y}} we have the equality of measures

PX,Y(d(x,y))=PX(dx)PY(dy),{\displaystyle P_{X,Y}(d(x,y))=P_{X}(dx)P_{Y}(dy),}

i.e. for everyBorel setAX×Y{\displaystyle A\subseteq {\mathcal {X}}\times {\mathcal {Y}}} we have

PX,Y(A)=APX,Y(d(x,y))=APX(dx)PY(dy){\displaystyle P_{X,Y}(A)=\int _{A}P_{X,Y}(d(x,y))=\int _{A}P_{X}(dx)P_{Y}(dy)}

wherePX,Y(A)=P((X,Y)A){\displaystyle P_{X,Y}(A)=P\left((X,Y)\in A\right)}. IfX{\displaystyle X} andY{\displaystyle Y} are discrete valued this simplifies to

PX,Y(xi,yj)=PX(xi)PY(yj) for all i=1,..,|X|,j=1,..,|Y|,{\displaystyle P_{X,Y}(x_{i},y_{j})=P_{X}(x_{i})P_{Y}(y_{j}){\text{ for all }}i=1,..,|{\mathcal {X}}|,\,j=1,..,|{\mathcal {Y}}|,}

while ifX{\displaystyle X} andY{\displaystyle Y} are real valued and haveprobability densitiespX(x){\displaystyle p_{X}(x)} andpY(y){\displaystyle p_{Y}(y)} and joint probability densitypX,Y(x,y){\displaystyle p_{X,Y}(x,y)} it becomes

pX,Y(x,y)=pX(x)pY(y)for almost all (x,y)R2.{\displaystyle p_{X,Y}(x,y)=p_{X}(x)p_{Y}(y)\quad {\text{for almost all }}(x,y)\in \mathbb {R} ^{2}.}

where "almost all" means all except for a set ofmeasure zero.

More than two random variables

[edit]

A finite set ofn{\displaystyle n} random variables{X1,,Xn}{\displaystyle \{X_{1},\ldots ,X_{n}\}} ispairwise independent if and only if every pair of random variables is independent. Even if the set of random variables is pairwise independent, it is not necessarilymutually independent as defined next.

A finite set ofn{\displaystyle n} random variables{X1,,Xn}{\displaystyle \{X_{1},\ldots ,X_{n}\}} ismutually independent if and only if for any sequence of numbers{x1,,xn}{\displaystyle \{x_{1},\ldots ,x_{n}\}}, the events{X1x1},,{Xnxn}{\displaystyle \{X_{1}\leq x_{1}\},\ldots ,\{X_{n}\leq x_{n}\}} are mutually independent events (as defined above inEq.3). This is equivalent to the following condition on the joint cumulative distribution functionFX1,,Xn(x1,,xn){\displaystyle F_{X_{1},\ldots ,X_{n}}(x_{1},\ldots ,x_{n})}. A finite set ofn{\displaystyle n} random variables{X1,,Xn}{\displaystyle \{X_{1},\ldots ,X_{n}\}} is mutually independent if and only if[3]: p. 16 

FX1,,Xn(x1,,xn)=FX1(x1)FXn(xn)for all x1,,xn{\displaystyle F_{X_{1},\ldots ,X_{n}}(x_{1},\ldots ,x_{n})=F_{X_{1}}(x_{1})\cdot \ldots \cdot F_{X_{n}}(x_{n})\quad {\text{for all }}x_{1},\ldots ,x_{n}}Eq.5

It is not necessary here to require that the probability distribution factorizes for all possiblek{\displaystyle k}-element subsets as in the case forn{\displaystyle n} events. This is not required because e.g.FX1,X2,X3(x1,x2,x3)=FX1(x1)FX2(x2)FX3(x3){\displaystyle F_{X_{1},X_{2},X_{3}}(x_{1},x_{2},x_{3})=F_{X_{1}}(x_{1})\cdot F_{X_{2}}(x_{2})\cdot F_{X_{3}}(x_{3})} impliesFX1,X3(x1,x3)=FX1(x1)FX3(x3){\displaystyle F_{X_{1},X_{3}}(x_{1},x_{3})=F_{X_{1}}(x_{1})\cdot F_{X_{3}}(x_{3})}.

Similarly to the case of two random variables, the general formulation of independence can be done measure-theoretically. A set of random variables(X1,,Xn){\displaystyle (X_{1},\ldots ,X_{n})} with values inmeasurable spacesX1,××Xn{\displaystyle {\mathcal {X}}_{1},\times \ldots \times {\mathcal {X}}_{n}} with joint distributionPX1,Xn{\displaystyle P_{X_{1},\ldots X_{n}}}, and marginalsPX1PXn{\displaystyle P_{X_{1}}\ldots P_{X_{n}}} are independent if and only if

PX1,Xn(d(x1,xn))=PX1(dx1)PXn(dxn){\displaystyle P_{X_{1},\ldots X_{n}}(d(x_{1},\ldots x_{n}))=P_{X_{1}}(dx_{1})\cdots P_{X_{n}}(dx_{n})}

which again means that for everyBorel setAX1××Xn{\displaystyle A\subseteq {\mathcal {X}}_{1}\times \cdots \times {\mathcal {X}}_{n}}, we have

PX1,,Xn(A)=APX1,Xn(d(x1,xn))=APX1(dx1)PXn(dxn),{\displaystyle P_{X_{1},\ldots ,X_{n}}(A)=\int _{A}P_{X_{1},\ldots X_{n}}(d(x_{1},\ldots x_{n}))=\int _{A}P_{X_{1}}(dx_{1})\cdots P_{X_{n}}(dx_{n}),}

The definition is exactly equivalent to the one above when the values of the random variables arereal numbers. It has the advantage of working also for complex-valued random variables or for random variables taking values in othermeasurable spaces (which includestopological spaces endowed by appropriate σ-algebras).

For real valued random vectors

[edit]

Two random vectorsX=(X1,,Xm)T{\displaystyle \mathbf {X} =(X_{1},\ldots ,X_{m})^{\mathrm {T} }} andY=(Y1,,Yn)T{\displaystyle \mathbf {Y} =(Y_{1},\ldots ,Y_{n})^{\mathrm {T} }} are called independent if[5]: p. 187 

FX,Y(x,y)=FX(x)FY(y)for all x,y{\displaystyle F_{\mathbf {X,Y} }(\mathbf {x,y} )=F_{\mathbf {X} }(\mathbf {x} )\cdot F_{\mathbf {Y} }(\mathbf {y} )\quad {\text{for all }}\mathbf {x} ,\mathbf {y} }Eq.6

whereFX(x){\displaystyle F_{\mathbf {X} }(\mathbf {x} )} andFY(y){\displaystyle F_{\mathbf {Y} }(\mathbf {y} )} denote the cumulative distribution functions ofX{\displaystyle \mathbf {X} } andY{\displaystyle \mathbf {Y} } andFX,Y(x,y){\displaystyle F_{\mathbf {X,Y} }(\mathbf {x,y} )} denotes their joint cumulative distribution function. Independence ofX{\displaystyle \mathbf {X} } andY{\displaystyle \mathbf {Y} } is often denoted byXY{\displaystyle \mathbf {X} \perp \!\!\!\perp \mathbf {Y} }.Written component-wise,X{\displaystyle \mathbf {X} } andY{\displaystyle \mathbf {Y} } are called independent if

FX1,,Xm,Y1,,Yn(x1,,xm,y1,,yn)=FX1,,Xm(x1,,xm)FY1,,Yn(y1,,yn)for all x1,,xm,y1,,yn.{\displaystyle F_{X_{1},\ldots ,X_{m},Y_{1},\ldots ,Y_{n}}(x_{1},\ldots ,x_{m},y_{1},\ldots ,y_{n})=F_{X_{1},\ldots ,X_{m}}(x_{1},\ldots ,x_{m})\cdot F_{Y_{1},\ldots ,Y_{n}}(y_{1},\ldots ,y_{n})\quad {\text{for all }}x_{1},\ldots ,x_{m},y_{1},\ldots ,y_{n}.}

For stochastic processes

[edit]

For one stochastic process

[edit]

The definition of independence may be extended from random vectors to astochastic process. Therefore, it is required for an independent stochastic process that the random variables obtained by sampling the process at anyn{\displaystyle n} timest1,,tn{\displaystyle t_{1},\ldots ,t_{n}} are independent random variables for anyn{\displaystyle n}.[6]: p. 163 

Formally, a stochastic process{Xt}tT{\displaystyle \left\{X_{t}\right\}_{t\in {\mathcal {T}}}} is called independent, if and only if for allnN{\displaystyle n\in \mathbb {N} } and for allt1,,tnT{\displaystyle t_{1},\ldots ,t_{n}\in {\mathcal {T}}}

FXt1,,Xtn(x1,,xn)=FXt1(x1)FXtn(xn)for all x1,,xn{\displaystyle F_{X_{t_{1}},\ldots ,X_{t_{n}}}(x_{1},\ldots ,x_{n})=F_{X_{t_{1}}}(x_{1})\cdot \ldots \cdot F_{X_{t_{n}}}(x_{n})\quad {\text{for all }}x_{1},\ldots ,x_{n}}Eq.7

whereFXt1,,Xtn(x1,,xn)=P(X(t1)x1,,X(tn)xn){\displaystyle F_{X_{t_{1}},\ldots ,X_{t_{n}}}(x_{1},\ldots ,x_{n})=\mathrm {P} (X(t_{1})\leq x_{1},\ldots ,X(t_{n})\leq x_{n})}. Independence of a stochastic process is a propertywithin a stochastic process, not between two stochastic processes.

For two stochastic processes

[edit]

Independence of two stochastic processes is a property between two stochastic processes{Xt}tT{\displaystyle \left\{X_{t}\right\}_{t\in {\mathcal {T}}}} and{Yt}tT{\displaystyle \left\{Y_{t}\right\}_{t\in {\mathcal {T}}}} that are defined on the same probability space(Ω,F,P){\displaystyle (\Omega ,{\mathcal {F}},P)}. Formally, two stochastic processes{Xt}tT{\displaystyle \left\{X_{t}\right\}_{t\in {\mathcal {T}}}} and{Yt}tT{\displaystyle \left\{Y_{t}\right\}_{t\in {\mathcal {T}}}} are said to be independent if for allnN{\displaystyle n\in \mathbb {N} } and for allt1,,tnT{\displaystyle t_{1},\ldots ,t_{n}\in {\mathcal {T}}}, the random vectors(X(t1),,X(tn)){\displaystyle (X(t_{1}),\ldots ,X(t_{n}))} and(Y(t1),,Y(tn)){\displaystyle (Y(t_{1}),\ldots ,Y(t_{n}))} are independent,[7]: p. 515  i.e. if

FXt1,,Xtn,Yt1,,Ytn(x1,,xn,y1,,yn)=FXt1,,Xtn(x1,,xn)FYt1,,Ytn(y1,,yn)for all x1,,xn{\displaystyle F_{X_{t_{1}},\ldots ,X_{t_{n}},Y_{t_{1}},\ldots ,Y_{t_{n}}}(x_{1},\ldots ,x_{n},y_{1},\ldots ,y_{n})=F_{X_{t_{1}},\ldots ,X_{t_{n}}}(x_{1},\ldots ,x_{n})\cdot F_{Y_{t_{1}},\ldots ,Y_{t_{n}}}(y_{1},\ldots ,y_{n})\quad {\text{for all }}x_{1},\ldots ,x_{n}}Eq.8

Independent σ-algebras

[edit]

The definitions above (Eq.1 andEq.2) are both generalized by the following definition of independence forσ-algebras. Let(Ω,Σ,P){\displaystyle (\Omega ,\Sigma ,\mathrm {P} )} be aprobability space and letA{\displaystyle {\mathcal {A}}} andB{\displaystyle {\mathcal {B}}} be two sub-σ-algebras ofΣ{\displaystyle \Sigma }.A{\displaystyle {\mathcal {A}}} andB{\displaystyle {\mathcal {B}}} are said to be independent if, wheneverAA{\displaystyle A\in {\mathcal {A}}} andBB{\displaystyle B\in {\mathcal {B}}},

P(AB)=P(A)P(B).{\displaystyle \mathrm {P} (A\cap B)=\mathrm {P} (A)\mathrm {P} (B).}

Likewise, a finite family of σ-algebras(τi)iI{\displaystyle (\tau _{i})_{i\in I}}, whereI{\displaystyle I} is anindex set, is said to be independent if and only if

(Ai)iIiIτi : P(iIAi)=iIP(Ai){\displaystyle \forall \left(A_{i}\right)_{i\in I}\in \prod \nolimits _{i\in I}\tau _{i}\ :\ \mathrm {P} \left(\bigcap \nolimits _{i\in I}A_{i}\right)=\prod \nolimits _{i\in I}\mathrm {P} \left(A_{i}\right)}

and an infinite family of σ-algebras is said to be independent if all its finite subfamilies are independent.

The new definition relates to the previous ones very directly:

  • Two events are independent (in the old sense)if and only if the σ-algebras that they generate are independent (in the new sense). The σ-algebra generated by an eventEΣ{\displaystyle E\in \Sigma } is, by definition,
σ({E})={,E,ΩE,Ω}.{\displaystyle \sigma (\{E\})=\{\emptyset ,E,\Omega \setminus E,\Omega \}.}

Using this definition, it is easy to show that ifX{\displaystyle X} andY{\displaystyle Y} are random variables andY{\displaystyle Y} is constant, thenX{\displaystyle X} andY{\displaystyle Y} are independent, since the σ-algebra generated by a constant random variable is the trivial σ-algebra{,Ω}{\displaystyle \{\varnothing ,\Omega \}}. Probability zero events cannot affect independence so independence also holds ifY{\displaystyle Y} is only Pr-almost surely constant.

Properties

[edit]

Self-independence

[edit]

Note that an event is independent of itself if and only if

P(A)=P(AA)=P(A)P(A)P(A)=0 or P(A)=1.{\displaystyle \mathrm {P} (A)=\mathrm {P} (A\cap A)=\mathrm {P} (A)\cdot \mathrm {P} (A)\iff \mathrm {P} (A)=0{\text{ or }}\mathrm {P} (A)=1.}

Thus an event is independent of itself if and only if italmost surely occurs or itscomplement almost surely occurs; this fact is useful when provingzero–one laws.[8]

Similarly, a random variable is independent of itself if and only if it isalmost surely constant.

Expectation, covariance, variance, and correlation

[edit]
Main article:Correlation and dependence

IfX{\displaystyle X} andY{\displaystyle Y} are statistically independent random variables, then:

- Theexpected value of the product is the product of the expected values[9]: p. 10 :

E[XnYm]=E[Xn]E[Ym]{\displaystyle \operatorname {E} [X^{n}Y^{m}]=\operatorname {E} [X^{n}]\operatorname {E} [Y^{m}]}

- ThecovarianceCov[X,Y]{\displaystyle \operatorname {Cov} [X,Y]} is zero:

Cov[X,Y]=E[XY]E[X]E[Y]=0{\displaystyle \operatorname {Cov} [X,Y]=\operatorname {E} [XY]-\operatorname {E} [X]\operatorname {E} [Y]=0}

- Thevariance of the sum is the sum of the variances:

V[X+Y]=V[X]+V[Y]+2Cov[X,Y]=V[X]+V[Y]{\displaystyle \operatorname {V} [X+Y]=\operatorname {V} [X]+\operatorname {V} [Y]+2\operatorname {Cov} [X,Y]=\operatorname {V} [X]+\operatorname {V} [Y]}

- Thecorrelation is zero:

ρX,Y=Cov[X,Y]σXσY=0{\displaystyle \rho _{X,Y}={\dfrac {\operatorname {Cov} [X,Y]}{\sigma _{X}\sigma _{Y}}}=0}

The converse does not hold: each of this property does not imply independence. For instance, if two random variables have a covariance of 0 they still may be not independent.

See also:Uncorrelatedness (probability theory)

Similarly for two stochastic processes{Xt}tT{\displaystyle \left\{X_{t}\right\}_{t\in {\mathcal {T}}}} and{Yt}tT{\displaystyle \left\{Y_{t}\right\}_{t\in {\mathcal {T}}}}: If they are independent, then they areuncorrelated.[10]: p. 151 

Characteristic function

[edit]

Two random variablesX{\displaystyle X} andY{\displaystyle Y} are independent if and only if thecharacteristic function of the random vector(X,Y){\displaystyle (X,Y)} satisfies

φ(X,Y)(t,s)=φX(t)φY(s).{\displaystyle \varphi _{(X,Y)}(t,s)=\varphi _{X}(t)\cdot \varphi _{Y}(s).}

In particular the characteristic function of their sum is the product of their marginal characteristic functions:

φX+Y(t)=φX(t)φY(t),{\displaystyle \varphi _{X+Y}(t)=\varphi _{X}(t)\cdot \varphi _{Y}(t),}

though the reverse implication is not true. Random variables that satisfy the latter condition are calledsubindependent.

Examples

[edit]

Rolling dice

[edit]

The event of getting a 6 the first time a die is rolled and the event of getting a 6 the second time areindependent. By contrast, the event of getting a 6 the first time a die is rolled and the event that the sum of the numbers seen on the first and second trial is 8 arenot independent.

Drawing cards

[edit]

If two cards are drawnwith replacement from a deck of cards, the event of drawing a red card on the first trial and that of drawing a red card on the second trial areindependent. By contrast, if two cards are drawnwithout replacement from a deck of cards, the event of drawing a red card on the first trial and that of drawing a red card on the second trial arenot independent, because a deck that has had a red card removed has proportionately fewer red cards.

Pairwise and mutual independence

[edit]
Pairwise independent, but not mutually independent, events
Mutually independent events

Consider the two probability spaces shown. In both cases,P(A)=P(B)=1/2{\displaystyle \mathrm {P} (A)=\mathrm {P} (B)=1/2} andP(C)=1/4{\displaystyle \mathrm {P} (C)=1/4}. The events in the first space are pairwise independent becauseP(A|B)=P(A|C)=1/2=P(A){\displaystyle \mathrm {P} (A|B)=\mathrm {P} (A|C)=1/2=\mathrm {P} (A)},P(B|A)=P(B|C)=1/2=P(B){\displaystyle \mathrm {P} (B|A)=\mathrm {P} (B|C)=1/2=\mathrm {P} (B)}, andP(C|A)=P(C|B)=1/4=P(C){\displaystyle \mathrm {P} (C|A)=\mathrm {P} (C|B)=1/4=\mathrm {P} (C)}; but the three events are not mutually independent. The events in the second space are both pairwise independent and mutually independent. To illustrate the difference, consider conditioning on two events. In the pairwise independent case, although any one event is independent of each of the other two individually, it is not independent of the intersection of the other two:

P(A|BC)=440440+140=45P(A){\displaystyle \mathrm {P} (A|BC)={\frac {\frac {4}{40}}{{\frac {4}{40}}+{\frac {1}{40}}}}={\tfrac {4}{5}}\neq \mathrm {P} (A)}
P(B|AC)=440440+140=45P(B){\displaystyle \mathrm {P} (B|AC)={\frac {\frac {4}{40}}{{\frac {4}{40}}+{\frac {1}{40}}}}={\tfrac {4}{5}}\neq \mathrm {P} (B)}
P(C|AB)=440440+640=25P(C){\displaystyle \mathrm {P} (C|AB)={\frac {\frac {4}{40}}{{\frac {4}{40}}+{\frac {6}{40}}}}={\tfrac {2}{5}}\neq \mathrm {P} (C)}

In the mutually independent case, however,

P(A|BC)=116116+116=12=P(A){\displaystyle \mathrm {P} (A|BC)={\frac {\frac {1}{16}}{{\frac {1}{16}}+{\frac {1}{16}}}}={\tfrac {1}{2}}=\mathrm {P} (A)}
P(B|AC)=116116+116=12=P(B){\displaystyle \mathrm {P} (B|AC)={\frac {\frac {1}{16}}{{\frac {1}{16}}+{\frac {1}{16}}}}={\tfrac {1}{2}}=\mathrm {P} (B)}
P(C|AB)=116116+316=14=P(C){\displaystyle \mathrm {P} (C|AB)={\frac {\frac {1}{16}}{{\frac {1}{16}}+{\frac {3}{16}}}}={\tfrac {1}{4}}=\mathrm {P} (C)}

Triple-independence but no pairwise-independence

[edit]

It is possible to create a three-event example in which

P(ABC)=P(A)P(B)P(C),{\displaystyle \mathrm {P} (A\cap B\cap C)=\mathrm {P} (A)\mathrm {P} (B)\mathrm {P} (C),}

and yet no two of the three events are pairwise independent (and hence the set of events are not mutually independent).[11] This example shows that mutual independence involves requirements on the products of probabilities of all combinations of events, not just the single events as in this example.

Conditional independence

[edit]
Main article:Conditional independence

For events

[edit]

The eventsA{\displaystyle A} andB{\displaystyle B} are conditionally independent given an eventC{\displaystyle C} when

P(ABC)=P(AC)P(BC){\displaystyle \mathrm {P} (A\cap B\mid C)=\mathrm {P} (A\mid C)\cdot \mathrm {P} (B\mid C)}.

For random variables

[edit]

Intuitively, two random variablesX{\displaystyle X} andY{\displaystyle Y} are conditionally independent givenZ{\displaystyle Z} if, onceZ{\displaystyle Z} is known, the value ofY{\displaystyle Y} does not add any additional information aboutX{\displaystyle X}. For instance, two measurementsX{\displaystyle X} andY{\displaystyle Y} of the same underlying quantityZ{\displaystyle Z} are not independent, but they are conditionally independent givenZ{\displaystyle Z} (unless the errors in the two measurements are somehow connected).

The formal definition of conditional independence is based on the idea ofconditional distributions. IfX{\displaystyle X},Y{\displaystyle Y}, andZ{\displaystyle Z} arediscrete random variables, then we defineX{\displaystyle X} andY{\displaystyle Y} to be conditionally independent givenZ{\displaystyle Z} if

P(Xx,Yy|Z=z)=P(Xx|Z=z)P(Yy|Z=z){\displaystyle \mathrm {P} (X\leq x,Y\leq y\;|\;Z=z)=\mathrm {P} (X\leq x\;|\;Z=z)\cdot \mathrm {P} (Y\leq y\;|\;Z=z)}

for allx{\displaystyle x},y{\displaystyle y} andz{\displaystyle z} such thatP(Z=z)>0{\displaystyle \mathrm {P} (Z=z)>0}. On the other hand, if the random variables arecontinuous and have a jointprobability density functionfXYZ(x,y,z){\displaystyle f_{XYZ}(x,y,z)}, thenX{\displaystyle X} andY{\displaystyle Y} are conditionally independent givenZ{\displaystyle Z} if

fXY|Z(x,y|z)=fX|Z(x|z)fY|Z(y|z){\displaystyle f_{XY|Z}(x,y|z)=f_{X|Z}(x|z)\cdot f_{Y|Z}(y|z)}

for all real numbersx{\displaystyle x},y{\displaystyle y} andz{\displaystyle z} such thatfZ(z)>0{\displaystyle f_{Z}(z)>0}.

If discreteX{\displaystyle X} andY{\displaystyle Y} are conditionally independent givenZ{\displaystyle Z}, then

P(X=x|Y=y,Z=z)=P(X=x|Z=z){\displaystyle \mathrm {P} (X=x|Y=y,Z=z)=\mathrm {P} (X=x|Z=z)}

for anyx{\displaystyle x},y{\displaystyle y} andz{\displaystyle z} withP(Z=z)>0{\displaystyle \mathrm {P} (Z=z)>0}. That is, the conditional distribution forX{\displaystyle X} givenY{\displaystyle Y} andZ{\displaystyle Z} is the same as that givenZ{\displaystyle Z} alone. A similar equation holds for the conditional probability density functions in the continuous case.

Independence can be seen as a special kind of conditional independence, since probability can be seen as a kind of conditional probability given no events.

History

[edit]

Before 1933, independence, in probability theory, was defined in a verbal manner. For example,de Moivre gave the following definition: “Two events are independent, when they have no connexion one with the other, and that the happening of one neither forwards nor obstructs the happening of the other”.[12] If there are n independent events, the probability of the event, that all of them happen was computed as the product of the probabilities of these n events. Apparently, there was the conviction, that this formula was a consequence of the above definition. (Sometimes this was called the Multiplication Theorem.), Of course, a proof of his assertion cannot work without further more formal tacit assumptions.

The definition of independence, given in this article, became the standard definition (now used in all books) after it appeared in 1933 as part of Kolmogorov's axiomatization of probability.[13]Kolmogorov credited it toS.N. Bernstein, and quoted a publication which had appeared in Russian in 1927.[14]

Unfortunately, both Bernstein and Kolmogorov had not been aware of the work of theGeorg Bohlmann. Bohlmann had given the same definition for two events in 1901[15] and for n events in 1908[16] In the latter paper, he studied his notion in detail. For example, he gave the first example showing that pairwise independence does not imply mutual independence.Even today, Bohlmann is rarely quoted. More about his work can be found inOn the contributions of Georg Bohlmann to probability theory fromde:Ulrich Krengel.[17]

See also

[edit]

References

[edit]
  1. ^Russell, Stuart; Norvig, Peter (2002).Artificial Intelligence: A Modern Approach.Prentice Hall. p. 478.ISBN 0-13-790395-2.
  2. ^abFlorescu, Ionut (2014).Probability and Stochastic Processes. Wiley.ISBN 978-0-470-62455-5.
  3. ^abcdGallager, Robert G. (2013).Stochastic Processes Theory for Applications. Cambridge University Press.ISBN 978-1-107-03975-9.
  4. ^abFeller, W (1971). "Stochastic Independence".An Introduction to Probability Theory and Its Applications.Wiley.
  5. ^Papoulis, Athanasios (1991).Probability, Random Variables and Stochastic Processes. MCGraw Hill.ISBN 0-07-048477-5.
  6. ^Hwei, Piao (1997).Theory and Problems of Probability, Random Variables, and Random Processes. McGraw-Hill.ISBN 0-07-030644-3.
  7. ^Amos Lapidoth (8 February 2017).A Foundation in Digital Communication. Cambridge University Press.ISBN 978-1-107-17732-1.
  8. ^Durrett, Richard (1996).Probability: theory and examples (Second ed.). page 62
  9. ^E Jakeman.MODELING FLUCTUATIONS IN SCATTERED WAVES.ISBN 978-0-7503-1005-5.
  10. ^Park, Kun Il (2018).Fundamentals of Probability and Stochastic Processes with Applications to Communications. Springer.ISBN 978-3-319-68074-3.
  11. ^George, Glyn, "Testing for the independence of three events,"Mathematical Gazette 88, November 2004, 568.PDF
  12. ^Cited according to: Grinstead and Snell’s Introduction to Probability. In: The CHANCE Project. Version of July 4, 2006.
  13. ^Kolmogorov, Andrey (1933). Grundbegriffe der Wahrscheinlichkeitsrechnung (in German). Berlin: Julius SpringerTranslation: Kolmogorov, Andrey (1956). Translation:Foundations of the Theory of Probability (2nd ed.). New York: Chelsea. ISBN 978-0-8284-0023-7.
  14. ^S.N. Bernstein, Probability Theory (Russian), Moscow, 1927 (4 editions, latest 1946)
  15. ^Georg Bohlmann: Lebensversicherungsmathematik, Encyklop¨adie der mathematischen Wissenschaften, Bd I, Teil 2, Artikel I D 4b (1901), 852–917
  16. ^Georg Bohlmann: Die Grundbegriffe der Wahrscheinlichkeitsrechnung in ihrer Anwendung auf die Lebensversichrung, Atti del IV. Congr. Int. dei Matem. Rom, Bd. III (1908), 244–278.
  17. ^de:Ulrich Krengel: On the contributions of Georg Bohlmann to probability theory (PDF; 6,4 MB), Electronic Journal for History of Probability and Statistics, 2011.

External links

[edit]
Retrieved from "https://en.wikipedia.org/w/index.php?title=Independence_(probability_theory)&oldid=1337466346"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp