Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Borel–Kolmogorov paradox

From Wikipedia, the free encyclopedia
(Redirected fromBorel's paradox)
Conditional probability paradox

Inprobability theory, theBorel–Kolmogorov paradox (sometimes known asBorel's paradox) is aparadox relating toconditional probability with respect to anevent of probability zero (also known as anull set). It is named afterÉmile Borel andAndrey Kolmogorov.

A great circle puzzle

[edit]

Suppose that arandom variable has auniform distribution on aunit sphere. What is itsconditional distribution on agreat circle? Because of the symmetry of the sphere, one might expect that the distribution is uniform and independent of the choice of coordinates. However, two analyses give contradictory results. First, note that choosing a point uniformly on the sphere is equivalent to choosing thelongitudeλ{\displaystyle \lambda } uniformly from[π,π]{\displaystyle [-\pi ,\pi ]} and choosing thelatitudeφ{\displaystyle \varphi } from[π2,π2]{\textstyle [-{\frac {\pi }{2}},{\frac {\pi }{2}}]} with density12cosφ{\textstyle {\frac {1}{2}}\cos \varphi }.[1] Then we can look at two different great circles:

  1. If the coordinates are chosen so that the great circle is anequator (latitudeφ=0{\displaystyle \varphi =0}), the conditional density for a longitudeλ{\displaystyle \lambda } defined on the interval[π,π]{\displaystyle [-\pi ,\pi ]} isf(λφ=0)=12π.{\displaystyle f(\lambda \mid \varphi =0)={\frac {1}{2\pi }}.}
  2. If the great circle is aline of longitude withλ=0{\displaystyle \lambda =0}, the conditional density forφ{\displaystyle \varphi } on the interval[π2,π2]{\textstyle [-{\frac {\pi }{2}},{\frac {\pi }{2}}]} isf(φλ=0)=12cosφ.{\displaystyle f(\varphi \mid \lambda =0)={\frac {1}{2}}\cos \varphi .}

One distribution is uniform on the circle, the other is not. Yet both seem to be referring to the same great circle in different coordinate systems.

Many quite futile arguments have raged — between otherwise competent probabilists — over which of these results is 'correct'.

— E.T. Jaynes[1]

Explanation and implications

[edit]

In case (1) above, the conditional probability that the longitudeλ lies in a setE given thatφ = 0 can be writtenP(λE |φ = 0). Elementary probability theory suggests this can be computed asP(λE andφ = 0)/P(φ = 0), but that expression is not well-defined sinceP(φ = 0) = 0.Measure theory provides a way to define a conditional probability, using the limit of eventsRab = {φ :a <φ <b} which are horizontal rings (curved surface zones ofspherical segments) consisting of all points with latitude betweena andb.

The resolution of the paradox is to notice that in case (2),P(φF |λ = 0) is defined using a limit of the eventsLcd = {λ :c <λ <d}, which arelunes (vertical wedges), consisting of all points whose longitude varies betweenc andd. So althoughP(λE |φ = 0) andP(φF |λ = 0) each provide a probability distribution on a great circle, one of them is defined using limits of rings, and the other using limits of lunes. Since rings and lunes have different shapes, it should be less surprising thatP(λE |φ = 0) andP(φF |λ = 0) have different distributions.

The concept of a conditional probability with regard to an isolated hypothesis whose probability equals 0 is inadmissible. For we can obtain a probability distribution for [the latitude] on the meridian circle only if we regard this circle as an element of the decomposition of the entire spherical surface onto meridian circles with the given poles

— Andrey Kolmogorov[2]

… the term 'great circle' is ambiguous until we specify what limiting operation is to produce it. The intuitive symmetry argument presupposes the equatorial limit; yet one eating slices of an orange might presuppose the other.

— E.T. Jaynes[1]

Mathematical explication

[edit]

Measure theoretic perspective

[edit]

To understand the problem we need to recognize that a distribution on a continuous random variable is described by a densityf only with respect to some measureμ. Both are important for the full description of the probability distribution. Or, equivalently, we need to fully define the space on which we want to definef.

Let Φ and Λ denote two random variables taking values in Ω1 =[π2,π2]{\textstyle \left[-{\frac {\pi }{2}},{\frac {\pi }{2}}\right]} respectively Ω2 = [−π,π]. An event {Φ = φ, Λ = λ} gives a point on the sphereS(r) with radiusr. We define thecoordinate transform

x=rcosφcosλy=rcosφsinλz=rsinφ{\displaystyle {\begin{aligned}x&=r\cos \varphi \cos \lambda \\y&=r\cos \varphi \sin \lambda \\z&=r\sin \varphi \end{aligned}}}

for which we obtain thevolume element

ωr(φ,λ)=(x,y,z)φ×(x,y,z)λ=r2cosφ .{\displaystyle \omega _{r}(\varphi ,\lambda )=\left\|{\partial (x,y,z) \over \partial \varphi }\times {\partial (x,y,z) \over \partial \lambda }\right\|=r^{2}\cos \varphi \ .}

Furthermore, if eitherφ orλ is fixed, we get the volume elements

ωr(λ)=(x,y,z)φ=r ,respectivelyωr(φ)=(x,y,z)λ=rcosφ .{\displaystyle {\begin{aligned}\omega _{r}(\lambda )&=\left\|{\partial (x,y,z) \over \partial \varphi }\right\|=r\ ,\quad {\text{respectively}}\\[3pt]\omega _{r}(\varphi )&=\left\|{\partial (x,y,z) \over \partial \lambda }\right\|=r\cos \varphi \ .\end{aligned}}}

Let

μΦ,Λ(dφ,dλ)=fΦ,Λ(φ,λ)ωr(φ,λ)dφdλ{\displaystyle \mu _{\Phi ,\Lambda }(d\varphi ,d\lambda )=f_{\Phi ,\Lambda }(\varphi ,\lambda )\omega _{r}(\varphi ,\lambda )\,d\varphi \,d\lambda }

denote the joint measure onB(Ω1×Ω2){\displaystyle {\mathcal {B}}(\Omega _{1}\times \Omega _{2})}, which has a densityfΦ,Λ{\displaystyle f_{\Phi ,\Lambda }} with respect toωr(φ,λ)dφdλ{\displaystyle \omega _{r}(\varphi ,\lambda )\,d\varphi \,d\lambda } and let

μΦ(dφ)=λΩ2μΦ,Λ(dφ,dλ) ,μΛ(dλ)=φΩ1μΦ,Λ(dφ,dλ) .{\displaystyle {\begin{aligned}\mu _{\Phi }(d\varphi )&=\int _{\lambda \in \Omega _{2}}\mu _{\Phi ,\Lambda }(d\varphi ,d\lambda )\ ,\\\mu _{\Lambda }(d\lambda )&=\int _{\varphi \in \Omega _{1}}\mu _{\Phi ,\Lambda }(d\varphi ,d\lambda )\ .\end{aligned}}}

If we assume that the densityfΦ,Λ{\displaystyle f_{\Phi ,\Lambda }} is uniform, then

μΦΛ(dφλ)=μΦ,Λ(dφ,dλ)μΛ(dλ)=12rωr(φ)dφ ,andμΛΦ(dλφ)=μΦ,Λ(dφ,dλ)μΦ(dφ)=12rπωr(λ)dλ .{\displaystyle {\begin{aligned}\mu _{\Phi \mid \Lambda }(d\varphi \mid \lambda )&={\mu _{\Phi ,\Lambda }(d\varphi ,d\lambda ) \over \mu _{\Lambda }(d\lambda )}={\frac {1}{2r}}\omega _{r}(\varphi )\,d\varphi \ ,\quad {\text{and}}\\[3pt]\mu _{\Lambda \mid \Phi }(d\lambda \mid \varphi )&={\mu _{\Phi ,\Lambda }(d\varphi ,d\lambda ) \over \mu _{\Phi }(d\varphi )}={\frac {1}{2r\pi }}\omega _{r}(\lambda )\,d\lambda \ .\end{aligned}}}

Hence,μΦΛ{\displaystyle \mu _{\Phi \mid \Lambda }} has a uniform density with respect toωr(φ)dφ{\displaystyle \omega _{r}(\varphi )\,d\varphi } but not with respect to theLebesgue measure. On the other hand,μΛΦ{\displaystyle \mu _{\Lambda \mid \Phi }} has a uniform density with respect toωr(λ)dλ{\displaystyle \omega _{r}(\lambda )\,d\lambda } and the Lebesgue measure.

Proof of contradiction

[edit]
This paragraphpossibly containsoriginal research. Relevant discussion may be found onTalk:Borel–Kolmogorov paradox. Pleaseimprove it byverifying the claims made and addinginline citations. Statements consisting only of original research should be removed.(March 2021) (Learn how and when to remove this message)

Consider a random vector(X,Y,Z){\displaystyle (X,Y,Z)} that is uniformly distributed on the unit sphereS2{\displaystyle S^{2}}.

We begin by parametrizing the sphere with the usualspherical polar coordinates:

x=cos(φ)cos(θ)y=cos(φ)sin(θ)z=sin(φ){\displaystyle {\begin{aligned}x&=\cos(\varphi )\cos(\theta )\\y&=\cos(\varphi )\sin(\theta )\\z&=\sin(\varphi )\end{aligned}}}

whereπ2φπ2{\textstyle -{\frac {\pi }{2}}\leq \varphi \leq {\frac {\pi }{2}}} andπθπ{\displaystyle -\pi \leq \theta \leq \pi }.

We can define random variablesΦ{\displaystyle \Phi },Θ{\displaystyle \Theta } as the values of(X,Y,Z){\displaystyle (X,Y,Z)}under the inverse of this parametrization, or more formally using thearctan2 function:

Φ=arcsin(Z)Θ=arctan2(Y1Z2,X1Z2){\displaystyle {\begin{aligned}\Phi &=\arcsin(Z)\\\Theta &=\arctan _{2}\left({\frac {Y}{\sqrt {1-Z^{2}}}},{\frac {X}{\sqrt {1-Z^{2}}}}\right)\end{aligned}}}

Using the formulas for the surface areaspherical cap and thespherical wedge, the surface of a spherical cap wedge is given by

Area(Θθ,Φφ)=(1+sin(φ))(θ+π){\displaystyle \operatorname {Area} (\Theta \leq \theta ,\Phi \leq \varphi )=(1+\sin(\varphi ))(\theta +\pi )}

Since(X,Y,Z){\displaystyle (X,Y,Z)} is uniformly distributed, the probability is proportional to the surface area, giving thejoint cumulative distribution function

FΦ,Θ(φ,θ)=P(Θθ,Φφ)=14π(1+sin(φ))(θ+π){\displaystyle F_{\Phi ,\Theta }(\varphi ,\theta )=P(\Theta \leq \theta ,\Phi \leq \varphi )={\frac {1}{4\pi }}(1+\sin(\varphi ))(\theta +\pi )}

Thejoint probability density function is then given by

fΦ,Θ(φ,θ)=2φθFΦ,Θ(φ,θ)=14πcos(φ){\displaystyle f_{\Phi ,\Theta }(\varphi ,\theta )={\frac {\partial ^{2}}{\partial \varphi \partial \theta }}F_{\Phi ,\Theta }(\varphi ,\theta )={\frac {1}{4\pi }}\cos(\varphi )}

Note thatΦ{\displaystyle \Phi } andΘ{\displaystyle \Theta } are independent random variables.

For simplicity, we won't calculate the full conditional distribution on a great circle, only the probability that the random vector lies in the first octant. That is to say, we will attempt to calculate the conditional probabilityP(A|B){\displaystyle \mathbb {P} (A|B)} with

A={0<Θ<π4}={0<X<1,0<Y<X}B={Φ=0}={Z=0}{\displaystyle {\begin{aligned}A&=\left\{0<\Theta <{\frac {\pi }{4}}\right\}&&=\{0<X<1,0<Y<X\}\\B&=\{\Phi =0\}&&=\{Z=0\}\end{aligned}}}

We attempt to evaluate the conditional probability as a limit of conditioning on the events

Bε={|Φ|<ε}{\displaystyle B_{\varepsilon }=\{|\Phi |<\varepsilon \}}

AsΦ{\displaystyle \Phi } andΘ{\displaystyle \Theta } are independent, so are the eventsA{\displaystyle A} andBε{\displaystyle B_{\varepsilon }}, therefore

P(AB)=?limε0P(ABε)P(Bε)=limε0P(A)=P(0<Θ<π4)=18.{\displaystyle P(A\mid B)\mathrel {\stackrel {?}{=}} \lim _{\varepsilon \to 0}{\frac {P(A\cap B_{\varepsilon })}{P(B_{\varepsilon })}}=\lim _{\varepsilon \to 0}P(A)=P\left(0<\Theta <{\frac {\pi }{4}}\right)={\frac {1}{8}}.}

Now we repeat the process with a different parametrization of the sphere:

x=sin(φ)y=cos(φ)sin(θ)z=cos(φ)cos(θ){\displaystyle {\begin{aligned}x&=\sin(\varphi )\\y&=\cos(\varphi )\sin(\theta )\\z&=-\cos(\varphi )\cos(\theta )\end{aligned}}}

This is equivalent to the previous parametrizationrotated by 90 degrees around the y axis.

Define new random variables

Φ=arcsin(X)Θ=arctan2(Y1X2,Z1X2).{\displaystyle {\begin{aligned}\Phi '&=\arcsin(X)\\\Theta '&=\arctan _{2}\left({\frac {Y}{\sqrt {1-X^{2}}}},{\frac {-Z}{\sqrt {1-X^{2}}}}\right).\end{aligned}}}

Rotation ismeasure preserving so the density ofΦ{\displaystyle \Phi '} andΘ{\displaystyle \Theta '} is the same:

fΦ,Θ(φ,θ)=14πcos(φ){\displaystyle f_{\Phi ',\Theta '}(\varphi ,\theta )={\frac {1}{4\pi }}\cos(\varphi )}.

The expressions forA andB are:

A={0<Θ<π4}={0<X<1, 0<Y<X}={0<Θ<π, 0<Φ<π2, sin(Θ)<tan(Φ)}B={Φ=0}={Z=0}={Θ=π2}{Θ=π2}.{\displaystyle {\begin{aligned}A&=\left\{0<\Theta <{\frac {\pi }{4}}\right\}&&=\{0<X<1,\ 0<Y<X\}&&=\left\{0<\Theta '<\pi ,\ 0<\Phi '<{\frac {\pi }{2}},\ \sin(\Theta ')<\tan(\Phi ')\right\}\\B&=\{\Phi =0\}&&=\{Z=0\}&&=\left\{\Theta '=-{\frac {\pi }{2}}\right\}\cup \left\{\Theta '={\frac {\pi }{2}}\right\}.\end{aligned}}}

Attempting again to evaluate the conditional probability as a limit of conditioning on the events

Bε={|Θ+π2|<ε}{|Θπ2|<ε}.{\displaystyle B_{\varepsilon }^{\prime }=\left\{\left|\Theta '+{\frac {\pi }{2}}\right|<\varepsilon \right\}\cup \left\{\left|\Theta '-{\frac {\pi }{2}}\right|<\varepsilon \right\}.}

UsingL'Hôpital's rule anddifferentiation under the integral sign:

P(AB)=?limε0P(ABε)P(Bε)=limε014ε2πP(π2ε<Θ<π2+ε, 0<Φ<π2, sin(Θ)<tan(Φ))=π2limε0επ/2ϵπ/2+ϵ0π/21sin(θ)<tan(φ)fΦ,Θ(φ,θ)dφdθ=π0π/211<tan(φ)fΦ,Θ(φ,π2)dφ=ππ/4π/214πcos(φ)dφ=14(112)18{\displaystyle {\begin{aligned}P(A\mid B)&\mathrel {\stackrel {?}{=}} \lim _{\varepsilon \to 0}{\frac {P(A\cap B_{\varepsilon }^{\prime })}{P(B_{\varepsilon }^{\prime })}}\\&=\lim _{\varepsilon \to 0}{\frac {1}{\frac {4\varepsilon }{2\pi }}}P\left({\frac {\pi }{2}}-\varepsilon <\Theta '<{\frac {\pi }{2}}+\varepsilon ,\ 0<\Phi '<{\frac {\pi }{2}},\ \sin(\Theta ')<\tan(\Phi ')\right)\\&={\frac {\pi }{2}}\lim _{\varepsilon \to 0}{\frac {\partial }{\partial \varepsilon }}\int _{{\pi }/{2}-\epsilon }^{{\pi }/{2}+\epsilon }\int _{0}^{{\pi }/{2}}1_{\sin(\theta )<\tan(\varphi )}f_{\Phi ',\Theta '}(\varphi ,\theta )\mathrm {d} \varphi \mathrm {d} \theta \\&=\pi \int _{0}^{{\pi }/{2}}1_{1<\tan(\varphi )}f_{\Phi ',\Theta '}\left(\varphi ,{\frac {\pi }{2}}\right)\mathrm {d} \varphi \\&=\pi \int _{\pi /4}^{\pi /2}{\frac {1}{4\pi }}\cos(\varphi )\mathrm {d} \varphi \\&={\frac {1}{4}}\left(1-{\frac {1}{\sqrt {2}}}\right)\neq {\frac {1}{8}}\end{aligned}}}

This shows that the conditional density cannot be treated as conditioning on an event of probability zero, as explained inConditional probability#Conditioning on an event of probability zero.

See also

[edit]

Notes

[edit]
  1. ^abcJaynes 2003, pp. 1514–1517
  2. ^OriginallyKolmogorov (1933), translated inKolmogorov (1956). Sourced fromPollard (2002)

References

[edit]
Retrieved from "https://en.wikipedia.org/w/index.php?title=Borel–Kolmogorov_paradox&oldid=1285350138"
Category:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp