Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Sum of normally distributed random variables

From Wikipedia, the free encyclopedia
Aspect of probability theory

Inprobability theory, calculation of thesum of normally distributed random variables is an instance of the arithmetic ofrandom variables.

This is not to be confused with thesum of normal distributions which forms amixture distribution.

Independent random variables

[edit]

LetX andY beindependentrandom variables that arenormally distributed (and therefore also jointly so), then their sum is also normally distributed. i.e., if

XN(μX,σX2){\displaystyle X\sim N(\mu _{X},\sigma _{X}^{2})}
YN(μY,σY2){\displaystyle Y\sim N(\mu _{Y},\sigma _{Y}^{2})}
Z=X+Y,{\displaystyle Z=X+Y,}

then

ZN(μX+μY,σX2+σY2).{\displaystyle Z\sim N(\mu _{X}+\mu _{Y},\sigma _{X}^{2}+\sigma _{Y}^{2}).}

This means that the sum of two independent normally distributed random variables is normal, with its mean being the sum of the two means, and its variance being the sum of the two variances (i.e., the square of the standard deviation is the sum of the squares of the standard deviations).[1]

In order for this result to hold, the assumption thatX andY are independent cannot be dropped, although it can be weakened to the assumption thatX andY arejointly, rather than separately, normally distributed.[2] (Seehere for an example.)

The result about the mean holds in all cases, while the result for the variance requires uncorrelatedness, but not independence.

Proofs

[edit]

Proof using characteristic functions

[edit]

Thecharacteristic function

φX+Y(t)=E(eit(X+Y)){\displaystyle \varphi _{X+Y}(t)=\operatorname {E} \left(e^{it(X+Y)}\right)}

of the sum of two independent random variablesX andY is just the product of the two separate characteristic functions:

φX(t)=E(eitX),φY(t)=E(eitY){\displaystyle \varphi _{X}(t)=\operatorname {E} \left(e^{itX}\right),\qquad \varphi _{Y}(t)=\operatorname {E} \left(e^{itY}\right)}

ofX andY.

The characteristic function of the normal distribution with expected value μ and variance σ2 is

φ(t)=exp(itμσ2t22).{\displaystyle \varphi (t)=\exp \left(it\mu -{\sigma ^{2}t^{2} \over 2}\right).}

So

φX+Y(t)=φX(t)φY(t)=exp(itμXσX2t22)exp(itμYσY2t22)=exp(it(μX+μY)(σX2+σY2)t22).{\displaystyle {\begin{aligned}\varphi _{X+Y}(t)=\varphi _{X}(t)\varphi _{Y}(t)&=\exp \left(it\mu _{X}-{\sigma _{X}^{2}t^{2} \over 2}\right)\exp \left(it\mu _{Y}-{\sigma _{Y}^{2}t^{2} \over 2}\right)\\[6pt]&=\exp \left(it(\mu _{X}+\mu _{Y})-{(\sigma _{X}^{2}+\sigma _{Y}^{2})t^{2} \over 2}\right).\end{aligned}}}

This is the characteristic function of the normal distribution with expected valueμX+μY{\displaystyle \mu _{X}+\mu _{Y}} and varianceσX2+σY2{\displaystyle \sigma _{X}^{2}+\sigma _{Y}^{2}}

Finally, recall that no two distinct distributions can both have the same characteristic function, so the distribution ofX + Y must be just this normal distribution.

Proof using convolutions

[edit]

For independent random variablesX andY, the distributionfZ ofZ =X + Y equals the convolution offX andfY:

fZ(z)=fY(zx)fX(x)dx{\displaystyle f_{Z}(z)=\int _{-\infty }^{\infty }f_{Y}(z-x)f_{X}(x)\,dx}

Given thatfX andfY are normal densities,

fX(x)=N(x;μX,σX2)=12πσXe(xμX)2/(2σX2)fY(y)=N(y;μY,σY2)=12πσYe(yμY)2/(2σY2){\displaystyle {\begin{aligned}f_{X}(x)={\mathcal {N}}(x;\mu _{X},\sigma _{X}^{2})={\frac {1}{{\sqrt {2\pi }}\sigma _{X}}}e^{-(x-\mu _{X})^{2}/(2\sigma _{X}^{2})}\\[5pt]f_{Y}(y)={\mathcal {N}}(y;\mu _{Y},\sigma _{Y}^{2})={\frac {1}{{\sqrt {2\pi }}\sigma _{Y}}}e^{-(y-\mu _{Y})^{2}/(2\sigma _{Y}^{2})}\end{aligned}}}

Substituting into the convolution:

fZ(z)=12πσYexp[(zxμY)22σY2]12πσXexp[(xμX)22σX2]dx=12π2πσXσYexp[σX2(zxμY)2+σY2(xμX)22σX2σY2]dx=12π2πσXσYexp[σX2(z2+x2+μY22xz2zμY+2xμY)+σY2(x2+μX22xμX)2σY2σX2]dx=12π2πσXσYexp[x2(σX2+σY2)2x(σX2(zμY)+σY2μX)+σX2(z2+μY22zμY)+σY2μX22σY2σX2]dx{\displaystyle {\begin{aligned}f_{Z}(z)&=\int _{-\infty }^{\infty }{\frac {1}{{\sqrt {2\pi }}\sigma _{Y}}}\exp \left[-{(z-x-\mu _{Y})^{2} \over 2\sigma _{Y}^{2}}\right]{\frac {1}{{\sqrt {2\pi }}\sigma _{X}}}\exp \left[-{(x-\mu _{X})^{2} \over 2\sigma _{X}^{2}}\right]\,dx\\[6pt]&=\int _{-\infty }^{\infty }{\frac {1}{{\sqrt {2\pi }}{\sqrt {2\pi }}\sigma _{X}\sigma _{Y}}}\exp \left[-{\frac {\sigma _{X}^{2}(z-x-\mu _{Y})^{2}+\sigma _{Y}^{2}(x-\mu _{X})^{2}}{2\sigma _{X}^{2}\sigma _{Y}^{2}}}\right]\,dx\\[6pt]&=\int _{-\infty }^{\infty }{\frac {1}{{\sqrt {2\pi }}{\sqrt {2\pi }}\sigma _{X}\sigma _{Y}}}\exp \left[-{\frac {\sigma _{X}^{2}(z^{2}+x^{2}+\mu _{Y}^{2}-2xz-2z\mu _{Y}+2x\mu _{Y})+\sigma _{Y}^{2}(x^{2}+\mu _{X}^{2}-2x\mu _{X})}{2\sigma _{Y}^{2}\sigma _{X}^{2}}}\right]\,dx\\[6pt]&=\int _{-\infty }^{\infty }{\frac {1}{{\sqrt {2\pi }}{\sqrt {2\pi }}\sigma _{X}\sigma _{Y}}}\exp \left[-{\frac {x^{2}(\sigma _{X}^{2}+\sigma _{Y}^{2})-2x(\sigma _{X}^{2}(z-\mu _{Y})+\sigma _{Y}^{2}\mu _{X})+\sigma _{X}^{2}(z^{2}+\mu _{Y}^{2}-2z\mu _{Y})+\sigma _{Y}^{2}\mu _{X}^{2}}{2\sigma _{Y}^{2}\sigma _{X}^{2}}}\right]\,dx\\[6pt]\end{aligned}}}

DefiningσZ=σX2+σY2{\displaystyle \sigma _{Z}={\sqrt {\sigma _{X}^{2}+\sigma _{Y}^{2}}}}, andcompleting the square:

fZ(z)=12πσZ12πσXσYσZexp[x22xσX2(zμY)+σY2μXσZ2+σX2(z2+μY22zμY)+σY2μX2σZ22(σXσYσZ)2]dx=12πσZ12πσXσYσZexp[(xσX2(zμY)+σY2μXσZ2)2(σX2(zμY)+σY2μXσZ2)2+σX2(zμY)2+σY2μX2σZ22(σXσYσZ)2]dx=12πσZexp[σZ2(σX2(zμY)2+σY2μX2)(σX2(zμY)+σY2μX)22σZ2(σXσY)2]12πσXσYσZexp[(xσX2(zμY)+σY2μXσZ2)22(σXσYσZ)2]dx=12πσZexp[(z(μX+μY))22σZ2]12πσXσYσZexp[(xσX2(zμY)+σY2μXσZ2)22(σXσYσZ)2]dx{\displaystyle {\begin{aligned}f_{Z}(z)&=\int _{-\infty }^{\infty }{\frac {1}{{\sqrt {2\pi }}\sigma _{Z}}}{\frac {1}{{\sqrt {2\pi }}{\frac {\sigma _{X}\sigma _{Y}}{\sigma _{Z}}}}}\exp \left[-{\frac {x^{2}-2x{\frac {\sigma _{X}^{2}(z-\mu _{Y})+\sigma _{Y}^{2}\mu _{X}}{\sigma _{Z}^{2}}}+{\frac {\sigma _{X}^{2}(z^{2}+\mu _{Y}^{2}-2z\mu _{Y})+\sigma _{Y}^{2}\mu _{X}^{2}}{\sigma _{Z}^{2}}}}{2\left({\frac {\sigma _{X}\sigma _{Y}}{\sigma _{Z}}}\right)^{2}}}\right]\,dx\\[6pt]&=\int _{-\infty }^{\infty }{\frac {1}{{\sqrt {2\pi }}\sigma _{Z}}}{\frac {1}{{\sqrt {2\pi }}{\frac {\sigma _{X}\sigma _{Y}}{\sigma _{Z}}}}}\exp \left[-{\frac {\left(x-{\frac {\sigma _{X}^{2}(z-\mu _{Y})+\sigma _{Y}^{2}\mu _{X}}{\sigma _{Z}^{2}}}\right)^{2}-\left({\frac {\sigma _{X}^{2}(z-\mu _{Y})+\sigma _{Y}^{2}\mu _{X}}{\sigma _{Z}^{2}}}\right)^{2}+{\frac {\sigma _{X}^{2}(z-\mu _{Y})^{2}+\sigma _{Y}^{2}\mu _{X}^{2}}{\sigma _{Z}^{2}}}}{2\left({\frac {\sigma _{X}\sigma _{Y}}{\sigma _{Z}}}\right)^{2}}}\right]\,dx\\[6pt]&=\int _{-\infty }^{\infty }{\frac {1}{{\sqrt {2\pi }}\sigma _{Z}}}\exp \left[-{\frac {\sigma _{Z}^{2}\left(\sigma _{X}^{2}(z-\mu _{Y})^{2}+\sigma _{Y}^{2}\mu _{X}^{2}\right)-\left(\sigma _{X}^{2}(z-\mu _{Y})+\sigma _{Y}^{2}\mu _{X}\right)^{2}}{2\sigma _{Z}^{2}\left(\sigma _{X}\sigma _{Y}\right)^{2}}}\right]{\frac {1}{{\sqrt {2\pi }}{\frac {\sigma _{X}\sigma _{Y}}{\sigma _{Z}}}}}\exp \left[-{\frac {\left(x-{\frac {\sigma _{X}^{2}(z-\mu _{Y})+\sigma _{Y}^{2}\mu _{X}}{\sigma _{Z}^{2}}}\right)^{2}}{2\left({\frac {\sigma _{X}\sigma _{Y}}{\sigma _{Z}}}\right)^{2}}}\right]\,dx\\[6pt]&={\frac {1}{{\sqrt {2\pi }}\sigma _{Z}}}\exp \left[-{(z-(\mu _{X}+\mu _{Y}))^{2} \over 2\sigma _{Z}^{2}}\right]\int _{-\infty }^{\infty }{\frac {1}{{\sqrt {2\pi }}{\frac {\sigma _{X}\sigma _{Y}}{\sigma _{Z}}}}}\exp \left[-{\frac {\left(x-{\frac {\sigma _{X}^{2}(z-\mu _{Y})+\sigma _{Y}^{2}\mu _{X}}{\sigma _{Z}^{2}}}\right)^{2}}{2\left({\frac {\sigma _{X}\sigma _{Y}}{\sigma _{Z}}}\right)^{2}}}\right]\,dx\end{aligned}}}

The expression in the integral is a normal density distribution onx, and so the integral evaluates to 1. The desired result follows:

fZ(z)=12πσZexp[(z(μX+μY))22σZ2]{\displaystyle f_{Z}(z)={\frac {1}{{\sqrt {2\pi }}\sigma _{Z}}}\exp \left[-{(z-(\mu _{X}+\mu _{Y}))^{2} \over 2\sigma _{Z}^{2}}\right]}
Using theconvolution theorem
[edit]

It can be shown that theFourier transform of a Gaussian,fX(x)=N(x;μX,σX2){\displaystyle f_{X}(x)={\mathcal {N}}(x;\mu _{X},\sigma _{X}^{2})}, is[3]

F{fX}=FX(ω)=exp[jωμX]exp[σX2ω22]{\displaystyle {\mathcal {F}}\{f_{X}\}=F_{X}(\omega )=\exp \left[-j\omega \mu _{X}\right]\exp \left[-{\tfrac {\sigma _{X}^{2}\omega ^{2}}{2}}\right]}

By theconvolution theorem:

fZ(z)=(fXfY)(z)=F1{F{fX}F{fY}}=F1{exp[jωμX]exp[σX2ω22]exp[jωμY]exp[σY2ω22]}=F1{exp[jω(μX+μY)]exp[(σX2 +σY2)ω22]}=N(z;μX+μY,σX2+σY2){\displaystyle {\begin{aligned}f_{Z}(z)&=(f_{X}*f_{Y})(z)\\[5pt]&={\mathcal {F}}^{-1}{\big \{}{\mathcal {F}}\{f_{X}\}\cdot {\mathcal {F}}\{f_{Y}\}{\big \}}\\[5pt]&={\mathcal {F}}^{-1}{\big \{}\exp \left[-j\omega \mu _{X}\right]\exp \left[-{\tfrac {\sigma _{X}^{2}\omega ^{2}}{2}}\right]\exp \left[-j\omega \mu _{Y}\right]\exp \left[-{\tfrac {\sigma _{Y}^{2}\omega ^{2}}{2}}\right]{\big \}}\\[5pt]&={\mathcal {F}}^{-1}{\big \{}\exp \left[-j\omega (\mu _{X}+\mu _{Y})\right]\exp \left[-{\tfrac {(\sigma _{X}^{2}\ +\sigma _{Y}^{2})\omega ^{2}}{2}}\right]{\big \}}\\[5pt]&={\mathcal {N}}(z;\mu _{X}+\mu _{Y},\sigma _{X}^{2}+\sigma _{Y}^{2})\end{aligned}}}

Geometric proof

[edit]

First consider the normalized case whenX,Y ~N(0, 1), so that theirPDFs are

f(x)=12πex2/2{\displaystyle f(x)={\frac {1}{\sqrt {2\pi \,}}}e^{-x^{2}/2}}

and

g(y)=12πey2/2.{\displaystyle g(y)={\frac {1}{\sqrt {2\pi \,}}}e^{-y^{2}/2}.}

LetZ =X + Y. Then theCDF forZ will be

zx+yzf(x)g(y)dxdy.{\displaystyle z\mapsto \int _{x+y\leq z}f(x)g(y)\,dx\,dy.}

This integral is over the half-plane which lies under the linex+y =z.

The key observation is that the function

f(x)g(y)=12πe(x2+y2)/2{\displaystyle f(x)g(y)={\frac {1}{2\pi }}e^{-(x^{2}+y^{2})/2}\,}

is radially symmetric. So we rotate the coordinate plane about the origin, choosing new coordinatesx,y{\displaystyle x',y'} such that the linex+y =z is described by the equationx=c{\displaystyle x'=c} wherec=c(z){\displaystyle c=c(z)} is determined geometrically. Because of the radial symmetry, we havef(x)g(y)=f(x)g(y){\displaystyle f(x)g(y)=f(x')g(y')}, and the CDF forZ is

xc,yRf(x)g(y)dxdy.{\displaystyle \int _{x'\leq c,y'\in \mathbb {R} }f(x')g(y')\,dx'\,dy'.}

This is easy to integrate; we find that the CDF forZ is

c(z)f(x)dx=Φ(c(z)).{\displaystyle \int _{-\infty }^{c(z)}f(x')\,dx'=\Phi (c(z)).}

To determine the valuec(z){\displaystyle c(z)}, note that we rotated the plane so that the linex+y =z now runs vertically withx-intercept equal toc. Soc is just the distance from the origin to the linex+y =z along the perpendicular bisector, which meets the line at its nearest point to the origin, in this case(z/2,z/2){\displaystyle (z/2,z/2)\,}. So the distance isc=(z/2)2+(z/2)2=z/2{\displaystyle c={\sqrt {(z/2)^{2}+(z/2)^{2}}}=z/{\sqrt {2}}\,}, and the CDF forZ isΦ(z/2){\displaystyle \Phi (z/{\sqrt {2}})}, i.e.,Z=X+YN(0,2).{\displaystyle Z=X+Y\sim N(0,2).}

Now, ifa,b are any real constants (not both zero) then the probability thataX+bYz{\displaystyle aX+bY\leq z} is found by the same integral as above, but with the bounding lineax+by=z{\displaystyle ax+by=z}. The same rotation method works, and in this more general case we find that the closest point on the line to the origin is located a (signed) distance

za2+b2{\displaystyle {\frac {z}{\sqrt {a^{2}+b^{2}}}}}

away, so that

aX+bYN(0,a2+b2).{\displaystyle aX+bY\sim N(0,a^{2}+b^{2}).}

The same argument in higher dimensions shows that if

XiN(0,σi2),i=1,,n,{\displaystyle X_{i}\sim N(0,\sigma _{i}^{2}),\qquad i=1,\dots ,n,}

then

X1++XnN(0,σ12++σn2).{\displaystyle X_{1}+\cdots +X_{n}\sim N(0,\sigma _{1}^{2}+\cdots +\sigma _{n}^{2}).}

Now we are essentially done, because

XN(μ,σ2)1σ(Xμ)N(0,1).{\displaystyle X\sim N(\mu ,\sigma ^{2})\Leftrightarrow {\frac {1}{\sigma }}(X-\mu )\sim N(0,1).}

So in general, if

XiN(μi,σi2),i=1,,n,{\displaystyle X_{i}\sim N(\mu _{i},\sigma _{i}^{2}),\qquad i=1,\dots ,n,}

then

i=1naiXiN(i=1naiμi,i=1n(aiσi)2).{\displaystyle \sum _{i=1}^{n}a_{i}X_{i}\sim N\left(\sum _{i=1}^{n}a_{i}\mu _{i},\sum _{i=1}^{n}(a_{i}\sigma _{i})^{2}\right).}

Correlated random variables

[edit]
See also:Markov chain central limit theorem

In the event that the variablesX andY are jointly normally distributed random variables, thenX + Y is still normally distributed (seeMultivariate normal distribution) and the mean is the sum of the means. However, the variances are not additive due to the correlation. Indeed,

σX+Y=σX2+σY2+2ρσXσY,{\displaystyle \sigma _{X+Y}={\sqrt {\sigma _{X}^{2}+\sigma _{Y}^{2}+2\rho \sigma _{X}\sigma _{Y}}},}

where ρ is thecorrelation. In particular, whenever ρ < 0, then the variance is less than the sum of the variances ofX andY.

Extensions of this result can be made for more than two random variables, using thecovariance matrix.

Note that the condition thatX andY are known to be jointly normally distributed is necessary for the conclusion that their sum is normally distributed to apply. It is possible to have variablesX andY which are individually normally distributed, but have a more complicated joint distribution. In that instance,X + Y may of course have a complicated, non-normal distribution. In some cases, this situation can be treated usingcopulas.

Proof

[edit]

In this case (withX andY having zero means), one needs to consider

12πσxσy1ρ2xyexp[12(1ρ2)(x2σx2+y2σy22ρxyσxσy)]δ(z(x+y))dxdy.{\displaystyle {\frac {1}{2\pi \sigma _{x}\sigma _{y}{\sqrt {1-\rho ^{2}}}}}\iint _{x\,y}\exp \left[-{\frac {1}{2(1-\rho ^{2})}}\left({\frac {x^{2}}{\sigma _{x}^{2}}}+{\frac {y^{2}}{\sigma _{y}^{2}}}-{\frac {2\rho xy}{\sigma _{x}\sigma _{y}}}\right)\right]\delta (z-(x+y))\,\mathrm {d} x\,\mathrm {d} y.}

As above, one makes the substitutionyzx{\displaystyle y\rightarrow z-x}

This integral is more complicated to simplify analytically, but can be done easily using a symbolic mathematics program. The probability distributionfZ(z) is given in this case by

fZ(z)=12πσ+exp(z22σ+2){\displaystyle f_{Z}(z)={\frac {1}{{\sqrt {2\pi }}\sigma _{+}}}\exp \left(-{\frac {z^{2}}{2\sigma _{+}^{2}}}\right)}

where

σ+=σx2+σy2+2ρσxσy.{\displaystyle \sigma _{+}={\sqrt {\sigma _{x}^{2}+\sigma _{y}^{2}+2\rho \sigma _{x}\sigma _{y}}}.}

If one considers insteadZ =X − Y, then one obtains

fZ(z)=12π(σx2+σy22ρσxσy)exp(z22(σx2+σy22ρσxσy)){\displaystyle f_{Z}(z)={\frac {1}{\sqrt {2\pi (\sigma _{x}^{2}+\sigma _{y}^{2}-2\rho \sigma _{x}\sigma _{y})}}}\exp \left(-{\frac {z^{2}}{2(\sigma _{x}^{2}+\sigma _{y}^{2}-2\rho \sigma _{x}\sigma _{y})}}\right)}

which also can be rewritten with

σXY=σx2+σy22ρσxσy.{\displaystyle \sigma _{X-Y}={\sqrt {\sigma _{x}^{2}+\sigma _{y}^{2}-2\rho \sigma _{x}\sigma _{y}}}.}

The standard deviations of each distribution are obvious by comparison with the standard normal distribution.

References

[edit]
  1. ^Lemons, Don S. (2002),An Introduction to Stochastic Processes in Physics, The Johns Hopkins University Press, p. 34,ISBN 0-8018-6866-1
  2. ^Lemons (2002) pp. 35–36
  3. ^Derpanis, Konstantinos G. (October 20, 2005)."Fourier Transform of the Gaussian"(PDF).

See also

[edit]
Retrieved from "https://en.wikipedia.org/w/index.php?title=Sum_of_normally_distributed_random_variables&oldid=1323445763"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp