Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Truncated normal distribution

From Wikipedia, the free encyclopedia
Type of probability distribution
Not to be confused withrectified normal distribution, where negative elements are reset to zero, nor acensored normal distribution, where some elements are known to be outside of a specific range.
This article includes a list ofgeneral references, butit lacks sufficient correspondinginline citations. Please help toimprove this article byintroducing more precise citations.(June 2010) (Learn how and when to remove this message)
Probability density function
Probability density function for the truncated normal distribution for different sets of parameters. In all cases,a = −10 andb = 10. For the black:μ = −8,σ = 2; blue:μ = 0,σ = 2; red:μ = 9,σ = 10; orange:μ = 0,σ = 10.
Cumulative distribution function
Cumulative distribution function for the truncated normal distribution for different sets of parameters. In all cases,a = −10 andb = 10. For the black:μ = −8,σ = 2; blue:μ = 0,σ = 2; red:μ = 9,σ = 10; orange:μ = 0,σ = 10.
Notationξ=xμσ, α=aμσ, β=bμσ{\displaystyle \xi ={\frac {x-\mu }{\sigma }},\ \alpha ={\frac {a-\mu }{\sigma }},\ \beta ={\frac {b-\mu }{\sigma }}}
Z=Φ(β)Φ(α){\displaystyle Z=\Phi (\beta )-\Phi (\alpha )}
ParametersμR{\displaystyle \mu \in \mathbb {R} }
σ20{\displaystyle \sigma ^{2}\geq 0} (but see definition)
aR{\displaystyle a\in \mathbb {R} } — minimum value ofx{\displaystyle x}
bR{\displaystyle b\in \mathbb {R} } — maximum value ofx{\displaystyle x} (b>a{\displaystyle b>a})
Supportx[a,b]{\displaystyle x\in [a,b]}
PDFf(x;μ,σ,a,b)=φ(ξ)σZ{\displaystyle f(x;\mu ,\sigma ,a,b)={\frac {\varphi (\xi )}{\sigma Z}}\,}[1]
CDFF(x;μ,σ,a,b)=Φ(ξ)Φ(α)Z{\displaystyle F(x;\mu ,\sigma ,a,b)={\frac {\Phi (\xi )-\Phi (\alpha )}{Z}}}
Meanμ+φ(α)φ(β)Zσ{\displaystyle \mu +{\frac {\varphi (\alpha )-\varphi (\beta )}{Z}}\sigma }
Medianμ+Φ1(Φ(α)+Φ(β)2)σ{\displaystyle \mu +\Phi ^{-1}\left({\frac {\Phi (\alpha )+\Phi (\beta )}{2}}\right)\sigma }
Mode{a,if μ<aμ,if aμbb,if μ>b{\displaystyle \left\{{\begin{array}{ll}a,&\mathrm {if} \ \mu <a\\\mu ,&\mathrm {if} \ a\leq \mu \leq b\\b,&\mathrm {if} \ \mu >b\end{array}}\right.}
Varianceσ2[1βφ(β)αφ(α)Z(φ(α)φ(β)Z)2]{\displaystyle \sigma ^{2}\left[1-{\frac {\beta \varphi (\beta )-\alpha \varphi (\alpha )}{Z}}-\left({\frac {\varphi (\alpha )-\varphi (\beta )}{Z}}\right)^{2}\right]}
Entropyln(2πeσZ)+αφ(α)βφ(β)2Z{\displaystyle \ln({\sqrt {2\pi e}}\sigma Z)+{\frac {\alpha \varphi (\alpha )-\beta \varphi (\beta )}{2Z}}}
MGFeμt+σ2t2/2[Φ(βσt)Φ(ασt)Φ(β)Φ(α)]{\displaystyle e^{\mu t+\sigma ^{2}t^{2}/2}\left[{\frac {\Phi (\beta -\sigma t)-\Phi (\alpha -\sigma t)}{\Phi (\beta )-\Phi (\alpha )}}\right]}

In probability and statistics, thetruncated normal distribution is the probability distribution derived from that of anormally distributed random variable by bounding the random variable from either below or above (or both). Thetruncated normal distribution has wide applications in statistics andeconometrics.

Definitions

[edit]

SupposeX{\displaystyle X} has a normal distribution with meanμ{\displaystyle \mu } and varianceσ2{\displaystyle \sigma ^{2}} and lies within the interval(a,b),witha<b{\displaystyle (a,b),{\text{with}}\;-\infty \leq a<b\leq \infty }. ThenX{\displaystyle X} conditional ona<X<b{\displaystyle a<X<b} has a truncated normal distribution.

Itsprobability density function,f{\displaystyle f}, foraxb{\displaystyle a\leq x\leq b}, is given by

f(x;μ,σ,a,b)=1σφ(xμσ)Φ(bμσ)Φ(aμσ){\displaystyle f(x;\mu ,\sigma ,a,b)={\frac {1}{\sigma }}\,{\frac {\varphi ({\frac {x-\mu }{\sigma }})}{\Phi ({\frac {b-\mu }{\sigma }})-\Phi ({\frac {a-\mu }{\sigma }})}}}

and byf=0{\displaystyle f=0} otherwise.

Here,φ(ξ)=12πexp(12ξ2){\displaystyle \varphi (\xi )={\frac {1}{\sqrt {2\pi }}}\exp \left(-{\frac {1}{2}}\xi ^{2}\right)}is the probability density function of thestandard normal distribution andΦ(){\displaystyle \Phi (\cdot )} is itscumulative distribution functionΦ(x)=12(1+erf(x/2)).{\displaystyle \Phi (x)={\frac {1}{2}}\left(1+\operatorname {erf} (x/{\sqrt {2}})\right).} By definition, ifb={\displaystyle b=\infty }, thenΦ(bμσ)=1{\displaystyle \Phi \left({\tfrac {b-\mu }{\sigma }}\right)=1}, and similarly, ifa={\displaystyle a=-\infty }, thenΦ(aμσ)=0{\displaystyle \Phi \left({\tfrac {a-\mu }{\sigma }}\right)=0}.

The above formulae show that when<a<b<+{\displaystyle -\infty <a<b<+\infty } the scale parameterσ2{\displaystyle \sigma ^{2}} of the truncated normal distribution is allowed to assume negative values. The parameterσ{\displaystyle \sigma } is in this case imaginary, but the functionf{\displaystyle f} is nevertheless real, positive, and normalizable. The scale parameterσ2{\displaystyle \sigma ^{2}} of the untruncated normal distribution must be positive because the distribution would not be normalizable otherwise. The doubly truncated normal distribution, on the other hand, can in principle have a negative scale parameter (which is different from the variance, see summary formulae), because no such integrability problems arise on a bounded domain. In this case the distribution cannot be interpreted as an untruncated normal conditional ona<X<b{\displaystyle a<X<b}, of course, but can still be interpreted as amaximum-entropy distribution with first and second moments as constraints, and has an additional peculiar feature: it presentstwo local maxima instead of one, located atx=a{\displaystyle x=a} andx=b{\displaystyle x=b}.

Properties

[edit]

The truncated normal is one of two possiblemaximum entropy probability distributions for a fixed mean and variance constrained to the interval [a,b], the other being the truncatedU.[2] Truncated normals with fixed support form an exponential family.Nielsen[3] reported closed-form formula for calculating the Kullback-Leibler divergence and the Bhattacharyya distance between two truncated normal distributions with the support of the first distribution nested into the support of the second distribution.

Moments

[edit]

If the random variable has been truncated only from below, some probability mass has been shifted to higher values, giving afirst-order stochastically dominating distribution and hence increasing the mean to a value higher than the meanμ{\displaystyle \mu } of the original normal distribution. Likewise, if the random variable has been truncated only from above, the truncated distribution has a mean less thanμ.{\displaystyle \mu .}

Regardless of whether the random variable is bounded above, below, or both, the truncation is amean-preserving contraction combined with a mean-changing rigid shift, and hence the variance of the truncated distribution is less than the varianceσ2{\displaystyle \sigma ^{2}} of the original normal distribution.

Two sided truncation

[edit]

Source:[4]

Letα=(aμ)/σ{\displaystyle \alpha =(a-\mu )/\sigma } andβ=(bμ)/σ{\displaystyle \beta =(b-\mu )/\sigma }. Then:E(Xa<X<b)=μσφ(β)φ(α)Φ(β)Φ(α){\displaystyle \operatorname {E} (X\mid a<X<b)=\mu -\sigma {\frac {\varphi (\beta )-\varphi (\alpha )}{\Phi (\beta )-\Phi (\alpha )}}}andVar(Xa<X<b)=σ2[1βφ(β)αφ(α)Φ(β)Φ(α)(φ(β)φ(α)Φ(β)Φ(α))2]{\displaystyle \operatorname {Var} (X\mid a<X<b)=\sigma ^{2}\left[1-{\frac {\beta \varphi (\beta )-\alpha \varphi (\alpha )}{\Phi (\beta )-\Phi (\alpha )}}-\left({\frac {\varphi (\beta )-\varphi (\alpha )}{\Phi (\beta )-\Phi (\alpha )}}\right)^{2}\right]}

Care must be taken in the numerical evaluation of these formulas, which can result incatastrophic cancellation when the interval[a,b]{\displaystyle [a,b]} does not includeμ{\displaystyle \mu }. There are better ways to rewrite them that avoid this issue.[5]

One sided truncation (of lower tail)

[edit]

Sources:[6][7]

In this caseb=,φ(β)=0,Φ(β)=1,{\displaystyle \;b=\infty ,\;\varphi (\beta )=0,\;\Phi (\beta )=1,} then

E(XX>a)=μ+σφ(α)/Z,{\displaystyle \operatorname {E} (X\mid X>a)=\mu +\sigma \varphi (\alpha )/Z,\!}

and

Var(XX>a)=σ2[1+αφ(α)/Z(φ(α)/Z)2],{\displaystyle \operatorname {Var} (X\mid X>a)=\sigma ^{2}[1+\alpha \varphi (\alpha )/Z-(\varphi (\alpha )/Z)^{2}],}

whereZ=1Φ(α).{\displaystyle Z=1-\Phi (\alpha ).}

One sided truncation (of upper tail)

[edit]

In this casea=α=,φ(α)=0,Φ(α)=0,{\displaystyle \;a=\alpha =-\infty ,\;\varphi (\alpha )=0,\;\Phi (\alpha )=0,} then

E(XX<b)=μσφ(β)Φ(β),{\displaystyle \operatorname {E} (X\mid X<b)=\mu -\sigma {\frac {\varphi (\beta )}{\Phi (\beta )}},}Var(XX<b)=σ2[1βφ(β)Φ(β)(φ(β)Φ(β))2].{\displaystyle \operatorname {Var} (X\mid X<b)=\sigma ^{2}\left[1-\beta {\frac {\varphi (\beta )}{\Phi (\beta )}}-\left({\frac {\varphi (\beta )}{\Phi (\beta )}}\right)^{2}\right].}

Barr & Sherrill (1999) give a simpler expression for the variance of one sided truncations. Their formula is in terms of the chi-square CDF, which is implemented in standard software libraries.Bebu & Mathew (2009) provide formulas for (generalized) confidence intervals around the truncated moments.

A recursive formula
[edit]

As for the non-truncated case, there is a recursive formula for the truncated moments.[8]

In particular, forn0{\displaystyle n\geq 0}, we have

E[(xμσ)n+2]=αn+1φ(α)βn+1φ(β)Φ(β)Φ(α)+(n+1)E[(xμσ)n].{\displaystyle \operatorname {E} \left[\left({\frac {x-\mu }{\sigma }}\right)^{n+2}\right]={\frac {\alpha ^{n+1}\varphi (\alpha )-\beta ^{n+1}\varphi (\beta )}{\Phi (\beta )-\Phi (\alpha )}}+(n+1)\operatorname {E} \left[\left({\frac {x-\mu }{\sigma }}\right)^{n}\right].}

Proof
[edit]

By the change of variablesξ=(xμ)/σ{\displaystyle \xi =(x-\mu )/\sigma }, one obtainsE[(xμσ)n+2]=αβξn+2φ(ξ)Φ(β)Φ(α)dξ.{\displaystyle \operatorname {E} \left[\left({\frac {x-\mu }{\sigma }}\right)^{n+2}\right]=\int _{\alpha }^{\beta }{\frac {\xi ^{n+2}\varphi (\xi )}{\Phi (\beta )-\Phi (\alpha )}}d\xi .}Usingφ(ξ)=ξφ(ξ),{\displaystyle \varphi '(\xi )=-\xi \varphi (\xi ),}integration by parts yieldsE[(xμσ)n+2]=[ξn+1φ(ξ)Φ(β)Φ(α)]αβ+(n+1)αβξnφ(ξ)Φ(β)Φ(α)dξ,{\displaystyle \operatorname {E} \left[\left({\frac {x-\mu }{\sigma }}\right)^{n+2}\right]=\left[{\frac {-\xi ^{n+1}\varphi (\xi )}{\Phi (\beta )-\Phi (\alpha )}}\right]_{\alpha }^{\beta }+(n+1)\int _{\alpha }^{\beta }{\frac {\xi ^{n}\varphi (\xi )}{\Phi (\beta )-\Phi (\alpha )}}d\xi ,}which gives the equation to be proven.

Multivariate
[edit]

Computing the moments of a multivariate truncated normal is harder.

Generating values from the truncated normal distribution

[edit]
Further information:Pseudo-random number sampling
This section'suse ofexternal links may not follow Wikipedia's policies or guidelines. Pleaseimprove this article by removingexcessive orinappropriate external links, and converting useful links where appropriate intofootnote references.(May 2022) (Learn how and when to remove this message)

A random variatex{\displaystyle x} defined asx=Φ1(Φ(α)+U(Φ(β)Φ(α)))σ+μ{\displaystyle x=\Phi ^{-1}(\Phi (\alpha )+U\cdot (\Phi (\beta )-\Phi (\alpha )))\sigma +\mu } withΦ{\displaystyle \Phi } the cumulative distribution function of the normal distribution to be sampled from (i.e. with correct mean and variance) andΦ1{\displaystyle \Phi ^{-1}} its inverse,U{\displaystyle U} a uniform random number on(0,1){\displaystyle (0,1)}, follows the distribution truncated to the range(a,b){\displaystyle (a,b)}. This is simply theinverse transform method for simulating random variables. Although one of the simplest, this method can either fail when sampling in the tail of the normal distribution,[9] or be much too slow.[10] Thus, in practice, one has to find alternative methods of simulation.

One such truncated normal generator (implemented inMatlab andinR (programming language) astrandn.R ) is based on an acceptance rejection idea due to Marsaglia.[11] Despite the slightly suboptimal acceptance rate ofMarsaglia (1964) in comparison withRobert (1995), Marsaglia's method is typically faster,[10] because it does not require the costly numerical evaluation of the exponential function.

For more on simulating a draw from the truncated normal distribution, seeRobert (1995),Lynch (2007, Section 8.1.3 (pages 200–206)),Devroye (1986). TheMSM package in R has a function,rtnorm, that calculates draws from a truncated normal. Thetruncnorm package in R also has functions to draw from a truncated normal.

Chopin (2011) proposed (arXiv) an algorithm inspired from theZiggurat algorithm of Marsaglia and Tsang (1984, 2000), which is usually considered as the fastest Gaussian sampler, and is also very close to Ahrens's algorithm (1995). Implementations can be found inC,C++,Matlab andPython.

Sampling from themultivariate truncated normal distribution is considerably more difficult.[12] Exact or perfect simulation is only feasible in the case of truncation of the normal distribution to a polytope region.[12][13] In more general cases,Damien & Walker (2001) introduce a general methodology for sampling truncated densities within aGibbs sampling framework. Their algorithm introduces one latent variable and, within a Gibbs sampling framework, it is more computationally efficient than the algorithm ofRobert (1995).

See also

[edit]

Notes

[edit]
  1. ^"Lecture 4: Selection"(PDF).web.ist.utl.pt.Instituto Superior Técnico. November 11, 2002. p. 1. Retrieved14 July 2015.
  2. ^Dowson, D.; Wragg, A. (September 1973). "Maximum-entropy distributions having prescribed first and second moments (Corresp.)".IEEE Transactions on Information Theory.19 (5):689–693.doi:10.1109/TIT.1973.1055060.ISSN 1557-9654.
  3. ^Frank Nielsen (2022)."Statistical Divergences between Densities of Truncated Exponential Families with Nested Supports: Duo Bregman and Duo Jensen Divergences".Entropy.24 (3). MDPI: 421.Bibcode:2022Entrp..24..421N.doi:10.3390/e24030421.PMC 8947456.PMID 35327931.
  4. ^Johnson, Norman Lloyd; Kotz, Samuel; Balakrishnan, N. (1994).Continuous Univariate Distributions. Vol. 1 (2nd ed.). New York: Wiley. Section 10.1.ISBN 0-471-58495-9.OCLC 29428092.
  5. ^Fernandez-de-Cossio-Diaz, Jorge (2017-12-06),TruncatedNormal.jl: Compute mean and variance of the univariate truncated normal distribution (works far from the peak), retrieved2017-12-06
  6. ^Greene, William H. (2003).Econometric Analysis (5th ed.). Prentice Hall.ISBN 978-0-13-066189-0.
  7. ^del Castillo, Joan (March 1994)."The singly truncated normal distribution: A non-steep exponential family"(PDF).Annals of the Institute of Statistical Mathematics.46 (1):57–66.doi:10.1007/BF00773592.
  8. ^Document by Eric Orjebin, "https://people.smp.uq.edu.au/YoniNazarathy/teaching_projects/studentWork/EricOrjebin_TruncatedNormalMoments.pdf"
  9. ^Kroese, D. P.; Taimre, T.; Botev, Z. I. (2011).Handbook of Monte Carlo methods. John Wiley & Sons.
  10. ^abBotev, Z. I.; L'Ecuyer, P. (2017). "Simulation from the Normal Distribution Truncated to an Interval in the Tail".10th EAI International Conference on Performance Evaluation Methodologies and Tools. 25th–28th Oct 2016 Taormina, Italy: ACM. pp. 23–29.doi:10.4108/eai.25-10-2016.2266879.ISBN 978-1-63190-141-6.{{cite conference}}: CS1 maint: location (link)
  11. ^Marsaglia, George (1964). "Generating a variable from the tail of the normal distribution".Technometrics.6 (1):101–102.doi:10.2307/1266749.JSTOR 1266749.
  12. ^abBotev, Z. I. (2016). "The normal law under linear restrictions: simulation and estimation via minimax tilting".Journal of the Royal Statistical Society, Series B.79:125–148.arXiv:1603.04166.doi:10.1111/rssb.12162.S2CID 88515228.
  13. ^Botev, Zdravko & L'Ecuyer, Pierre (2018). "Chapter 8: Simulation from the Tail of the Univariate and Multivariate Normal Distribution". In Puliafito, Antonio (ed.).Systems Modeling: Methodologies and Tools. EAI/Springer Innovations in Communication and Computing. Springer, Cham. pp. 115–132.doi:10.1007/978-3-319-92378-9_8.ISBN 978-3-319-92377-2.S2CID 125554530.
  14. ^Sun, Jingchao; Kong, Maiying; Pal, Subhadip (22 June 2021)."The Modified-Half-Normal distribution: Properties and an efficient sampling scheme".Communications in Statistics - Theory and Methods.52 (5):1591–1613.doi:10.1080/03610926.2021.1934700.ISSN 0361-0926.S2CID 237919587.

References

[edit]
Discrete
univariate
with finite
support
with infinite
support
Continuous
univariate
supported on a
bounded interval
supported on a
semi-infinite
interval
supported
on the whole
real line
with support
whose type varies
Mixed
univariate
continuous-
discrete
Multivariate
(joint)
Directional
Degenerate
andsingular
Degenerate
Dirac delta function
Singular
Cantor
Families
Retrieved from "https://en.wikipedia.org/w/index.php?title=Truncated_normal_distribution&oldid=1329527813"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp