Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Folded normal distribution

From Wikipedia, the free encyclopedia
Probability distribution
Probability density function
Probability density function for the folded-normal distribution
μ=1,σ=1
Cumulative distribution function
Cumulative distribution function for the normal distribution
μ=1,σ=1
ParametersμR   (location)
σ2 > 0   (scale)
Supportx ∈ [0,∞)
PDF1σ2πe(xμ)22σ2+1σ2πe(x+μ)22σ2{\displaystyle {\frac {1}{\sigma {\sqrt {2\pi }}}}\,e^{-{\frac {(x-\mu )^{2}}{2\sigma ^{2}}}}+{\frac {1}{\sigma {\sqrt {2\pi }}}}\,e^{-{\frac {(x+\mu )^{2}}{2\sigma ^{2}}}}}
CDF12[erf(x+μσ2)+erf(xμσ2)]{\displaystyle {\frac {1}{2}}\left[{\mbox{erf}}\left({\frac {x+\mu }{\sigma {\sqrt {2}}}}\right)+{\mbox{erf}}\left({\frac {x-\mu }{\sigma {\sqrt {2}}}}\right)\right]}
MeanμY=σ2πe(μ2/2σ2)+μ(12Φ(μσ)){\displaystyle \mu _{Y}=\sigma {\sqrt {\tfrac {2}{\pi }}}\,e^{(-\mu ^{2}/2\sigma ^{2})}+\mu \left(1-2\,\Phi (-{\tfrac {\mu }{\sigma }})\right)}
VarianceσY2=μ2+σ2μY2{\displaystyle \sigma _{Y}^{2}=\mu ^{2}+\sigma ^{2}-\mu _{Y}^{2}}

Thefolded normal distribution is aprobability distribution related to thenormal distribution. Given a normally distributed random variableX withmeanμ andvarianceσ2, therandom variableY = |X| has a folded normal distribution. Such a case may be encountered if only the magnitude of some variable is recorded, but not its sign. The distribution is called "folded" because probability mass to the left ofx = 0 is folded over by taking theabsolute value. In the physics ofheat conduction, the folded normal distribution is a fundamental solution of theheat equation on the half space; it corresponds to having a perfect insulator on ahyperplane through the origin.

Definitions

[edit]

Density

[edit]

Theprobability density function (PDF) is given by

fY(x;μ,σ2)=12πσ2e(xμ)22σ2+12πσ2e(x+μ)22σ2{\displaystyle f_{Y}(x;\mu ,\sigma ^{2})={\frac {1}{\sqrt {2\pi \sigma ^{2}}}}\,e^{-{\frac {(x-\mu )^{2}}{2\sigma ^{2}}}}+{\frac {1}{\sqrt {2\pi \sigma ^{2}}}}\,e^{-{\frac {(x+\mu )^{2}}{2\sigma ^{2}}}}}

forx ≥ 0, and 0 everywhere else. An alternative formulation is given by

f(x)=2πσ2e(x2+μ2)2σ2cosh(μxσ2){\displaystyle f\left(x\right)={\sqrt {\frac {2}{\pi \sigma ^{2}}}}e^{-{\frac {\left(x^{2}+\mu ^{2}\right)}{2\sigma ^{2}}}}\cosh {\left({\frac {\mu x}{\sigma ^{2}}}\right)}},

where cosh is theHyperbolic cosine function. It follows that thecumulative distribution function (CDF) is given by:

FY(x;μ,σ2)=12[erf(x+μ2σ2)+erf(xμ2σ2)]{\displaystyle F_{Y}(x;\mu ,\sigma ^{2})={\frac {1}{2}}\left[{\mbox{erf}}\left({\frac {x+\mu }{\sqrt {2\sigma ^{2}}}}\right)+{\mbox{erf}}\left({\frac {x-\mu }{\sqrt {2\sigma ^{2}}}}\right)\right]}

forx ≥ 0, where erf() is theerror function. This expression reduces to the CDF of thehalf-normal distribution whenμ = 0.

The mean of the folded distribution is then

μY=σ2πexp(μ22σ2)+μerf(μ2σ2){\displaystyle \mu _{Y}=\sigma {\sqrt {\frac {2}{\pi }}}\,\,\exp \left({\frac {-\mu ^{2}}{2\sigma ^{2}}}\right)+\mu \,{\mbox{erf}}\left({\frac {\mu }{\sqrt {2\sigma ^{2}}}}\right)}

or

μY=2πσeμ22σ2+μ[12Φ(μσ)]{\displaystyle \mu _{Y}={\sqrt {\frac {2}{\pi }}}\sigma e^{-{\frac {\mu ^{2}}{2\sigma ^{2}}}}+\mu \left[1-2\Phi \left(-{\frac {\mu }{\sigma }}\right)\right]}

whereΦ{\displaystyle \Phi } is thenormal cumulative distribution function:

Φ(x)=12[1+erf(x2)].{\displaystyle \Phi (x)\;=\;{\frac {1}{2}}\left[1+\operatorname {erf} \left({\frac {x}{\sqrt {2}}}\right)\right].}

The variance then is expressed easily in terms of the mean:

σY2=μ2+σ2μY2.{\displaystyle \sigma _{Y}^{2}=\mu ^{2}+\sigma ^{2}-\mu _{Y}^{2}.}

Both the mean (μ) and variance (σ2) ofX in the original normal distribution can be interpreted as the location and scale parameters ofY in the folded distribution.

Properties

[edit]

Mode

[edit]

The mode of the distribution is the value ofx{\displaystyle x} for which the density is maximised. In order to find this value, we take the first derivative of the density with respect tox{\displaystyle x} and set it equal to zero. Unfortunately, there is no closed form. We can, however, write the derivative in a better way and end up with a non-linear equation

df(x)dx=0(xμ)σ2e12(xμ)2σ2(x+μ)σ2e12(x+μ)2σ2=0{\displaystyle {\frac {df(x)}{dx}}=0\Rightarrow -{\frac {\left(x-\mu \right)}{\sigma ^{2}}}e^{-{\frac {1}{2}}{\frac {\left(x-\mu \right)^{2}}{\sigma ^{2}}}}-{\frac {\left(x+\mu \right)}{\sigma ^{2}}}e^{-{\frac {1}{2}}{\frac {\left(x+\mu \right)^{2}}{\sigma ^{2}}}}=0}

x[e12(xμ)2σ2+e12(x+μ)2σ2]μ[e12(xμ)2σ2e12(x+μ)2σ2]=0{\displaystyle x\left[e^{-{\frac {1}{2}}{\frac {\left(x-\mu \right)^{2}}{\sigma ^{2}}}}+e^{-{\frac {1}{2}}{\frac {\left(x+\mu \right)^{2}}{\sigma ^{2}}}}\right]-\mu \left[e^{-{\frac {1}{2}}{\frac {\left(x-\mu \right)^{2}}{\sigma ^{2}}}}-e^{-{\frac {1}{2}}{\frac {\left(x+\mu \right)^{2}}{\sigma ^{2}}}}\right]=0}

x(1+e2μxσ2)μ(1e2μxσ2)=0{\displaystyle x\left(1+e^{-{\frac {2\mu x}{\sigma ^{2}}}}\right)-\mu \left(1-e^{-{\frac {2\mu x}{\sigma ^{2}}}}\right)=0}

(μ+x)e2μxσ2=μx{\displaystyle \left(\mu +x\right)e^{-{\frac {2\mu x}{\sigma ^{2}}}}=\mu -x}

x=σ22μlogμxμ+x{\displaystyle x=-{\frac {\sigma ^{2}}{2\mu }}\log {\frac {\mu -x}{\mu +x}}}.

Tsagris et al. (2014) saw from numerical investigation that whenμ<σ{\displaystyle \mu <\sigma }, the maximum is met whenx=0{\displaystyle x=0}, and whenμ{\displaystyle \mu } becomes greater than3σ{\displaystyle 3\sigma }, the maximum approachesμ{\displaystyle \mu }. This is of course something to be expected, since, in this case, the folded normal converges to the normal distribution. In order to avoid any trouble with negative variances, the exponentiation of the parameter is suggested. Alternatively, you can add a constraint, such as if the optimiser goes for a negative variance the value of the log-likelihood is NA or something very small.

Characteristic function and other related functions

[edit]
  • The characteristic function is given by

φx(t)=eσ2t22+iμtΦ(μσ+iσt)+eσ2t22iμtΦ(μσ+iσt){\displaystyle \varphi _{x}\left(t\right)=e^{{\frac {-\sigma ^{2}t^{2}}{2}}+i\mu t}\Phi \left({\frac {\mu }{\sigma }}+i\sigma t\right)+e^{-{\frac {\sigma ^{2}t^{2}}{2}}-i\mu t}\Phi \left(-{\frac {\mu }{\sigma }}+i\sigma t\right)}.

  • The moment generating function is given by

Mx(t)=φx(it)=eσ2t22+μtΦ(μσ+σt)+eσ2t22μtΦ(μσ+σt){\displaystyle M_{x}\left(t\right)=\varphi _{x}\left(-it\right)=e^{{\frac {\sigma ^{2}t^{2}}{2}}+\mu t}\Phi \left({\frac {\mu }{\sigma }}+\sigma t\right)+e^{{\frac {\sigma ^{2}t^{2}}{2}}-\mu t}\Phi \left(-{\frac {\mu }{\sigma }}+\sigma t\right)}.

  • The cumulant generating function is given by

Kx(t)=logMx(t)=(σ2t22+μt)+log{1Φ(μσσt)+eσ2t22μt[1Φ(μσσt)]}{\displaystyle K_{x}\left(t\right)=\log {M_{x}\left(t\right)}=\left({\frac {\sigma ^{2}t^{2}}{2}}+\mu t\right)+\log {\left\lbrace 1-\Phi \left(-{\frac {\mu }{\sigma }}-\sigma t\right)+e^{{\frac {\sigma ^{2}t^{2}}{2}}-\mu t}\left[1-\Phi \left({\frac {\mu }{\sigma }}-\sigma t\right)\right]\right\rbrace }}.

  • The Laplace transformation is given by

E(etx)=eσ2t22μt[1Φ(μσ+σt)]+eσ2t22+μt[1Φ(μσ+σt)]{\displaystyle E\left(e^{-tx}\right)=e^{{\frac {\sigma ^{2}t^{2}}{2}}-\mu t}\left[1-\Phi \left(-{\frac {\mu }{\sigma }}+\sigma t\right)\right]+e^{{\frac {\sigma ^{2}t^{2}}{2}}+\mu t}\left[1-\Phi \left({\frac {\mu }{\sigma }}+\sigma t\right)\right]}.

  • The Fourier transform is given by

f^(t)=φx(2πt)=e4π2σ2t22i2πμt[1Φ(μσi2πσt)]+e4π2σ2t22+i2πμt[1Φ(μσi2πσt)]{\displaystyle {\hat {f}}\left(t\right)=\varphi _{x}\left(-2\pi t\right)=e^{{\frac {-4\pi ^{2}\sigma ^{2}t^{2}}{2}}-i2\pi \mu t}\left[1-\Phi \left(-{\frac {\mu }{\sigma }}-i2\pi \sigma t\right)\right]+e^{-{\frac {4\pi ^{2}\sigma ^{2}t^{2}}{2}}+i2\pi \mu t}\left[1-\Phi \left({\frac {\mu }{\sigma }}-i2\pi \sigma t\right)\right]}.

Related distributions

[edit]

Statistical Inference

[edit]

Estimation of parameters

[edit]

There are a few ways of estimating the parameters of the folded normal. All of them are essentially the maximum likelihood estimation procedure, but in some cases, a numerical maximization is performed, whereas in other cases, the root of an equation is being searched. The log-likelihood of the folded normal when a samplexi{\displaystyle x_{i}} of sizen{\displaystyle n} is available can be written in the following way

l=n2log2πσ2+i=1nlog[e(xiμ)22σ2+e(xi+μ)22σ2]{\displaystyle l=-{\frac {n}{2}}\log {2\pi \sigma ^{2}}+\sum _{i=1}^{n}\log {\left[e^{-{\frac {\left(x_{i}-\mu \right)^{2}}{2\sigma ^{2}}}}+e^{-{\frac {\left(x_{i}+\mu \right)^{2}}{2\sigma ^{2}}}}\right]}}

l=n2log2πσ2+i=1nlog[e(xiμ)22σ2(1+e(xi+μ)22σ2e(xiμ)22σ2)]{\displaystyle l=-{\frac {n}{2}}\log {2\pi \sigma ^{2}}+\sum _{i=1}^{n}\log {\left[e^{-{\frac {\left(x_{i}-\mu \right)^{2}}{2\sigma ^{2}}}}\left(1+e^{-{\frac {\left(x_{i}+\mu \right)^{2}}{2\sigma ^{2}}}}e^{\frac {\left(x_{i}-\mu \right)^{2}}{2\sigma ^{2}}}\right)\right]}}

l=n2log2πσ2i=1n(xiμ)22σ2+i=1nlog(1+e2μxiσ2){\displaystyle l=-{\frac {n}{2}}\log {2\pi \sigma ^{2}}-\sum _{i=1}^{n}{\frac {\left(x_{i}-\mu \right)^{2}}{2\sigma ^{2}}}+\sum _{i=1}^{n}\log {\left(1+e^{-{\frac {2\mu x_{i}}{\sigma ^{2}}}}\right)}}

InR (programming language), using the packageRfast one can obtain the MLE really fast (commandfoldnorm.mle). Alternatively, the commandoptim ornlm will fit this distribution. The maximisation is easy, since two parameters (μ{\displaystyle \mu } andσ2{\displaystyle \sigma ^{2}}) are involved. Note, that both positive and negative values forμ{\displaystyle \mu } are acceptable, sinceμ{\displaystyle \mu } belongs to the real line of numbers, hence, the sign is not important because the distribution is symmetric with respect to it. The next code is written in R

folded<-function(y){## y is a vector with positive datan<-length(y)## sample sizesy2<-sum(y^2)sam<-function(para,n,sy2){me<-para[1];se<-exp(para[2])f<--n/2*log(2/pi/se)+n*me^2/2/se+sy2/2/se-sum(log(cosh(me*y/se)))f}mod<-optim(c(mean(y),sd(y)),n=n,sy2=sy2,sam,control=list(maxit=2000))mod<-optim(mod$par,sam,n=n,sy2=sy2,control=list(maxit=20000))result<-c(-mod$value,mod$par[1],exp(mod$par[2]))names(result)<-c("log-likelihood","mu","sigma squared")result}

The partial derivatives of the log-likelihood are written as

lμ=i=1n(xiμ)σ22σ2i=1nxie2μxiσ21+e2μxiσ2{\displaystyle {\frac {\partial l}{\partial \mu }}={\frac {\sum _{i=1}^{n}\left(x_{i}-\mu \right)}{\sigma ^{2}}}-{\frac {2}{\sigma ^{2}}}\sum _{i=1}^{n}{\frac {x_{i}e^{\frac {-2\mu x_{i}}{\sigma ^{2}}}}{1+e^{\frac {-2\mu x_{i}}{\sigma ^{2}}}}}}

lμ=i=1n(xiμ)σ22σ2i=1nxi1+e2μxiσ2  and{\displaystyle {\frac {\partial l}{\partial \mu }}={\frac {\sum _{i=1}^{n}\left(x_{i}-\mu \right)}{\sigma ^{2}}}-{\frac {2}{\sigma ^{2}}}\sum _{i=1}^{n}{\frac {x_{i}}{1+e^{\frac {2\mu x_{i}}{\sigma ^{2}}}}}\ \ {\text{and}}}

lσ2=n2σ2+i=1n(xiμ)22σ4+2μσ4i=1nxie2μxiσ21+e2μxiσ2{\displaystyle {\frac {\partial l}{\partial \sigma ^{2}}}=-{\frac {n}{2\sigma ^{2}}}+{\frac {\sum _{i=1}^{n}\left(x_{i}-\mu \right)^{2}}{2\sigma ^{4}}}+{\frac {2\mu }{\sigma ^{4}}}\sum _{i=1}^{n}{\frac {x_{i}e^{-{\frac {2\mu x_{i}}{\sigma ^{2}}}}}{1+e^{-{\frac {2\mu x_{i}}{\sigma ^{2}}}}}}}

lσ2=n2σ2+i=1n(xiμ)22σ4+2μσ4i=1nxi1+e2μxiσ2{\displaystyle {\frac {\partial l}{\partial \sigma ^{2}}}=-{\frac {n}{2\sigma ^{2}}}+{\frac {\sum _{i=1}^{n}\left(x_{i}-\mu \right)^{2}}{2\sigma ^{4}}}+{\frac {2\mu }{\sigma ^{4}}}\sum _{i=1}^{n}{\frac {x_{i}}{1+e^{\frac {2\mu x_{i}}{\sigma ^{2}}}}}}.

By equating the first partial derivative of the log-likelihood to zero, we obtain a nice relationship

i=1nxi1+e2μxiσ2=i=1n(xiμ)2{\displaystyle \sum _{i=1}^{n}{\frac {x_{i}}{1+e^{\frac {2\mu x_{i}}{\sigma ^{2}}}}}={\frac {\sum _{i=1}^{n}\left(x_{i}-\mu \right)}{2}}}.

Note that the above equation has three solutions, one at zero and two more with the opposite sign. By substituting the above equation, to the partial derivative of the log-likelihood w.r.tσ2{\displaystyle \sigma ^{2}} and equating it to zero, we get the following expression for the variance

σ2=i=1n(xiμ)2n+2μi=1n(xiμ)n=i=1n(xi2μ2)n=i=1nxi2nμ2{\displaystyle \sigma ^{2}={\frac {\sum _{i=1}^{n}\left(x_{i}-\mu \right)^{2}}{n}}+{\frac {2\mu \sum _{i=1}^{n}\left(x_{i}-\mu \right)}{n}}={\frac {\sum _{i=1}^{n}\left(x_{i}^{2}-\mu ^{2}\right)}{n}}={\frac {\sum _{i=1}^{n}x_{i}^{2}}{n}}-\mu ^{2}},

which is the same formula as in thenormal distribution. A main difference here is thatμ{\displaystyle \mu } andσ2{\displaystyle \sigma ^{2}} are not statistically independent. The above relationships can be used to obtain maximum likelihood estimates in an efficient recursive way. We start with an initial value forσ2{\displaystyle \sigma ^{2}} and find the positive root (μ{\displaystyle \mu }) of the last equation. Then, we get an updated value ofσ2{\displaystyle \sigma ^{2}}. The procedure is being repeated until the change in the log-likelihood value is negligible. Another easier and more efficient way is to perform a search algorithm. Let us write the last equation in a more elegant way

2i=1nxi1+e2μxiσ2i=1nxi(1+e2μxiσ2)1+e2μxiσ2+nμ=0{\displaystyle 2\sum _{i=1}^{n}{\frac {x_{i}}{1+e^{\frac {2\mu x_{i}}{\sigma ^{2}}}}}-\sum _{i=1}^{n}{\frac {x_{i}\left(1+e^{\frac {2\mu x_{i}}{\sigma ^{2}}}\right)}{1+e^{\frac {2\mu x_{i}}{\sigma ^{2}}}}}+n\mu =0}

i=1nxi(1e2μxiσ2)1+e2μxiσ2+nμ=0{\displaystyle \sum _{i=1}^{n}{\frac {x_{i}\left(1-e^{\frac {2\mu x_{i}}{\sigma ^{2}}}\right)}{1+e^{\frac {2\mu x_{i}}{\sigma ^{2}}}}}+n\mu =0}.

It becomes clear that the optimization the log-likelihood with respect to the two parameters has turned into a root search of a function. This of course is identical to the previous root search. Tsagris et al. (2014) spotted that there are three roots to this equation forμ{\displaystyle \mu }, i.e. there are three possible values ofμ{\displaystyle \mu } that satisfy this equation. Theμ{\displaystyle -\mu } and+μ{\displaystyle +\mu }, which are the maximum likelihood estimates and 0, which corresponds to the minimum log-likelihood.

See also

[edit]

References

[edit]
  1. ^abSun, Jingchao; Kong, Maiying; Pal, Subhadip (22 June 2021)."The Modified-Half-Normal distribution: Properties and an efficient sampling scheme"(PDF).Communications in Statistics - Theory and Methods.52 (5):1591–1613.doi:10.1080/03610926.2021.1934700.ISSN 0361-0926.S2CID 237919587.

External links

[edit]
Discrete
univariate
with finite
support
with infinite
support
Continuous
univariate
supported on a
bounded interval
supported on a
semi-infinite
interval
supported
on the whole
real line
with support
whose type varies
Mixed
univariate
continuous-
discrete
Multivariate
(joint)
Directional
Degenerate
andsingular
Degenerate
Dirac delta function
Singular
Cantor
Families
Retrieved from "https://en.wikipedia.org/w/index.php?title=Folded_normal_distribution&oldid=1237907072"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp