Family of multivariate continuous probability distributions
normal-inverse-gamma Probability density function
Parameters μ {\displaystyle \mu \,} location (real )λ > 0 {\displaystyle \lambda >0\,} (real)α > 0 {\displaystyle \alpha >0\,} (real)β > 0 {\displaystyle \beta >0\,} (real)Support x ∈ ( − ∞ , ∞ ) , σ 2 ∈ ( 0 , ∞ ) {\displaystyle x\in (-\infty ,\infty )\,\!,\;\sigma ^{2}\in (0,\infty )} PDF λ 2 π σ 2 β α Γ ( α ) ( 1 σ 2 ) α + 1 exp ( − 2 β + λ ( x − μ ) 2 2 σ 2 ) {\displaystyle {\frac {\sqrt {\lambda }}{\sqrt {2\pi \sigma ^{2}}}}{\frac {\beta ^{\alpha }}{\Gamma (\alpha )}}\left({\frac {1}{\sigma ^{2}}}\right)^{\alpha +1}\exp \left(-{\frac {2\beta +\lambda (x-\mu )^{2}}{2\sigma ^{2}}}\right)} Mean E [ x ] = μ {\displaystyle \operatorname {E} [x]=\mu }
E [ σ 2 ] = β α − 1 {\displaystyle \operatorname {E} [\sigma ^{2}]={\frac {\beta }{\alpha -1}}} , forα > 1 {\displaystyle \alpha >1} Mode x = μ (univariate) , x = μ (multivariate) {\displaystyle x=\mu \;{\textrm {(univariate)}},x={\boldsymbol {\mu }}\;{\textrm {(multivariate)}}}
σ 2 = β α + 1 + 1 / 2 (univariate) , σ 2 = β α + 1 + k / 2 (multivariate) {\displaystyle \sigma ^{2}={\frac {\beta }{\alpha +1+1/2}}\;{\textrm {(univariate)}},\sigma ^{2}={\frac {\beta }{\alpha +1+k/2}}\;{\textrm {(multivariate)}}} Variance Var [ x ] = β ( α − 1 ) λ {\displaystyle \operatorname {Var} [x]={\frac {\beta }{(\alpha -1)\lambda }}} , forα > 1 {\displaystyle \alpha >1} Var [ σ 2 ] = β 2 ( α − 1 ) 2 ( α − 2 ) {\displaystyle \operatorname {Var} [\sigma ^{2}]={\frac {\beta ^{2}}{(\alpha -1)^{2}(\alpha -2)}}} , forα > 2 {\displaystyle \alpha >2}
Cov [ x , σ 2 ] = 0 {\displaystyle \operatorname {Cov} [x,\sigma ^{2}]=0} , forα > 1 {\displaystyle \alpha >1}
Inprobability theory andstatistics , thenormal-inverse-gamma distribution (orGaussian-inverse-gamma distribution ) is a four-parameter family of multivariate continuousprobability distributions . It is theconjugate prior of anormal distribution with unknownmean andvariance .
Suppose
x ∣ σ 2 , μ , λ ∼ N ( μ , σ 2 / λ ) {\displaystyle x\mid \sigma ^{2},\mu ,\lambda \sim \mathrm {N} (\mu ,\sigma ^{2}/\lambda )\,\!} has anormal distribution withmean μ {\displaystyle \mu } andvariance σ 2 / λ {\displaystyle \sigma ^{2}/\lambda } , where
σ 2 ∣ α , β ∼ Γ − 1 ( α , β ) {\displaystyle \sigma ^{2}\mid \alpha ,\beta \sim \Gamma ^{-1}(\alpha ,\beta )\!} has aninverse-gamma distribution . Then( x , σ 2 ) {\displaystyle (x,\sigma ^{2})} has a normal-inverse-gamma distribution, denoted as
( x , σ 2 ) ∼ N- Γ − 1 ( μ , λ , α , β ) . {\displaystyle (x,\sigma ^{2})\sim {\text{N-}}\Gamma ^{-1}(\mu ,\lambda ,\alpha ,\beta )\!.} (NIG {\displaystyle {\text{NIG}}} is also used instead ofN- Γ − 1 . {\displaystyle {\text{N-}}\Gamma ^{-1}.} )
Thenormal-inverse-Wishart distribution is a generalization of the normal-inverse-gamma distribution that is defined over multivariate random variables.
Probability density function [ edit ] f ( x , σ 2 ∣ μ , λ , α , β ) = λ σ 2 π β α Γ ( α ) ( 1 σ 2 ) α + 1 exp ( − 2 β + λ ( x − μ ) 2 2 σ 2 ) {\displaystyle f(x,\sigma ^{2}\mid \mu ,\lambda ,\alpha ,\beta )={\frac {\sqrt {\lambda }}{\sigma {\sqrt {2\pi }}}}\,{\frac {\beta ^{\alpha }}{\Gamma (\alpha )}}\,\left({\frac {1}{\sigma ^{2}}}\right)^{\alpha +1}\exp \left(-{\frac {2\beta +\lambda (x-\mu )^{2}}{2\sigma ^{2}}}\right)} For the multivariate form wherex {\displaystyle \mathbf {x} } is ak × 1 {\displaystyle k\times 1} random vector,
f ( x , σ 2 ∣ μ , V − 1 , α , β ) = | V | − 1 / 2 ( 2 π ) − k / 2 β α Γ ( α ) ( 1 σ 2 ) α + 1 + k / 2 exp ( − 2 β + ( x − μ ) T V − 1 ( x − μ ) 2 σ 2 ) . {\displaystyle f(\mathbf {x} ,\sigma ^{2}\mid \mu ,\mathbf {V} ^{-1},\alpha ,\beta )=|\mathbf {V} |^{-1/2}{(2\pi )^{-k/2}}\,{\frac {\beta ^{\alpha }}{\Gamma (\alpha )}}\,\left({\frac {1}{\sigma ^{2}}}\right)^{\alpha +1+k/2}\exp \left(-{\frac {2\beta +(\mathbf {x} -{\boldsymbol {\mu }})^{T}\mathbf {V} ^{-1}(\mathbf {x} -{\boldsymbol {\mu }})}{2\sigma ^{2}}}\right).} where| V | {\displaystyle |\mathbf {V} |} is thedeterminant of thek × k {\displaystyle k\times k} matrix V {\displaystyle \mathbf {V} } . Note how this last equation reduces to the first form ifk = 1 {\displaystyle k=1} so thatx , V , μ {\displaystyle \mathbf {x} ,\mathbf {V} ,{\boldsymbol {\mu }}} arescalars .
Alternative parameterization [ edit ] It is also possible to letγ = 1 / λ {\displaystyle \gamma =1/\lambda } in which case the pdf becomes
f ( x , σ 2 ∣ μ , γ , α , β ) = 1 σ 2 π γ β α Γ ( α ) ( 1 σ 2 ) α + 1 exp ( − 2 γ β + ( x − μ ) 2 2 γ σ 2 ) {\displaystyle f(x,\sigma ^{2}\mid \mu ,\gamma ,\alpha ,\beta )={\frac {1}{\sigma {\sqrt {2\pi \gamma }}}}\,{\frac {\beta ^{\alpha }}{\Gamma (\alpha )}}\,\left({\frac {1}{\sigma ^{2}}}\right)^{\alpha +1}\exp \left(-{\frac {2\gamma \beta +(x-\mu )^{2}}{2\gamma \sigma ^{2}}}\right)} In the multivariate form, the corresponding change would be to regard the covariance matrixV {\displaystyle \mathbf {V} } instead of itsinverse V − 1 {\displaystyle \mathbf {V} ^{-1}} as a parameter.
Cumulative distribution function [ edit ] F ( x , σ 2 ∣ μ , λ , α , β ) = e − β σ 2 ( β σ 2 ) α ( erf ( λ ( x − μ ) 2 σ ) + 1 ) 2 σ 2 Γ ( α ) {\displaystyle F(x,\sigma ^{2}\mid \mu ,\lambda ,\alpha ,\beta )={\frac {e^{-{\frac {\beta }{\sigma ^{2}}}}\left({\frac {\beta }{\sigma ^{2}}}\right)^{\alpha }\left(\operatorname {erf} \left({\frac {{\sqrt {\lambda }}(x-\mu )}{{\sqrt {2}}\sigma }}\right)+1\right)}{2\sigma ^{2}\Gamma (\alpha )}}} Marginal distributions [ edit ] Given( x , σ 2 ) ∼ N- Γ − 1 ( μ , λ , α , β ) . {\displaystyle (x,\sigma ^{2})\sim {\text{N-}}\Gamma ^{-1}(\mu ,\lambda ,\alpha ,\beta )\!.} as above,σ 2 {\displaystyle \sigma ^{2}} by itself follows aninverse gamma distribution :
σ 2 ∼ Γ − 1 ( α , β ) {\displaystyle \sigma ^{2}\sim \Gamma ^{-1}(\alpha ,\beta )\!} whileα λ β ( x − μ ) {\displaystyle {\sqrt {\frac {\alpha \lambda }{\beta }}}(x-\mu )} follows at distribution with2 α {\displaystyle 2\alpha } degrees of freedom.[ 1]
Proof forλ = 1 {\displaystyle \lambda =1} Forλ = 1 {\displaystyle \lambda =1} probability density function is
f ( x , σ 2 ∣ μ , α , β ) = 1 σ 2 π β α Γ ( α ) ( 1 σ 2 ) α + 1 exp ( − 2 β + ( x − μ ) 2 2 σ 2 ) {\displaystyle f(x,\sigma ^{2}\mid \mu ,\alpha ,\beta )={\frac {1}{\sigma {\sqrt {2\pi }}}}\,{\frac {\beta ^{\alpha }}{\Gamma (\alpha )}}\,\left({\frac {1}{\sigma ^{2}}}\right)^{\alpha +1}\exp \left(-{\frac {2\beta +(x-\mu )^{2}}{2\sigma ^{2}}}\right)}
Marginal distribution overx {\displaystyle x} is
f ( x ∣ μ , α , β ) = ∫ 0 ∞ d σ 2 f ( x , σ 2 ∣ μ , α , β ) = 1 2 π β α Γ ( α ) ∫ 0 ∞ d σ 2 ( 1 σ 2 ) α + 1 / 2 + 1 exp ( − 2 β + ( x − μ ) 2 2 σ 2 ) {\displaystyle {\begin{aligned}f(x\mid \mu ,\alpha ,\beta )&=\int _{0}^{\infty }d\sigma ^{2}f(x,\sigma ^{2}\mid \mu ,\alpha ,\beta )\\&={\frac {1}{\sqrt {2\pi }}}\,{\frac {\beta ^{\alpha }}{\Gamma (\alpha )}}\int _{0}^{\infty }d\sigma ^{2}\left({\frac {1}{\sigma ^{2}}}\right)^{\alpha +1/2+1}\exp \left(-{\frac {2\beta +(x-\mu )^{2}}{2\sigma ^{2}}}\right)\end{aligned}}}
Except for normalization factor, expression under the integral coincides withInverse-gamma distribution
Γ − 1 ( x ; a , b ) = b a Γ ( a ) e − b / x x a + 1 , {\displaystyle \Gamma ^{-1}(x;a,b)={\frac {b^{a}}{\Gamma (a)}}{\frac {e^{-b/x}}{{x}^{a+1}}},}
withx = σ 2 {\displaystyle x=\sigma ^{2}} ,a = α + 1 / 2 {\displaystyle a=\alpha +1/2} ,b = 2 β + ( x − μ ) 2 2 {\displaystyle b={\frac {2\beta +(x-\mu )^{2}}{2}}} .
Since∫ 0 ∞ d x Γ − 1 ( x ; a , b ) = 1 , ∫ 0 ∞ d x x − ( a + 1 ) e − b / x = Γ ( a ) b − a {\displaystyle \int _{0}^{\infty }dx\Gamma ^{-1}(x;a,b)=1,\quad \int _{0}^{\infty }dxx^{-(a+1)}e^{-b/x}=\Gamma (a)b^{-a}} , and
∫ 0 ∞ d σ 2 ( 1 σ 2 ) α + 1 / 2 + 1 exp ( − 2 β + ( x − μ ) 2 2 σ 2 ) = Γ ( α + 1 / 2 ) ( 2 β + ( x − μ ) 2 2 ) − ( α + 1 / 2 ) {\displaystyle \int _{0}^{\infty }d\sigma ^{2}\left({\frac {1}{\sigma ^{2}}}\right)^{\alpha +1/2+1}\exp \left(-{\frac {2\beta +(x-\mu )^{2}}{2\sigma ^{2}}}\right)=\Gamma (\alpha +1/2)\left({\frac {2\beta +(x-\mu )^{2}}{2}}\right)^{-(\alpha +1/2)}}
Substituting this expression and factoring dependence onx {\displaystyle x} ,
f ( x ∣ μ , α , β ) ∝ x ( 1 + ( x − μ ) 2 2 β ) − ( α + 1 / 2 ) . {\displaystyle f(x\mid \mu ,\alpha ,\beta )\propto _{x}\left(1+{\frac {(x-\mu )^{2}}{2\beta }}\right)^{-(\alpha +1/2)}.}
Shape ofgeneralized Student's t-distribution is
t ( x | ν , μ ^ , σ ^ 2 ) ∝ x ( 1 + 1 ν ( x − μ ^ ) 2 σ ^ 2 ) − ( ν + 1 ) / 2 {\displaystyle t(x|\nu ,{\hat {\mu }},{\hat {\sigma }}^{2})\propto _{x}\left(1+{\frac {1}{\nu }}{\frac {(x-{\hat {\mu }})^{2}}{{\hat {\sigma }}^{2}}}\right)^{-(\nu +1)/2}} .
Marginal distributionf ( x ∣ μ , α , β ) {\displaystyle f(x\mid \mu ,\alpha ,\beta )} follows t-distribution with2 α {\displaystyle 2\alpha } degrees of freedom
f ( x ∣ μ , α , β ) = t ( x | ν = 2 α , μ ^ = μ , σ ^ 2 = β / α ) {\displaystyle f(x\mid \mu ,\alpha ,\beta )=t(x|\nu =2\alpha ,{\hat {\mu }}=\mu ,{\hat {\sigma }}^{2}=\beta /\alpha )} .
In the multivariate case, the marginal distribution ofx {\displaystyle \mathbf {x} } is amultivariate t distribution :
x ∼ t 2 α ( μ , β α V ) {\displaystyle \mathbf {x} \sim t_{2\alpha }({\boldsymbol {\mu }},{\frac {\beta }{\alpha }}\mathbf {V} )\!} Suppose
( x , σ 2 ) ∼ N- Γ − 1 ( μ , λ , α , β ) . {\displaystyle (x,\sigma ^{2})\sim {\text{N-}}\Gamma ^{-1}(\mu ,\lambda ,\alpha ,\beta )\!.} Then forc > 0 {\displaystyle c>0} ,
( c x , c σ 2 ) ∼ N- Γ − 1 ( c μ , λ / c , α , c β ) . {\displaystyle (cx,c\sigma ^{2})\sim {\text{N-}}\Gamma ^{-1}(c\mu ,\lambda /c,\alpha ,c\beta )\!.} Proof: To prove this let( x , σ 2 ) ∼ N- Γ − 1 ( μ , λ , α , β ) {\displaystyle (x,\sigma ^{2})\sim {\text{N-}}\Gamma ^{-1}(\mu ,\lambda ,\alpha ,\beta )} and fixc > 0 {\displaystyle c>0} . DefiningY = ( Y 1 , Y 2 ) = ( c x , c σ 2 ) {\displaystyle Y=(Y_{1},Y_{2})=(cx,c\sigma ^{2})} , observe that the PDF of the random variableY {\displaystyle Y} evaluated at( y 1 , y 2 ) {\displaystyle (y_{1},y_{2})} is given by1 / c 2 {\displaystyle 1/c^{2}} times the PDF of aN- Γ − 1 ( μ , λ , α , β ) {\displaystyle {\text{N-}}\Gamma ^{-1}(\mu ,\lambda ,\alpha ,\beta )} random variable evaluated at( y 1 / c , y 2 / c ) {\displaystyle (y_{1}/c,y_{2}/c)} . Hence the PDF ofY {\displaystyle Y} evaluated at( y 1 , y 2 ) {\displaystyle (y_{1},y_{2})} is given by :f Y ( y 1 , y 2 ) = 1 c 2 λ 2 π y 2 / c β α Γ ( α ) ( 1 y 2 / c ) α + 1 exp ( − 2 β + λ ( y 1 / c − μ ) 2 2 y 2 / c ) = λ / c 2 π y 2 ( c β ) α Γ ( α ) ( 1 y 2 ) α + 1 exp ( − 2 c β + ( λ / c ) ( y 1 − c μ ) 2 2 y 2 ) . {\displaystyle f_{Y}(y_{1},y_{2})={\frac {1}{c^{2}}}{\frac {\sqrt {\lambda }}{\sqrt {2\pi y_{2}/c}}}\,{\frac {\beta ^{\alpha }}{\Gamma (\alpha )}}\,\left({\frac {1}{y_{2}/c}}\right)^{\alpha +1}\exp \left(-{\frac {2\beta +\lambda (y_{1}/c-\mu )^{2}}{2y_{2}/c}}\right)={\frac {\sqrt {\lambda /c}}{\sqrt {2\pi y_{2}}}}\,{\frac {(c\beta )^{\alpha }}{\Gamma (\alpha )}}\,\left({\frac {1}{y_{2}}}\right)^{\alpha +1}\exp \left(-{\frac {2c\beta +(\lambda /c)\,(y_{1}-c\mu )^{2}}{2y_{2}}}\right).\!}
The right hand expression is the PDF for aN- Γ − 1 ( c μ , λ / c , α , c β ) {\displaystyle {\text{N-}}\Gamma ^{-1}(c\mu ,\lambda /c,\alpha ,c\beta )} random variable evaluated at( y 1 , y 2 ) {\displaystyle (y_{1},y_{2})} , which completes the proof.
Normal-inverse-gamma distributions form anexponential family withnatural parameters θ 1 = − λ 2 {\displaystyle \textstyle \theta _{1}={\frac {-\lambda }{2}}} ,θ 2 = λ μ {\displaystyle \textstyle \theta _{2}=\lambda \mu } ,θ 3 = α {\displaystyle \textstyle \theta _{3}=\alpha } , andθ 4 = − β + − λ μ 2 2 {\displaystyle \textstyle \theta _{4}=-\beta +{\frac {-\lambda \mu ^{2}}{2}}} and sufficient statisticsT 1 = x 2 σ 2 {\displaystyle \textstyle T_{1}={\frac {x^{2}}{\sigma ^{2}}}} ,T 2 = x σ 2 {\displaystyle \textstyle T_{2}={\frac {x}{\sigma ^{2}}}} ,T 3 = log ( 1 σ 2 ) {\displaystyle \textstyle T_{3}=\log {\big (}{\frac {1}{\sigma ^{2}}}{\big )}} , andT 4 = 1 σ 2 {\displaystyle \textstyle T_{4}={\frac {1}{\sigma ^{2}}}} .
Information entropy [ edit ] Kullback–Leibler divergence[ edit ] Measures difference between two distributions.
Maximum likelihood estimation [ edit ] This section is empty. You can help by
adding to it .
(July 2010 )
Posterior distribution of the parameters [ edit ] See the articles onnormal-gamma distribution andconjugate prior .
Interpretation of the parameters [ edit ] See the articles onnormal-gamma distribution andconjugate prior .
Generating normal-inverse-gamma random variates [ edit ] Generation of random variates is straightforward:
Sampleσ 2 {\displaystyle \sigma ^{2}} from an inverse gamma distribution with parametersα {\displaystyle \alpha } andβ {\displaystyle \beta } Samplex {\displaystyle x} from a normal distribution with meanμ {\displaystyle \mu } and varianceσ 2 / λ {\displaystyle \sigma ^{2}/\lambda } Related distributions [ edit ] Denison, David G. T.; et al. (2002).Bayesian Methods for Nonlinear Classification and Regression . Wiley.ISBN 0471490369 . Koch, Karl-Rudolf (2007).Introduction to Bayesian Statistics (2nd ed.). Springer.ISBN 354072723X .
Discrete univariate
with finite support with infinite support
Continuous univariate
supported on a bounded interval supported on a semi-infinite interval supported on the whole real line with support whose type varies
Mixed univariate
Multivariate (joint) Directional Degenerate andsingular Families