Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Wishart distribution

From Wikipedia, the free encyclopedia
Generalization of gamma distribution to multiple dimensions
Wishart
NotationX ~Wp(V,n)
Parametersndegrees of freedom (real)
V > 0scale matrix (p ×ppos. def)
SupportX (p ×p)positive definite matrix
PDF

fX(X)=|X|(np1)/2etr(V1X)/22(np)/2|V|n/2Γp(n2){\displaystyle f_{\mathbf {X} }(\mathbf {X} )={\frac {|\mathbf {X} |^{(n-p-1)/2}e^{-\operatorname {tr} (\mathbf {V} ^{-1}\mathbf {X} )/2}}{2^{(np)/2}|{\mathbf {V} }|^{n/2}\Gamma _{p}({\frac {n}{2}})}}}

MeanE[X]=nV{\displaystyle \operatorname {E} [\mathbf {\mathbf {X} } ]=n{\mathbf {V} }}
Mode(np − 1)V fornp + 1
VarianceVar(Xij)=n(vij2+viivjj){\displaystyle \operatorname {Var} (\mathbf {X} _{ij})=n\left(v_{ij}^{2}+v_{ii}v_{jj}\right)}
Entropysee below
CFΘ|I2iΘV|n2{\displaystyle \Theta \mapsto \left|{\mathbf {I} }-2i\,{\mathbf {\Theta } }{\mathbf {V} }\right|^{-{\frac {n}{2}}}}

Instatistics, theWishart distribution is a generalization of thegamma distribution to multiple dimensions. It is named in honor ofJohn Wishart, who first formulated the distribution in 1928.[1] Other names includeWishart ensemble (inrandom matrix theory, probability distributions over matrices are usually called "ensembles"), orWishart–Laguerre ensemble (since its eigenvalue distribution involveLaguerre polynomials), or LOE, LUE, LSE (in analogy withGOE, GUE, GSE).[2]

It is a family ofprobability distributions defined over symmetric,positive-definiterandom matrices (i.e.matrix-valuedrandom variables). These distributions are of great importance in theestimation of covariance matrices inmultivariate statistics. InBayesian statistics, the Wishart distribution is theconjugate prior of theinversecovariance-matrix of amultivariate-normal random vector.[3]

Definition

[edit]

SupposeG is ap ×n matrix, each column of which isindependently drawn from ap-variate normal distribution with zero mean:

G=(g1,,gn)Np(0,V).{\displaystyle G=(g_{1},\dots ,g_{n})\sim {\mathcal {N}}_{p}(0,V).}

It means :gi=(gi,1,,gi,p)T iid Np(0,V) i{1,,n}{\displaystyle g_{i}=(g_{i,1},\dots ,g_{i,p})^{T}\ {\overset {iid}{\sim }}\ {\mathcal {N}}_{p}(0,V)\ \forall i\in \{1,\dots ,n\}}

Then the Wishart distribution is theprobability distribution of thep ×p random matrix[4]

S=GGT=i=1ngigiT{\displaystyle S=GG^{T}=\sum _{i=1}^{n}g_{i}g_{i}^{T}}

known as thescatter matrix. One indicates thatS has that probability distribution by writing

SWp(V,n).{\displaystyle S\sim W_{p}(V,n).}

The positive integern is the number ofdegrees of freedom. Sometimes this is writtenW(V,p,n). Fornp the matrixS is invertible with probability1 ifV is invertible.

Ifp =V = 1 then this distribution is achi-squared distribution withn degrees of freedom.

Occurrence

[edit]

The Wishart distribution arises as the distribution of the sample covariance matrix for a sample from amultivariate normal distribution. It occurs frequently inlikelihood-ratio tests in multivariate statistical analysis. It also arises in the spectral theory ofrandom matrices[citation needed] and in multidimensional Bayesian analysis.[5] It is also encountered in wireless communications, while analyzing the performance ofRayleigh fadingMIMO wireless channels.[6]

Probability density function

[edit]
Spectral density of Wishart-Laguerre ensemble with dimensions (8, 15). A reconstruction of Figure 1 of[7].

The Wishart distribution can becharacterized by itsprobability density function as follows:

LetX be ap ×p symmetric matrix of random variables that ispositive semi-definite. LetV be a (fixed) symmetric positive definite matrix of sizep ×p.

Then, ifnp,X has a Wishart distribution withn degrees of freedom if it has theprobability density function

fX(X)=12np/2|V|n/2Γp(n2)|X|(np1)/2e12tr(V1X){\displaystyle f_{\mathbf {X} }(\mathbf {X} )={\frac {1}{2^{np/2}\left|{\mathbf {V} }\right|^{n/2}\Gamma _{p}\left({\frac {n}{2}}\right)}}{\left|\mathbf {X} \right|}^{(n-p-1)/2}e^{-{\frac {1}{2}}\operatorname {tr} ({\mathbf {V} }^{-1}\mathbf {X} )}}

where|X|{\displaystyle \left|{\mathbf {X} }\right|} is thedeterminant ofX{\displaystyle \mathbf {X} } andΓp is themultivariate gamma function defined as

Γp(n2)=πp(p1)/4j=1pΓ(n2j12).{\displaystyle \Gamma _{p}\left({\frac {n}{2}}\right)=\pi ^{p(p-1)/4}\prod _{j=1}^{p}\Gamma \left({\frac {n}{2}}-{\frac {j-1}{2}}\right).}

The density above is not the joint density of all thep2{\displaystyle p^{2}} elements of the random matrixX (suchp2{\displaystyle p^{2}}-dimensional density does not exist because of the symmetry constrainsXij=Xji{\displaystyle X_{ij}=X_{ji}}), it is rather the joint density ofp(p+1)/2{\displaystyle p(p+1)/2} elementsXij{\displaystyle X_{ij}} forij{\displaystyle i\leq j} (,[1] page 38). Also, the density formula above applies only to positive definite matricesx;{\displaystyle \mathbf {x} ;} for other matrices the density is equal to zero.

In fact the above definition can be extended to any realn >p − 1. Ifnp − 1, then the Wishart no longer has a density—instead it represents a singular distribution that takes values in a lower-dimension subspace of the space ofp ×p matrices.[8]

Spectral density

[edit]

The joint-eigenvalue density for the eigenvaluesλ1,,λp0{\displaystyle \lambda _{1},\dots ,\lambda _{p}\geq 0} of a random matrixXWp(I,n){\displaystyle \mathbf {X} \sim W_{p}(\mathbf {I} ,n)} is,[9][10]

cn,pe12iλiλi(np1)/2i<j|λiλj|{\displaystyle c_{n,p}e^{-{\frac {1}{2}}\sum _{i}\lambda _{i}}\prod \lambda _{i}^{(n-p-1)/2}\prod _{i<j}|\lambda _{i}-\lambda _{j}|}

wherecn,p{\displaystyle c_{n,p}} is a constant. The spectral density can be marginalized to yield the density of a single eigenvalue, by evaluating aSelberg integral.

The spectral density can be integrated to give the probability that all eigenvalues of a Wishart random matrix lie within an interval.[11]

Use in Bayesian statistics

[edit]

InBayesian statistics, in the context of themultivariate normal distribution, the Wishart distribution is the conjugate prior to theprecision matrixΩ =Σ−1, whereΣ is the covariance matrix.[12]: 135 [13] The use of Normal-Wishart conjugate priors (for mean and precision) is particularly common forvector autoregression models.[14]

Choice of parameters

[edit]

The least informative, proper Wishart prior is obtained by settingn =p.[citation needed]

A common choice forV leverages the fact that the mean ofX ~Wp(V,n) isnV. ThenV is chosen so thatnV equals an initial guess forX. For instance, when estimating a precision matrixΣ−1 ~Wp(V,n) a reasonable choice forV would ben−1Σ0−1, whereΣ0 is some prior estimate for the covariance matrixΣ.

Properties

[edit]

Log-expectation

[edit]

The following formula plays a role invariational Bayes derivations forBayes networksinvolving the Wishart distribution. From equation (2.63),[15]

E[ln|X|]=ψp(n2)+pln(2)+ln|V|{\displaystyle \operatorname {E} [\,\ln \left|\mathbf {X} \right|\,]=\psi _{p}\left({\frac {n}{2}}\right)+p\,\ln(2)+\ln |\mathbf {V} |}

whereψp{\displaystyle \psi _{p}} is the multivariate digamma function (the derivative of the log of themultivariate gamma function).

Log-variance

[edit]

The following variance computation could be of help in Bayesian statistics:

Var[ln|X|]=i=1pψ1(n+1i2){\displaystyle \operatorname {Var} \left[\,\ln \left|\mathbf {X} \right|\,\right]=\sum _{i=1}^{p}\psi _{1}\left({\frac {n+1-i}{2}}\right)}

whereψ1{\displaystyle \psi _{1}} is the trigamma function. This comes up when computing the Fisher information of the Wishart random variable.

Entropy

[edit]

Theinformation entropy of the distribution has the following formula:[12]: 693 

H[X]=ln(B(V,n))np12E[ln|X|]+np2{\displaystyle \operatorname {H} \left[\,\mathbf {X} \,\right]=-\ln \left(B(\mathbf {V} ,n)\right)-{\frac {n-p-1}{2}}\operatorname {E} \left[\,\ln \left|\mathbf {X} \right|\,\right]+{\frac {np}{2}}}

whereB(V,n) is thenormalizing constant of the distribution:

B(V,n)=1|V|n/22np/2Γp(n2).{\displaystyle B(\mathbf {V} ,n)={\frac {1}{\left|\mathbf {V} \right|^{n/2}2^{np/2}\Gamma _{p}\left({\frac {n}{2}}\right)}}.}

This can be expanded as follows:

H[X]=n2ln|V|+np2ln2+lnΓp(n2)np12E[ln|X|]+np2=n2ln|V|+np2ln2+lnΓp(n2)np12(ψp(n2)+pln2+ln|V|)+np2=n2ln|V|+np2ln2+lnΓp(n2)np12ψp(n2)np12(pln2+ln|V|)+np2=p+12ln|V|+12p(p+1)ln2+lnΓp(n2)np12ψp(n2)+np2{\displaystyle {\begin{aligned}\operatorname {H} \left[\,\mathbf {X} \,\right]&={\frac {n}{2}}\ln \left|\mathbf {V} \right|+{\frac {np}{2}}\ln 2+\ln \Gamma _{p}\left({\frac {n}{2}}\right)-{\frac {n-p-1}{2}}\operatorname {E} \left[\,\ln \left|\mathbf {X} \right|\,\right]+{\frac {np}{2}}\\[8pt]&={\frac {n}{2}}\ln \left|\mathbf {V} \right|+{\frac {np}{2}}\ln 2+\ln \Gamma _{p}\left({\frac {n}{2}}\right)-{\frac {n-p-1}{2}}\left(\psi _{p}\left({\frac {n}{2}}\right)+p\ln 2+\ln \left|\mathbf {V} \right|\right)+{\frac {np}{2}}\\[8pt]&={\frac {n}{2}}\ln \left|\mathbf {V} \right|+{\frac {np}{2}}\ln 2+\ln \Gamma _{p}\left({\frac {n}{2}}\right)-{\frac {n-p-1}{2}}\psi _{p}\left({\frac {n}{2}}\right)-{\frac {n-p-1}{2}}\left(p\ln 2+\ln \left|\mathbf {V} \right|\right)+{\frac {np}{2}}\\[8pt]&={\frac {p+1}{2}}\ln \left|\mathbf {V} \right|+{\frac {1}{2}}p(p+1)\ln 2+\ln \Gamma _{p}\left({\frac {n}{2}}\right)-{\frac {n-p-1}{2}}\psi _{p}\left({\frac {n}{2}}\right)+{\frac {np}{2}}\end{aligned}}}

Cross-entropy

[edit]

Thecross-entropy of two Wishart distributionsp0{\displaystyle p_{0}} with parametersn0,V0{\displaystyle n_{0},V_{0}} andp1{\displaystyle p_{1}} with parametersn1,V1{\displaystyle n_{1},V_{1}} is

H(p0,p1)=Ep0[logp1]=Ep0[log|X|(n1p11)/2etr(V11X)/22n1p1/2|V1|n1/2Γp1(n12)]=n1p12log2+n12log|V1|+logΓp1(n12)n1p112Ep0[log|X|]+12Ep0[tr(V11X)]=n1p12log2+n12log|V1|+logΓp1(n12)n1p112(ψp0(n02)+p0log2+log|V0|)+12tr(V11n0V0)=n12log|V11V0|+p1+12log|V0|+n02tr(V11V0)+logΓp1(n12)n1p112ψp0(n02)+n1(p1p0)+p0(p1+1)2log2{\displaystyle {\begin{aligned}H(p_{0},p_{1})&=\operatorname {E} _{p_{0}}[\,-\log p_{1}\,]\\[8pt]&=\operatorname {E} _{p_{0}}\left[\,-\log {\frac {\left|\mathbf {X} \right|^{(n_{1}-p_{1}-1)/2}e^{-\operatorname {tr} (\mathbf {V} _{1}^{-1}\mathbf {X} )/2}}{2^{n_{1}p_{1}/2}\left|\mathbf {V} _{1}\right|^{n_{1}/2}\Gamma _{p_{1}}\left({\tfrac {n_{1}}{2}}\right)}}\right]\\[8pt]&={\tfrac {n_{1}p_{1}}{2}}\log 2+{\tfrac {n_{1}}{2}}\log \left|\mathbf {V} _{1}\right|+\log \Gamma _{p_{1}}({\tfrac {n_{1}}{2}})-{\tfrac {n_{1}-p_{1}-1}{2}}\operatorname {E} _{p_{0}}\left[\,\log \left|\mathbf {X} \right|\,\right]+{\tfrac {1}{2}}\operatorname {E} _{p_{0}}\left[\,\operatorname {tr} \left(\,\mathbf {V} _{1}^{-1}\mathbf {X} \,\right)\,\right]\\[8pt]&={\tfrac {n_{1}p_{1}}{2}}\log 2+{\tfrac {n_{1}}{2}}\log \left|\mathbf {V} _{1}\right|+\log \Gamma _{p_{1}}({\tfrac {n_{1}}{2}})-{\tfrac {n_{1}-p_{1}-1}{2}}\left(\psi _{p_{0}}({\tfrac {n_{0}}{2}})+p_{0}\log 2+\log \left|\mathbf {V} _{0}\right|\right)+{\tfrac {1}{2}}\operatorname {tr} \left(\,\mathbf {V} _{1}^{-1}n_{0}\mathbf {V} _{0}\,\right)\\[8pt]&=-{\tfrac {n_{1}}{2}}\log \left|\,\mathbf {V} _{1}^{-1}\mathbf {V} _{0}\,\right|+{\tfrac {p_{1}+1}{2}}\log \left|\mathbf {V} _{0}\right|+{\tfrac {n_{0}}{2}}\operatorname {tr} \left(\,\mathbf {V} _{1}^{-1}\mathbf {V} _{0}\right)+\log \Gamma _{p_{1}}\left({\tfrac {n_{1}}{2}}\right)-{\tfrac {n_{1}-p_{1}-1}{2}}\psi _{p_{0}}({\tfrac {n_{0}}{2}})+{\tfrac {n_{1}(p_{1}-p_{0})+p_{0}(p_{1}+1)}{2}}\log 2\end{aligned}}}

Note that whenp0=p1{\displaystyle p_{0}=p_{1}} andn0=n1{\displaystyle n_{0}=n_{1}} we recover the entropy.

KL-divergence

[edit]

TheKullback–Leibler divergence ofp1{\displaystyle p_{1}} fromp0{\displaystyle p_{0}} is

DKL(p0p1)=H(p0,p1)H(p0)=n12log|V11V0|+n02(tr(V11V0)p)+logΓp(n12)Γp(n02)+n0n12ψp(n02){\displaystyle {\begin{aligned}D_{KL}(p_{0}\|p_{1})&=H(p_{0},p_{1})-H(p_{0})\\[6pt]&=-{\frac {n_{1}}{2}}\log |\mathbf {V} _{1}^{-1}\mathbf {V} _{0}|+{\frac {n_{0}}{2}}(\operatorname {tr} (\mathbf {V} _{1}^{-1}\mathbf {V} _{0})-p)+\log {\frac {\Gamma _{p}\left({\frac {n_{1}}{2}}\right)}{\Gamma _{p}\left({\frac {n_{0}}{2}}\right)}}+{\tfrac {n_{0}-n_{1}}{2}}\psi _{p}\left({\frac {n_{0}}{2}}\right)\end{aligned}}}

Characteristic function

[edit]

Thecharacteristic function of the Wishart distribution is

ΘE[exp(itr(XΘ))]=|12iΘV|n/2{\displaystyle \Theta \mapsto \operatorname {E} \left[\,\exp \left(\,i\operatorname {tr} \left(\,\mathbf {X} {\mathbf {\Theta } }\,\right)\,\right)\,\right]=\left|\,1-2i\,{\mathbf {\Theta } }\,{\mathbf {V} }\,\right|^{-n/2}}

whereE[⋅] denotes expectation. (HereΘ is any matrix with the same dimensions asV,1 indicates the identity matrix, andi is a square root of −1).[10] Properly interpreting this formula requires a little care, because noninteger complex powers aremultivalued; whenn is noninteger, the correct branch must be determined viaanalytic continuation.[16]

Theorem

[edit]

If ap ×p random matrixX has a Wishart distribution withm degrees of freedom and variance matrixV — writeXWp(V,m){\displaystyle \mathbf {X} \sim {\mathcal {W}}_{p}({\mathbf {V} },m)} — andC is aq ×p matrix ofrankq, then[17]

CXCTWq(CVCT,m).{\displaystyle \mathbf {C} \mathbf {X} {\mathbf {C} }^{T}\sim {\mathcal {W}}_{q}\left({\mathbf {C} }{\mathbf {V} }{\mathbf {C} }^{T},m\right).}

Corollary 1

[edit]

Ifz is a nonzerop × 1 constant vector, then:[17]

σz2zTXzχm2.{\displaystyle \sigma _{z}^{-2}\,{\mathbf {z} }^{T}\mathbf {X} {\mathbf {z} }\sim \chi _{m}^{2}.}

In this case,χm2{\displaystyle \chi _{m}^{2}} is thechi-squared distribution andσz2=zTVz{\displaystyle \sigma _{z}^{2}={\mathbf {z} }^{T}{\mathbf {V} }{\mathbf {z} }} (note thatσz2{\displaystyle \sigma _{z}^{2}} is a constant; it is positive becauseV is positive definite).

Corollary 2

[edit]

Consider the case wherezT = (0, ..., 0, 1, 0, ..., 0) (that is, thej-th element is one and all others zero). Then corollary 1 above shows that

σjj1wjjχm2{\displaystyle \sigma _{jj}^{-1}\,w_{jj}\sim \chi _{m}^{2}}

gives the marginal distribution of each of the elements on the matrix's diagonal.

George Seber points out that the Wishart distribution is not called the “multivariate chi-squared distribution” because the marginal distribution of theoff-diagonal elements is not chi-squared. Seber prefers to reserve the termmultivariate for the case when all univariate marginals belong to the same family.[18]

Estimator of the multivariate normal distribution

[edit]

The Wishart distribution is thesampling distribution of themaximum-likelihood estimator (MLE) of thecovariance matrix of amultivariate normal distribution.[19] Aderivation of the MLE uses thespectral theorem.

Bartlett decomposition

[edit]

TheBartlett decomposition of a matrixX from ap-variate Wishart distribution with scale matrixV andn degrees of freedom is the factorization:

X=LAATLT,{\displaystyle \mathbf {X} ={\textbf {L}}{\textbf {A}}{\textbf {A}}^{T}{\textbf {L}}^{T},}

whereL is theCholesky factor ofV, and:

A=(c1000n21c200n31n32c30np1np2np3cp){\displaystyle \mathbf {A} ={\begin{pmatrix}c_{1}&0&0&\cdots &0\\n_{21}&c_{2}&0&\cdots &0\\n_{31}&n_{32}&c_{3}&\cdots &0\\\vdots &\vdots &\vdots &\ddots &\vdots \\n_{p1}&n_{p2}&n_{p3}&\cdots &c_{p}\end{pmatrix}}}

whereci2χni+12{\displaystyle c_{i}^{2}\sim \chi _{n-i+1}^{2}} andnij ~N(0, 1) independently.[20] This provides a useful method for obtaining random samples from a Wishart distribution.[21]

Marginal distribution of matrix elements

[edit]

LetV be a2 × 2 variance matrix characterized bycorrelation coefficient−1 <ρ < 1 andL its lower Cholesky factor:

V=(σ12ρσ1σ2ρσ1σ2σ22),L=(σ10ρσ21ρ2σ2){\displaystyle \mathbf {V} ={\begin{pmatrix}\sigma _{1}^{2}&\rho \sigma _{1}\sigma _{2}\\\rho \sigma _{1}\sigma _{2}&\sigma _{2}^{2}\end{pmatrix}},\qquad \mathbf {L} ={\begin{pmatrix}\sigma _{1}&0\\\rho \sigma _{2}&{\sqrt {1-\rho ^{2}}}\sigma _{2}\end{pmatrix}}}

Multiplying through the Bartlett decomposition above, we find that a random sample from the2 × 2 Wishart distribution is

X=(σ12c12σ1σ2(ρc12+1ρ2c1n21)σ1σ2(ρc12+1ρ2c1n21)σ22((1ρ2)c22+(1ρ2n21+ρc1)2)){\displaystyle \mathbf {X} ={\begin{pmatrix}\sigma _{1}^{2}c_{1}^{2}&\sigma _{1}\sigma _{2}\left(\rho c_{1}^{2}+{\sqrt {1-\rho ^{2}}}c_{1}n_{21}\right)\\\sigma _{1}\sigma _{2}\left(\rho c_{1}^{2}+{\sqrt {1-\rho ^{2}}}c_{1}n_{21}\right)&\sigma _{2}^{2}\left(\left(1-\rho ^{2}\right)c_{2}^{2}+\left({\sqrt {1-\rho ^{2}}}n_{21}+\rho c_{1}\right)^{2}\right)\end{pmatrix}}}

The diagonal elements, most evidently in the first element, follow theχ2 distribution withn degrees of freedom (scaled byσ2) as expected. The off-diagonal element is less familiar but can be identified as anormal variance-mean mixture where the mixing density is aχ2 distribution. The corresponding marginal probability density for the off-diagonal element is therefore thevariance-gamma distribution

f(x12)=|x12|n12Γ(n2)2n1π(1ρ2)(σ1σ2)n+1Kn12(|x12|σ1σ2(1ρ2))exp(ρx12σ1σ2(1ρ2)){\displaystyle f(x_{12})={\frac {\left|x_{12}\right|^{\frac {n-1}{2}}}{\Gamma \left({\frac {n}{2}}\right){\sqrt {2^{n-1}\pi \left(1-\rho ^{2}\right)\left(\sigma _{1}\sigma _{2}\right)^{n+1}}}}}\cdot K_{\frac {n-1}{2}}\left({\frac {\left|x_{12}\right|}{\sigma _{1}\sigma _{2}\left(1-\rho ^{2}\right)}}\right)\exp {\left({\frac {\rho x_{12}}{\sigma _{1}\sigma _{2}(1-\rho ^{2})}}\right)}}

whereKν(z) is themodified Bessel function of the second kind.[22] Similar results may be found for higher dimensions. In general, ifX{\displaystyle X} follows a Wishart distribution with parameters,Σ,n{\displaystyle \Sigma ,n}, then forij{\displaystyle i\neq j}, the off-diagonal elements

XijVG(n,Σij,(ΣiiΣjjΣij2)1/2,0){\displaystyle X_{ij}\sim {\text{VG}}(n,\Sigma _{ij},(\Sigma _{ii}\Sigma _{jj}-\Sigma _{ij}^{2})^{1/2},0)}.[23]

It is also possible to write down themoment-generating function even in thenoncentral case (essentially thenth power of Craig (1936)[24] equation 10) although the probability density becomes an infinite sum of Bessel functions.

The range of the shape parameter

[edit]

It can be shown[25] that the Wishart distribution can be defined if and only if the shape parametern belongs to the set

Λp:={0,,p1}(p1,).{\displaystyle \Lambda _{p}:=\{0,\ldots ,p-1\}\cup \left(p-1,\infty \right).}

This set is named afterSimon Gindikin, who introduced it[26] in the 1970s in the context of gamma distributions on homogeneous cones. However, for the new parameters in the discrete spectrum of the Gindikin ensemble, namely,

Λp:={0,,p1},{\displaystyle \Lambda _{p}^{*}:=\{0,\ldots ,p-1\},}

the corresponding Wishart distribution has no Lebesgue density.

Wishart–Laguerre ensembles and β-extensions

[edit]

In random matrix theory, the Wishart family is often studied through itsLaguerre ensembles. For the real case (orthogonal symmetry,β=1), the joint density of the eigenvaluesλ1,,λp0{\displaystyle \lambda _{1},\dots ,\lambda _{p}\geq 0} ofXWp(I,n){\displaystyle \mathbf {X} \sim W_{p}(\mathbf {I} ,n)} is

cn,pexp(12i=1pλi)i=1pλinp121i<jp|λiλj|,{\displaystyle c_{n,p}\,\exp \!{\Big (}-{\tfrac {1}{2}}\sum _{i=1}^{p}\lambda _{i}{\Big )}\;\prod _{i=1}^{p}\lambda _{i}^{\tfrac {n-p-1}{2}}\;\prod _{1\leq i<j\leq p}\!|\lambda _{i}-\lambda _{j}|,}

which is theLaguerre orthogonal ensemble (LOE). The complex and quaternion analogues are theLaguerre unitary (LUE,β=2{\displaystyle \beta =2}) andLaguerre symplectic (LSE,β=4{\displaystyle \beta =4}) ensembles, respectively.[27]

β-Laguerre ensemble (generalβ>0{\displaystyle \beta >0})

[edit]

A further generalization, theβ-Laguerre ensemble, allows the Dyson indexβ>0{\displaystyle \beta >0} to vary continuously. Its joint eigenvalue density has theCoulomb gas form

p(λ1,,λp)(i=1pλiαeβ2λi)(i<j|λiλj|β),α>1,{\displaystyle p(\lambda _{1},\ldots ,\lambda _{p})\;\propto \;{\Big (}\prod _{i=1}^{p}\lambda _{i}^{\alpha }\,e^{-{\tfrac {\beta }{2}}\lambda _{i}}{\Big )}\;{\Big (}\prod _{i<j}|\lambda _{i}-\lambda _{j}|^{\beta }{\Big )},\qquad \alpha >-1,}

which reduces to LOE/LUE/LSE forβ=1,2,4{\displaystyle \beta =1,2,4}. For the classical Gaussian Wishart case (scaleI{\displaystyle \mathbf {I} }), one has the identification

α=β2(np+1)1.{\displaystyle \alpha \;=\;{\tfrac {\beta }{2}}\,(n-p+1)\;-\;1.}

A concrete probabilistic construction for anyβ>0{\displaystyle \beta >0} is provided by theDumitriu–Edelman bidiagonal model. One samples a random bidiagonal matrixB{\displaystyle \mathbf {B} } with independent chi variables and setsL=BBT{\displaystyle \mathbf {L} =\mathbf {B} \mathbf {B} ^{T}}; the eigenvalues ofL{\displaystyle \mathbf {L} } follow the β-Laguerre law with parameterα{\displaystyle \alpha }.[28][29]

Sampling
Bii1βχβ(α+pi+1),Bi,i11βχβ(pi+1)(i=2,,p),{\displaystyle B_{ii}\sim {\tfrac {1}{\sqrt {\beta }}}\;\chi _{\beta (\alpha +p-i+1)},\qquad B_{i,i-1}\sim {\tfrac {1}{\sqrt {\beta }}}\;\chi _{\beta (p-i+1)}\quad (i=2,\dots ,p),}

and setsL=BBT{\displaystyle \mathbf {L} =\mathbf {B} \mathbf {B} ^{T}}. Then the eigenvalues ofL{\displaystyle \mathbf {L} } have the β-Laguerre joint density above.[28]

Rectangular data matrices

If one wishes to realize these spectra as the singular values of a rectangularM×N{\displaystyle M\times N} matrix (withMN=p{\displaystyle M\geq N=p}), draw independent Haar-distributedUO(M){\displaystyle \mathbf {U} \in O(M)} (orU(M){\displaystyle U(M)}) andVO(N){\displaystyle \mathbf {V} \in O(N)}, and set

X=U[diag(λ1,,λp)0]V,{\displaystyle X\;=\;U{\begin{bmatrix}\operatorname {diag} ({\sqrt {\lambda _{1}}},\dots ,{\sqrt {\lambda _{p}}})\\0\end{bmatrix}}V^{*},}

so thatXX{\displaystyle X^{*}X} has eigenvalues{λi}{\displaystyle \{\lambda _{i}\}}. For i.i.d. Gaussian entries (true Wishart),β{\displaystyle \beta } is fixed by the field (real, complex, quaternion) andα{\displaystyle \alpha } reduces toβ2(np+1)1{\displaystyle {\tfrac {\beta }{2}}(n-p+1)-1}.[29][31]

Hard-edge behavior and universality

[edit]

At the “hard edge” (nearλ=0{\displaystyle \lambda =0}), β-Laguerre ensembles exhibitBessel-kernel correlations and level repulsion of orderβ{\displaystyle \beta }. In particular, withp{\displaystyle p\to \infty } and fixedα{\displaystyle \alpha }, the distribution of the smallest eigenvalue converges to a universal hard-edge (Bessel) law that depends only onα{\displaystyle \alpha } andβ{\displaystyle \beta }.[32][29]

Macroscopic limit (Marchenko–Pastur law)

[edit]

Under proportional growthp/nγ(0,){\displaystyle p/n\to \gamma \in (0,\infty )}, the empirical spectral distribution ofn1GGT{\displaystyle n^{-1}\mathbf {G} \mathbf {G} ^{T}} (with i.i.d. entries of varianceσ2{\displaystyle \sigma ^{2}}) converges almost surely to theMarchenko–Pastur law, supported onσ2(1±γ)2{\displaystyle \sigma ^{2}(1\pm {\sqrt {\gamma }})^{2}}. Whenγ=1{\displaystyle \gamma =1}, the density diverges asx1/2{\displaystyle x^{-1/2}} at the origin (a hard-edge singularity).[33]

Relationships to other distributions

[edit]

See also

[edit]

References

[edit]
  1. ^abWishart, J. (1928). "The generalised product moment distribution in samples from a normal multivariate population".Biometrika.20A (1–2):32–52.doi:10.1093/biomet/20A.1-2.32.JFM 54.0565.02.JSTOR 2331939.
  2. ^Livan, Giacomo; Novaes, Marcel; Vivo, Pierpaolo (2018),"Classical Ensembles: Wishart-Laguerre",Introduction to Random Matrices: Theory and Practice, SpringerBriefs in Mathematical Physics, Cham: Springer International Publishing, pp. 89–95,doi:10.1007/978-3-319-70885-0_13,ISBN 978-3-319-70885-0{{citation}}: CS1 maint: work parameter with ISBN (link)
  3. ^Koop, Gary; Korobilis, Dimitris (2010)."Bayesian Multivariate Time Series Methods for Empirical Macroeconomics".Foundations and Trends in Econometrics.3 (4):267–358.doi:10.1561/0800000013.
  4. ^Gupta, A. K.; Nagar, D. K. (2000).Matrix Variate Distributions. Chapman & Hall /CRC.ISBN 1584880465.
  5. ^Gelman, Andrew (2003).Bayesian Data Analysis (2nd ed.). Boca Raton, Fla.: Chapman & Hall. p. 582.ISBN 158488388X. Retrieved3 June 2015.
  6. ^Zanella, A.; Chiani, M.; Win, M.Z. (April 2009)."On the marginal distribution of the eigenvalues of wishart matrices"(PDF).IEEE Transactions on Communications.57 (4):1050–1060.Bibcode:2009ITCom..57.1050Z.doi:10.1109/TCOMM.2009.04.070143.hdl:1721.1/66900.S2CID 12437386.
  7. ^Livan, Giacomo; Vivo, Pierpaolo (2011). "Moments of Wishart-Laguerre and Jacobi ensembles of random matrices: application to the quantum transport problem in chaotic cavities".Acta Physica Polonica B.42 (5): 1081.arXiv:1103.2638.doi:10.5506/APhysPolB.42.1081.ISSN 0587-4254.S2CID 119599157.
  8. ^Uhlig, H. (1994)."On Singular Wishart and Singular Multivariate Beta Distributions".The Annals of Statistics.22:395–405.doi:10.1214/aos/1176325375.
  9. ^Muirhead, Robb J. (2005).Aspects of Multivariate Statistical Theory (2nd ed.). Wiley Interscience.ISBN 0471769851.
  10. ^abAnderson, T. W. (2003).An Introduction to Multivariate Statistical Analysis (3rd ed.). Hoboken, N. J.:Wiley Interscience. p. 259.ISBN 0-471-36091-0.
  11. ^Chiani, M. (2017)."On the probability that all eigenvalues of Gaussian, Wishart, and double Wishart random matrices lie within an interval".IEEE Transactions on Information Theory.63 (7):4521–4531.arXiv:1502.04189.Bibcode:2017ITIT...63.4521C.doi:10.1109/TIT.2017.2694846.
  12. ^abcBishop, C. M. (2006).Pattern Recognition and Machine Learning. Springer.
  13. ^Hoff, Peter D. (2009).A First Course in Bayesian Statistical Methods. New York: Springer. pp. 109–111.ISBN 978-0-387-92299-7.
  14. ^Kilian, Lutz; Lütkepohl, Helmut (2017). "Bayesian VAR Analysis".Structural Vector Autoregressive Analysis. Cambridge University Press. pp. 140–170.doi:10.1017/9781108164818.006.ISBN 978-1-107-19657-5.
  15. ^Nguyen, Duy (15 August 2023)."AN IN DEPTH INTRODUCTION TO VARIATIONAL BAYES NOTE".SSRN 4541076. Retrieved15 August 2023.
  16. ^Mayerhofer, Eberhard (2019-01-27). "Reforming the Wishart characteristic function".arXiv:1901.09347 [math.PR].
  17. ^abRao, C. R. (1965).Linear Statistical Inference and its Applications. Wiley. p. 535.
  18. ^Seber, George A. F. (2004).Multivariate Observations.Wiley.ISBN 978-0471691211.
  19. ^Chatfield, C.; Collins, A. J. (1980).Introduction to Multivariate Analysis. London: Chapman and Hall. pp. 103–108.ISBN 0-412-16030-7.
  20. ^Anderson, T. W. (2003).An Introduction to Multivariate Statistical Analysis (3rd ed.). Hoboken, N. J.:Wiley Interscience. p. 257.ISBN 0-471-36091-0.
  21. ^Smith, W. B.; Hocking, R. R. (1972). "Algorithm AS 53: Wishart Variate Generator".Journal of the Royal Statistical Society, Series C.21 (3):341–345.JSTOR 2346290.
  22. ^Pearson, Karl;Jeffery, G. B.;Elderton, Ethel M. (December 1929). "On the Distribution of the First Product Moment-Coefficient, in Samples Drawn from an Indefinitely Large Normal Population".Biometrika.21 (1/4). Biometrika Trust:164–201.doi:10.2307/2332556.JSTOR 2332556.
  23. ^Fischer, Adrian; Gaunt, Robert E.; Andrey, Sarantsev (2023). "The Variance-Gamma Distribution: A Review".arXiv:2303.05615 [math.ST].
  24. ^Craig, Cecil C. (1936)."On the Frequency Function of xy".Ann. Math. Statist.7:1–15.doi:10.1214/aoms/1177732541.
  25. ^Peddada, Shyamal Das;Richards, Donald St. P. (1991)."Proof of a Conjecture of M. L. Eaton on the Characteristic Function of the Wishart Distribution".Annals of Probability.19 (2):868–874.doi:10.1214/aop/1176990455.
  26. ^Gindikin, S.G. (1975). "Invariant generalized functions in homogeneous domains".Funct. Anal. Appl.9 (1):50–52.doi:10.1007/BF01078179.S2CID 123288172.
  27. ^Muirhead, Robb J. (2005).Aspects of Multivariate Statistical Theory (2nd ed.). Wiley-Interscience.ISBN 978-0471769859.
  28. ^abDumitriu, Ioana; Edelman, Alan (2002). "Matrix models for beta ensembles".Journal of Mathematical Physics.43 (11):5830–5847.arXiv:math-ph/0206043.Bibcode:2002JMP....43.5830D.doi:10.1063/1.1507823.
  29. ^abcForrester, Peter J. (2010).Log-Gases and Random Matrices. London Mathematical Society Monographs. Princeton University Press.ISBN 978-0-691-12829-0.
  30. ^Anderson, T. W. (2003).An Introduction to Multivariate Statistical Analysis (3rd ed.). Wiley-Interscience. p. 257.ISBN 0-471-36091-0.
  31. ^Edelman, Alan; Brian D. Sutton (2005)."The beta-Jacobi matrix model, the CS decomposition, and generalized singular value problems"(PDF).Foundations of Computational Mathematics.5 (2):173–202.doi:10.1007/s10208-004-0143-4 (inactive 7 September 2025).{{cite journal}}: CS1 maint: DOI inactive as of September 2025 (link)
  32. ^Tracy, Craig A.; Widom, Harold (1994). "Level spacing distributions and the Bessel kernel".Communications in Mathematical Physics.161 (2):289–309.arXiv:hep-th/9304063.Bibcode:1994CMaPh.161..289T.doi:10.1007/BF02099779.
  33. ^Marčenko, V. A.; Pastur, L. A. (1967)."Distribution of eigenvalues for some sets of random matrices"(PDF).Math. USSR-Sbornik.1 (4):457–483.Bibcode:1967SbMat...1..457M.doi:10.1070/SM1967v001n04ABEH001994.
  34. ^Dwyer, Paul S. (1967). "Some Applications of Matrix Derivatives in Multivariate Analysis".J. Amer. Statist. Assoc.62 (318):607–625.doi:10.1080/01621459.1967.10482934.JSTOR 2283988.

External links

[edit]
Discrete
univariate
with finite
support
with infinite
support
Continuous
univariate
supported on a
bounded interval
supported on a
semi-infinite
interval
supported
on the whole
real line
with support
whose type varies
Mixed
univariate
continuous-
discrete
Multivariate
(joint)
Directional
Degenerate
andsingular
Degenerate
Dirac delta function
Singular
Cantor
Families
Retrieved from "https://en.wikipedia.org/w/index.php?title=Wishart_distribution&oldid=1315090253"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp