Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Definite matrix

From Wikipedia, the free encyclopedia
(Redirected fromPositive-definite matrix)
Property of a mathematical matrix
Not to be confused withPositive matrix andTotally positive matrix.

Inmathematics, a symmetric matrixM{\displaystyle M} withreal entries ispositive-definite if the real numberxMx{\displaystyle \mathbf {x} ^{\top }M\mathbf {x} } is positive for every nonzero realcolumn vectorx,{\displaystyle \mathbf {x} ,} wherex{\displaystyle \mathbf {x} ^{\top }} is therow vectortranspose ofx.{\displaystyle \mathbf {x} .}[1]More generally, aHermitian matrix (that is, acomplex matrix equal to itsconjugate transpose) ispositive-definite if the real numberzMz{\displaystyle \mathbf {z} ^{*}M\mathbf {z} } is positive for every nonzero complex column vectorz,{\displaystyle \mathbf {z} ,} wherez{\displaystyle \mathbf {z} ^{*}} denotes the conjugate transpose ofz.{\displaystyle \mathbf {z} .}

Positive semi-definite matrices are defined similarly, except that the scalarsxMx{\displaystyle \mathbf {x} ^{\top }M\mathbf {x} } andzMz{\displaystyle \mathbf {z} ^{*}M\mathbf {z} } are required to be positiveor zero (that is, nonnegative).Negative-definite andnegative semi-definite matrices are defined analogously. A matrix that is not positive semi-definite and not negative semi-definite is sometimes calledindefinite.

Some authors use more general definitions of definiteness, permitting the matrices to be non-symmetric or non-Hermitian. The properties of these generalized definite matrices are explored in§ Extension for non-Hermitian square matrices, below, but are not the main focus of this article.

Ramifications

[edit]

It follows from the above definitions that a matrix is positive-definiteif and only if it is the matrix of apositive-definite quadratic form orHermitian form. In other words, a matrix is positive-definite if and only if it defines aninner product.

Positive-definite and positive-semidefinite matrices can be characterized in many ways, which may explain the importance of the concept in various parts of mathematics. A matrixM is positive-definite if and only if it satisfies any of the following equivalent conditions.

A matrix is positive semi-definite if it satisfies similar equivalent conditions where "positive" is replaced by "nonnegative", "invertible matrix" is replaced by "matrix", and the word "leading" is removed.

Positive-definite and positive-semidefinite real matrices are at the basis ofconvex optimization, since, given afunction of several real variables that is twicedifferentiable, then if itsHessian matrix (matrix of its second partial derivatives) is positive-definite at a pointp,{\displaystyle p,} then the function isconvex nearp, and, conversely, if the function is convex nearp,{\displaystyle p,} then the Hessian matrix is positive-semidefinite atp.{\displaystyle p.}

The set of positive definite matrices is anopenconvex cone, while the set of positive semi-definite matrices is aclosed convex cone.[2]

Definitions

[edit]

In the following definitions,x{\displaystyle \mathbf {x} ^{\top }} is the transpose ofx,{\displaystyle \mathbf {x} ,}z{\displaystyle \mathbf {z} ^{*}} is theconjugate transpose ofz,{\displaystyle \mathbf {z} ,} and0{\displaystyle \mathbf {0} } denotes then dimensional zero-vector.

Definitions for real matrices

[edit]

Ann×n{\displaystyle n\times n} symmetric real matrixM{\displaystyle M} is said to bepositive-definite ifxMx>0{\displaystyle \mathbf {x} ^{\top }M\mathbf {x} >0} for all non-zerox{\displaystyle \mathbf {x} } inRn.{\displaystyle \mathbb {R} ^{n}.} Formally,

M positive-definitexMx>0 for all xRn{0}{\displaystyle M{\text{ positive-definite}}\quad \iff \quad \mathbf {x} ^{\top }M\mathbf {x} >0{\text{ for all }}\mathbf {x} \in \mathbb {R} ^{n}\setminus \{\mathbf {0} \}}

Ann×n{\displaystyle n\times n} symmetric real matrixM{\displaystyle M} is said to bepositive-semidefinite ornon-negative-definite ifxMx0{\displaystyle \mathbf {x} ^{\top }M\mathbf {x} \geq 0} for allx{\displaystyle \mathbf {x} } inRn.{\displaystyle \mathbb {R} ^{n}.} Formally,

M positive semi-definitexMx0 for all xRn{\displaystyle M{\text{ positive semi-definite}}\quad \iff \quad \mathbf {x} ^{\top }M\mathbf {x} \geq 0{\text{ for all }}\mathbf {x} \in \mathbb {R} ^{n}}

Ann×n{\displaystyle n\times n} symmetric real matrixM{\displaystyle M} is said to benegative-definite ifxMx<0{\displaystyle \mathbf {x} ^{\top }M\mathbf {x} <0} for all non-zerox{\displaystyle \mathbf {x} } inRn.{\displaystyle \mathbb {R} ^{n}.} Formally,

M negative-definitexMx<0 for all xRn{0}{\displaystyle M{\text{ negative-definite}}\quad \iff \quad \mathbf {x} ^{\top }M\mathbf {x} <0{\text{ for all }}\mathbf {x} \in \mathbb {R} ^{n}\setminus \{\mathbf {0} \}}

Ann×n{\displaystyle n\times n} symmetric real matrixM{\displaystyle M} is said to benegative-semidefinite ornon-positive-definite ifxMx0{\displaystyle \mathbf {x} ^{\top }M\mathbf {x} \leq 0} for allx{\displaystyle \mathbf {x} } inRn.{\displaystyle \mathbb {R} ^{n}.} Formally,

M negative semi-definitexMx0 for all xRn{\displaystyle M{\text{ negative semi-definite}}\quad \iff \quad \mathbf {x} ^{\top }M\mathbf {x} \leq 0{\text{ for all }}\mathbf {x} \in \mathbb {R} ^{n}}

Ann×n{\displaystyle n\times n} symmetric real matrix which is neither positive semidefinite nor negative semidefinite is calledindefinite.

Definitions for complex matrices

[edit]

The following definitions all involve the termzMz.{\displaystyle \mathbf {z} ^{*}M\mathbf {z} .} Notice that this is always a real number for any Hermitian square matrixM.{\displaystyle M.}

Ann×n{\displaystyle n\times n} Hermitian complex matrixM{\displaystyle M} is said to bepositive-definite ifzMz>0{\displaystyle \mathbf {z} ^{*}M\mathbf {z} >0} for all non-zeroz{\displaystyle \mathbf {z} } inCn.{\displaystyle \mathbb {C} ^{n}.} Formally,

M positive-definitezMz>0 for all zCn{0}{\displaystyle M{\text{ positive-definite}}\quad \iff \quad \mathbf {z} ^{*}M\mathbf {z} >0{\text{ for all }}\mathbf {z} \in \mathbb {C} ^{n}\setminus \{\mathbf {0} \}}

Ann×n{\displaystyle n\times n} Hermitian complex matrixM{\displaystyle M} is said to bepositive semi-definite ornon-negative-definite ifzMz0{\displaystyle \mathbf {z} ^{*}M\mathbf {z} \geq 0} for allz{\displaystyle \mathbf {z} } inCn.{\displaystyle \mathbb {C} ^{n}.} Formally,

M positive semi-definitezMz0 for all zCn{\displaystyle M{\text{ positive semi-definite}}\quad \iff \quad \mathbf {z} ^{*}M\mathbf {z} \geq 0{\text{ for all }}\mathbf {z} \in \mathbb {C} ^{n}}

Ann×n{\displaystyle n\times n} Hermitian complex matrixM{\displaystyle M} is said to benegative-definite ifzMz<0{\displaystyle \mathbf {z} ^{*}M\mathbf {z} <0} for all non-zeroz{\displaystyle \mathbf {z} } inCn.{\displaystyle \mathbb {C} ^{n}.} Formally,

M negative-definitezMz<0 for all zCn{0}{\displaystyle M{\text{ negative-definite}}\quad \iff \quad \mathbf {z} ^{*}M\mathbf {z} <0{\text{ for all }}\mathbf {z} \in \mathbb {C} ^{n}\setminus \{\mathbf {0} \}}

Ann×n{\displaystyle n\times n} Hermitian complex matrixM{\displaystyle M} is said to benegative semi-definite ornon-positive-definite ifzMz0{\displaystyle \mathbf {z} ^{*}M\mathbf {z} \leq 0} for allz{\displaystyle \mathbf {z} } inCn.{\displaystyle \mathbb {C} ^{n}.} Formally,

M negative semi-definitezMz0 for all zCn{\displaystyle M{\text{ negative semi-definite}}\quad \iff \quad \mathbf {z} ^{*}M\mathbf {z} \leq 0{\text{ for all }}\mathbf {z} \in \mathbb {C} ^{n}}

Ann×n{\displaystyle n\times n} Hermitian complex matrix which is neither positive semidefinite nor negative semidefinite is calledindefinite.

Consistency between real and complex definitions

[edit]

Since every real matrix is also a complex matrix, the definitions of "definiteness" for the two classes must agree.

For complex matrices, the most common definition says thatM{\displaystyle M} is positive-definite if and only ifzMz{\displaystyle \mathbf {z} ^{*}M\mathbf {z} } is real and positive for every non-zero complex column vectorsz.{\displaystyle \mathbf {z} .} This condition implies thatM{\displaystyle M} is Hermitian (i.e. its transpose is equal to its conjugate), sincezMz{\displaystyle \mathbf {z} ^{*}M\mathbf {z} } being real, it equals its conjugate transposezMz{\displaystyle \mathbf {z} ^{*}M^{*}\mathbf {z} } for everyz,{\displaystyle \mathbf {z} ,} which impliesM=M.{\displaystyle M=M^{*}.}

By this definition, a positive-definitereal matrixM{\displaystyle M} is Hermitian, hence symmetric; andzMz{\displaystyle \mathbf {z} ^{\top }M\mathbf {z} } is positive for all non-zeroreal column vectorsz.{\displaystyle \mathbf {z} .} However the last condition alone is not sufficient forM{\displaystyle M} to be positive-definite. For example, ifM=[1111],{\displaystyle M={\begin{bmatrix}1&1\\-1&1\end{bmatrix}},}

then for any real vectorz{\displaystyle \mathbf {z} } with entriesa{\displaystyle a} andb{\displaystyle b} we havezMz=(a+b)a+(a+b)b=a2+b2,{\displaystyle \mathbf {z} ^{\top }M\mathbf {z} =\left(a+b\right)a+\left(-a+b\right)b=a^{2}+b^{2},} which is always positive ifz{\displaystyle \mathbf {z} } is not zero. However, ifz{\displaystyle \mathbf {z} } is the complex vector with entries1 andi,{\displaystyle i,} one gets

zMz=[1i]M[1i]=[1+i1i][1i]=2+2i.{\displaystyle \mathbf {z} ^{*}M\mathbf {z} ={\begin{bmatrix}1&-i\end{bmatrix}}M{\begin{bmatrix}1\\i\end{bmatrix}}={\begin{bmatrix}1+i&1-i\end{bmatrix}}{\begin{bmatrix}1\\i\end{bmatrix}}=2+2i.}

which is not real. Therefore,M{\displaystyle M} is not positive-definite.

On the other hand, for asymmetric real matrixM,{\displaystyle M,} the condition "zMz>0{\displaystyle \mathbf {z} ^{\top }M\mathbf {z} >0} for all nonzero real vectorsz{\displaystyle \mathbf {z} }"does imply thatM{\displaystyle M} is positive-definite in the complex sense.

Notation

[edit]

If a Hermitian matrixM{\displaystyle M} is positive semi-definite, one sometimes writesM0{\displaystyle M\succeq 0} and ifM{\displaystyle M} is positive-definite one writesM0.{\displaystyle M\succ 0.} To denote thatM{\displaystyle M} is negative semi-definite one writesM0{\displaystyle M\preceq 0} and to denote thatM{\displaystyle M} is negative-definite one writesM0.{\displaystyle M\prec 0.}

The notion comes fromfunctional analysis where positive semidefinite matrices definepositive operators. If two matricesA{\displaystyle A} andB{\displaystyle B} satisfyBA0,{\displaystyle B-A\succeq 0,} we can define anon-strict partial orderBA{\displaystyle B\succeq A} that isreflexive,antisymmetric, andtransitive; It is not atotal order, however, asBA,{\displaystyle B-A,} in general, may be indefinite.

A common alternative notation isM0,{\displaystyle M\geq 0,}M>0,{\displaystyle M>0,}M0,{\displaystyle M\leq 0,} andM<0{\displaystyle M<0} for positive semi-definite and positive-definite, negative semi-definite and negative-definite matrices, respectively. This may be confusing, as sometimesnonnegative matrices (respectively, nonpositive matrices) are also denoted in this way.

Examples

[edit]

Eigenvalues

[edit]

LetM{\displaystyle M} be ann×n{\displaystyle n\times n}Hermitian matrix (this includes realsymmetric matrices). All eigenvalues ofM{\displaystyle M} are real, and their sign characterize its definiteness:

  • M{\displaystyle M} is positive definite if and only if all of its eigenvalues are positive.
  • M{\displaystyle M} is positive semi-definite if and only if all of its eigenvalues are non-negative.
  • M{\displaystyle M} is negative definite if and only if all of its eigenvalues are negative.
  • M{\displaystyle M} is negative semi-definite if and only if all of its eigenvalues are non-positive.
  • M{\displaystyle M} is indefinite if and only if it has both positive and negative eigenvalues.

LetPDP1{\displaystyle PDP^{-1}} be aneigendecomposition ofM,{\displaystyle M,} whereP{\displaystyle P} is aunitary complex matrix whose columns comprise anorthonormal basis ofeigenvectors ofM,{\displaystyle M,} andD{\displaystyle D} is arealdiagonal matrix whosemain diagonal contains the correspondingeigenvalues. The matrixM{\displaystyle M} may be regarded as a diagonal matrixD{\displaystyle D} that has been re-expressed in coordinates of the (eigenvectors) basisP.{\displaystyle P.} Put differently, applyingM{\displaystyle M} to some vectorz,{\displaystyle \mathbf {z} ,} givingMz,{\displaystyle M\mathbf {z} ,} is the same aschanging the basis to the eigenvector coordinate system usingP1,{\displaystyle P^{-1},} givingP1z,{\displaystyle P^{-1}\mathbf {z} ,} applying thestretching transformationD{\displaystyle D} to the result, givingDP1z,{\displaystyle DP^{-1}\mathbf {z} ,} and then changing the basis back usingP,{\displaystyle P,} givingPDP1z.{\displaystyle PDP^{-1}\mathbf {z} .}

With this in mind, the one-to-one change of variabley=Pz{\displaystyle \mathbf {y} =P\mathbf {z} } shows thatzMz{\displaystyle \mathbf {z} ^{*}M\mathbf {z} } is real and positive for any complex vectorz{\displaystyle \mathbf {z} } if and only ifyDy{\displaystyle \mathbf {y} ^{*}D\mathbf {y} } is real and positive for anyy;{\displaystyle y;} in other words, ifD{\displaystyle D} is positive definite. For a diagonal matrix, this is true only if each element of the main diagonal – that is, every eigenvalue ofM{\displaystyle M} – is positive. Since thespectral theorem guarantees all eigenvalues of a Hermitian matrix to be real, the positivity of eigenvalues can be checked usingDescartes' rule of alternating signs when thecharacteristic polynomial of a real, symmetric matrixM{\displaystyle M} is available.

Decomposition

[edit]
See also:Gram matrix

LetM{\displaystyle M} be ann×n{\displaystyle n\times n}Hermitian matrix.M{\displaystyle M} is positive semidefinite if and only if it can be decomposed as a productM=BB{\displaystyle M=B^{*}B}of a matrixB{\displaystyle B} with itsconjugate transpose.

WhenM{\displaystyle M} is real,B{\displaystyle B} can be real as well and the decomposition can be written asM=BB.{\displaystyle M=B^{\top }B.}

M{\displaystyle M} is positive definite if and only if such a decomposition exists withB{\displaystyle B}invertible.More generally,M{\displaystyle M} is positive semidefinite with rankk{\displaystyle k} if and only if a decomposition exists with ak×n{\displaystyle k\times n} matrixB{\displaystyle B} of full row rank (i.e. of rankk{\displaystyle k}).Moreover, for any decompositionM=BB,{\displaystyle M=B^{*}B,}rank(M)=rank(B).{\displaystyle \operatorname {rank} (M)=\operatorname {rank} (B).}[3]

Proof

IfM=BB,{\displaystyle M=B^{*}B,} thenxMx=(xB)(Bx)=Bx20,{\displaystyle x^{*}Mx=(x^{*}B^{*})(Bx)=\|Bx\|^{2}\geq 0,} soM{\displaystyle M} is positive semidefinite.If moreoverB{\displaystyle B} is invertible then the inequality is strict forx0,{\displaystyle x\neq 0,} soM{\displaystyle M} is positive definite.IfB{\displaystyle B} isk×n{\displaystyle k\times n} of rankk,{\displaystyle k,} thenrank(M)=rank(B)=k.{\displaystyle \operatorname {rank} (M)=\operatorname {rank} (B^{*})=k.}

In the other direction, supposeM{\displaystyle M} is positive semidefinite.SinceM{\displaystyle M} is Hermitian, it has aneigendecompositionM=Q1DQ{\displaystyle M=Q^{-1}DQ} whereQ{\displaystyle Q} isunitary andD{\displaystyle D} is a diagonal matrix whose entries are the eigenvalues ofM{\displaystyle M}SinceM{\displaystyle M} is positive semidefinite, the eigenvalues are non-negative real numbers, so one can defineD12{\displaystyle D^{\frac {1}{2}}} as the diagonal matrix whose entries are non-negative square roots of eigenvalues.ThenM=Q1DQ=QDQ=QD12D12Q=QD12D12Q=BB{\displaystyle M=Q^{-1}DQ=Q^{*}DQ=Q^{*}D^{\frac {1}{2}}D^{\frac {1}{2}}Q=Q^{*}D^{{\frac {1}{2}}*}D^{\frac {1}{2}}Q=B^{*}B} forB=D12Q.{\displaystyle B=D^{\frac {1}{2}}Q.}If moreoverM{\displaystyle M} is positive definite, then the eigenvalues are (strictly) positive, soD12{\displaystyle D^{\frac {1}{2}}} is invertible, and henceB=D12Q{\displaystyle B=D^{\frac {1}{2}}Q} is invertible as well.IfM{\displaystyle M} has rankk,{\displaystyle k,} then it has exactlyk{\displaystyle k} positive eigenvalues and the others are zero, hence inB=D12Q{\displaystyle B=D^{\frac {1}{2}}Q} all butk{\displaystyle k} rows are all zeroed.Cutting the zero rows gives ak×n{\displaystyle k\times n} matrixB{\displaystyle B'} such thatBB=BB=M.{\displaystyle B'^{*}B'=B^{*}B=M.}

The columnsb1,,bn{\displaystyle b_{1},\dots ,b_{n}} ofB{\displaystyle B} can be seen as vectors in thecomplex orreal vector spaceRk,{\displaystyle \mathbb {R} ^{k},} respectively.Then the entries ofM{\displaystyle M} areinner products (that isdot products, in the real case) of these vectorsMij=bi,bj.{\displaystyle M_{ij}=\langle b_{i},b_{j}\rangle .}In other words, a Hermitian matrixM{\displaystyle M} is positive semidefinite if and only if it is theGram matrix of some vectorsb1,,bn.{\displaystyle b_{1},\dots ,b_{n}.}It is positive definite if and only if it is the Gram matrix of somelinearly independent vectors.In general, the rank of the Gram matrix of vectorsb1,,bn{\displaystyle b_{1},\dots ,b_{n}} equals the dimension of the spacespanned by these vectors.[4]

Uniqueness up to unitary transformations

[edit]

The decomposition is not unique: ifM=BB{\displaystyle M=B^{*}B} for somek×n{\displaystyle k\times n} matrixB{\displaystyle B} and ifQ{\displaystyle Q} is anyunitaryk×k{\displaystyle k\times k} matrix (meaningQQ=QQ=I{\displaystyle Q^{*}Q=QQ^{*}=I}),thenM=BB=BQQB=AA{\displaystyle M=B^{*}B=B^{*}Q^{*}QB=A^{*}A} forA=QB.{\displaystyle A=QB.}

However, this is the only way in which two decompositions can differ: The decomposition is unique up tounitary transformations.More formally, ifA{\displaystyle A} is ak×n{\displaystyle k\times n} matrix andB{\displaystyle B} is a×n{\displaystyle \ell \times n} matrix such thatAA=BB,{\displaystyle A^{*}A=B^{*}B,}then there is a×k{\displaystyle \ell \times k} matrixQ{\displaystyle Q} with orthonormal columns (meaningQQ=Ik×k{\displaystyle Q^{*}Q=I_{k\times k}}) such thatB=QA.{\displaystyle B=QA.}[5]When=k{\displaystyle \ell =k} this meansQ{\displaystyle Q} isunitary.

This statement has an intuitive geometric interpretation in the real case:let the columns ofA{\displaystyle A} andB{\displaystyle B} be the vectorsa1,,an{\displaystyle a_{1},\dots ,a_{n}} andb1,,bn{\displaystyle b_{1},\dots ,b_{n}} inRk.{\displaystyle \mathbb {R} ^{k}.}A real unitary matrix is anorthogonal matrix, which describes arigid transformation (an isometry of Euclidean spaceRk{\displaystyle \mathbb {R} ^{k}}) preserving the 0 point (i.e.rotations andreflections, without translations). Therefore, the dot productsaiaj{\displaystyle a_{i}\cdot a_{j}} andbibj{\displaystyle b_{i}\cdot b_{j}} are equal if and only if some rigid transformation ofRk{\displaystyle \mathbb {R} ^{k}} transforms the vectorsa1,,an{\displaystyle a_{1},\dots ,a_{n}} tob1,,bn{\displaystyle b_{1},\dots ,b_{n}} (and 0 to 0).

Square root

[edit]
Main article:Square root of a matrix

A Hermitian matrixM{\displaystyle M} is positive semidefinite if and only if there is a positive semidefinite matrixB{\displaystyle B} (in particularB{\displaystyle B} is Hermitian, soB=B{\displaystyle B^{*}=B}) satisfyingM=BB.{\displaystyle M=BB.} This matrixB{\displaystyle B} is unique,[6] is called thenon-negativesquare root ofM,{\displaystyle M,} and is denoted withB=M12.{\displaystyle B=M^{\frac {1}{2}}.}WhenM{\displaystyle M} is positive definite, so isM12,{\displaystyle M^{\frac {1}{2}},} hence it is also called thepositive square root ofM.{\displaystyle M.}

The non-negative square root should not be confused with other decompositionsM=BB.{\displaystyle M=B^{*}B.}Some authors use the namesquare root andM12{\displaystyle M^{\frac {1}{2}}} for any such decomposition, or specifically for theCholesky decomposition,or any decomposition of the formM=BB;{\displaystyle M=BB;}others only use it for the non-negative square root.

IfMN0{\displaystyle M\succ N\succ 0} thenM12N120.{\displaystyle M^{\frac {1}{2}}\succ N^{\frac {1}{2}}\succ 0.}

Cholesky decomposition

[edit]

A Hermitian positive semidefinite matrixM{\displaystyle M} can be written asM=LL,{\displaystyle M=LL^{*},} whereL{\displaystyle L} is lower triangular with non-negative diagonal (equivalentlyM=BB{\displaystyle M=B^{*}B} whereB=L{\displaystyle B=L^{*}} is upper triangular); this is theCholesky decomposition.IfM{\displaystyle M} is positive definite, then the diagonal ofL{\displaystyle L} is positive and the Cholesky decomposition is unique. Conversely ifL{\displaystyle L} is lower triangular with nonnegative diagonal thenLL{\displaystyle LL^{*}} is positive semidefinite. The Cholesky decomposition is especially useful for efficient numerical calculations.A closely related decomposition is theLDL decomposition,M=LDL,{\displaystyle M=LDL^{*},} whereD{\displaystyle D} is diagonal andL{\displaystyle L} islower unitriangular.

Williamson theorem

[edit]

Any2n×2n{\displaystyle 2n\times 2n} positive definite Hermitian real matrixM{\displaystyle M} can be diagonalized via symplectic (real) matrices. More precisely,Williamson's theorem ensures the existence of symplecticSSp(2n,R){\displaystyle S\in \mathbf {Sp} (2n,\mathbb {R} )} and diagonal real positiveDRn×n{\displaystyle D\in \mathbb {R} ^{n\times n}} such thatSMST=DD{\displaystyle SMS^{T}=D\oplus D}.

Other characterizations

[edit]

LetM{\displaystyle M} be ann×n{\displaystyle n\times n}real symmetric matrix, and letB1(M){xRn:xMx1}{\displaystyle B_{1}(M)\equiv \{\mathbf {x} \in \mathbb {R} ^{n}:\mathbf {x} ^{\top }M\mathbf {x} \leq 1\}} be the "unit ball" defined byM.{\displaystyle M.} Then we have the following

LetM{\displaystyle M} be ann×n{\displaystyle n\times n}Hermitian matrix. The following properties are equivalent toM{\displaystyle M} being positive definite:

The associated sesquilinear form is an inner product
Thesesquilinear form defined byM{\displaystyle M} is the function,{\displaystyle \langle \cdot ,\cdot \rangle } fromCn×Cn{\displaystyle \mathbb {C} ^{n}\times \mathbb {C} ^{n}} toCn{\displaystyle \mathbb {C} ^{n}} such thatx,yyMx{\displaystyle \langle \mathbf {x} ,\mathbf {y} \rangle \equiv \mathbf {y} ^{*}M\mathbf {x} } for allx{\displaystyle \mathbf {x} } andy{\displaystyle \mathbf {y} } inCn,{\displaystyle \mathbb {C} ^{n},} wherey{\displaystyle \mathbf {y} ^{*}} is the conjugate transpose ofy.{\displaystyle \mathbf {y} .} For any complex matrixM,{\displaystyle M,} this form is linear inx{\displaystyle x} and semilinear iny.{\displaystyle \mathbf {y} .} Therefore, the form is aninner product onCn{\displaystyle \mathbb {C} ^{n}} if and only ifz,z{\displaystyle \langle \mathbf {z} ,\mathbf {z} \rangle } is real and positive for all nonzeroz;{\displaystyle \mathbf {z} ;} that is if and only ifM{\displaystyle M} is positive definite. (In fact, every inner product onCn{\displaystyle \mathbb {C} ^{n}} arises in this fashion from a Hermitian positive definite matrix.)
Its leading principal minors are all positive
Thekthleading principal minor of a matrixM{\displaystyle M} is thedeterminant of its upper-leftk×k{\displaystyle k\times k} sub-matrix. It turns out that a matrix is positive definite if and only if all these determinants are positive. This condition is known asSylvester's criterion, and provides an efficient test of positive definiteness of a symmetric real matrix. Namely, the matrix is reduced to anupper triangular matrix by usingelementary row operations, as in the first part of theGaussian elimination method, taking care to preserve the sign of its determinant duringpivoting process. Since thekth leading principal minor of a triangular matrix is the product of its diagonal elements up to rowk,{\displaystyle k,} Sylvester's criterion is equivalent to checking whether its diagonal elements are all positive. This condition can be checked each time a new rowk{\displaystyle k} of the triangular matrix is obtained.

A positive semidefinite matrix is positive definite if and only if it isinvertible.[7]A matrixM{\displaystyle M} is negative (semi)definite if and only ifM{\displaystyle -M} is positive (semi)definite.

Quadratic forms

[edit]
Main article:Definite quadratic form

The (purely)quadratic form associated with a realn×n{\displaystyle n\times n} matrixM{\displaystyle M} is the functionQ:RnR{\displaystyle Q:\mathbb {R} ^{n}\to \mathbb {R} } such thatQ(x)=xMx{\displaystyle Q(\mathbf {x} )=\mathbf {x} ^{\top }M\mathbf {x} } for allx.{\displaystyle \mathbf {x} .}M{\displaystyle M} can be assumed symmetric by replacing it with12(M+M),{\displaystyle {\tfrac {1}{2}}\left(M+M^{\top }\right),} since any asymmetric part will be zeroed-out in the double-sided product.

A symmetric matrixM{\displaystyle M} is positive definite if and only if its quadratic form is astrictly convex function.

More generally, anyquadratic function fromRn{\displaystyle \mathbb {R} ^{n}} toR{\displaystyle \mathbb {R} } can be written asxMx+bx+c{\displaystyle \mathbf {x} ^{\top }M\mathbf {x} +\mathbf {b} ^{\top }\mathbf {x} +c} whereM{\displaystyle M} is a symmetricn×n{\displaystyle n\times n} matrix,b{\displaystyle \mathbf {b} } is a realn vector, andc{\displaystyle c} a real constant. In then=1{\displaystyle n=1} case, this is a parabola, and just like in then=1{\displaystyle n=1} case, we have

Theorem: This quadratic function is strictly convex, and hence has a unique finite global minimum, if and only ifM{\displaystyle M} is positive definite.

Proof: IfM{\displaystyle M} is positive definite, then the function is strictly convex. Its gradient is zero at the unique point ofM1b,{\displaystyle M^{-1}\mathbf {b} ,} which must be the global minimum since the function is strictly convex. IfM{\displaystyle M} is not positive definite, then there exists some vectorv{\displaystyle \mathbf {v} } such thatvMv0,{\displaystyle \mathbf {v} ^{\top }M\mathbf {v} \leq 0,} so the functionf(t)(tv)M(tv)+b(tv)+c{\displaystyle f(t)\equiv (t\mathbf {v} )^{\top }M(t\mathbf {v} )+b^{\top }(t\mathbf {v} )+c} is a line or a downward parabola, thus not strictly convex and not having a global minimum.

For this reason, positive definite matrices play an important role inoptimization problems.

Simultaneous diagonalization

[edit]

One symmetric matrix and another matrix that is both symmetric and positive definite can besimultaneously diagonalized. This is so although simultaneous diagonalization is not necessarily performed with asimilarity transformation. This result does not extend to the case of three or more matrices. In this section we write for the real case. Extension to the complex case is immediate.

LetM{\displaystyle M} be a symmetric andN{\displaystyle N} a symmetric and positive definite matrix. Write the generalized eigenvalue equation as(MλN)x=0{\displaystyle \left(M-\lambda N\right)\mathbf {x} =0} where we impose thatx{\displaystyle \mathbf {x} } be normalized, i.e.xNx=1.{\displaystyle \mathbf {x} ^{\top }N\mathbf {x} =1.} Now we useCholesky decomposition to write the inverse ofN{\displaystyle N} asQQ.{\displaystyle Q^{\top }Q.} Multiplying byQ{\displaystyle Q} and lettingx=Qy,{\displaystyle \mathbf {x} =Q^{\top }\mathbf {y} ,} we getQ(MλN)Qy=0,{\displaystyle Q\left(M-\lambda N\right)Q^{\top }\mathbf {y} =0,} which can be rewritten as(QMQ)y=λy{\displaystyle \left(QMQ^{\top }\right)\mathbf {y} =\lambda \mathbf {y} } whereyy=1.{\displaystyle \mathbf {y} ^{\top }\mathbf {y} =1.} Manipulation now yieldsMX=NXΛ{\displaystyle MX=NX\Lambda } whereX{\displaystyle X} is a matrix having as columns the generalized eigenvectors andΛ{\displaystyle \Lambda } is a diagonal matrix of the generalized eigenvalues. Now premultiplication withX{\displaystyle X^{\top }} gives the final result:XMX=Λ{\displaystyle X^{\top }MX=\Lambda } andXNX=I,{\displaystyle X^{\top }NX=I,} but note that this is no longer an orthogonal diagonalization with respect to the inner product whereyy=1.{\displaystyle \mathbf {y} ^{\top }\mathbf {y} =1.} In fact, we diagonalizedM{\displaystyle M} with respect to the inner product induced byN.{\displaystyle N.}[8]

Note that this result does not contradict what is said on simultaneous diagonalization in the articleDiagonalizable matrix, which refers to simultaneous diagonalization by a similarity transformation. Our result here is more akin to a simultaneous diagonalization of two quadratic forms, and is useful for optimization of one form under conditions on the other.

Properties

[edit]

Induced partial ordering

[edit]

For arbitrary square matricesM,{\displaystyle M,}N{\displaystyle N} we writeMN{\displaystyle M\geq N} ifMN0{\displaystyle M-N\geq 0} i.e.,MN{\displaystyle M-N} is positive semi-definite. This defines apartial ordering on the set of all square matrices. One can similarly define a strict partial orderingM>N.{\displaystyle M>N.} The ordering is called theLoewner order.

Inverse of positive definite matrix

[edit]

Every positive definite matrix isinvertible and its inverse is also positive definite.[9] IfMN>0{\displaystyle M\geq N>0} thenN1M1>0.{\displaystyle N^{-1}\geq M^{-1}>0.}[10] Moreover, by themin-max theorem, thekth largest eigenvalue ofM{\displaystyle M} is greater than or equal to thekth largest eigenvalue ofN.{\displaystyle N.}

Scaling

[edit]

IfM{\displaystyle M} is positive definite andr>0{\displaystyle r>0} is a real number, thenrM{\displaystyle rM} is positive definite.[11]

Addition

[edit]

Multiplication

[edit]

Trace

[edit]

The diagonal entriesmii{\displaystyle m_{ii}} of a positive-semidefinite matrix are real and non-negative. As a consequence thetrace,tr(M)0.{\displaystyle \operatorname {tr} (M)\geq 0.} Furthermore,[13] since every principal sub-matrix (in particular, 2-by-2) is positive semidefinite,|mij|miimjji,j{\displaystyle \left|m_{ij}\right|\leq {\sqrt {m_{ii}m_{jj}}}\quad \forall i,j}

and thus, whenn1,{\displaystyle n\geq 1,}maxi,j|mij|maximii{\displaystyle \max _{i,j}\left|m_{ij}\right|\leq \max _{i}m_{ii}}

Ann×n{\displaystyle n\times n} Hermitian matrixM{\displaystyle M} is positive definite if it satisfies the following trace inequalities:[14]tr(M)>0and(tr(M))2tr(M2)>n1.{\displaystyle \operatorname {tr} (M)>0\quad \mathrm {and} \quad {\frac {(\operatorname {tr} (M))^{2}}{\operatorname {tr} (M^{2})}}>n-1.}

Another important result is that for anyM{\displaystyle M} andN{\displaystyle N} positive-semidefinite matrices,tr(MN)0.{\displaystyle \operatorname {tr} (MN)\geq 0.} This follows by writingtr(MN)=tr(M12NM12).{\displaystyle \operatorname {tr} (MN)=\operatorname {tr} (M^{\frac {1}{2}}NM^{\frac {1}{2}}).} The matrixM12NM12{\displaystyle M^{\frac {1}{2}}NM^{\frac {1}{2}}} is positive-semidefinite and thus has non-negative eigenvalues, whose sum, the trace, is therefore also non-negative.

Hadamard product

[edit]

IfM,N0,{\displaystyle M,N\geq 0,} althoughMN{\displaystyle MN} is not necessary positive semidefinite, theHadamard product is,MN0{\displaystyle M\circ N\geq 0} (this result is often called theSchur product theorem).[15]

Regarding the Hadamard product of two positive semidefinite matricesM=(mij)0,{\displaystyle M=(m_{ij})\geq 0,}N0,{\displaystyle N\geq 0,} there are two notable inequalities:

Kronecker product

[edit]

IfM,N0,{\displaystyle M,N\geq 0,} althoughMN{\displaystyle MN} is not necessary positive semidefinite, theKronecker productMN0.{\displaystyle M\otimes N\geq 0.}

Frobenius product

[edit]

IfM,N0,{\displaystyle M,N\geq 0,} althoughMN{\displaystyle MN} is not necessary positive semidefinite, theFrobenius inner productM:N0{\displaystyle M:N\geq 0} (Lancaster–Tismenetsky,The Theory of Matrices, p. 218).

Convexity

[edit]

The set of positive semidefinite symmetric matrices isconvex. That is, ifM{\displaystyle M} andN{\displaystyle N} are positive semidefinite, then for anyα{\displaystyle \alpha } between0 and1,αM+(1α)N{\displaystyle \alpha M+\left(1-\alpha \right)N} is also positive semidefinite. For any vectorx{\displaystyle \mathbf {x} }:x(αM+(1α)N)x=αxMx+(1α)xNx0.{\displaystyle \mathbf {x} ^{\top }\left(\alpha M+\left(1-\alpha \right)N\right)\mathbf {x} =\alpha \mathbf {x} ^{\top }M\mathbf {x} +(1-\alpha )\mathbf {x} ^{\top }N\mathbf {x} \geq 0.}

This property guarantees thatsemidefinite programming problems converge to a globally optimal solution.

Relation with cosine

[edit]

The positive-definiteness of a matrixA{\displaystyle A} expresses that the angleθ{\displaystyle \theta } between any vectorx{\displaystyle \mathbf {x} } and its imageAx{\displaystyle A\mathbf {x} } is alwaysπ/2<θ<+π/2:{\displaystyle -\pi /2<\theta <+\pi /2:}

cosθ=xAxxAx=x,AxxAx,θ=θ(x,Ax)(x,Ax)^{\displaystyle \cos \theta ={\frac {\mathbf {x} ^{\top }A\mathbf {x} }{\lVert \mathbf {x} \rVert \lVert A\mathbf {x} \rVert }}={\frac {\langle \mathbf {x} ,A\mathbf {x} \rangle }{\lVert \mathbf {x} \rVert \lVert A\mathbf {x} \rVert }},\theta =\theta (\mathbf {x} ,A\mathbf {x} )\equiv {\widehat {\left(\mathbf {x} ,A\mathbf {x} \right)}}\equiv } the angle betweenx{\displaystyle \mathbf {x} } andAx.{\displaystyle A\mathbf {x} .}

Further properties

[edit]
  1. IfM{\displaystyle M} is a symmetricToeplitz matrix, i.e. the entriesmij{\displaystyle m_{ij}} are given as a function of their absolute index differences:mij=h(|ij|),{\displaystyle m_{ij}=h(|i-j|),} and thestrict inequalityj0|h(j)|<h(0){\textstyle \sum _{j\neq 0}\left|h(j)\right|<h(0)} holds, thenM{\displaystyle M} isstrictly positive definite.
  2. LetM>0{\displaystyle M>0} andN{\displaystyle N} Hermitian. IfMN+NM0{\displaystyle MN+NM\geq 0} (resp.,MN+NM>0{\displaystyle MN+NM>0}) thenN0{\displaystyle N\geq 0} (resp.,N>0{\displaystyle N>0}).[18]
  3. IfM>0{\displaystyle M>0} is real, then there is aδ>0{\displaystyle \delta >0} such thatM>δI,{\displaystyle M>\delta I,} whereI{\displaystyle I} is theidentity matrix.
  4. IfMk{\displaystyle M_{k}} denotes the leadingk×k{\displaystyle k\times k} minor,det(Mk)/det(Mk1){\displaystyle \det \left(M_{k}\right)/\det \left(M_{k-1}\right)} is thekth pivot duringLU decomposition.
  5. A matrix is negative definite if itskth order leadingprincipal minor is negative whenk{\displaystyle k} is odd, and positive whenk{\displaystyle k} is even.
  6. IfM{\displaystyle M} is a real positive definite matrix, then there exists a positive real numberm{\displaystyle m} such that for every vectorv,{\displaystyle \mathbf {v} ,}vMvmv22.{\displaystyle \mathbf {v} ^{\top }M\mathbf {v} \geq m\|\mathbf {v} \|_{2}^{2}.}
  7. A Hermitian matrix is positive semidefinite if and only if all of its principal minors are nonnegative. It is however not enough to consider the leading principal minors only, as is checked on the diagonal matrix with entries0 and−1 .

Block matrices and submatrices

[edit]

A positive2n×2n{\displaystyle 2n\times 2n} matrix may also be defined byblocks:M=[ABCD]{\displaystyle M={\begin{bmatrix}A&B\\C&D\end{bmatrix}}}

where each block isn×n,{\displaystyle n\times n,} By applying the positivity condition, it immediately follows thatA{\displaystyle A} andD{\displaystyle D} are hermitian, andC=B.{\displaystyle C=B^{*}.}

We have thatzMz0{\displaystyle \mathbf {z} ^{*}M\mathbf {z} \geq 0} for all complexz,{\displaystyle \mathbf {z} ,} and in particular forz=[v,0].{\displaystyle \mathbf {z} =[\mathbf {v} ,0]^{\top }.} Then[v0][ABBD][v0]=vAv0.{\displaystyle {\begin{bmatrix}\mathbf {v} ^{*}&0\end{bmatrix}}{\begin{bmatrix}A&B\\B^{*}&D\end{bmatrix}}{\begin{bmatrix}\mathbf {v} \\0\end{bmatrix}}=\mathbf {v} ^{*}A\mathbf {v} \geq 0.}

A similar argument can be applied toD,{\displaystyle D,} and thus we conclude that bothA{\displaystyle A} andD{\displaystyle D} must be positive definite. The argument can be extended to show that anyprincipal submatrix ofM{\displaystyle M} is itself positive definite.

Converse results can be proved with stronger conditions on the blocks, for instance, using theSchur complement.

Local extrema

[edit]

A generalquadratic formf(x){\displaystyle f(\mathbf {x} )} onn{\displaystyle n} real variablesx1,,xn{\displaystyle x_{1},\ldots ,x_{n}} can always be written asxMx{\displaystyle \mathbf {x} ^{\top }M\mathbf {x} } wherex{\displaystyle \mathbf {x} } is the column vector with those variables, andM{\displaystyle M} is a symmetric real matrix. Therefore, the matrix being positive definite means thatf{\displaystyle f} has a unique minimum (zero) whenx{\displaystyle \mathbf {x} } is zero, and is strictly positive for any otherx.{\displaystyle \mathbf {x} .}

More generally, a twice-differentiable real functionf{\displaystyle f} onn{\displaystyle n} real variables has local minimum at argumentsx1,,xn{\displaystyle x_{1},\ldots ,x_{n}} if itsgradient is zero and itsHessian (the matrix of all second derivatives) is positive semi-definite at that point. Similar statements can be made for negative definite and semi-definite matrices.

Covariance

[edit]

Instatistics, thecovariance matrix of amultivariate probability distribution is always positive semi-definite; and it is positive definite unless one variable is an exact linear function of the others. Conversely, every positive semi-definite matrix is the covariance matrix of some multivariate distribution.

Extension for non-Hermitian square matrices

[edit]

The definition of positive definite can be generalized by designating any complex matrixM{\displaystyle M} (e.g. real non-symmetric) as positive definite ifRe{zMz}>0{\displaystyle {\mathcal {R_{e}}}\left\{\mathbf {z} ^{*}M\mathbf {z} \right\}>0} for all non-zero complex vectorsz,{\displaystyle \mathbf {z} ,} whereRe{c}{\displaystyle {\mathcal {R_{e}}}\{c\}} denotes the real part of acomplex numberc.{\displaystyle c.}[19] Only the Hermitian part12(M+M){\textstyle {\frac {1}{2}}\left(M+M^{*}\right)} determines whether the matrix is positive definite, and is assessed in the narrower sense above. Similarly, ifx{\displaystyle \mathbf {x} } andM{\displaystyle M} are real, we havexMx>0{\displaystyle \mathbf {x} ^{\top }M\mathbf {x} >0} for all real nonzero vectorsx{\displaystyle \mathbf {x} } if and only if the symmetric part12(M+M){\textstyle {\frac {1}{2}}\left(M+M^{\top }\right)} is positive definite in the narrower sense. It is immediately clear thatxMx=ijxiMijxj{\textstyle \mathbf {x} ^{\top }M\mathbf {x} =\sum _{ij}x_{i}M_{ij}x_{j}}is insensitive to transposition ofM.{\displaystyle M.}

A non-symmetric real matrix with only positive eigenvalues may have a symmetric part with negative eigenvalues, in which case it will not be positive (semi)definite. For example, the matrixM=[4914]{\textstyle M=\left[{\begin{matrix}4&9\\1&4\end{matrix}}\right]} has positive eigenvalues 1 and 7, yetxMx=2{\displaystyle \mathbf {x} ^{\top }M\mathbf {x} =-2} with the choicex=[11]{\displaystyle \mathbf {x} =\left[{\begin{smallmatrix}-1\\1\end{smallmatrix}}\right]}.

In summary, the distinguishing feature between the real and complex case is that, abounded positive operator on a complex Hilbert space is necessarily Hermitian, or self adjoint. The general claim can be argued using thepolarization identity. That is no longer true in the real case.

Applications

[edit]

Heat conductivity matrix

[edit]

Fourier's law of heat conduction, giving heat fluxq{\displaystyle \mathbf {q} } in terms of the temperature gradientg=T{\displaystyle \mathbf {g} =\nabla T} is written for anisotropic media asq=Kg,{\displaystyle \mathbf {q} =-K\mathbf {g} ,} in whichK{\displaystyle K} is thethermal conductivity matrix. The negative is inserted in Fourier's law to reflect the expectation that heat will always flow from hot to cold. In other words, since the temperature gradientg{\displaystyle \mathbf {g} } always points from cold to hot, the heat fluxq{\displaystyle \mathbf {q} } is expected to have a negative inner product withg{\displaystyle \mathbf {g} } so thatqg<0.{\displaystyle \mathbf {q} ^{\top }\mathbf {g} <0.} Substituting Fourier's law then gives this expectation asgKg>0,{\displaystyle \mathbf {g} ^{\top }K\mathbf {g} >0,} implying that the conductivity matrix should be positive definite. OrdinarilyK{\displaystyle K} should be symmetric, however it becomes nonsymmetric in the presence of a magnetic field as in athermal Hall effect.

More generally in thermodynamics, the flow of heat and particles is a fully coupled system as described by theOnsager reciprocal relations, and the coupling matrix is required to be positive semi-definite (possibly non-symmetric) in order that entropy production be nonnegative.

See also

[edit]

References

[edit]
  1. ^van den Bos, Adriaan (March 2007)."Appendix C: Positive semidefinite and positive definite matrices".Parameter Estimation for Scientists and Engineers (.pdf) (online ed.). John Wiley & Sons. pp. 259–263.doi:10.1002/9780470173862.ISBN 978-047-017386-2. Print ed.ISBN 9780470147818
  2. ^Boyd, Stephen; Vandenberghe, Lieven (8 March 2004).Convex Optimization. Cambridge University Press.doi:10.1017/cbo9780511804441.ISBN 978-0-521-83378-3.
  3. ^Horn & Johnson (2013), p. 440, Theorem 7.2.7
  4. ^Horn & Johnson (2013), p. 441, Theorem 7.2.10
  5. ^Horn & Johnson (2013), p. 452, Theorem 7.3.11
  6. ^Horn & Johnson (2013), p. 439, Theorem 7.2.6 withk=2{\displaystyle k=2}
  7. ^Horn & Johnson (2013), p. 431, Corollary 7.1.7
  8. ^Horn & Johnson (2013), p. 485, Theorem 7.6.1
  9. ^Horn & Johnson (2013), p. 438, Theorem 7.2.1
  10. ^Horn & Johnson (2013), p. 495, Corollary 7.7.4(a)
  11. ^abHorn & Johnson (2013), p. 430, Observation 7.1.3
  12. ^Horn & Johnson (2013), p. 431, Observation 7.1.8
  13. ^Horn & Johnson (2013), p. 430
  14. ^Wolkowicz, Henry; Styan, George P.H. (1980). "Bounds for Eigenvalues using Traces".Linear Algebra and Its Applications.29 (29). Elsevier:471–506.doi:10.1016/0024-3795(80)90258-X.
  15. ^Horn & Johnson (2013), p. 479, Theorem 7.5.3
  16. ^Horn & Johnson (2013), p. 509, Theorem 7.8.16
  17. ^Styan, G.P. (1973). "Hadamard products and multivariate statistical analysis".Linear Algebra and Its Applications.6:217–240.doi:10.1016/0024-3795(73)90023-2., Corollary 3.6, p. 227
  18. ^Bhatia, Rajendra (2007).Positive Definite Matrices. Princeton, New Jersey: Princeton University Press. p. 8.ISBN 978-0-691-12918-1.
  19. ^Weisstein, Eric W."Positive definite matrix".MathWorld. Wolfram Research. Retrieved26 July 2012.

Sources

[edit]

External links

[edit]
Matrix classes
Explicitly constrained entries
Constant
Conditions oneigenvalues or eigenvectors
Satisfying conditions onproducts orinverses
With specific applications
Used instatistics
Used ingraph theory
Used in science and engineering
Related terms
Retrieved from "https://en.wikipedia.org/w/index.php?title=Definite_matrix&oldid=1280140974"
Category:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp