Movatterモバイル変換


[0]ホーム

URL:


Wikipedia

Projection (linear algebra)

(Redirected fromOrthogonal projection)
"Orthogonal projection" redirects here. For the technical drawing concept, seeOrthographic projection. For a concrete discussion of orthogonal projections in finite-dimensional linear spaces, seeVector projection.

Inlinear algebra andfunctional analysis, aprojection is alinear transformationP{\displaystyle P} from avector space to itself (anendomorphism) such thatPP=P{\displaystyle P\circ P=P}. That is, wheneverP{\displaystyle P} is applied twice to any vector, it gives the same result as if it were applied once (i.e.P{\displaystyle P} isidempotent). It leaves itsimage unchanged.[1] This definition of "projection" formalizes and generalizes the idea ofgraphical projection. One can also consider the effect of a projection on a geometrical object by examining the effect of the projection onpoints in the object.

The transformationP is the orthogonal projectiononto thelinem.

Definitions

edit

Aprojection on a vector spaceV{\displaystyle V}  is a linear operatorP:VV{\displaystyle P\colon V\to V}  such thatP2=P{\displaystyle P^{2}=P} .

WhenV{\displaystyle V}  has aninner product and iscomplete, i.e. whenV{\displaystyle V}  is aHilbert space, the concept oforthogonality can be used. A projectionP{\displaystyle P}  on a Hilbert spaceV{\displaystyle V}  is called anorthogonal projection if it satisfiesPx,y=x,Py{\displaystyle \langle P\mathbf {x} ,\mathbf {y} \rangle =\langle \mathbf {x} ,P\mathbf {y} \rangle }  for allx,yV{\displaystyle \mathbf {x} ,\mathbf {y} \in V} . A projection on a Hilbert space that is not orthogonal is called anoblique projection.

Projection matrix

edit

Theeigenvalues of a projection matrix must be 0 or 1.

Examples

edit

Orthogonal projection

edit

For example, the function which maps the point(x,y,z){\displaystyle (x,y,z)}  in three-dimensional spaceR3{\displaystyle \mathbb {R} ^{3}}  to the point(x,y,0){\displaystyle (x,y,0)}  is an orthogonal projection onto thexy-plane. This function is represented by the matrixP=[100010000].{\displaystyle P={\begin{bmatrix}1&0&0\\0&1&0\\0&0&0\end{bmatrix}}.} 

The action of this matrix on an arbitraryvector isP[xyz]=[xy0].{\displaystyle P{\begin{bmatrix}x\\y\\z\end{bmatrix}}={\begin{bmatrix}x\\y\\0\end{bmatrix}}.} 

To see thatP{\displaystyle P}  is indeed a projection, i.e.,P=P2{\displaystyle P=P^{2}} , we computeP2[xyz]=P[xy0]=[xy0]=P[xyz].{\displaystyle P^{2}{\begin{bmatrix}x\\y\\z\end{bmatrix}}=P{\begin{bmatrix}x\\y\\0\end{bmatrix}}={\begin{bmatrix}x\\y\\0\end{bmatrix}}=P{\begin{bmatrix}x\\y\\z\end{bmatrix}}.} 

Observing thatPT=P{\displaystyle P^{\mathrm {T} }=P}  shows that the projection is an orthogonal projection.

Oblique projection

edit

A simple example of a non-orthogonal (oblique) projection isP=[00α1].{\displaystyle P={\begin{bmatrix}0&0\\\alpha &1\end{bmatrix}}.} 

Viamatrix multiplication, one sees thatP2=[00α1][00α1]=[00α1]=P.{\displaystyle P^{2}={\begin{bmatrix}0&0\\\alpha &1\end{bmatrix}}{\begin{bmatrix}0&0\\\alpha &1\end{bmatrix}}={\begin{bmatrix}0&0\\\alpha &1\end{bmatrix}}=P.} showing thatP{\displaystyle P}  is indeed a projection.

The projectionP{\displaystyle P}  is orthogonalif and only ifα=0{\displaystyle \alpha =0}  because only thenPT=P.{\displaystyle P^{\mathrm {T} }=P.} 

Properties and classification

edit
 
The transformationT is the projection alongk ontom. The range ofT ism and the kernel isk.

Idempotence

edit

By definition, a projectionP{\displaystyle P}  isidempotent (i.e.P2=P{\displaystyle P^{2}=P} ).

Open map

edit

Every projection is anopen map onto its image, meaning that it maps eachopen set in thedomain to an open set in thesubspace topology of theimage.[citation needed] That is, for any vectorx{\displaystyle \mathbf {x} }  and any ballBx{\displaystyle B_{\mathbf {x} }}  (with positive radius) centered onx{\displaystyle \mathbf {x} } , there exists a ballBPx{\displaystyle B_{P\mathbf {x} }}  (with positive radius) centered onPx{\displaystyle P\mathbf {x} }  that is wholly contained in the imageP(Bx){\displaystyle P(B_{\mathbf {x} })} .

Complementarity of image and kernel

edit

LetW{\displaystyle W}  be a finite-dimensional vector space andP{\displaystyle P}  be a projection onW{\displaystyle W} . Suppose thesubspacesU{\displaystyle U}  andV{\displaystyle V}  are theimage andkernel ofP{\displaystyle P}  respectively. ThenP{\displaystyle P}  has the following properties:

  1. P{\displaystyle P}  is theidentity operatorI{\displaystyle I}  onU{\displaystyle U} :xU:Px=x.{\displaystyle \forall \mathbf {x} \in U:P\mathbf {x} =\mathbf {x} .} 
  2. We have adirect sumW=UV{\displaystyle W=U\oplus V} . Every vectorxW{\displaystyle \mathbf {x} \in W}  may be decomposed uniquely asx=u+v{\displaystyle \mathbf {x} =\mathbf {u} +\mathbf {v} }  withu=Px{\displaystyle \mathbf {u} =P\mathbf {x} }  andv=xPx=(IP)x{\displaystyle \mathbf {v} =\mathbf {x} -P\mathbf {x} =\left(I-P\right)\mathbf {x} } , and whereuU,vV.{\displaystyle \mathbf {u} \in U,\mathbf {v} \in V.} 

The image and kernel of a projection arecomplementary, as areP{\displaystyle P}  andQ=IP{\displaystyle Q=I-P} . The operatorQ{\displaystyle Q}  is also a projection as the image and kernel ofP{\displaystyle P}  become the kernel and image ofQ{\displaystyle Q}  and vice versa. We sayP{\displaystyle P}  is a projection alongV{\displaystyle V}  ontoU{\displaystyle U}  (kernel/image) andQ{\displaystyle Q}  is a projection alongU{\displaystyle U}  ontoV{\displaystyle V} .

Spectrum

edit

In infinite-dimensional vector spaces, thespectrum of a projection is contained in{0,1}{\displaystyle \{0,1\}}  as(λIP)1=1λI+1λ(λ1)P.{\displaystyle (\lambda I-P)^{-1}={\frac {1}{\lambda }}I+{\frac {1}{\lambda (\lambda -1)}}P.} Only 0 or 1 can be aneigenvalue of a projection. This implies that an orthogonal projectionP{\displaystyle P}  is always apositive semi-definite matrix. In general, the correspondingeigenspaces are (respectively) the kernel and range of the projection. Decomposition of a vector space into direct sums is not unique. Therefore, given a subspaceV{\displaystyle V} , there may be many projections whose range (or kernel) isV{\displaystyle V} .

If a projection is nontrivial it hasminimal polynomialx2x=x(x1){\displaystyle x^{2}-x=x(x-1)} , which factors into distinct linear factors, and thusP{\displaystyle P}  isdiagonalizable.

Product of projections

edit

The product of projections is not in general a projection, even if they are orthogonal. If two projectionscommute then their product is a projection, but theconverse is false: the product of two non-commuting projections may be a projection.

If two orthogonal projections commute then their product is an orthogonal projection. If the product of two orthogonal projections is an orthogonal projection, then the two orthogonal projections commute (more generally: two self-adjointendomorphisms commute if and only if their product is self-adjoint).

Orthogonal projections

edit

When the vector spaceW{\displaystyle W}  has aninner product and is complete (is aHilbert space) the concept oforthogonality can be used. Anorthogonal projection is a projection for which the rangeU{\displaystyle U}  and the kernelV{\displaystyle V}  areorthogonal subspaces. Thus, for everyx{\displaystyle \mathbf {x} }  andy{\displaystyle \mathbf {y} }  inW{\displaystyle W} ,Px,(yPy)=(xPx),Py=0{\displaystyle \langle P\mathbf {x} ,(\mathbf {y} -P\mathbf {y} )\rangle =\langle (\mathbf {x} -P\mathbf {x} ),P\mathbf {y} \rangle =0} . Equivalently:x,Py=Px,Py=Px,y.{\displaystyle \langle \mathbf {x} ,P\mathbf {y} \rangle =\langle P\mathbf {x} ,P\mathbf {y} \rangle =\langle P\mathbf {x} ,\mathbf {y} \rangle .} 

A projection is orthogonal if and only if it isself-adjoint. Using the self-adjoint and idempotent properties ofP{\displaystyle P} , for anyx{\displaystyle \mathbf {x} }  andy{\displaystyle \mathbf {y} }  inW{\displaystyle W}  we havePxU{\displaystyle P\mathbf {x} \in U} ,yPyV{\displaystyle \mathbf {y} -P\mathbf {y} \in V} , andPx,yPy=x,(PP2)y=0{\displaystyle \langle P\mathbf {x} ,\mathbf {y} -P\mathbf {y} \rangle =\langle \mathbf {x} ,\left(P-P^{2}\right)\mathbf {y} \rangle =0} where,{\displaystyle \langle \cdot ,\cdot \rangle }  is the inner product associated withW{\displaystyle W} . Therefore,P{\displaystyle P}  andIP{\displaystyle I-P}  are orthogonal projections.[3] The other direction, namely that ifP{\displaystyle P}  is orthogonal then it is self-adjoint, follows from the implication from(xPx),Py=Px,(yPy)=0{\displaystyle \langle (\mathbf {x} -P\mathbf {x} ),P\mathbf {y} \rangle =\langle P\mathbf {x} ,(\mathbf {y} -P\mathbf {y} )\rangle =0}  tox,Py=Px,Py=Px,y=x,Py{\displaystyle \langle \mathbf {x} ,P\mathbf {y} \rangle =\langle P\mathbf {x} ,P\mathbf {y} \rangle =\langle P\mathbf {x} ,\mathbf {y} \rangle =\langle \mathbf {x} ,P^{*}\mathbf {y} \rangle } for everyx{\displaystyle x}  andy{\displaystyle y}  inW{\displaystyle W} ; thusP=P{\displaystyle P=P^{*}} .

The existence of an orthogonal projection onto a closed subspace follows from theHilbert projection theorem.

Properties and special cases

edit

An orthogonal projection is abounded operator. This is because for everyv{\displaystyle \mathbf {v} }  in the vector space we have, by theCauchy–Schwarz inequality:Pv2=Pv,Pv=Pv,vPvv{\displaystyle \left\|P\mathbf {v} \right\|^{2}=\langle P\mathbf {v} ,P\mathbf {v} \rangle =\langle P\mathbf {v} ,\mathbf {v} \rangle \leq \left\|P\mathbf {v} \right\|\cdot \left\|\mathbf {v} \right\|} ThusPvv{\displaystyle \left\|P\mathbf {v} \right\|\leq \left\|\mathbf {v} \right\|} .

For finite-dimensional complex or real vector spaces, thestandard inner product can be substituted for,{\displaystyle \langle \cdot ,\cdot \rangle } .

Formulas
edit

A simple case occurs when the orthogonal projection is onto a line. Ifu{\displaystyle \mathbf {u} }  is aunit vector on the line, then the projection is given by theouter productPu=uuT.{\displaystyle P_{\mathbf {u} }=\mathbf {u} \mathbf {u} ^{\mathsf {T}}.} (Ifu{\displaystyle \mathbf {u} }  is complex-valued, the transpose in the above equation is replaced by a Hermitian transpose). This operator leavesu invariant, and it annihilates all vectors orthogonal tou{\displaystyle \mathbf {u} } , proving that it is indeed the orthogonal projection onto the line containingu.[4] A simple way to see this is to consider an arbitrary vectorx{\displaystyle \mathbf {x} }  as the sum of a component on the line (i.e. the projected vector we seek) and another perpendicular to it,x=x+x{\displaystyle \mathbf {x} =\mathbf {x} _{\parallel }+\mathbf {x} _{\perp }} . Applying projection, we getPux=uuTx+uuTx=u(sgn(uTx)x)+u0=x{\displaystyle P_{\mathbf {u} }\mathbf {x} =\mathbf {u} \mathbf {u} ^{\mathsf {T}}\mathbf {x} _{\parallel }+\mathbf {u} \mathbf {u} ^{\mathsf {T}}\mathbf {x} _{\perp }=\mathbf {u} \left(\operatorname {sgn} \left(\mathbf {u} ^{\mathsf {T}}\mathbf {x} _{\parallel }\right)\left\|\mathbf {x} _{\parallel }\right\|\right)+\mathbf {u} \cdot \mathbf {0} =\mathbf {x} _{\parallel }} by the properties of thedot product of parallel and perpendicular vectors.

This formula can be generalized to orthogonal projections on a subspace of arbitrarydimension. Letu1,,uk{\displaystyle \mathbf {u} _{1},\ldots ,\mathbf {u} _{k}}  be anorthonormal basis of the subspaceU{\displaystyle U} , with the assumption that the integerk1{\displaystyle k\geq 1} , and letA{\displaystyle A}  denote then×k{\displaystyle n\times k}  matrix whose columns areu1,,uk{\displaystyle \mathbf {u} _{1},\ldots ,\mathbf {u} _{k}} , i.e.,A=[u1uk]{\displaystyle A={\begin{bmatrix}\mathbf {u} _{1}&\cdots &\mathbf {u} _{k}\end{bmatrix}}} . Then the projection is given by:[5]PA=AAT{\displaystyle P_{A}=AA^{\mathsf {T}}} which can be rewritten asPA=iui,ui.{\displaystyle P_{A}=\sum _{i}\langle \mathbf {u} _{i},\cdot \rangle \mathbf {u} _{i}.} 

The matrixAT{\displaystyle A^{\mathsf {T}}}  is thepartial isometry that vanishes on theorthogonal complement ofU{\displaystyle U} , andA{\displaystyle A}  is the isometry that embedsU{\displaystyle U}  into the underlying vector space. The range ofPA{\displaystyle P_{A}}  is therefore thefinal space ofA{\displaystyle A} . It is also clear thatAAT{\displaystyle AA^{\mathsf {T}}}  is the identity operator onU{\displaystyle U} .

The orthonormality condition can also be dropped. Ifu1,,uk{\displaystyle \mathbf {u} _{1},\ldots ,\mathbf {u} _{k}}  is a (not necessarily orthonormal)basis withk1{\displaystyle k\geq 1} , andA{\displaystyle A}  is the matrix with these vectors as columns, then the projection is:[6][7]PA=A(ATA)1AT.{\displaystyle P_{A}=A\left(A^{\mathsf {T}}A\right)^{-1}A^{\mathsf {T}}.} 

The matrixA{\displaystyle A}  still embedsU{\displaystyle U}  into the underlying vector space but is no longer an isometry in general. The matrix(ATA)1{\displaystyle \left(A^{\mathsf {T}}A\right)^{-1}}  is a "normalizing factor" that recovers the norm. For example, therank-1 operatoruuT{\displaystyle \mathbf {u} \mathbf {u} ^{\mathsf {T}}}  is not a projection ifu1.{\displaystyle \left\|\mathbf {u} \right\|\neq 1.}  After dividing byuTu=u2,{\displaystyle \mathbf {u} ^{\mathsf {T}}\mathbf {u} =\left\|\mathbf {u} \right\|^{2},}  we obtain the projectionu(uTu)1uT{\displaystyle \mathbf {u} \left(\mathbf {u} ^{\mathsf {T}}\mathbf {u} \right)^{-1}\mathbf {u} ^{\mathsf {T}}}  onto the subspace spanned byu{\displaystyle u} .

In the general case, we can have an arbitrarypositive definite matrixD{\displaystyle D}  defining an inner productx,yD=yDx{\displaystyle \langle x,y\rangle _{D}=y^{\dagger }Dx} , and the projectionPA{\displaystyle P_{A}}  is given byPAx=argminyrange(A)xyD2{\textstyle P_{A}x=\operatorname {argmin} _{y\in \operatorname {range} (A)}\left\|x-y\right\|_{D}^{2}} . ThenPA=A(ATDA)1ATD.{\displaystyle P_{A}=A\left(A^{\mathsf {T}}DA\right)^{-1}A^{\mathsf {T}}D.} 

When the range space of the projection is generated by aframe (i.e. the number of generators is greater than its dimension), the formula for the projection takes the form:PA=AA+{\displaystyle P_{A}=AA^{+}} . HereA+{\displaystyle A^{+}}  stands for theMoore–Penrose pseudoinverse. This is just one of many ways to construct the projection operator.

If[AB]{\displaystyle {\begin{bmatrix}A&B\end{bmatrix}}}  is a non-singular matrix andATB=0{\displaystyle A^{\mathsf {T}}B=0}  (i.e.,B{\displaystyle B}  is thenull space matrix ofA{\displaystyle A} ),[8] the following holds:I=[AB][AB]1[ATBT]1[ATBT]=[AB]([ATBT][AB])1[ATBT]=[AB][ATAOOBTB]1[ATBT]=A(ATA)1AT+B(BTB)1BT{\displaystyle {\begin{aligned}I&={\begin{bmatrix}A&B\end{bmatrix}}{\begin{bmatrix}A&B\end{bmatrix}}^{-1}{\begin{bmatrix}A^{\mathsf {T}}\\B^{\mathsf {T}}\end{bmatrix}}^{-1}{\begin{bmatrix}A^{\mathsf {T}}\\B^{\mathsf {T}}\end{bmatrix}}\\&={\begin{bmatrix}A&B\end{bmatrix}}\left({\begin{bmatrix}A^{\mathsf {T}}\\B^{\mathsf {T}}\end{bmatrix}}{\begin{bmatrix}A&B\end{bmatrix}}\right)^{-1}{\begin{bmatrix}A^{\mathsf {T}}\\B^{\mathsf {T}}\end{bmatrix}}\\&={\begin{bmatrix}A&B\end{bmatrix}}{\begin{bmatrix}A^{\mathsf {T}}A&O\\O&B^{\mathsf {T}}B\end{bmatrix}}^{-1}{\begin{bmatrix}A^{\mathsf {T}}\\B^{\mathsf {T}}\end{bmatrix}}\\[4pt]&=A\left(A^{\mathsf {T}}A\right)^{-1}A^{\mathsf {T}}+B\left(B^{\mathsf {T}}B\right)^{-1}B^{\mathsf {T}}\end{aligned}}} 

If the orthogonal condition is enhanced toATWB=ATWTB=0{\displaystyle A^{\mathsf {T}}WB=A^{\mathsf {T}}W^{\mathsf {T}}B=0}  withW{\displaystyle W}  non-singular, the following holds:I=[AB][(ATWA)1AT(BTWB)1BT]W.{\displaystyle I={\begin{bmatrix}A&B\end{bmatrix}}{\begin{bmatrix}\left(A^{\mathsf {T}}WA\right)^{-1}A^{\mathsf {T}}\\\left(B^{\mathsf {T}}WB\right)^{-1}B^{\mathsf {T}}\end{bmatrix}}W.} 

All these formulas also hold for complex inner product spaces, provided that theconjugate transpose is used instead of the transpose. Further details on sums of projectors can be found in Banerjee and Roy (2014).[9] Also see Banerjee (2004)[10] for application of sums of projectors in basicspherical trigonometry.

Oblique projections

edit

The termoblique projections is sometimes used to refer to non-orthogonal projections. These projections are also used to represent spatial figures in two-dimensional drawings (seeoblique projection), though not as frequently as orthogonal projections. Whereas calculating the fitted value of anordinary least squares regression requires an orthogonal projection, calculating the fitted value of aninstrumental variables regression requires an oblique projection.

A projection is defined by its kernel and the basis vectors used to characterize its range (which is a complement of the kernel). When these basis vectors are orthogonal to the kernel, then the projection is an orthogonal projection. When these basis vectors are not orthogonal to the kernel, the projection is an oblique projection, or just a projection.

A matrix representation formula for a nonzero projection operator

edit

LetP:VV{\displaystyle P\colon V\to V}  be a linear operator such thatP2=P{\displaystyle P^{2}=P}  and assume thatP{\displaystyle P}  is not the zero operator. Let the vectorsu1,,uk{\displaystyle \mathbf {u} _{1},\ldots ,\mathbf {u} _{k}}  form a basis for the range ofP{\displaystyle P} , and assemble these vectors in then×k{\displaystyle n\times k}  matrixA{\displaystyle A} . Thenk1{\displaystyle k\geq 1} , otherwisek=0{\displaystyle k=0}  andP{\displaystyle P}  is the zero operator. The range and the kernel are complementary spaces, so the kernel has dimensionnk{\displaystyle n-k} . It follows that theorthogonal complement of the kernel has dimensionk{\displaystyle k} . Letv1,,vk{\displaystyle \mathbf {v} _{1},\ldots ,\mathbf {v} _{k}}  form a basis for the orthogonal complement of the kernel of the projection, and assemble these vectors in the matrixB{\displaystyle B} . Then the projectionP{\displaystyle P}  (with the conditionk1{\displaystyle k\geq 1} ) is given byP=A(BTA)1BT.{\displaystyle P=A\left(B^{\mathsf {T}}A\right)^{-1}B^{\mathsf {T}}.} 

This expression generalizes the formula for orthogonal projections given above.[11][12] A standard proof of this expression is the following. For any vectorx{\displaystyle \mathbf {x} }  in the vector spaceV{\displaystyle V} , we can decomposex=x1+x2{\displaystyle \mathbf {x} =\mathbf {x} _{1}+\mathbf {x} _{2}} , where vectorx1=P(x){\displaystyle \mathbf {x} _{1}=P(\mathbf {x} )}  is in the image ofP{\displaystyle P} , and vectorx2=xP(x).{\displaystyle \mathbf {x} _{2}=\mathbf {x} -P(\mathbf {x} ).}  SoP(x2)=P(x)P2(x)=0{\displaystyle P(\mathbf {x} _{2})=P(\mathbf {x} )-P^{2}(\mathbf {x} )=\mathbf {0} } , and thenx2{\displaystyle \mathbf {x} _{2}}  is in the kernel ofP{\displaystyle P} , which is the null space ofA.{\displaystyle A.}  In other words, the vectorx1{\displaystyle \mathbf {x} _{1}}  is in the column space ofA,{\displaystyle A,}  sox1=Aw{\displaystyle \mathbf {x} _{1}=A\mathbf {w} }  for somek{\displaystyle k}  dimension vectorw{\displaystyle \mathbf {w} }  and the vectorx2{\displaystyle \mathbf {x} _{2}}  satisfiesBTx2=0{\displaystyle B^{\mathsf {T}}\mathbf {x} _{2}=\mathbf {0} }  by the construction ofB{\displaystyle B} . Put these conditions together, and we find a vectorw{\displaystyle \mathbf {w} }  so thatBT(xAw)=0{\displaystyle B^{\mathsf {T}}(\mathbf {x} -A\mathbf {w} )=\mathbf {0} } . Since matricesA{\displaystyle A}  andB{\displaystyle B}  are of full rankk{\displaystyle k}  by their construction, thek×k{\displaystyle k\times k} -matrixBTA{\displaystyle B^{\mathsf {T}}A}  is invertible. So the equationBT(xAw)=0{\displaystyle B^{\mathsf {T}}(\mathbf {x} -A\mathbf {w} )=\mathbf {0} }  gives the vectorw=(BTA)1BTx.{\displaystyle \mathbf {w} =(B^{\mathsf {T}}A)^{-1}B^{\mathsf {T}}\mathbf {x} .}  In this way,Px=x1=Aw=A(BTA)1BTx{\displaystyle P\mathbf {x} =\mathbf {x} _{1}=A\mathbf {w} =A(B^{\mathsf {T}}A)^{-1}B^{\mathsf {T}}\mathbf {x} }  for any vectorxV{\displaystyle \mathbf {x} \in V}  and henceP=A(BTA)1BT{\displaystyle P=A(B^{\mathsf {T}}A)^{-1}B^{\mathsf {T}}} .

In the case thatP{\displaystyle P}  is an orthogonal projection, we can takeA=B{\displaystyle A=B} , and it follows thatP=A(ATA)1AT{\displaystyle P=A\left(A^{\mathsf {T}}A\right)^{-1}A^{\mathsf {T}}} . By using this formula, one can easily check thatP=PT{\displaystyle P=P^{\mathsf {T}}} . In general, if the vector space is over complex number field, one then uses theHermitian transposeA{\displaystyle A^{*}}  and has the formulaP=A(AA)1A{\displaystyle P=A\left(A^{*}A\right)^{-1}A^{*}} . Recall that one can express theMoore–Penrose inverse of the matrixA{\displaystyle A}  byA+=(AA)1A{\displaystyle A^{+}=(A^{*}A)^{-1}A^{*}}  sinceA{\displaystyle A}  has full column rank, soP=AA+{\displaystyle P=AA^{+}} .

Singular values

edit

IP{\displaystyle I-P}  is also an oblique projection. The singular values ofP{\displaystyle P}  andIP{\displaystyle I-P}  can be computed by anorthonormal basis ofA{\displaystyle A} . LetQA{\displaystyle Q_{A}}  be an orthonormal basis ofA{\displaystyle A}  and letQA{\displaystyle Q_{A}^{\perp }}  be theorthogonal complement ofQA{\displaystyle Q_{A}} . Denote the singular values of the matrixQATA(BTA)1BTQA{\displaystyle Q_{A}^{T}A(B^{T}A)^{-1}B^{T}Q_{A}^{\perp }}  by the positive valuesγ1γ2γk{\displaystyle \gamma _{1}\geq \gamma _{2}\geq \ldots \geq \gamma _{k}} . With this, the singular values forP{\displaystyle P}  are:[13]σi={1+γi21ik0otherwise{\displaystyle \sigma _{i}={\begin{cases}{\sqrt {1+\gamma _{i}^{2}}}&1\leq i\leq k\\0&{\text{otherwise}}\end{cases}}} and the singular values forIP{\displaystyle I-P}  areσi={1+γi21ik1k+1ink0otherwise{\displaystyle \sigma _{i}={\begin{cases}{\sqrt {1+\gamma _{i}^{2}}}&1\leq i\leq k\\1&k+1\leq i\leq n-k\\0&{\text{otherwise}}\end{cases}}} This implies that the largest singular values ofP{\displaystyle P}  andIP{\displaystyle I-P}  are equal, and thus that thematrix norm of the oblique projections are the same. However, thecondition number satisfies the relationκ(IP)=σ11σ1σk=κ(P){\displaystyle \kappa (I-P)={\frac {\sigma _{1}}{1}}\geq {\frac {\sigma _{1}}{\sigma _{k}}}=\kappa (P)} , and is therefore not necessarily equal.

Finding projection with an inner product

edit

LetV{\displaystyle V}  be a vector space (in this case a plane) spanned by orthogonal vectorsu1,u2,,up{\displaystyle \mathbf {u} _{1},\mathbf {u} _{2},\dots ,\mathbf {u} _{p}} . Lety{\displaystyle y}  be a vector. One can define a projection ofy{\displaystyle \mathbf {y} }  ontoV{\displaystyle V}  asprojVy=yuiuiuiui{\displaystyle \operatorname {proj} _{V}\mathbf {y} ={\frac {\mathbf {y} \cdot \mathbf {u} ^{i}}{\mathbf {u} ^{i}\cdot \mathbf {u} ^{i}}}\mathbf {u} ^{i}} where repeated indices are summed over (Einstein sum notation). The vectory{\displaystyle \mathbf {y} }  can be written as an orthogonal sum such thaty=projVy+z{\displaystyle \mathbf {y} =\operatorname {proj} _{V}\mathbf {y} +\mathbf {z} } .projVy{\displaystyle \operatorname {proj} _{V}\mathbf {y} }  is sometimes denoted asy^{\displaystyle {\hat {\mathbf {y} }}} . There is a theorem in linear algebra that states that thisz{\displaystyle \mathbf {z} }  is the smallest distance (theorthogonal distance) fromy{\displaystyle \mathbf {y} }  toV{\displaystyle V}  and is commonly used in areas such asmachine learning.

 
y is being projected onto the vector spaceV.

Canonical forms

edit

Any projectionP=P2{\displaystyle P=P^{2}}  on a vector space of dimensiond{\displaystyle d}  over afield is adiagonalizable matrix, since itsminimal polynomial dividesx2x{\displaystyle x^{2}-x} , which splits into distinct linear factors. Thus there exists a basis in whichP{\displaystyle P}  has the form

P=Ir0dr{\displaystyle P=I_{r}\oplus 0_{d-r}} 

wherer{\displaystyle r}  is therank ofP{\displaystyle P} . HereIr{\displaystyle I_{r}}  is theidentity matrix of sizer{\displaystyle r} ,0dr{\displaystyle 0_{d-r}}  is thezero matrix of sizedr{\displaystyle d-r} , and{\displaystyle \oplus }  is thedirect sum operator. If the vector space is complex and equipped with aninner product, then there is anorthonormal basis in which the matrix ofP is[14]

P=[1σ100][1σk00]Im0s.{\displaystyle P={\begin{bmatrix}1&\sigma _{1}\\0&0\end{bmatrix}}\oplus \cdots \oplus {\begin{bmatrix}1&\sigma _{k}\\0&0\end{bmatrix}}\oplus I_{m}\oplus 0_{s}.} 

whereσ1σ2σk>0{\displaystyle \sigma _{1}\geq \sigma _{2}\geq \dots \geq \sigma _{k}>0} . Theintegersk,s,m{\displaystyle k,s,m}  and the real numbersσi{\displaystyle \sigma _{i}}  are uniquely determined.2k+s+m=d{\displaystyle 2k+s+m=d} . The factorIm0s{\displaystyle I_{m}\oplus 0_{s}}  corresponds to the maximal invariant subspace on whichP{\displaystyle P}  acts as anorthogonal projection (so thatP itself is orthogonal if and only ifk=0{\displaystyle k=0} ) and theσi{\displaystyle \sigma _{i}} -blocks correspond to theoblique components.

Projections on normed vector spaces

edit

When the underlying vector spaceX{\displaystyle X}  is a (not necessarily finite-dimensional)normed vector space, analytic questions, irrelevant in the finite-dimensional case, need to be considered. Assume nowX{\displaystyle X}  is aBanach space.

Many of the algebraic results discussed above survive the passage to this context. A given direct sum decomposition ofX{\displaystyle X}  into complementary subspaces still specifies a projection, and vice versa. IfX{\displaystyle X}  is the direct sumX=UV{\displaystyle X=U\oplus V} , then the operator defined byP(u+v)=u{\displaystyle P(u+v)=u}  is still a projection with rangeU{\displaystyle U}  and kernelV{\displaystyle V} . It is also clear thatP2=P{\displaystyle P^{2}=P} . Conversely, ifP{\displaystyle P}  is projection onX{\displaystyle X} , i.e.P2=P{\displaystyle P^{2}=P} , then it is easily verified that(1P)2=(1P){\displaystyle (1-P)^{2}=(1-P)} . In other words,1P{\displaystyle 1-P}  is also a projection. The relationP2=P{\displaystyle P^{2}=P}  implies1=P+(1P){\displaystyle 1=P+(1-P)}  andX{\displaystyle X}  is the direct sumrg(P)rg(1P){\displaystyle \operatorname {rg} (P)\oplus \operatorname {rg} (1-P)} .

However, in contrast to the finite-dimensional case, projections need not becontinuous in general. If a subspaceU{\displaystyle U}  ofX{\displaystyle X}  is not closed in the norm topology, then the projection ontoU{\displaystyle U}  is not continuous. In other words, the range of a continuous projectionP{\displaystyle P}  must be a closed subspace. Furthermore, the kernel of a continuous projection (in fact, a continuous linear operator in general) is closed. Thus acontinuous projectionP{\displaystyle P}  gives a decomposition ofX{\displaystyle X}  into two complementaryclosed subspaces:X=rg(P)ker(P)=ker(1P)ker(P){\displaystyle X=\operatorname {rg} (P)\oplus \ker(P)=\ker(1-P)\oplus \ker(P)} .

The converse holds also, with an additional assumption. SupposeU{\displaystyle U}  is a closed subspace ofX{\displaystyle X} . If there exists a closed subspaceV{\displaystyle V}  such thatX =UV, then the projectionP{\displaystyle P}  with rangeU{\displaystyle U}  and kernelV{\displaystyle V}  is continuous. This follows from theclosed graph theorem. Supposexnx andPxny. One needs to show thatPx=y{\displaystyle Px=y} . SinceU{\displaystyle U}  is closed and{Pxn} ⊂U,y lies inU{\displaystyle U} , i.e.Py =y. Also,xnPxn = (IP)xnxy. BecauseV{\displaystyle V}  is closed and{(IP)xn} ⊂V, we havexyV{\displaystyle x-y\in V} , i.e.P(xy)=PxPy=Pxy=0{\displaystyle P(x-y)=Px-Py=Px-y=0} , which proves the claim.

The above argument makes use of the assumption that bothU{\displaystyle U}  andV{\displaystyle V}  are closed. In general, given a closed subspaceU{\displaystyle U} , there need not exist a complementary closed subspaceV{\displaystyle V} , although forHilbert spaces this can always be done by taking theorthogonal complement. For Banach spaces, a one-dimensional subspace always has a closed complementary subspace. This is an immediate consequence ofHahn–Banach theorem. LetU{\displaystyle U}  be the linear span ofu{\displaystyle u} . By Hahn–Banach, there exists a boundedlinear functionalφ{\displaystyle \varphi }  such thatφ(u) = 1. The operatorP(x)=φ(x)u{\displaystyle P(x)=\varphi (x)u}  satisfiesP2=P{\displaystyle P^{2}=P} , i.e. it is a projection. Boundedness ofφ{\displaystyle \varphi }  implies continuity ofP{\displaystyle P}  and thereforeker(P)=rg(IP){\displaystyle \ker(P)=\operatorname {rg} (I-P)}  is a closed complementary subspace ofU{\displaystyle U} .

Applications and further considerations

edit

Projections (orthogonal and otherwise) play a major role inalgorithms for certain linear algebra problems:

As stated above, projections are a special case of idempotents. Analytically, orthogonal projections are non-commutative generalizations ofcharacteristic functions. Idempotents are used in classifying, for instance,semisimple algebras, whilemeasure theory begins with considering characteristic functions ofmeasurable sets. Therefore, as one can imagine, projections are very often encountered in the context ofoperator algebras. In particular, avon Neumann algebra is generated by its completelattice of projections.

Generalizations

edit

More generally, given a map between normed vector spacesT:VW,{\displaystyle T\colon V\to W,}  one can analogously ask for this map to be an isometry on the orthogonal complement of the kernel: that(kerT)W{\displaystyle (\ker T)^{\perp }\to W}  be an isometry (comparePartial isometry); in particular it must beonto. The case of an orthogonal projection is whenW is a subspace ofV. InRiemannian geometry, this is used in the definition of aRiemannian submersion.

See also

edit

Notes

edit
  1. ^Meyer, pp 386+387
  2. ^abHorn, Roger A.; Johnson, Charles R. (2013).Matrix Analysis, second edition. Cambridge University Press.ISBN 9780521839402.
  3. ^Meyer, p. 433
  4. ^Meyer, p. 431
  5. ^Meyer, equation (5.13.4)
  6. ^Banerjee, Sudipto; Roy, Anindya (2014),Linear Algebra and Matrix Analysis for Statistics, Texts in Statistical Science (1st ed.), Chapman and Hall/CRC,ISBN 978-1420095388
  7. ^Meyer, equation (5.13.3)
  8. ^See alsoLinear least squares (mathematics) § Properties of the least-squares estimators.
  9. ^Banerjee, Sudipto; Roy, Anindya (2014),Linear Algebra and Matrix Analysis for Statistics, Texts in Statistical Science (1st ed.), Chapman and Hall/CRC,ISBN 978-1420095388
  10. ^Banerjee, Sudipto (2004), "Revisiting Spherical Trigonometry with Orthogonal Projectors",The College Mathematics Journal,35 (5):375–381,doi:10.1080/07468342.2004.11922099,S2CID 122277398
  11. ^Banerjee, Sudipto; Roy, Anindya (2014),Linear Algebra and Matrix Analysis for Statistics, Texts in Statistical Science (1st ed.), Chapman and Hall/CRC,ISBN 978-1420095388
  12. ^Meyer, equation (7.10.39)
  13. ^Brust, J. J.; Marcia, R. F.; Petra, C. G. (2020), "Computationally Efficient Decompositions of Oblique Projection Matrices",SIAM Journal on Matrix Analysis and Applications,41 (2):852–870,doi:10.1137/19M1288115,OSTI 1680061,S2CID 219921214
  14. ^Doković, D. Ž. (August 1991). "Unitary similarity of projectors".Aequationes Mathematicae.42 (1):220–224.doi:10.1007/BF01818492.S2CID 122704926.

References

edit
  • Banerjee, Sudipto; Roy, Anindya (2014),Linear Algebra and Matrix Analysis for Statistics, Texts in Statistical Science (1st ed.), Chapman and Hall/CRC,ISBN 978-1420095388
  • Dunford, N.; Schwartz, J. T. (1958).Linear Operators, Part I: General Theory. Interscience.
  • Meyer, Carl D. (2000).Matrix Analysis and Applied Linear Algebra. Society for Industrial and Applied Mathematics.ISBN 978-0-89871-454-8.
  • Brezinski, Claude:Projection Methods for Systems of Equations, North-Holland, ISBN 0-444-82777-3 (1997).

External links

edit

[8]ページ先頭

©2009-2025 Movatter.jp