Projection (linear algebra)

(Redirected fromOrthogonal projection)

"Orthogonal projection" redirects here. For the technical drawing concept, seeOrthographic projection. For a concrete discussion of orthogonal projections in finite-dimensional linear spaces, seeVector projection.

Inlinear algebra andfunctional analysis, aprojection is alinear transformation $P {\displaystyle P}$ from avector space to itself (anendomorphism) such that $P\circ P=P$ . That is, whenever $P {\displaystyle P}$ is applied twice to any vector, it gives the same result as if it were applied once (i.e. $P {\displaystyle P}$ isidempotent). It leaves itsimage unchanged.^[1] This definition of "projection" formalizes and generalizes the idea ofgraphical projection. One can also consider the effect of a projection on a geometrical object by examining the effect of the projection onpoints in the object.

The transformationP is the orthogonal projectiononto thelinem.

Definitions

edit

Aprojection on a vector space $V {\displaystyle V}$ is a linear operator $P\colon V\to V$ such that $P^{2}=P$ .

When $V {\displaystyle V}$ has aninner product and iscomplete, i.e. when $V {\displaystyle V}$ is aHilbert space, the concept oforthogonality can be used. A projection $P {\displaystyle P}$ on a Hilbert space $V {\displaystyle V}$ is called anorthogonal projection if it satisfies $\langle P\mathbf {x} ,\mathbf {y} \rangle =\langle \mathbf {x} ,P\mathbf {y} \rangle$ for all $\mathbf {x} ,\mathbf {y} \in V$ . A projection on a Hilbert space that is not orthogonal is called anoblique projection.

Projection matrix

edit

Asquare matrix $P {\displaystyle P}$ is called aprojection matrix if it is equal to its square, i.e. if $P^{2}=P$ .^[2]^{: p. 38}
A square matrix $P {\displaystyle P}$ is called anorthogonal projection matrix if $P^{2}=P=P^{\mathrm {T} }$ for areal matrix, and respectively $P^{2}=P=P^{*}$ for acomplex matrix, where $P^{\mathrm {T} }$ denotes thetranspose of $P {\displaystyle P}$ and $P^{*}$ denotes the adjoint orHermitian transpose of $P {\displaystyle P}$ .^[2]^{: p. 223}
A projection matrix that is not an orthogonal projection matrix is called anoblique projection matrix.

Theeigenvalues of a projection matrix must be 0 or 1.

Examples

edit

Orthogonal projection

edit

For example, the function which maps the point $(x,y,z)$ in three-dimensional space $\mathbb {R} ^{3}$ to the point $(x,y,0)$ is an orthogonal projection onto thexy-plane. This function is represented by the matrix $P={\begin{bmatrix}1&0&0\\0&1&0\\0&0&0\end{bmatrix}}.$

The action of this matrix on an arbitraryvector is $P{\begin{bmatrix}x\\y\\z\end{bmatrix}}={\begin{bmatrix}x\\y\\0\end{bmatrix}}.$

To see that $P {\displaystyle P}$ is indeed a projection, i.e., $P=P^{2}$ , we compute $P^{2}{\begin{bmatrix}x\\y\\z\end{bmatrix}}=P{\begin{bmatrix}x\\y\\0\end{bmatrix}}={\begin{bmatrix}x\\y\\0\end{bmatrix}}=P{\begin{bmatrix}x\\y\\z\end{bmatrix}}.$

Observing that $P^{\mathrm {T} }=P$ shows that the projection is an orthogonal projection.

Oblique projection

edit

A simple example of a non-orthogonal (oblique) projection is $P={\begin{bmatrix}0&0\\\alpha &1\end{bmatrix}}.$

Viamatrix multiplication, one sees that $P^{2}={\begin{bmatrix}0&0\\\alpha &1\end{bmatrix}}{\begin{bmatrix}0&0\\\alpha &1\end{bmatrix}}={\begin{bmatrix}0&0\\\alpha &1\end{bmatrix}}=P.$ showing that $P {\displaystyle P}$ is indeed a projection.

The projection $P {\displaystyle P}$ is orthogonalif and only if $\alpha =0$ because only then $P^{\mathrm {T} }=P.$

Properties and classification

edit

The transformationT is the projection alongk ontom. The range ofT ism and the kernel isk.

Let $W {\displaystyle W}$ be a finite-dimensional vector space and $P {\displaystyle P}$ be a projection on $W {\displaystyle W}$ . Suppose thesubspaces $U {\displaystyle U}$ and $V {\displaystyle V}$ are theimage andkernel of $P {\displaystyle P}$ respectively. Then $P {\displaystyle P}$ has the following properties:

$P {\displaystyle P}$ is theidentity operator $I {\displaystyle I}$ on $U {\displaystyle U}$ : $\forall \mathbf {x} \in U:P\mathbf {x} =\mathbf {x} .$
We have adirect sum $W=U\oplus V$ . Every vector $\mathbf {x} \in W$ may be decomposed uniquely as $\mathbf {x} =\mathbf {u} +\mathbf {v}$ with $\mathbf {u} =P\mathbf {x}$ and $\mathbf {v} =\mathbf {x} -P\mathbf {x} =\left(I-P\right)\mathbf {x}$ , and where $\mathbf {u} \in U,\mathbf {v} \in V.$

The image and kernel of a projection arecomplementary, as are $P {\displaystyle P}$ and $Q=I-P$ . The operator $Q {\displaystyle Q}$ is also a projection as the image and kernel of $P {\displaystyle P}$ become the kernel and image of $Q {\displaystyle Q}$ and vice versa. We say $P {\displaystyle P}$ is a projection along $V {\displaystyle V}$ onto $U {\displaystyle U}$ (kernel/image) and $Q {\displaystyle Q}$ is a projection along $U {\displaystyle U}$ onto $V {\displaystyle V}$ .

Spectrum

edit

In infinite-dimensional vector spaces, thespectrum of a projection is contained in $\{0,1\}$ as $(\lambda I-P)^{-1}={\frac {1}{\lambda }}I+{\frac {1}{\lambda (\lambda -1)}}P.$ Only 0 or 1 can be aneigenvalue of a projection. This implies that an orthogonal projection $P {\displaystyle P}$ is always apositive semi-definite matrix. In general, the correspondingeigenspaces are (respectively) the kernel and range of the projection. Decomposition of a vector space into direct sums is not unique. Therefore, given a subspace $V {\displaystyle V}$ , there may be many projections whose range (or kernel) is $V {\displaystyle V}$ .

If a projection is nontrivial it hasminimal polynomial $x^{2}-x=x(x-1)$ , which factors into distinct linear factors, and thus $P {\displaystyle P}$ isdiagonalizable.

Product of projections

edit

The product of projections is not in general a projection, even if they are orthogonal. If two projectionscommute then their product is a projection, but theconverse is false: the product of two non-commuting projections may be a projection.

If two orthogonal projections commute then their product is an orthogonal projection. If the product of two orthogonal projections is an orthogonal projection, then the two orthogonal projections commute (more generally: two self-adjointendomorphisms commute if and only if their product is self-adjoint).

Orthogonal projections

edit

Main articles:Hilbert projection theorem andComplemented subspace

When the vector space $W {\displaystyle W}$ has aninner product and is complete (is aHilbert space) the concept oforthogonality can be used. Anorthogonal projection is a projection for which the range $U {\displaystyle U}$ and the kernel $V {\displaystyle V}$ areorthogonal subspaces. Thus, for every $\mathbf {x}$ and $\mathbf {y}$ in $W {\displaystyle W}$ , $\langle P\mathbf {x} ,(\mathbf {y} -P\mathbf {y} )\rangle =\langle (\mathbf {x} -P\mathbf {x} ),P\mathbf {y} \rangle =0$ . Equivalently: $\langle \mathbf {x} ,P\mathbf {y} \rangle =\langle P\mathbf {x} ,P\mathbf {y} \rangle =\langle P\mathbf {x} ,\mathbf {y} \rangle .$

A projection is orthogonal if and only if it isself-adjoint. Using the self-adjoint and idempotent properties of $P {\displaystyle P}$ , for any $\mathbf {x}$ and $\mathbf {y}$ in $W {\displaystyle W}$ we have $P\mathbf {x} \in U$ , $\mathbf {y} -P\mathbf {y} \in V$ , and $\langle P\mathbf {x} ,\mathbf {y} -P\mathbf {y} \rangle =\langle \mathbf {x} ,\left(P-P^{2}\right)\mathbf {y} \rangle =0$ where $\langle \cdot ,\cdot \rangle$ is the inner product associated with $W {\displaystyle W}$ . Therefore, $P {\displaystyle P}$ and $I-P$ are orthogonal projections.^[3] The other direction, namely that if $P {\displaystyle P}$ is orthogonal then it is self-adjoint, follows from the implication from $\langle (\mathbf {x} -P\mathbf {x} ),P\mathbf {y} \rangle =\langle P\mathbf {x} ,(\mathbf {y} -P\mathbf {y} )\rangle =0$ to $\langle \mathbf {x} ,P\mathbf {y} \rangle =\langle P\mathbf {x} ,P\mathbf {y} \rangle =\langle P\mathbf {x} ,\mathbf {y} \rangle =\langle \mathbf {x} ,P^{*}\mathbf {y} \rangle$ for every $x {\displaystyle x}$ and $y {\displaystyle y}$ in $W {\displaystyle W}$ ; thus $P=P^{*}$ .

The existence of an orthogonal projection onto a closed subspace follows from theHilbert projection theorem.

Properties and special cases

edit

An orthogonal projection is abounded operator. This is because for every $\mathbf {v}$ in the vector space we have, by theCauchy–Schwarz inequality: $\left\|P\mathbf {v} \right\|^{2}=\langle P\mathbf {v} ,P\mathbf {v} \rangle =\langle P\mathbf {v} ,\mathbf {v} \rangle \leq \left\|P\mathbf {v} \right\|\cdot \left\|\mathbf {v} \right\|$ Thus $\left\|P\mathbf {v} \right\|\leq \left\|\mathbf {v} \right\|$ .

For finite-dimensional complex or real vector spaces, thestandard inner product can be substituted for $\langle \cdot ,\cdot \rangle$ .

Formulas

edit

A simple case occurs when the orthogonal projection is onto a line. If $\mathbf {u}$ is aunit vector on the line, then the projection is given by theouter product $P_{\mathbf {u} }=\mathbf {u} \mathbf {u} ^{\mathsf {T}}.$ (If $\mathbf {u}$ is complex-valued, the transpose in the above equation is replaced by a Hermitian transpose). This operator leavesu invariant, and it annihilates all vectors orthogonal to $\mathbf {u}$ , proving that it is indeed the orthogonal projection onto the line containingu.^[4] A simple way to see this is to consider an arbitrary vector $\mathbf {x}$ as the sum of a component on the line (i.e. the projected vector we seek) and another perpendicular to it, $\mathbf {x} =\mathbf {x} _{\parallel }+\mathbf {x} _{\perp }$ . Applying projection, we get $P_{\mathbf {u} }\mathbf {x} =\mathbf {u} \mathbf {u} ^{\mathsf {T}}\mathbf {x} _{\parallel }+\mathbf {u} \mathbf {u} ^{\mathsf {T}}\mathbf {x} _{\perp }=\mathbf {u} \left(\operatorname {sgn} \left(\mathbf {u} ^{\mathsf {T}}\mathbf {x} _{\parallel }\right)\left\|\mathbf {x} _{\parallel }\right\|\right)+\mathbf {u} \cdot \mathbf {0} =\mathbf {x} _{\parallel }$ by the properties of thedot product of parallel and perpendicular vectors.

This formula can be generalized to orthogonal projections on a subspace of arbitrarydimension. Let $\mathbf {u} _{1},\ldots ,\mathbf {u} _{k}$ be anorthonormal basis of the subspace $U {\displaystyle U}$ , with the assumption that the integer $k\geq 1$ , and let $A {\displaystyle A}$ denote the $n\times k$ matrix whose columns are $\mathbf {u} _{1},\ldots ,\mathbf {u} _{k}$ , i.e., $A={\begin{bmatrix}\mathbf {u} _{1}&\cdots &\mathbf {u} _{k}\end{bmatrix}}$ . Then the projection is given by:^[5] $P_{A}=AA^{\mathsf {T}}$ which can be rewritten as $P_{A}=\sum _{i}\langle \mathbf {u} _{i},\cdot \rangle \mathbf {u} _{i}.$

The matrix $A^{\mathsf {T}}$ is thepartial isometry that vanishes on theorthogonal complement of $U {\displaystyle U}$ , and $A {\displaystyle A}$ is the isometry that embeds $U {\displaystyle U}$ into the underlying vector space. The range of $P_{A}$ is therefore thefinal space of $A {\displaystyle A}$ . It is also clear that $AA^{\mathsf {T}}$ is the identity operator on $U {\displaystyle U}$ .

The orthonormality condition can also be dropped. If $\mathbf {u} _{1},\ldots ,\mathbf {u} _{k}$ is a (not necessarily orthonormal)basis with $k\geq 1$ , and $A {\displaystyle A}$ is the matrix with these vectors as columns, then the projection is:^[6]^[7] $P_{A}=A\left(A^{\mathsf {T}}A\right)^{-1}A^{\mathsf {T}}.$

The matrix $A {\displaystyle A}$ still embeds $U {\displaystyle U}$ into the underlying vector space but is no longer an isometry in general. The matrix $\left(A^{\mathsf {T}}A\right)^{-1}$ is a "normalizing factor" that recovers the norm. For example, therank-1 operator $\mathbf {u} \mathbf {u} ^{\mathsf {T}}$ is not a projection if $\left\|\mathbf {u} \right\|\neq 1.$ After dividing by $\mathbf {u} ^{\mathsf {T}}\mathbf {u} =\left\|\mathbf {u} \right\|^{2},$ we obtain the projection $\mathbf {u} \left(\mathbf {u} ^{\mathsf {T}}\mathbf {u} \right)^{-1}\mathbf {u} ^{\mathsf {T}}$ onto the subspace spanned by $u {\displaystyle u}$ .

In the general case, we can have an arbitrarypositive definite matrix $D {\displaystyle D}$ defining an inner product $\langle x,y\rangle _{D}=y^{\dagger }Dx$ , and the projection $P_{A}$ is given by ${\textstyle P_{A}x=\operatorname {argmin} _{y\in \operatorname {range} (A)}\left\|x-y\right\|_{D}^{2}}$ . Then $P_{A}=A\left(A^{\mathsf {T}}DA\right)^{-1}A^{\mathsf {T}}D.$

When the range space of the projection is generated by aframe (i.e. the number of generators is greater than its dimension), the formula for the projection takes the form: $P_{A}=AA^{+}$ . Here $A^{+}$ stands for theMoore–Penrose pseudoinverse. This is just one of many ways to construct the projection operator.

If ${\begin{bmatrix}A&B\end{bmatrix}}$ is a non-singular matrix and $A^{\mathsf {T}}B=0$ (i.e., $B {\displaystyle B}$ is thenull space matrix of $A {\displaystyle A}$ ),^[8] the following holds: ${\begin{aligned}I&={\begin{bmatrix}A&B\end{bmatrix}}{\begin{bmatrix}A&B\end{bmatrix}}^{-1}{\begin{bmatrix}A^{\mathsf {T}}\\B^{\mathsf {T}}\end{bmatrix}}^{-1}{\begin{bmatrix}A^{\mathsf {T}}\\B^{\mathsf {T}}\end{bmatrix}}\\&={\begin{bmatrix}A&B\end{bmatrix}}\left({\begin{bmatrix}A^{\mathsf {T}}\\B^{\mathsf {T}}\end{bmatrix}}{\begin{bmatrix}A&B\end{bmatrix}}\right)^{-1}{\begin{bmatrix}A^{\mathsf {T}}\\B^{\mathsf {T}}\end{bmatrix}}\\&={\begin{bmatrix}A&B\end{bmatrix}}{\begin{bmatrix}A^{\mathsf {T}}A&O\\O&B^{\mathsf {T}}B\end{bmatrix}}^{-1}{\begin{bmatrix}A^{\mathsf {T}}\\B^{\mathsf {T}}\end{bmatrix}}\\[4pt]&=A\left(A^{\mathsf {T}}A\right)^{-1}A^{\mathsf {T}}+B\left(B^{\mathsf {T}}B\right)^{-1}B^{\mathsf {T}}\end{aligned}}$

If the orthogonal condition is enhanced to $A^{\mathsf {T}}WB=A^{\mathsf {T}}W^{\mathsf {T}}B=0$ with $W {\displaystyle W}$ non-singular, the following holds: $I={\begin{bmatrix}A&B\end{bmatrix}}{\begin{bmatrix}\left(A^{\mathsf {T}}WA\right)^{-1}A^{\mathsf {T}}\\\left(B^{\mathsf {T}}WB\right)^{-1}B^{\mathsf {T}}\end{bmatrix}}W.$

All these formulas also hold for complex inner product spaces, provided that theconjugate transpose is used instead of the transpose. Further details on sums of projectors can be found in Banerjee and Roy (2014).^[9] Also see Banerjee (2004)^[10] for application of sums of projectors in basicspherical trigonometry.

Oblique projections

edit

The termoblique projections is sometimes used to refer to non-orthogonal projections. These projections are also used to represent spatial figures in two-dimensional drawings (seeoblique projection), though not as frequently as orthogonal projections. Whereas calculating the fitted value of anordinary least squares regression requires an orthogonal projection, calculating the fitted value of aninstrumental variables regression requires an oblique projection.

A projection is defined by its kernel and the basis vectors used to characterize its range (which is a complement of the kernel). When these basis vectors are orthogonal to the kernel, then the projection is an orthogonal projection. When these basis vectors are not orthogonal to the kernel, the projection is an oblique projection, or just a projection.

A matrix representation formula for a nonzero projection operator

edit

Let $P\colon V\to V$ be a linear operator such that $P^{2}=P$ and assume that $P {\displaystyle P}$ is not the zero operator. Let the vectors $\mathbf {u} _{1},\ldots ,\mathbf {u} _{k}$ form a basis for the range of $P {\displaystyle P}$ , and assemble these vectors in the $n\times k$ matrix $A {\displaystyle A}$ . Then $k\geq 1$ , otherwise $k=0$ and $P {\displaystyle P}$ is the zero operator. The range and the kernel are complementary spaces, so the kernel has dimension $n-k$ . It follows that theorthogonal complement of the kernel has dimension $k {\displaystyle k}$ . Let $\mathbf {v} _{1},\ldots ,\mathbf {v} _{k}$ form a basis for the orthogonal complement of the kernel of the projection, and assemble these vectors in the matrix $B {\displaystyle B}$ . Then the projection $P {\displaystyle P}$ (with the condition $k\geq 1$ ) is given by $P=A\left(B^{\mathsf {T}}A\right)^{-1}B^{\mathsf {T}}.$

This expression generalizes the formula for orthogonal projections given above.^[11]^[12] A standard proof of this expression is the following. For any vector $\mathbf {x}$ in the vector space $V {\displaystyle V}$ , we can decompose $\mathbf {x} =\mathbf {x} _{1}+\mathbf {x} _{2}$ , where vector $\mathbf {x} _{1}=P(\mathbf {x} )$ is in the image of $P {\displaystyle P}$ , and vector $\mathbf {x} _{2}=\mathbf {x} -P(\mathbf {x} ).$ So $P(\mathbf {x} _{2})=P(\mathbf {x} )-P^{2}(\mathbf {x} )=\mathbf {0}$ , and then $\mathbf {x} _{2}$ is in the kernel of $P {\displaystyle P}$ , which is the null space of $A . {\displaystyle A.}$ In other words, the vector $\mathbf {x} _{1}$ is in the column space of $A, {\displaystyle A,}$ so $\mathbf {x} _{1}=A\mathbf {w}$ for some $k {\displaystyle k}$ dimension vector $\mathbf {w}$ and the vector $\mathbf {x} _{2}$ satisfies $B^{\mathsf {T}}\mathbf {x} _{2}=\mathbf {0}$ by the construction of $B {\displaystyle B}$ . Put these conditions together, and we find a vector $\mathbf {w}$ so that $B^{\mathsf {T}}(\mathbf {x} -A\mathbf {w} )=\mathbf {0}$ . Since matrices $A {\displaystyle A}$ and $B {\displaystyle B}$ are of full rank $k {\displaystyle k}$ by their construction, the $k\times k$ -matrix $B^{\mathsf {T}}A$ is invertible. So the equation $B^{\mathsf {T}}(\mathbf {x} -A\mathbf {w} )=\mathbf {0}$ gives the vector $\mathbf {w} =(B^{\mathsf {T}}A)^{-1}B^{\mathsf {T}}\mathbf {x} .$ In this way, $P\mathbf {x} =\mathbf {x} _{1}=A\mathbf {w} =A(B^{\mathsf {T}}A)^{-1}B^{\mathsf {T}}\mathbf {x}$ for any vector $\mathbf {x} \in V$ and hence $P=A(B^{\mathsf {T}}A)^{-1}B^{\mathsf {T}}$ .

In the case that $P {\displaystyle P}$ is an orthogonal projection, we can take $A=B$ , and it follows that $P=A\left(A^{\mathsf {T}}A\right)^{-1}A^{\mathsf {T}}$ . By using this formula, one can easily check that $P=P^{\mathsf {T}}$ . In general, if the vector space is over complex number field, one then uses theHermitian transpose $A^{*}$ and has the formula $P=A\left(A^{*}A\right)^{-1}A^{*}$ . Recall that one can express theMoore–Penrose inverse of the matrix $A {\displaystyle A}$ by $A^{+}=(A^{*}A)^{-1}A^{*}$ since $A {\displaystyle A}$ has full column rank, so $P=AA^{+}$ .

Singular values

edit

$I-P$ is also an oblique projection. The singular values of $P {\displaystyle P}$ and $I-P$ can be computed by anorthonormal basis of $A {\displaystyle A}$ . Let $Q_{A}$ be an orthonormal basis of $A {\displaystyle A}$ and let $Q_{A}^{\perp }$ be theorthogonal complement of $Q_{A}$ . Denote the singular values of the matrix $Q_{A}^{T}A(B^{T}A)^{-1}B^{T}Q_{A}^{\perp }$ by the positive values $\gamma _{1}\geq \gamma _{2}\geq \ldots \geq \gamma _{k}$ . With this, the singular values for $P {\displaystyle P}$ are:^[13] $\sigma _{i}={\begin{cases}{\sqrt {1+\gamma _{i}^{2}}}&1\leq i\leq k\\0&{\text{otherwise}}\end{cases}}$ and the singular values for $I-P$ are $\sigma _{i}={\begin{cases}{\sqrt {1+\gamma _{i}^{2}}}&1\leq i\leq k\\1&k+1\leq i\leq n-k\\0&{\text{otherwise}}\end{cases}}$ This implies that the largest singular values of $P {\displaystyle P}$ and $I-P$ are equal, and thus that thematrix norm of the oblique projections are the same. However, thecondition number satisfies the relation $\kappa (I-P)={\frac {\sigma _{1}}{1}}\geq {\frac {\sigma _{1}}{\sigma _{k}}}=\kappa (P)$ , and is therefore not necessarily equal.

Finding projection with an inner product

edit

Let $V {\displaystyle V}$ be a vector space (in this case a plane) spanned by orthogonal vectors $\mathbf {u} _{1},\mathbf {u} _{2},\dots ,\mathbf {u} _{p}$ . Let $y {\displaystyle y}$ be a vector. One can define a projection of $\mathbf {y}$ onto $V {\displaystyle V}$ as $\operatorname {proj} _{V}\mathbf {y} ={\frac {\mathbf {y} \cdot \mathbf {u} ^{i}}{\mathbf {u} ^{i}\cdot \mathbf {u} ^{i}}}\mathbf {u} ^{i}$ where repeated indices are summed over (Einstein sum notation). The vector $\mathbf {y}$ can be written as an orthogonal sum such that $\mathbf {y} =\operatorname {proj} _{V}\mathbf {y} +\mathbf {z}$ . $\operatorname {proj} _{V}\mathbf {y}$ is sometimes denoted as ${\hat {\mathbf {y} }}$ . There is a theorem in linear algebra that states that this $\mathbf {z}$ is the smallest distance (theorthogonal distance) from $\mathbf {y}$ to $V {\displaystyle V}$ and is commonly used in areas such asmachine learning.

y is being projected onto the vector spaceV.

Canonical forms

edit

Any projection $P=P^{2}$ on a vector space of dimension $d {\displaystyle d}$ over afield is adiagonalizable matrix, since itsminimal polynomial divides $x^{2}-x$ , which splits into distinct linear factors. Thus there exists a basis in which $P {\displaystyle P}$ has the form

P=I_{r}\oplus 0_{d-r}

where $r {\displaystyle r}$ is therank of $P {\displaystyle P}$ . Here $I_{r}$ is theidentity matrix of size $r {\displaystyle r}$ , $0_{d-r}$ is thezero matrix of size $d-r$ , and $\oplus$ is thedirect sum operator. If the vector space is complex and equipped with aninner product, then there is anorthonormal basis in which the matrix ofP is^[14]

P={\begin{bmatrix}1&\sigma _{1}\\0&0\end{bmatrix}}\oplus \cdots \oplus {\begin{bmatrix}1&\sigma _{k}\\0&0\end{bmatrix}}\oplus I_{m}\oplus 0_{s}.

where $\sigma _{1}\geq \sigma _{2}\geq \dots \geq \sigma _{k}>0$ . Theintegers $k, s, m {\displaystyle k,s,m}$ and the real numbers $\sigma _{i}$ are uniquely determined. $2k+s+m=d$ . The factor $I_{m}\oplus 0_{s}$ corresponds to the maximal invariant subspace on which $P {\displaystyle P}$ acts as anorthogonal projection (so thatP itself is orthogonal if and only if $k=0$ ) and the $\sigma _{i}$ -blocks correspond to theoblique components.

Projections on normed vector spaces

edit

When the underlying vector space $X {\displaystyle X}$ is a (not necessarily finite-dimensional)normed vector space, analytic questions, irrelevant in the finite-dimensional case, need to be considered. Assume now $X {\displaystyle X}$ is aBanach space.

Many of the algebraic results discussed above survive the passage to this context. A given direct sum decomposition of $X {\displaystyle X}$ into complementary subspaces still specifies a projection, and vice versa. If $X {\displaystyle X}$ is the direct sum $X=U\oplus V$ , then the operator defined by $P(u+v)=u$ is still a projection with range $U {\displaystyle U}$ and kernel $V {\displaystyle V}$ . It is also clear that $P^{2}=P$ . Conversely, if $P {\displaystyle P}$ is projection on $X {\displaystyle X}$ , i.e. $P^{2}=P$ , then it is easily verified that $(1-P)^{2}=(1-P)$ . In other words, $1-P$ is also a projection. The relation $P^{2}=P$ implies $1=P+(1-P)$ and $X {\displaystyle X}$ is the direct sum $\operatorname {rg} (P)\oplus \operatorname {rg} (1-P)$ .

However, in contrast to the finite-dimensional case, projections need not becontinuous in general. If a subspace $U {\displaystyle U}$ of $X {\displaystyle X}$ is not closed in the norm topology, then the projection onto $U {\displaystyle U}$ is not continuous. In other words, the range of a continuous projection $P {\displaystyle P}$ must be a closed subspace. Furthermore, the kernel of a continuous projection (in fact, a continuous linear operator in general) is closed. Thus acontinuous projection $P {\displaystyle P}$ gives a decomposition of $X {\displaystyle X}$ into two complementaryclosed subspaces: $X=\operatorname {rg} (P)\oplus \ker(P)=\ker(1-P)\oplus \ker(P)$ .

The converse holds also, with an additional assumption. Suppose $U {\displaystyle U}$ is a closed subspace of $X {\displaystyle X}$ . If there exists a closed subspace $V {\displaystyle V}$ such thatX =U ⊕V, then the projection $P {\displaystyle P}$ with range $U {\displaystyle U}$ and kernel $V {\displaystyle V}$ is continuous. This follows from theclosed graph theorem. Supposex_n →x andPx_n →y. One needs to show that $Px=y$ . Since $U {\displaystyle U}$ is closed and{Px_n} ⊂U,y lies in $U {\displaystyle U}$ , i.e.Py =y. Also,x_n −Px_n = (I −P)x_n →x −y. Because $V {\displaystyle V}$ is closed and{(I −P)x_n} ⊂V, we have $x-y\in V$ , i.e. $P(x-y)=Px-Py=Px-y=0$ , which proves the claim.

The above argument makes use of the assumption that both $U {\displaystyle U}$ and $V {\displaystyle V}$ are closed. In general, given a closed subspace $U {\displaystyle U}$ , there need not exist a complementary closed subspace $V {\displaystyle V}$ , although forHilbert spaces this can always be done by taking theorthogonal complement. For Banach spaces, a one-dimensional subspace always has a closed complementary subspace. This is an immediate consequence ofHahn–Banach theorem. Let $U {\displaystyle U}$ be the linear span of $u {\displaystyle u}$ . By Hahn–Banach, there exists a boundedlinear functional $\varphi$ such thatφ(u) = 1. The operator $P(x)=\varphi (x)u$ satisfies $P^{2}=P$ , i.e. it is a projection. Boundedness of $\varphi$ implies continuity of $P {\displaystyle P}$ and therefore $\ker(P)=\operatorname {rg} (I-P)$ is a closed complementary subspace of $U {\displaystyle U}$ .

Applications and further considerations

edit

Projections (orthogonal and otherwise) play a major role inalgorithms for certain linear algebra problems:

QR decomposition (seeHouseholder transformation andGram–Schmidt decomposition);
Singular value decomposition
Reduction toHessenberg form (the first step in manyeigenvalue algorithms)
Linear regression
Projective elements of matrix algebras are used in the construction of certain K-groups inOperator K-theory

As stated above, projections are a special case of idempotents. Analytically, orthogonal projections are non-commutative generalizations ofcharacteristic functions. Idempotents are used in classifying, for instance,semisimple algebras, whilemeasure theory begins with considering characteristic functions ofmeasurable sets. Therefore, as one can imagine, projections are very often encountered in the context ofoperator algebras. In particular, avon Neumann algebra is generated by its completelattice of projections.

Generalizations

edit

More generally, given a map between normed vector spaces $T\colon V\to W,$ one can analogously ask for this map to be an isometry on the orthogonal complement of the kernel: that $(\ker T)^{\perp }\to W$ be an isometry (comparePartial isometry); in particular it must beonto. The case of an orthogonal projection is whenW is a subspace ofV. InRiemannian geometry, this is used in the definition of aRiemannian submersion.

Notes

edit

^Meyer, pp 386+387
^^a ^bHorn, Roger A.; Johnson, Charles R. (2013).Matrix Analysis, second edition. Cambridge University Press.ISBN 9780521839402.
^Meyer, p. 433
^Meyer, p. 431
^Meyer, equation (5.13.4)
^Banerjee, Sudipto; Roy, Anindya (2014),Linear Algebra and Matrix Analysis for Statistics, Texts in Statistical Science (1st ed.), Chapman and Hall/CRC,ISBN 978-1420095388
^Meyer, equation (5.13.3)
^See alsoLinear least squares (mathematics) § Properties of the least-squares estimators.
^Banerjee, Sudipto; Roy, Anindya (2014),Linear Algebra and Matrix Analysis for Statistics, Texts in Statistical Science (1st ed.), Chapman and Hall/CRC,ISBN 978-1420095388
^Banerjee, Sudipto (2004), "Revisiting Spherical Trigonometry with Orthogonal Projectors",The College Mathematics Journal,35 (5):375–381,doi:10.1080/07468342.2004.11922099,S2CID 122277398
^Banerjee, Sudipto; Roy, Anindya (2014),Linear Algebra and Matrix Analysis for Statistics, Texts in Statistical Science (1st ed.), Chapman and Hall/CRC,ISBN 978-1420095388
^Meyer, equation (7.10.39)
^Brust, J. J.; Marcia, R. F.; Petra, C. G. (2020), "Computationally Efficient Decompositions of Oblique Projection Matrices",SIAM Journal on Matrix Analysis and Applications,41 (2):852–870,doi:10.1137/19M1288115,OSTI 1680061,S2CID 219921214
^Doković, D. Ž. (August 1991). "Unitary similarity of projectors".Aequationes Mathematicae.42 (1):220–224.doi:10.1007/BF01818492.S2CID 122704926.

References

edit

Banerjee, Sudipto; Roy, Anindya (2014),Linear Algebra and Matrix Analysis for Statistics, Texts in Statistical Science (1st ed.), Chapman and Hall/CRC,ISBN 978-1420095388
Dunford, N.; Schwartz, J. T. (1958).Linear Operators, Part I: General Theory. Interscience.
Meyer, Carl D. (2000).Matrix Analysis and Applied Linear Algebra. Society for Industrial and Applied Mathematics.ISBN 978-0-89871-454-8.
Brezinski, Claude:Projection Methods for Systems of Equations, North-Holland, ISBN 0-444-82777-3 (1997).