Projection (linear algebra)

From The Right Wiki
Jump to navigationJump to search
File:Orthogonal projection.svg
The transformation P is the orthogonal projection onto the line m.

In linear algebra and functional analysis, a projection is a linear transformation P from a vector space to itself (an endomorphism) such that PP=P. That is, whenever P is applied twice to any vector, it gives the same result as if it were applied once (i.e. P is idempotent). It leaves its image unchanged.[1] This definition of "projection" formalizes and generalizes the idea of graphical projection. One can also consider the effect of a projection on a geometrical object by examining the effect of the projection on points in the object.

Definitions

A projection on a vector space V is a linear operator P:VV such that P2=P. When V has an inner product and is complete, i.e. when V is a Hilbert space, the concept of orthogonality can be used. A projection P on a Hilbert space V is called an orthogonal projection if it satisfies Px,y=x,Py for all x,yV. A projection on a Hilbert space that is not orthogonal is called an oblique projection.

Projection matrix

  • A square matrix P is called a projection matrix if it is equal to its square, i.e. if P2=P.[2]: p. 38 
  • A square matrix P is called an orthogonal projection matrix if P2=P=PT for a real matrix, and respectively P2=P=P* for a complex matrix, where PT denotes the transpose of P and P* denotes the adjoint or Hermitian transpose of P.[2]: p. 223 
  • A projection matrix that is not an orthogonal projection matrix is called an oblique projection matrix.

The eigenvalues of a projection matrix must be 0 or 1.

Examples

Orthogonal projection

For example, the function which maps the point (x,y,z) in three-dimensional space 3 to the point (x,y,0) is an orthogonal projection onto the xy-plane. This function is represented by the matrix P=[100010000]. The action of this matrix on an arbitrary vector is P[xyz]=[xy0]. To see that P is indeed a projection, i.e., P=P2, we compute P2[xyz]=P[xy0]=[xy0]=P[xyz]. Observing that PT=P shows that the projection is an orthogonal projection.

Oblique projection

A simple example of a non-orthogonal (oblique) projection is P=[00α1]. Via matrix multiplication, one sees that P2=[00α1][00α1]=[00α1]=P. showing that P is indeed a projection. The projection P is orthogonal if and only if α=0 because only then PT=P.

Properties and classification

File:Oblique projection.svg
The transformation T is the projection along k onto m. The range of T is m and the kernel is k.

Idempotence

By definition, a projection P is idempotent (i.e. P2=P).

Open map

Every projection is an open map, meaning that it maps each open set in the domain to an open set in the subspace topology of the image.[citation needed] That is, for any vector x and any ball Bx (with positive radius) centered on x, there exists a ball BPx (with positive radius) centered on Px that is wholly contained in the image P(Bx).

Complementarity of image and kernel

Let W be a finite-dimensional vector space and P be a projection on W. Suppose the subspaces U and V are the image and kernel of P respectively. Then P has the following properties:

  1. P is the identity operator I on U: xU:Px=x.
  2. We have a direct sum W=UV. Every vector xW may be decomposed uniquely as x=u+v with u=Px and v=xPx=(IP)x, and where uU,vV.

The image and kernel of a projection are complementary, as are P and Q=IP. The operator Q is also a projection as the image and kernel of P become the kernel and image of Q and vice versa. We say P is a projection along V onto U (kernel/image) and Q is a projection along U onto V.

Spectrum

In infinite-dimensional vector spaces, the spectrum of a projection is contained in {0,1} as (λIP)1=1λI+1λ(λ1)P. Only 0 or 1 can be an eigenvalue of a projection. This implies that an orthogonal projection P is always a positive semi-definite matrix. In general, the corresponding eigenspaces are (respectively) the kernel and range of the projection. Decomposition of a vector space into direct sums is not unique. Therefore, given a subspace V, there may be many projections whose range (or kernel) is V. If a projection is nontrivial it has minimal polynomial x2x=x(x1), which factors into distinct linear factors, and thus P is diagonalizable.

Product of projections

The product of projections is not in general a projection, even if they are orthogonal. If two projections commute then their product is a projection, but the converse is false: the product of two non-commuting projections may be a projection. If two orthogonal projections commute then their product is an orthogonal projection. If the product of two orthogonal projections is an orthogonal projection, then the two orthogonal projections commute (more generally: two self-adjoint endomorphisms commute if and only if their product is self-adjoint).

Orthogonal projections

When the vector space W has an inner product and is complete (is a Hilbert space) the concept of orthogonality can be used. An orthogonal projection is a projection for which the range U and the kernel V are orthogonal subspaces. Thus, for every x and y in W, Px,(yPy)=(xPx),Py=0. Equivalently: x,Py=Px,Py=Px,y. A projection is orthogonal if and only if it is self-adjoint. Using the self-adjoint and idempotent properties of P, for any x and y in W we have PxU, yPyV, and Px,yPy=x,(PP2)y=0 where , is the inner product associated with W. Therefore, P and IP are orthogonal projections.[3] The other direction, namely that if P is orthogonal then it is self-adjoint, follows from the implication from (xPx),Py=Px,(yPy)=0 to x,Py=Px,Py=Px,y=x,P*y for every x and y in W; thus P=P*. The existence of an orthogonal projection onto a closed subspace follows from the Hilbert projection theorem.

Properties and special cases

An orthogonal projection is a bounded operator. This is because for every v in the vector space we have, by the Cauchy–Schwarz inequality: Pv2=Pv,Pv=Pv,vPvv Thus Pvv. For finite-dimensional complex or real vector spaces, the standard inner product can be substituted for ,.

Formulas

A simple case occurs when the orthogonal projection is onto a line. If u is a unit vector on the line, then the projection is given by the outer product Pu=uuT. (If u is complex-valued, the transpose in the above equation is replaced by a Hermitian transpose). This operator leaves u invariant, and it annihilates all vectors orthogonal to u, proving that it is indeed the orthogonal projection onto the line containing u.[4] A simple way to see this is to consider an arbitrary vector x as the sum of a component on the line (i.e. the projected vector we seek) and another perpendicular to it, x=x+x. Applying projection, we get Pux=uuTx+uuTx=u(sgn(uTx)x)+u0=x by the properties of the dot product of parallel and perpendicular vectors. This formula can be generalized to orthogonal projections on a subspace of arbitrary dimension. Let u1,,uk be an orthonormal basis of the subspace U, with the assumption that the integer k1, and let A denote the n×k matrix whose columns are u1,,uk, i.e., A=[u1uk]. Then the projection is given by:[5] PA=AAT which can be rewritten as PA=iui,ui. The matrix AT is the partial isometry that vanishes on the orthogonal complement of U, and A is the isometry that embeds U into the underlying vector space. The range of PA is therefore the final space of A. It is also clear that AAT is the identity operator on U. The orthonormality condition can also be dropped. If u1,,uk is a (not necessarily orthonormal) basis with k1, and A is the matrix with these vectors as columns, then the projection is:[6][7] PA=A(ATA)1AT. The matrix A still embeds U into the underlying vector space but is no longer an isometry in general. The matrix (ATA)1 is a "normalizing factor" that recovers the norm. For example, the rank-1 operator uuT is not a projection if u1. After dividing by uTu=u2, we obtain the projection u(uTu)1uT onto the subspace spanned by u. In the general case, we can have an arbitrary positive definite matrix D defining an inner product x,yD=yDx, and the projection PA is given by PAx=argminyrange(A)xyD2. Then PA=A(ATDA)1ATD. When the range space of the projection is generated by a frame (i.e. the number of generators is greater than its dimension), the formula for the projection takes the form: PA=AA+. Here A+ stands for the Moore–Penrose pseudoinverse. This is just one of many ways to construct the projection operator. If [AB] is a non-singular matrix and ATB=0 (i.e., B is the null space matrix of A),[8] the following holds: I=[AB][AB]1[ATBT]1[ATBT]=[AB]([ATBT][AB])1[ATBT]=[AB][ATAOOBTB]1[ATBT]=A(ATA)1AT+B(BTB)1BT If the orthogonal condition is enhanced to ATWB=ATWTB=0 with W non-singular, the following holds: I=[AB][(ATWA)1AT(BTWB)1BT]W. All these formulas also hold for complex inner product spaces, provided that the conjugate transpose is used instead of the transpose. Further details on sums of projectors can be found in Banerjee and Roy (2014).[9] Also see Banerjee (2004)[10] for application of sums of projectors in basic spherical trigonometry.

Oblique projections

The term oblique projections is sometimes used to refer to non-orthogonal projections. These projections are also used to represent spatial figures in two-dimensional drawings (see oblique projection), though not as frequently as orthogonal projections. Whereas calculating the fitted value of an ordinary least squares regression requires an orthogonal projection, calculating the fitted value of an instrumental variables regression requires an oblique projection. A projection is defined by its kernel and the basis vectors used to characterize its range (which is a complement of the kernel). When these basis vectors are orthogonal to the kernel, then the projection is an orthogonal projection. When these basis vectors are not orthogonal to the kernel, the projection is an oblique projection, or just a projection.

A matrix representation formula for a nonzero projection operator

Let P:VV be a linear operator such that P2=P and assume that P is not the zero operator. Let the vectors u1,,uk form a basis for the range of P, and assemble these vectors in the n×k matrix A. Then k1, otherwise k=0 and P is the zero operator. The range and the kernel are complementary spaces, so the kernel has dimension nk. It follows that the orthogonal complement of the kernel has dimension k. Let v1,,vk form a basis for the orthogonal complement of the kernel of the projection, and assemble these vectors in the matrix B. Then the projection P (with the condition k1) is given by P=A(BTA)1BT. This expression generalizes the formula for orthogonal projections given above.[11][12] A standard proof of this expression is the following. For any vector x in the vector space V, we can decompose x=x1+x2, where vector x1=P(x) is in the image of P, and vector x2=xP(x). So P(x2)=P(x)P2(x)=0, and then x2 is in the kernel of P, which is the null space of A. In other words, the vector x1 is in the column space of A, so x1=Aw for some k dimension vector w and the vector x2 satisfies BTx2=0 by the construction of B. Put these conditions together, and we find a vector w so that BT(xAw)=0. Since matrices A and B are of full rank k by their construction, the k×k-matrix BTA is invertible. So the equation BT(xAw)=0 gives the vector w=(BTA)1BTx. In this way, Px=x1=Aw=A(BTA)1BTx for any vector xV and hence P=A(BTA)1BT. In the case that P is an orthogonal projection, we can take A=B, and it follows that P=A(ATA)1AT. By using this formula, one can easily check that P=PT. In general, if the vector space is over complex number field, one then uses the Hermitian transpose A* and has the formula P=A(A*A)1A*. Recall that one can express the Moore–Penrose inverse of the matrix A by A+=(A*A)1A* since A has full column rank, so P=AA+.

Singular values

IP is also an oblique projection. The singular values of P and IP can be computed by an orthonormal basis of A. Let QA be an orthonormal basis of A and let QA be the orthogonal complement of QA. Denote the singular values of the matrix QATA(BTA)1BTQA by the positive values γ1γ2γk. With this, the singular values for P are:[13] σi={1+γi21ik0otherwise and the singular values for IP are σi={1+γi21ik1k+1ink0otherwise This implies that the largest singular values of P and IP are equal, and thus that the matrix norm of the oblique projections are the same. However, the condition number satisfies the relation κ(IP)=σ11σ1σk=κ(P), and is therefore not necessarily equal.

Finding projection with an inner product

Let V be a vector space (in this case a plane) spanned by orthogonal vectors u1,u2,,up. Let y be a vector. One can define a projection of y onto V as projVy=yuiuiuiui where repeated indices are summed over (Einstein sum notation). The vector y can be written as an orthogonal sum such that y=projVy+z. projVy is sometimes denoted as y^. There is a theorem in linear algebra that states that this z is the smallest distance (the orthogonal distance) from y to V and is commonly used in areas such as machine learning.

File:Ortho projection.svg
y is being projected onto the vector space V.

Canonical forms

Any projection P=P2 on a vector space of dimension d over a field is a diagonalizable matrix, since its minimal polynomial divides x2x, which splits into distinct linear factors. Thus there exists a basis in which P has the form

P=Ir0dr

where r is the rank of P. Here Ir is the identity matrix of size r, 0dr is the zero matrix of size dr, and is the direct sum operator. If the vector space is complex and equipped with an inner product, then there is an orthonormal basis in which the matrix of P is[14]

P=[1σ100][1σk00]Im0s.

where σ1σ2σk>0. The integers k,s,m and the real numbers σi are uniquely determined. 2k+s+m=d. The factor Im0s corresponds to the maximal invariant subspace on which P acts as an orthogonal projection (so that P itself is orthogonal if and only if k=0) and the σi-blocks correspond to the oblique components.

Projections on normed vector spaces

When the underlying vector space X is a (not necessarily finite-dimensional) normed vector space, analytic questions, irrelevant in the finite-dimensional case, need to be considered. Assume now X is a Banach space. Many of the algebraic results discussed above survive the passage to this context. A given direct sum decomposition of X into complementary subspaces still specifies a projection, and vice versa. If X is the direct sum X=UV, then the operator defined by P(u+v)=u is still a projection with range U and kernel V. It is also clear that P2=P. Conversely, if P is projection on X, i.e. P2=P, then it is easily verified that (1P)2=(1P). In other words, 1P is also a projection. The relation P2=P implies 1=P+(1P) and X is the direct sum rg(P)rg(1P). However, in contrast to the finite-dimensional case, projections need not be continuous in general. If a subspace U of X is not closed in the norm topology, then the projection onto U is not continuous. In other words, the range of a continuous projection P must be a closed subspace. Furthermore, the kernel of a continuous projection (in fact, a continuous linear operator in general) is closed. Thus a continuous projection P gives a decomposition of X into two complementary closed subspaces: X=rg(P)ker(P)=ker(1P)ker(P). The converse holds also, with an additional assumption. Suppose U is a closed subspace of X. If there exists a closed subspace V such that X = UV, then the projection P with range U and kernel V is continuous. This follows from the closed graph theorem. Suppose xnx and Pxny. One needs to show that Px=y. Since U is closed and {Pxn} ⊂ U, y lies in U, i.e. Py = y. Also, xnPxn = (IP)xnxy. Because V is closed and {(IP)xn} ⊂ V, we have xyV, i.e. P(xy)=PxPy=Pxy=0, which proves the claim. The above argument makes use of the assumption that both U and V are closed. In general, given a closed subspace U, there need not exist a complementary closed subspace V, although for Hilbert spaces this can always be done by taking the orthogonal complement. For Banach spaces, a one-dimensional subspace always has a closed complementary subspace. This is an immediate consequence of Hahn–Banach theorem. Let U be the linear span of u. By Hahn–Banach, there exists a bounded linear functional φ such that φ(u) = 1. The operator P(x)=φ(x)u satisfies P2=P, i.e. it is a projection. Boundedness of φ implies continuity of P and therefore ker(P)=rg(IP) is a closed complementary subspace of U.

Applications and further considerations

Projections (orthogonal and otherwise) play a major role in algorithms for certain linear algebra problems:

As stated above, projections are a special case of idempotents. Analytically, orthogonal projections are non-commutative generalizations of characteristic functions. Idempotents are used in classifying, for instance, semisimple algebras, while measure theory begins with considering characteristic functions of measurable sets. Therefore, as one can imagine, projections are very often encountered in the context of operator algebras. In particular, a von Neumann algebra is generated by its complete lattice of projections.

Generalizations

More generally, given a map between normed vector spaces T:VW, one can analogously ask for this map to be an isometry on the orthogonal complement of the kernel: that (kerT)W be an isometry (compare Partial isometry); in particular it must be onto. The case of an orthogonal projection is when W is a subspace of V. In Riemannian geometry, this is used in the definition of a Riemannian submersion.

See also

Notes

  1. Meyer, pp 386+387
  2. 2.0 2.1 Horn, Roger A.; Johnson, Charles R. (2013). Matrix Analysis, second edition. Cambridge University Press. ISBN 9780521839402.
  3. Meyer, p. 433
  4. Meyer, p. 431
  5. Meyer, equation (5.13.4)
  6. Banerjee, Sudipto; Roy, Anindya (2014), Linear Algebra and Matrix Analysis for Statistics, Texts in Statistical Science (1st ed.), Chapman and Hall/CRC, ISBN 978-1420095388
  7. Meyer, equation (5.13.3)
  8. See also Linear least squares (mathematics) § Properties of the least-squares estimators.
  9. Banerjee, Sudipto; Roy, Anindya (2014), Linear Algebra and Matrix Analysis for Statistics, Texts in Statistical Science (1st ed.), Chapman and Hall/CRC, ISBN 978-1420095388
  10. Banerjee, Sudipto (2004), "Revisiting Spherical Trigonometry with Orthogonal Projectors", The College Mathematics Journal, 35 (5): 375–381, doi:10.1080/07468342.2004.11922099, S2CID 122277398
  11. Banerjee, Sudipto; Roy, Anindya (2014), Linear Algebra and Matrix Analysis for Statistics, Texts in Statistical Science (1st ed.), Chapman and Hall/CRC, ISBN 978-1420095388
  12. Meyer, equation (7.10.39)
  13. Brust, J. J.; Marcia, R. F.; Petra, C. G. (2020), "Computationally Efficient Decompositions of Oblique Projection Matrices", SIAM Journal on Matrix Analysis and Applications, 41 (2): 852–870, doi:10.1137/19M1288115, OSTI 1680061, S2CID 219921214
  14. Doković, D. Ž. (August 1991). "Unitary similarity of projectors". Aequationes Mathematicae. 42 (1): 220–224. doi:10.1007/BF01818492. S2CID 122704926.

References

  • Banerjee, Sudipto; Roy, Anindya (2014), Linear Algebra and Matrix Analysis for Statistics, Texts in Statistical Science (1st ed.), Chapman and Hall/CRC, ISBN 978-1420095388
  • Dunford, N.; Schwartz, J. T. (1958). Linear Operators, Part I: General Theory. Interscience.
  • Meyer, Carl D. (2000). Matrix Analysis and Applied Linear Algebra. Society for Industrial and Applied Mathematics. ISBN 978-0-89871-454-8.
  • Brezinski, Claude: Projection Methods for Systems of Equations, North-Holland, ISBN 0-444-82777-3 (1997).

External links