In the field ofmathematics,norms are defined for elements within avector space. Specifically, when the vector space comprises matrices, such norms are referred to asmatrix norms. Matrix norms differ from vector norms in that they must also interact withmatrix multiplication.
Given afield of eitherreal orcomplex numbers (or any complete subset thereof), let be theK-vector space of matrices with rows and columns and entries in the field A matrix norm is anorm on
Norms are often expressed withdouble vertical bars (like so:). Thus, the matrix norm is afunction that must satisfy the following properties:[1][2]
For all scalars and matrices
(positive-valued)
(definite)
(absolutely homogeneous)
(sub-additive or satisfying thetriangle inequality)
The only feature distinguishing matrices from rearranged vectors ismultiplication. Matrix norms are particularly useful if they are alsosub-multiplicative:[1][2][3]
Suppose avector norm on and a vector norm on are given. Any matrixA induces a linear operator from to with respect to the standard basis, and one defines the correspondinginduced norm oroperator norm orsubordinate norm on the space of all matrices as follows:where denotes thesupremum. This norm measures how much the mapping induced by can stretch vectors.Depending on the vector norms, used, notation other than can be used for the operator norm.
If thep-norm for vectors () is used for both spaces and then the corresponding operator norm is:[2]These induced norms are different from the"entry-wise"p-norms and theSchattenp-norms for matrices treated below, which are also usually denoted by
Geometrically speaking, one can imagine ap-norm unit ball in, then apply the linear map to the ball. It would end up becoming a distorted convex shape, and measures the longest "radius" of the distorted convex shape. In other words, we must take ap-norm unit ball in, then multiply it by at least, in order for it to be large enough to contain.
When (theEuclidean norm or-norm for vectors), the induced matrix norm is thespectral norm. The two values donot coincide in infinite dimensions — seeSpectral radius for further discussion. The spectral radius should not be confused with the spectral norm. The spectral norm of a matrix is the largestsingular value of, i.e., the square root of the largesteigenvalue of the matrix where denotes theconjugate transpose of:[5]where represents the largest singular value of matrix
We can generalize the above definition. Suppose we have vector norms and for spaces and respectively; the corresponding operator norm isIn particular, the defined previously is the special case of.
In the special cases of and, the induced matrix norms can be computed by where is the i-th row of matrix.
In the special cases of and, the induced matrix norms can be computed by where is the j-th column of matrix.
Hence, and are the maximum row and column 2-norm of the matrix, respectively.
Suppose is an operator norm on the space of square matricesinduced by vector norms and.Then, the operator norm is a sub-multiplicative matrix norm:
Moreover, any such norm satisfies the inequality
1
for all positive integersr, whereρ(A) is thespectral radius ofA. Forsymmetric orhermitianA, we have equality in (1) for the 2-norm, since in this case the 2-normis precisely the spectral radius ofA. For an arbitrary matrix, we may not have equality for any norm; a counterexample would bewhich has vanishing spectral radius. In any case, for any matrix norm, we have thespectral radius formula:
If the vector norms and are given in terms ofenergy norms based onsymmetricpositive definite matrices and respectively, the resulting operator norm is given as
Using the symmetricmatrix square roots of and respectively, the operator norm can be expressed as the spectral norm of a modified matrix:
A matrix norm on is calledconsistent with a vector norm on and a vector norm on, if:for all and all. In the special case ofm =n and, is also calledcompatible with.
All induced norms are consistent by definition. Also, any sub-multiplicative matrix norm on induces a compatible vector norm on by defining.
Let be the dimensionm columns of matrix. From the original definition, the matrix presentsn data points in anm-dimensional space. The norm[6] is the sum of the Euclidean norms of the columns of the matrix:
Whenp =q = 2 for the norm, it is called theFrobenius norm or theHilbert–Schmidt norm, though the latter term is used more frequently in the context of operators on (possibly infinite-dimensional)Hilbert space. This norm can be defined in various ways:
where thetrace is the sum of diagonal entries, and are thesingular values of. The second equality is proven by explicit computation of. The third equality is proven bysingular value decomposition of, and the fact that the trace is invariant under circular shifts.
The Frobenius norm is an extension of the Euclidean norm to and comes from theFrobenius inner product on the space of all matrices.
The Frobenius norm is sub-multiplicative and is very useful fornumerical linear algebra. The sub-multiplicativity of Frobenius norm can be proved using theCauchy–Schwarz inequality. In fact, it is more than sub-multiplicative, aswhere the operator norm.
Frobenius norm is often easier to compute than induced norms, and has the useful property of being invariant underrotations (andunitary operations in general). That is, for any unitary matrix. This property follows from the cyclic nature of the trace ():
and analogously:
where we have used the unitary nature of (that is,).
It also satisfies
and
where is theFrobenius inner product, and Re is the real part of a complex number (irrelevant for real matrices)
Themax norm is the elementwise norm in the limit asp =q goes to infinity:
This norm is notsub-multiplicative; but modifying the right-hand side to makes it so.
Note that in some literature (such asCommunication complexity), an alternative definition of max-norm, also called the-norm, refers to the factorization norm:
The Schattenp-norms arise when applying thep-norm to the vector ofsingular values of a matrix.[2] If the singular values of the matrix are denoted byσi, then the Schattenp-norm is defined by
These norms again share the notation with the induced and entry-wisep-norms, but they are different.
All Schatten norms are sub-multiplicative. They are also unitarily invariant, which means that for all matrices and allunitary matrices and.
The most familiar cases arep = 1, 2, ∞. The casep = 2 yields the Frobenius norm, introduced before. The casep = ∞ yields the spectral norm, which is the operator norm induced by the vector 2-norm (see above). Finally,p = 1 yields thenuclear norm (also known as thetrace norm, or theKy Fan 'n'-norm[7]), defined as:
Another source of inspiration for matrix norms arises from considering a matrix as theadjacency matrix of aweighted,directed graph.[9] The so-called "cut norm" measures how close the associated graph is to beingbipartite: whereA ∈Km×n.[9][10][11] Equivalent definitions (up to a constant factor) impose the conditions2|S| >n & 2|T| >m;S =T; orS ∩T = ∅.[10]
The cut-norm is equivalent to the induced operator norm‖·‖∞→1, which is itself equivalent to another norm, called theGrothendieck norm.[11]
To define the Grothendieck norm, first note that a linear operatorK1 →K1 is just a scalar, and thus extends to a linear operator on anyKk →Kk. Moreover, given any choice of basis forKn andKm, any linear operatorKn →Km extends to a linear operator(Kk)n → (Kk)m, by letting each matrix element on elements ofKk viascalar multiplication. The Grothendieck norm is the norm of that extended operator; in symbols:[11]
The Grothendieck norm depends on choice of basis (usually taken to be thestandard basis) andk.
for some positive numbersr ands, for all matrices. In other words, all norms on areequivalent; they induce the sametopology on. This is true because the vector space has the finitedimension.
Moreover, for every matrix norm on there exists a unique positive real number such that is a sub-multiplicative matrix norm for every; to wit,
A sub-multiplicative matrix norm is said to beminimal, if there exists no other sub-multiplicative matrix norm satisfying.
^The condition only applies when the product is defined, such as the case ofsquare matrices (). More generally, multiplication of the matrices must be possible: and further, the two norms and must either have the same definitions, only differing in the matrix dimensions, or two different types of norms that are none the less "consistent" (see below).
^Malek-Shahmirzadi, Massoud (1983). "A characterization of certain classes of matrix norms".Linear and Multilinear Algebra.13 (2):97–99.doi:10.1080/03081088308817508.ISSN0308-1087.
^abHorn, Roger A. (2012).Matrix analysis. Johnson, Charles R. (2nd ed.). Cambridge, UK: Cambridge University Press. pp. 340–341.ISBN978-1-139-77600-4.OCLC817236655.
^Carl D. Meyer, Matrix Analysis and Applied Linear Algebra, §5.2, p.281, Society for Industrial & Applied Mathematics, June 2000.
^Ding, Chris; Zhou, Ding; He, Xiaofeng; Zha, Hongyuan (June 2006).R1-PCA: Rotational invariant L1-norm principal component analysis for robust subspace factorization. 23rd International Conference on Machine Learning. ICML '06. Pittsburgh, PA:Association for Computing Machinery. pp. 281–288.doi:10.1145/1143844.1143880.ISBN1-59593-383-2.
^Ciarlet, Philippe G. (1989).Introduction to numerical linear algebra and optimisation. Cambridge, England: Cambridge University Press. p. 57.ISBN0521327881.
^abLovász László (2012). "The cut distance".Large Networks and Graph Limits. AMS Colloquium Publications. Vol. 60. Providence, RI: American Mathematical Society. pp. 127–131.ISBN978-0-8218-9085-1. Note that Lovász rescales‖A‖□ to lie in[0, 1].