Equivalently, the matrix exponential is provided by the solution of the (matrix) differential equation
WhenX is ann ×ndiagonal matrix thenexp(X) will be ann ×n diagonal matrix with each diagonal element equal to the ordinaryexponential applied to the corresponding diagonal element ofX.
LetX andY ben ×n complex matrices and leta andb be arbitrary complex numbers. We denote then ×nidentity matrix byI and thezero matrix by 0. The matrix exponential satisfies the following properties.[2]
We begin with the properties that are immediate consequences of the definition as a power series:
The proof of this identity is the same as the standard power-series argument for the corresponding identity for the exponential of real numbers. That is to say,as long as and commute, it makes no difference to the argument whether and are numbers or matrices. It is important to note that this identity typically does not hold if and do not commute (seeGolden-Thompson inequality below).
Consequences of the preceding identity are the following:
One of the reasons for the importance of the matrix exponential is that it can be used to solve systems of linearordinary differential equations. The solution ofwhereA is a constant matrix andy is a column vector, is given by
The matrix exponential can also be used to solve the inhomogeneous equationSee the section onapplications below for examples.
There is no closed-form solution for differential equations of the formwhereA is not constant, but theMagnus series gives the solution as an infinite sum.
In addition to providing a computational tool, this formula demonstrates that a matrix exponential is always aninvertible matrix. This follows from the fact that the right hand side of the above equation is always non-zero, and sodet(eA) ≠ 0, which implies thateA must be invertible.
In the real-valued case, the formula also exhibits the mapto not besurjective, in contrast to the complex case mentioned earlier. This follows from the fact that, for real-valued matrices, the right-hand side of the formula is always positive, while there exist invertible matrices with a negative determinant.
The matrix exponential of a real symmetric matrix is positive definite. Let be ann ×n real symmetric matrix and a column vector. Using the elementary properties of the matrix exponential and of symmetric matrices, we have:
Since is invertible, the equality only holds for, and we have for all non-zero. Hence is positive definite.
For any real numbers (scalars)x andy we know that the exponential function satisfiesex+y =exey. The same is true for commuting matrices. If matricesX andY commute (meaning thatXY =YX), then,
However, for matrices that do not commute the above equality does not necessarily hold.
In the other direction, ifX andY are sufficiently small (but not necessarily commuting) matrices, we havewhereZ may be computed as a series incommutators ofX andY by means of theBaker–Campbell–Hausdorff formula:[5]where the remaining terms are all iterated commutators involvingX andY. IfX andY commute, then all the commutators are zero and we have simplyZ =X +Y.
Inequalities for exponentials of Hermitian matrices
There is no requirement of commutativity. There are counterexamples to show that the Golden–Thompson inequality cannot be extended to three matrices – and, in any event,tr(exp(A)exp(B)exp(C)) is not guaranteed to be real for HermitianA,B,C. However,Lieb proved[7][8] that it can be generalized to three matrices if we modify the expression as follows
The exponential of a matrix is always aninvertible matrix. The inverse matrix ofeX is given bye−X. This is analogous to the fact that the exponential of a complex number is always nonzero. The matrix exponential then gives us a mapfrom the space of alln ×n matrices to thegeneral linear group of degreen, i.e. thegroup of alln ×n invertible matrices. In fact, this map issurjective which means that every invertible matrix can be written as the exponential of some other matrix[9] (for this, it is essential to consider the fieldC of complex numbers and notR).
The derivative of this curve (ortangent vector) at a pointt is given by
1
The derivative att = 0 is just the matrixX, which is to say thatX generates this one-parameter subgroup.
More generally,[10] for a generict-dependent exponent,X(t),
Taking the above expressioneX(t) outside the integral sign and expanding the integrand with the help of theHadamard lemma one can obtain the following useful expression for the derivative of the matrix exponent,[11]
The coefficients in the expression above are different from what appears in the exponential. For a closed form, seederivative of the exponential map.
Directional derivatives when restricted to Hermitian matrices
Let be a Hermitian matrix with distinct eigenvalues. Let be its eigen-decomposition where is a unitary matrix whose columns are the eigenvectors of, is its conjugate transpose, and the vector of corresponding eigenvalues. Then, for any Hermitian matrix, thedirectional derivative of at in the direction is[12][13]where, the operator denotes the Hadamard product, and, for all, the matrix is defined asIn addition, for any Hermitian matrix, the second directional derivative in directions and is[13]where the matrix-valued function is defined, for all, aswith
Finding reliable and accurate methods to compute the matrix exponential is difficult, and this is still a topic of considerable current research in mathematics and numerical analysis.Matlab,GNU Octave,R, andSciPy all use thePadé approximant.[14][15][16][17] In this section, we discuss methods that are applicable in principle to any matrix, and which can be carried out explicitly for small matrices.[18] Subsequent sections describe methods suitable for numerical evaluation on large matrices.
which is especially easy to compute whenD is diagonal.
Application ofSylvester's formula yields the same result. (To see this, note that addition and multiplication, hence also exponentiation, of diagonal matrices is equivalent to element-wise addition and multiplication, and hence exponentiation; in particular, the "one-dimensional" exponentiation is felt element-wise for the diagonal case.)
A matrixN isnilpotent ifNq = 0 for some integerq. In this case, the matrix exponentialeN can be computed directly from the series expansion, as the series terminates after a finite number of terms:
Since the series has a finite number of steps, it is a matrix polynomial, which can becomputed efficiently.
A closely related method is, if the field isalgebraically closed, to work with theJordan form ofX. Suppose thatX =PJP−1 whereJ is the Jordan form ofX. Then
Also, since
Therefore, we need only know how to compute the matrix exponential of aJordan block. But each Jordan block is of the form
whereN is a special nilpotent matrix. The matrix exponential ofJ is then given by
For a simple rotation in which the perpendicular unit vectorsa andb specify a plane,[20] therotation matrixR can be expressed in terms of a similar exponential function involving ageneratorG and angleθ.[21][22]
The formula for the exponential results from reducing the powers ofG in the series expansion and identifying the respective series coefficients ofG2 andG with−cos(θ) andsin(θ) respectively. The second expression here foreGθ is the same as the expression forR(θ) in the article containing the derivation of thegenerator,R(θ) =eGθ.
In two dimensions, if and, then,, andreduces to the standard matrix for a plane rotation.
The matrixP = −G2projects a vector onto theab-plane and the rotation only affects this part of the vector. An example illustrating this is a rotation of30° = π/6 in the plane spanned bya andb,
LetN =I -P, soN2 =N and its products withP andG are zero. This will allow us to evaluate powers ofR.
By virtue of theCayley–Hamilton theorem the matrix exponential is expressible as a polynomial of ordern−1.
IfP andQt are nonzero polynomials in one variable, such thatP(A) = 0, and if themeromorphic functionisentire, thenTo prove this, multiply the first of the two above equalities byP(z) and replacez byA.
Such a polynomialQt(z) can be found as follows−seeSylvester's formula. Lettinga be a root ofP,Qa,t(z) is solved from the product ofP by theprincipal part of theLaurent series off ata: It is proportional to the relevantFrobenius covariant. Then the sumSt of theQa,t, wherea runs over all the roots ofP, can be taken as a particularQt. All the otherQt will be obtained by adding a multiple ofP toSt(z). In particular,St(z), theLagrange-Sylvester polynomial, is the onlyQt whose degree is less than that ofP.
Example: Consider the case of an arbitrary2 × 2 matrix,
Thus, as indicated above, the matrixA having decomposed into the sum of two mutually commuting pieces, the traceful piece and the traceless piece,
the matrix exponential reduces to a plain product of the exponentials of the two respective pieces. This is a formula often used in physics, as it amounts to the analog ofEuler's formula forPauli spin matrices, that is rotations of the doublet representation of the groupSU(2).
The polynomialSt can also be given the following "interpolation" characterization. Defineet(z) ≡etz, andn ≡ degP. ThenSt(z) is the unique degree<n polynomial which satisfiesSt(k)(a) =et(k)(a) wheneverk is less than the multiplicity ofa as a root ofP. We assume, as we obviously can, thatP is theminimal polynomial ofA. We further assume thatA is adiagonalizable matrix. In particular, the roots ofP are simple, and the "interpolation" characterization indicates thatSt is given by theLagrange interpolation formula, so it is theLagrange−Sylvester polynomial.
At the other extreme, ifP = (z -a)n, then
The simplest case not covered by the above observations is when witha ≠b, which yields
A practical, expedited computation of the above reduces to the following rapid steps. Recall from above that ann ×n matrixexp(tA) amounts to a linear combination of the firstn−1 powers ofA by theCayley–Hamilton theorem. Fordiagonalizable matrices, as illustrated above, e.g. in the2 × 2 case,Sylvester's formula yieldsexp(tA) =Bα exp(tα) +Bβ exp(tβ), where theBs are theFrobenius covariants ofA.
It is easiest, however, to simply solve for theseBs directly, by evaluating this expression and its first derivative att = 0, in terms ofA andI, to find the same answer as above.
But this simple procedure also works fordefective matrices, in a generalization due to Buchheim.[23] This is illustrated here for a4 × 4 example of a matrix which isnot diagonalizable, and theBs are not projection matrices.
Considerwith eigenvaluesλ1 = 3/4 andλ2 = 1, each with a multiplicity of two.
Consider the exponential of each eigenvalue multiplied byt,exp(λit). Multiply each exponentiated eigenvalue by the corresponding undetermined coefficient matrixBi. If the eigenvalues have an algebraic multiplicity greater than 1, then repeat the process, but now multiplying by an extra factor oft for each repetition, to ensure linear independence.
(If one eigenvalue had a multiplicity of three, then there would be the three terms:. By contrast, when all eigenvalues are distinct, theBs are just theFrobenius covariants, and solving for them as below just amounts to the inversion of theVandermonde matrix of these 4 eigenvalues.)
Sum all such terms, here four such,
To solve for all of the unknown matricesB in terms of the first three powers ofA and the identity, one needs four equations, the above one providing one such att = 0. Further, differentiate it with respect tot,
and again,
and once more,
(In the general case,n−1 derivatives need be taken.)
Settingt = 0 in these four equations, the four coefficient matricesBs may now be solved for,
to yield
Substituting with the value forA yields the coefficient matrices
so the final answer is
The procedure is much shorter thanPutzer's algorithm sometimes utilized in such cases.
The exponential of a1 × 1 matrix is just the exponential of the one entry of the matrix, soexp(J1(4)) = [e4]. The exponential ofJ2(16) can be calculated by the formulae(λI +N) =eλeN mentioned above; this yields[24]
Therefore, the exponential of the original matrixB is
The matrix exponential has applications to systems oflinear differential equations. (See alsomatrix differential equation.) Recall from earlier in this article that ahomogeneous differential equation of the formhas solutioneAty(0).
If we consider the vectorwe can express a system ofinhomogeneous coupled linear differential equations asMaking anansatz to use an integrating factor ofe−At and multiplying throughout, yields
The second step is possible due to the fact that, ifAB =BA, theneAtB =BeAt. So, calculatingeAt leads to the solution to the system, by simply integrating the third step with respect tot.
A solution to this can be obtained by integrating and multiplying by to eliminate the exponent in the LHS. Notice that while is a matrix, given that it is a matrix exponential, we can say that. In other words,.
From before, we already have the general solution to the homogeneous equation. Since the sum of the homogeneous and particular solutions give the general solution to the inhomogeneous problem, we now only need find the particular solution.
We have, by above,which could be further simplified to get the requisite particular solution determined through variation of parameters.Notec =yp(0). For more rigor, see the following generalization.
Inhomogeneous case generalization: variation of parameters
^R. M. Wilcox (1967). "Exponential Operators and Parameter Differentiation in Quantum Physics".Journal of Mathematical Physics.8 (4):962–982.Bibcode:1967JMP.....8..962W.doi:10.1063/1.1705306.
^This can be generalized; in general, the exponential ofJn(a) is an upper triangular matrix withea/0! on the main diagonal,ea/1! on the one above,ea/2! on the next one, and so on.
Hall, Brian C. (2015),Lie groups, Lie algebras, and representations: An elementary introduction, Graduate Texts in Mathematics, vol. 222 (2nd ed.), Springer,ISBN978-3-319-13466-6
Suzuki, Masuo (1985). "Decomposition formulas of exponential operators and Lie exponentials with some applications to quantum mechanics and statistical physics".Journal of Mathematical Physics.26 (4):601–612.Bibcode:1985JMP....26..601S.doi:10.1063/1.526596.