Inmathematics,Ricci calculus constitutes the rules of index notation and manipulation fortensors andtensor fields on adifferentiable manifold, with or without ametric tensor orconnection.[a][1][2][3] It is also the modern name for what used to be called theabsolute differential calculus (the foundation of tensor calculus),tensor calculus ortensor analysis developed byGregorio Ricci-Curbastro in 1887–1896, and subsequently popularized in a paper written with his pupilTullio Levi-Civita in 1900.[4]Jan Arnoldus Schouten developed the modern notation and formalism for this mathematical framework, and made contributions to the theory, during its applications togeneral relativity anddifferential geometry in the early twentieth century.[5] The basis of modern tensor analysis was developed byBernhard Riemann in a paper from 1861.[6]
A component of a tensor is areal number that is used as a coefficient of a basis element for the tensor space. The tensor is the sum of its components multiplied by their corresponding basis elements. Tensors and tensor fields can be expressed in terms of their components, and operations on tensors and tensor fields can be expressed in terms of operations on their components. The description of tensor fields and operations on them in terms of their components is the focus of the Ricci calculus. This notation allows an efficient expression of such tensor fields and operations. While much of the notation may be applied with any tensors, operations relating to adifferential structure are only applicable to tensor fields. Where needed, the notation extends to components of non-tensors, particularlymultidimensional arrays.
A tensor may be expressed as a linear sum of thetensor product ofvector andcovector basis elements. The resulting tensor components are labelled by indices of the basis. Each index has one possible value perdimension of the underlyingvector space. The number of indices equals the degree (or order) of the tensor.
For compactness and convenience, the Ricci calculus incorporatesEinstein notation, which implies summation over indices repeated within a term anduniversal quantification over free indices. Expressions in the notation of the Ricci calculus may generally be interpreted as a set of simultaneous equations relating the components as functions over a manifold, usually more specifically as functions of the coordinates on the manifold. This allows intuitive manipulation of expressions with familiarity of only a limited set of rules.
Tensor calculus has many applications inphysics,engineering andcomputer science includingelasticity,continuum mechanics,electromagnetism (seemathematical descriptions of the electromagnetic field),general relativity (seemathematics of general relativity),quantum field theory, andmachine learning.
Working with a main proponent of theexterior calculusÉlie Cartan, the influential geometerShiing-Shen Chern summarizes the role of tensor calculus:[7]
In our subject of differential geometry, where you talk about manifolds, one difficulty is that the geometry is described by coordinates, but the coordinates do not have meaning. They are allowed to undergo transformation. And in order to handle this kind of situation, an important tool is the so-called tensor analysis, or Ricci calculus, which was new to mathematicians. In mathematics you have a function, you write down the function, you calculate, or you add, or you multiply, or you can differentiate. You have something very concrete. In geometry the geometric situation is described by numbers, but you can change your numbers arbitrarily. So to handle this, you need the Ricci calculus.
Where a distinction is to be made between the space-like basis elements and a time-like element in the four-dimensional spacetime of classical physics, this is conventionally done through indices as follows:[8]
Some sources use 4 instead of 0 as the index value corresponding to time; in this article, 0 is used. Otherwise, in general mathematical contexts, any symbols can be used for the indices, generally running over all dimensions of the vector space.
The author(s) will usually make it clear whether a subscript is intended as an index or as a label.
For example, in 3-D Euclidean space and usingCartesian coordinates; thecoordinate vectorA = (A1,A2,A3) = (Ax,Ay,Az) shows a direct correspondence between the subscripts 1, 2, 3 and the labelsx,y,z. In the expressionAi,i is interpreted as an index ranging over the values 1, 2, 3, while thex,y,z subscripts are only labels, not variables. In the context of spacetime, the index value 0 conventionally corresponds to the labelt.
Indices themselves may belabelled usingdiacritic-like symbols, such as ahat (ˆ),bar (¯),tilde (˜), or prime (′) as in:
to denote a possibly differentbasis for that index. An example is inLorentz transformations from oneframe of reference to another, where one frame could be unprimed and the other primed, as in:
This is not to be confused withvan der Waerden notation forspinors, which uses hats and overdots on indices to reflect the chirality of a spinor.
Ricci calculus, andindex notation more generally, distinguishes between lower indices (subscripts) and upper indices (superscripts); the latter arenot exponents, even though they may look as such to the reader only familiar with other parts of mathematics.
In the special case that the metric tensor is everywhere equal to the identity matrix, it is possible to drop the distinction between upper and lower indices, and then all indices could be written in the lower position. Coordinate formulae in linear algebra such as for the product of matrices may be examples of this. But in general, the distinction between upper and lower indices should be maintained.
Alower index (subscript) indicates covariance of the components with respect to that index:
Anupper index (superscript) indicates contravariance of the components with respect to that index:
A tensor may have both upper and lower indices:
Ordering of indices is significant, even when of differing variance. However, when it is understood that no indices will be raised or lowered while retaining the base symbol, covariant indices are sometimes placed below contravariant indices for notational convenience (e.g. with thegeneralized Kronecker delta).
The number of each upper and lower indices of a tensor gives itstype: a tensor withp upper andq lower indices is said to be of type(p,q), or to be a type-(p,q) tensor.
The number of indices of a tensor, regardless of variance, is called thedegree of the tensor (alternatively, itsvalence,order orrank, althoughrank is ambiguous). Thus, a tensor of type(p,q) has degreep +q.
The same symbol occurring twice (one upper and one lower) within a term indicates a pair of indices that are summed over:
The operation implied by such a summation is calledtensor contraction:
This summation may occur more than once within a term with a distinct symbol per pair of indices, for example:
Other combinations of repeated indices within a term are considered to be ill-formed, such as
(both occurrences of are lower; would be fine) | |
( occurs twice as a lower index; or would be fine). |
The reason for excluding such formulae is that although these quantities could be computed as arrays of numbers, they would not in general transform as tensors under a change of basis.
If a tensor has a list of all upper or lower indices, one shorthand is to use a capital letter for the list:[9]
whereI =i1i2 ⋅⋅⋅in andJ =j1j2 ⋅⋅⋅jm.
A pair of vertical bars| ⋅ | around a set of all-upper indices or all-lower indices (but not both), associated with contraction with another set of indices when the expression iscompletely antisymmetric in each of the two sets of indices:[10]
means a restricted sum over index values, where each index is constrained to being strictly less than the next. More than one group can be summed in this way, for example:
When using multi-index notation, an underarrow is placed underneath the block of indices:[11]
where
By contracting an index with a non-singularmetric tensor, thetype of a tensor can be changed, converting a lower index to an upper index or vice versa:
The base symbol in many cases is retained (e.g. usingA whereB appears here), and when there is no ambiguity, repositioning an index may be taken to imply this operation.
This table summarizes how the manipulation of covariant and contravariant indices fit in with invariance under apassive transformation between bases, with the components of each basis set in terms of the other reflected in the first column. The barred indices refer to the final coordinate system after the transformation.[12]
TheKronecker delta is used,see also below.
Basis transformation | Component transformation | Invariance | |
---|---|---|---|
Covector, covariant vector, 1-form | |||
Vector, contravariant vector |
Tensors are equalif and only if every corresponding component is equal; e.g., tensorA equals tensorB if and only if
for allα,β,γ. Consequently, there are facets of the notation that are useful in checking that an equation makes sense (an analogous procedure todimensional analysis).
Indices not involved in contractions are calledfree indices. Indices used in contractions are termeddummy indices, orsummation indices.
The components of tensors (likeAα,Bβγ etc.) are just real numbers. Since the indices take various integer values to select specific components of the tensors, a single tensor equation represents many ordinary equations. If a tensor equality hasn free indices, and if the dimensionality of the underlying vector space ism, the equality representsmn equations: each index takes on every value of a specific set of values.
For instance, if
is infour dimensions (that is, each index runs from 0 to 3 or from 1 to 4), then because there are three free indices (α,β,δ), there are 43 = 64 equations. Three of these are:
This illustrates the compactness and efficiency of using index notation: many equations which all share a similar structure can be collected into one simple tensor equation.
Replacing any index symbol throughout by another leaves the tensor equation unchanged (provided there is no conflict with other symbols already used). This can be useful when manipulating indices, such as using index notation to verifyvector calculus identities or identities of theKronecker delta andLevi-Civita symbol (see also below). An example of a correct change is:
whereas an erroneous change is:
In the first replacement,λ replacedα andμ replacedγeverywhere, so the expression still has the same meaning. In the second,λ did not fully replaceα, andμ did not fully replaceγ (incidentally, the contraction on theγ index became a tensor product), which is entirely inconsistent for reasons shown next.
The free indices in a tensor expression always appear in the same (upper or lower) position throughout every term, and in a tensor equation the free indices are the same on each side. Dummy indices (which implies a summation over that index) need not be the same, for example:
as for an erroneous expression:
In other words, non-repeated indices must be of the same type in every term of the equation. In the above identity,α,β,δ line up throughout andγ occurs twice in one term due to a contraction (once as an upper index and once as a lower index), and thus it is a valid expression. In the invalid expression, whileβ lines up,α andδ do not, andγ appears twice in one term (contraction)and once in another term, which is inconsistent.
When applying a rule to a number of indices (differentiation, symmetrization etc., shown next), the bracket or punctuation symbols denoting the rules are only shown on one group of the indices to which they apply.
If the brackets enclosecovariant indices – the rule applies only toall covariant indices enclosed in the brackets, not to any contravariant indices which happen to be placed intermediately between the brackets.
Similarly if brackets enclosecontravariant indices – the rule applies only toall enclosed contravariant indices, not to intermediately placed covariant indices.
Parentheses, ( ), around multiple indices denotes the symmetrized part of the tensor. When symmetrizingp indices usingσ to range over permutations of the numbers 1 top, one takes a sum over thepermutations of those indicesασ(i) fori = 1, 2, 3, ...,p, and then divides by the number of permutations:
For example, two symmetrizing indices mean there are two indices to permute and sum over:
while for three symmetrizing indices, there are three indices to sum over and permute:
The symmetrization isdistributive over addition;
Indices are not part of the symmetrization when they are:
Here theα andγ indices are symmetrized,β is not.
Square brackets, [ ], around multiple indices denotes theantisymmetrized part of the tensor. Forp antisymmetrizing indices – the sum over the permutations of those indicesασ(i) multiplied by thesignature of the permutationsgn(σ) is taken, then divided by the number of permutations:
whereδβ1⋅⋅⋅βp
α1⋅⋅⋅αp is thegeneralized Kronecker delta of degree2p, with scaling as defined below.
For example, two antisymmetrizing indices imply:
while three antisymmetrizing indices imply:
as for a more specific example, ifF represents theelectromagnetic tensor, then the equation
representsGauss's law for magnetism andFaraday's law of induction.
As before, the antisymmetrization is distributive over addition;
As with symmetrization, indices are not antisymmetrized when they are:
Here theα andγ indices are antisymmetrized,β is not.
Any tensor can be written as the sum of its symmetric and antisymmetric parts on two indices:
as can be seen by adding the above expressions forA(αβ)γ⋅⋅⋅ andA[αβ]γ⋅⋅⋅. This does not hold for other than two indices.
For compactness, derivatives may be indicated by adding indices after a comma or semicolon.[13][14]
While most of the expressions of the Ricci calculus are valid for arbitrary bases, the expressions involving partial derivatives of tensor components with respect to coordinates apply only with acoordinate basis: a basis that is defined through differentiation with respect to the coordinates. Coordinates are typically denoted byxμ, but do not in general form the components of a vector. In flat spacetime with linear coordinatization, a tuple ofdifferences in coordinates,Δxμ, can be treated as a contravariant vector. With the same constraints on the space and on the choice of coordinate system, the partial derivatives with respect to the coordinates yield a result that is effectively covariant. Aside from use in this special case, the partial derivatives of components of tensors do not in general transform covariantly, but are useful in building expressions that are covariant, albeit still with a coordinate basis if the partial derivatives are explicitly used, as with the covariant, exterior and Lie derivatives below.
To indicate partial differentiation of the components of a tensor field with respect to a coordinate variablexγ, acomma is placed before an appended lower index of the coordinate variable.
This may be repeated (without adding further commas):
These components donot transform covariantly, unless the expression being differentiated is a scalar. This derivative is characterized by theproduct rule and the derivatives of the coordinates
whereδ is theKronecker delta.
The covariant derivative is only defined if aconnection is defined. For any tensor field, asemicolon ( ;) placed before an appended lower (covariant) index indicates covariant differentiation. Less common alternatives to the semicolon include aforward slash ( /)[15] or in three-dimensional curved space a single vertical bar ( | ).[16]
The covariant derivative of a scalar function, a contravariant vector and a covariant vector are:
whereΓαγβ are the connection coefficients.
For an arbitrary tensor:[17]
An alternative notation for the covariant derivative of any tensor is the subscripted nabla symbol∇β. For the case of a vector fieldAα:[18]
The covariant formulation of thedirectional derivative of any tensor field along a vectorvγ may be expressed as its contraction with the covariant derivative, e.g.:
The components of this derivative of a tensor field transform covariantly, and hence form another tensor field, despite subexpressions (the partial derivative and the connection coefficients) separately not transforming covariantly.
This derivative is characterized by the product rule:
AKoszul connection on thetangent bundle of adifferentiable manifold is called anaffine connection.
A connection is ametric connection when the covariant derivative of the metric tensor vanishes:
Anaffine connection that is also a metric connection is called aRiemannian connection. A Riemannian connection that is torsion-free (i.e., for which thetorsion tensor vanishes:Tαβγ = 0) is aLevi-Civita connection.
TheΓαβγ for a Levi-Civita connection in a coordinate basis are calledChristoffel symbols of the second kind.
The exterior derivative of a totally antisymmetric type(0,s) tensor field with componentsAα1⋅⋅⋅αs (also called adifferential form) is a derivative that is covariant under basis transformations. It does not depend on either a metric tensor or a connection: it requires only the structure of a differentiable manifold. In a coordinate basis, it may be expressed as the antisymmetrization of the partial derivatives of the tensor components:[19]: 232–233
This derivative is not defined on any tensor field with contravariant indices or that is not totally antisymmetric. It is characterized by a graded product rule.
The Lie derivative is another derivative that is covariant under basis transformations. Like the exterior derivative, it does not depend on either a metric tensor or a connection. The Lie derivative of a type(r,s) tensor fieldT along (the flow of) a contravariant vector fieldXρmay be expressed using a coordinate basis as[20]
This derivative is characterized by the product rule and the fact that the Lie derivative of a contravariant vector field along itself is zero:
The Kronecker delta is like theidentity matrix when multiplied and contracted:
The componentsδα
β are the same in any basis and form an invariant tensor of type(1, 1), i.e. the identity of thetangent bundle over theidentity mapping of thebase manifold, and so its trace is an invariant.[21]Itstrace is the dimensionality of the space; for example, in four-dimensionalspacetime,
The Kronecker delta is one of the family of generalized Kronecker deltas. The generalized Kronecker delta of degree2p may be defined in terms of the Kronecker delta by (a common definition includes an additional multiplier ofp! on the right):
and acts as an antisymmetrizer onp indices:
An affine connection has a torsion tensorTαβγ:
whereγαβγ are given by the components of the Lie bracket of the local basis, which vanish when it is a coordinate basis.
For a Levi-Civita connection this tensor is defined to be zero, which for a coordinate basis gives the equations
If this tensor is defined as
then it is thecommutator of the covariant derivative with itself:[22][23]
since the connection is torsionless, which means that the torsion tensor vanishes.
This can be generalized to get the commutator for two covariant derivatives of an arbitrary tensor as follows:
which are often referred to as theRicci identities.[24]
The metric tensorgαβ is used for lowering indices and gives the length of anyspace-like curve
whereγ is anysmoothstrictly monotoneparameterization of the path. It also gives the duration of anytime-like curve
whereγ is any smooth strictly monotone parameterization of the trajectory. See alsoLine element.
Theinverse matrixgαβ of the metric tensor is another important tensor, used for raising indices: