| Part of a series of articles about | ||||||
| Calculus | ||||||
|---|---|---|---|---|---|---|
| ||||||
| ||||||
Specialized | ||||||
Incalculus,integration by substitution, also known asu-substitution,reverse chain rule orchange of variables,[1] is a method for evaluatingintegrals andantiderivatives. It is the counterpart to thechain rule fordifferentiation, and can loosely be thought of as using the chain rule "backwards." This involvesdifferential forms.
Before stating the resultrigorously, consider a simple case usingindefinite integrals.
Compute[2]
Set This means or as adifferential form, Now:where is an arbitraryconstant of integration.
This procedure is frequently used, but not all integrals are of a form that permits its use. In any event, the result should be verified by differentiating and comparing to the original integrand.For definite integrals, the limits of integration must also be adjusted, but the procedure is mostly the same.
Let be adifferentiable function with acontinuous derivative, where is aninterval. Suppose that is acontinuous function. Then:[3]
In Leibniz notation, the substitution yields:Working heuristically withinfinitesimals yields the equationwhich suggests the substitution formula above. (This equation may be put on a rigorous foundation by interpreting it as a statement aboutdifferential forms.) One may view the method of integration by substitution as a partial justification ofLeibniz's notation for integrals and derivatives.
The formula is used to transform one integral into another integral that is easier to compute. Thus, the formula can be read from left to right or from right to left in order to simplify a given integral. When used in the former manner, it is sometimes known asu-substitution orw-substitution in which a new variable is defined to be a function of the original variable found inside thecomposite function multiplied by the derivative of the inner function. The latter manner is commonly used intrigonometric substitution, replacing the original variable with atrigonometric function of a new variable and the originaldifferential with the differential of the trigonometric function.
Integration by substitution can be derived from thefundamental theorem of calculus as follows. Let and be two functions satisfying the above hypothesis that is continuous on and is integrable on the closed interval. Then the function is also integrable on. Hence the integralsandin fact exist, and it remains to show that they are equal.
Since is continuous, it has anantiderivative. Thecomposite function is then defined. Since is differentiable, combining thechain rule and the definition of an antiderivative gives:
Applying thefundamental theorem of calculus twice gives:which is the substitution rule.
Substitution can be used to determineantiderivatives. One chooses a relation between and determines the corresponding relation between and by differentiating, and performs the substitutions. An antiderivative for the substituted function can hopefully be determined; the original substitution between and is then undone.
Consider the integral:Make the substitution to obtain meaning Therefore:where is an arbitraryconstant of integration.
Thetangent function can be integrated using substitution by expressing it in terms of the sine and cosine:.
Using the substitution gives and
Thecotangent function can be integrated similarly by expressing it as and using the substitution:
When evaluating definite integrals by substitution, one may calculate the antiderivative fully first, then apply the boundary conditions. In that case, there is no need to transform the boundary terms. Alternatively, one may fully evaluate the indefinite integral (see above) first then apply the boundary conditions. This becomes especially handy when multiple substitutions are used.
Consider the integral:Make the substitution to obtain meaning Therefore:Since the lower limit was replaced with and the upper limit with a transformation back into terms of was unnecessary.
For the integrala variation of the above procedure is needed. The substitution implying is useful because We thus have:
The resulting integral can be computed usingintegration by parts or adouble angle formula, followed by one more substitution. One can also note that the function being integrated is the upper right quarter of a circle with a radius of one, and hence integrating the upper right quarter from zero to one is the geometric equivalent to the area of one quarter of the unit circle, or
One may also use substitution when integratingfunctions of several variables.
Here, the substitution function(v1,...,vn) =φ(u1, ...,un) needs to beinjective and continuously differentiable, and the differentials transform as:wheredet(Dφ)(u1, ...,un) denotes thedeterminant of theJacobian matrix ofpartial derivatives ofφ at the point(u1, ...,un). This formula expresses the fact that theabsolute value of the determinant of a matrix equals the volume of theparallelotope spanned by its columns or rows.
More precisely, thechange of variables formula is stated in the next theorem:
Theorem—LetU be an open set inRn andφ :U →Rn aninjective differentiable function with continuous partial derivatives, the Jacobian of which is nonzero for everyx inU. Then for any real-valued, compactly supported, continuous functionf, with support contained inφ(U):
The conditions on the theorem can be weakened in various ways. First, the requirement thatφ be continuously differentiable can be replaced by the weaker assumption thatφ be merely differentiable and have a continuous inverse.[4] This is guaranteed to hold ifφ is continuously differentiable by theinverse function theorem. Alternatively, the requirement thatdet(Dφ) ≠ 0 can be eliminated by applyingSard's theorem.[5]
For Lebesgue measurable functions, the theorem can be stated in the following form:[6]
Theorem—LetU be a measurable subset ofRn andφ :U →Rn aninjective function, and suppose for everyx inU there existsφ′(x) inRn,n such thatφ(y) =φ(x) +φ′(x)(y −x) +o(‖y −x‖) asy →x (hereo islittle-o notation). Thenφ(U) is measurable, and for any real-valued functionf defined onφ(U):in the sense that if either integral exists (including the possibility of being properly infinite), then so does the other one, and they have the same value.
Another very general version inmeasure theory is the following:[7]
Theorem—LetX be alocally compactHausdorff space equipped with a finiteRadon measureμ, and letY be aσ-compact Hausdorff space with aσ-finite Radon measureρ. Letφ :X →Y be anabsolutely continuous function (where the latter means thatρ(φ(E)) = 0 wheneverμ(E) = 0). Then there exists a real-valuedBorel measurable functionw onX such that for everyLebesgue integrable functionf :Y →R, the function(f ∘φ) ⋅w is Lebesgue integrable onX, andFurthermore, it is possible to writefor some Borel measurable functiong onY.
Ingeometric measure theory, integration by substitution is used withLipschitz functions. A bi-Lipschitz function is a Lipschitz functionφ :U →Rn which is injective and whose inverse functionφ−1 :φ(U) →U is also Lipschitz. ByRademacher's theorem, a bi-Lipschitz mapping is differentiablealmost everywhere. In particular, the Jacobian determinant of a bi-Lipschitz mappingdetDφ is well-defined almost everywhere. The following result then holds:
Theorem—LetU be an open subset ofRn andφ :U →Rn be a bi-Lipschitz mapping. Letf :φ(U) →R be measurable. Thenin the sense that if either integral exists (or is properly infinite), then so does the other one, and they have the same value.
The above theorem was first proposed byEuler when he developed the notion ofdouble integrals in 1769. Although generalized to triple integrals byLagrange in 1773, and used byLegendre,Laplace, andGauss, and first generalized ton variables byMikhail Ostrogradsky in 1836, it resisted a fully rigorous formal proof for a surprisingly long time, and was first satisfactorily resolved 125 years later, byÉlie Cartan in a series of papers beginning in the mid-1890s.[8][9]
Substitution can be used to answer the following important question in probability: given a random variableX with probability densitypX and another random variableY such thatY=ϕ(X) forinjective (one-to-one)ϕ, what is the probability density forY?
It is easiest to answer this question by first answering a slightly different question: what is the probability thatY takes a value in some particular subsetS? Denote this probabilityP(Y ∈S). Of course, ifY has probability densitypY, then the answer is:but this is not really useful because we do not knowpY; it is what we are trying to find. We can make progress by considering the problem in the variableX.Y takes a value inS wheneverX takes a value in so:
Changing from variablex toy gives:Combining this with our first equation gives:so:
In the case whereX andY depend on several uncorrelated variables (i.e., and),can be found by substitution in several variables discussed above. The result is: