Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Inverse function theorem

From Wikipedia, the free encyclopedia
Theorem in mathematics

Part of a series of articles about
Calculus
abf(t)dt=f(b)f(a){\displaystyle \int _{a}^{b}f'(t)\,dt=f(b)-f(a)}

Inreal analysis, a branch ofmathematics, theinverse function theorem is atheorem that asserts that, if areal functionf has acontinuous derivative near a point where its derivative is nonzero, then, near this point,f has aninverse function. The inverse function is alsocontinuously differentiable, and theinverse function rule expresses its derivative as themultiplicative inverse of the derivative off.

The theorem applies verbatim tocomplex-valued functions of acomplex variable. It generalizes to functions fromn-tuples (of real or complex numbers) ton-tuples, and to functions betweenvector spaces of the same finite dimension, by replacing "derivative" with "Jacobian matrix" and "nonzero derivative" with "nonzeroJacobian determinant".

If the function of the theorem belongs to a higherdifferentiability class, the same is true for the inverse function. There are also versions of the inverse function theorem forholomorphic functions, for differentiable maps betweenmanifolds, for differentiable functions betweenBanach spaces, and so forth.

The theorem was first established byPicard andGoursat using an iterative scheme: the basic idea is to prove afixed point theorem using thecontraction mapping theorem.

Statements

[edit]

For functions of a singlevariable, the theorem states that iff{\displaystyle f} is acontinuously differentiable function with nonzero derivative at the pointa{\displaystyle a}; thenf{\displaystyle f} is injective (or bijective onto the image) in a neighborhood ofa{\displaystyle a}, the inverse is continuously differentiable nearb=f(a){\displaystyle b=f(a)}, and the derivative of the inverse function atb{\displaystyle b} is the reciprocal of the derivative off{\displaystyle f} ata{\displaystyle a}:(f1)(b)=1f(a)=1f(f1(b)).{\displaystyle {\bigl (}f^{-1}{\bigr )}'(b)={\frac {1}{f'(a)}}={\frac {1}{f'(f^{-1}(b))}}.}

It can happen that a functionf{\displaystyle f} may be injective near a pointa{\displaystyle a} whilef(a)=0{\displaystyle f'(a)=0}. An example isf(x)=(xa)3{\displaystyle f(x)=(x-a)^{3}}. In fact, for such a function, the inverse cannot be differentiable atb=f(a){\displaystyle b=f(a)}, since iff1{\displaystyle f^{-1}} were differentiable atb{\displaystyle b}, then, by the chain rule,1=(f1f)(a)=(f1)(b)f(a){\displaystyle 1=(f^{-1}\circ f)'(a)=(f^{-1})'(b)f'(a)}, which impliesf(a)0{\displaystyle f'(a)\neq 0}. (The situation is different for holomorphic functions; seeHolomorphic inverse function theorem below.)

For functions of more than one variable, the theorem states that iff{\displaystyle f} is a continuously differentiable function from an open subsetA{\displaystyle A} ofRn{\displaystyle \mathbb {R} ^{n}} intoRn{\displaystyle \mathbb {R} ^{n}}, and thederivativef(a){\displaystyle f'(a)} is invertible at a pointa (that is, the determinant of theJacobian matrix off ata is non-zero), then there exist neighborhoodsU{\displaystyle U} ofa{\displaystyle a} inA{\displaystyle A} andV{\displaystyle V} ofb=f(a){\displaystyle b=f(a)} such thatf(U)V{\displaystyle f(U)\subset V} andf:UV{\displaystyle f:U\to V} is bijective.[1] Writingf=(f1,,fn){\displaystyle f=(f_{1},\ldots ,f_{n})}, this means that the system ofn equationsyi=fi(x1,,xn){\displaystyle y_{i}=f_{i}(x_{1},\dots ,x_{n})} has a unique solution forx1,,xn{\displaystyle x_{1},\dots ,x_{n}} in terms ofy1,,yn{\displaystyle y_{1},\dots ,y_{n}} whenxU,yV{\displaystyle x\in U,y\in V}. Note that the theoremdoes not sayf{\displaystyle f} is bijective onto the image wheref{\displaystyle f'} is invertible but that it is locally bijective wheref{\displaystyle f'} is invertible.

Moreover, the theorem says that the inverse functionf1:VU{\displaystyle f^{-1}:V\to U} is continuously differentiable, and its derivative atb=f(a){\displaystyle b=f(a)} is the inverse map off(a){\displaystyle f'(a)}; i.e.,

(f1)(b)=f(a)1.{\displaystyle (f^{-1})'(b)=f'(a)^{-1}.}

In other words, ifJf1(b),Jf(a){\displaystyle Jf^{-1}(b),Jf(a)} are the Jacobian matrices representing(f1)(b),f(a){\displaystyle (f^{-1})'(b),f'(a)}, this means:

Jf1(b)=Jf(a)1.{\displaystyle Jf^{-1}(b)=Jf(a)^{-1}.}

The hard part of the theorem is the existence and differentiability off1{\displaystyle f^{-1}}. Assuming this, the inverse derivative formula follows from thechain rule applied tof1f=I{\displaystyle f^{-1}\circ f=I}. (Indeed,1=I(a)=(f1f)(a)=(f1)(b)f(a).{\displaystyle 1=I'(a)=(f^{-1}\circ f)'(a)=(f^{-1})'(b)\circ f'(a).}) Since taking the inverse is infinitely differentiable, the formula for the derivative of the inverse shows that iff{\displaystyle f} is continuouslyk{\displaystyle k} times differentiable, with invertible derivative at the pointa, then the inverse is also continuouslyk{\displaystyle k} times differentiable. Herek{\displaystyle k} is a positive integer or{\displaystyle \infty }.

There are two variants of the inverse function theorem.[1] Given a continuously differentiable mapf:URm{\displaystyle f:U\to \mathbb {R} ^{m}}, the first is

and the second is

In the first case (whenf(a){\displaystyle f'(a)} is surjective), the pointb=f(a){\displaystyle b=f(a)} is called aregular value. Sincem=dimker(f(a))+dimim(f(a)){\displaystyle m=\dim \ker(f'(a))+\dim \operatorname {im} (f'(a))}, the first case is equivalent to sayingb=f(a){\displaystyle b=f(a)} is not in the image ofcritical pointsa{\displaystyle a} (a critical point is a pointa{\displaystyle a} such that the kernel off(a){\displaystyle f'(a)} is nonzero). The statement in the first case is a special case of thesubmersion theorem.

These variants are restatements of the inverse functions theorem. Indeed, in the first case whenf(a){\displaystyle f'(a)} is surjective, we can find an (injective) linear mapT{\displaystyle T} such thatf(a)T=I{\displaystyle f'(a)\circ T=I}. Defineh(x)=a+Tx{\displaystyle h(x)=a+Tx} so that we have:

(fh)(0)=f(a)T=I.{\displaystyle (f\circ h)'(0)=f'(a)\circ T=I.}

Thus, by the inverse function theorem,fh{\displaystyle f\circ h} has inverse near0{\displaystyle 0}; i.e.,fh(fh)1=I{\displaystyle f\circ h\circ (f\circ h)^{-1}=I} nearb{\displaystyle b}. The second case (f(a){\displaystyle f'(a)} is injective) is seen in the similar way.

Example

[edit]

Consider thevector-valued functionF:R2R2{\displaystyle F:\mathbb {R} ^{2}\to \mathbb {R} ^{2}\!} defined by:

F(x,y)=[excosyexsiny].{\displaystyle F(x,y)={\begin{bmatrix}{e^{x}\cos y}\\{e^{x}\sin y}\\\end{bmatrix}}.}

The Jacobian matrix of it at(x,y){\displaystyle (x,y)} is:

JF(x,y)=[excosyexsinyexsinyexcosy]{\displaystyle JF(x,y)={\begin{bmatrix}{e^{x}\cos y}&{-e^{x}\sin y}\\{e^{x}\sin y}&{e^{x}\cos y}\\\end{bmatrix}}}

with the determinant:

detJF(x,y)=e2xcos2y+e2xsin2y=e2x.{\displaystyle \det JF(x,y)=e^{2x}\cos ^{2}y+e^{2x}\sin ^{2}y=e^{2x}.\,\!}

The determinante2x{\displaystyle e^{2x}\!} is nonzero everywhere. Thus the theorem guarantees that, for every pointp inR2{\displaystyle \mathbb {R} ^{2}\!}, there exists a neighborhood aboutp over whichF is invertible. This does not meanF is invertible over its entire domain: in this caseF is not eveninjective since it is periodic:F(x,y)=F(x,y+2π){\displaystyle F(x,y)=F(x,y+2\pi )\!}.

Counter-example

[edit]
The functionf(x)=x+2x2sin(1x){\displaystyle f(x)=x+2x^{2}\sin({\tfrac {1}{x}})} is bounded inside a quadratic envelope near the liney=x{\displaystyle y=x}, sof(0)=1{\displaystyle f'(0)=1}. Nevertheless, it has local max/min points accumulating atx=0{\displaystyle x=0}, so it is not one-to-one on any surrounding interval.

If one drops the assumption that the derivative is continuous, the function is no longer necessarily locally injective. For examplef(x)=x+2x2sin(1x){\displaystyle f(x)=x+2x^{2}\sin({\tfrac {1}{x}})} andf(0)=0{\displaystyle f(0)=0} has discontinuous derivativef(x)=12cos(1x)+4xsin(1x){\displaystyle f'\!(x)=1-2\cos({\tfrac {1}{x}})+4x\sin({\tfrac {1}{x}})} andf(0)=1,{\displaystyle f'\!(0)=1,} which vanishes arbitrarily close tox=0{\displaystyle x=0}. These critical points are local max/min points off,{\displaystyle f,} sof{\displaystyle f} is not one-to-one (and not invertible) on any interval containingx=0{\displaystyle x=0}. Intuitively, the slopef(0)=1{\displaystyle f'\!(0)=1} does not propagate to nearby points, where the slopes are governed by a weak but rapid oscillation.

If the derivative is continuous but zero at a point, the function is no longer necessarily locally injective. A real function that islocally constant at a pointxR{\displaystyle x\in \mathbb {R} } in the interior of its domain is not locally injective atx{\displaystyle x} but is trivially continuously differentiable atx{\displaystyle x}.

Methods of proof

[edit]

As an important result, the inverse function theorem has been given numerous proofs. The proof most commonly seen in textbooks relies on thecontraction mapping principle, also known as theBanach fixed-point theorem (which can also be used as the key step in the proof ofexistence and uniqueness of solutions toordinary differential equations).[2][3]

Since the fixed point theorem applies in infinite-dimensional (Banach space) settings, this proof generalizes immediately to the infinite-dimensional version of the inverse function theorem[4] (seeGeneralizations below).

An alternate proof in finite dimensions hinges on theextreme value theorem for functions on acompact set.[5] This approach has an advantage that the proof generalizes to a situation where there is no Cauchy completeness (see§ Over a real closed field).

Yet another proof usesNewton's method, which has the advantage of providing aneffective version of the theorem: bounds on the derivative of the function imply an estimate of the size of the neighborhood on which the function is invertible.[6]

Proof for single-variable functions

[edit]

We want to prove the following:LetDR{\displaystyle D\subseteq \mathbb {R} } be an open set withx0D,f:DR{\displaystyle x_{0}\in D,f:D\to \mathbb {R} } a continuously differentiable function defined onD{\displaystyle D}, and suppose thatf(x0)0{\displaystyle f'(x_{0})\neq 0}. Then there exists an open intervalI{\displaystyle I} withx0I{\displaystyle x_{0}\in I} such thatf{\displaystyle f} mapsI{\displaystyle I} bijectively onto the open intervalJ=f(I){\displaystyle J=f(I)}, and such that the inverse functionf1:JI{\displaystyle f^{-1}:J\to I} is continuously differentiable, and for anyyJ{\displaystyle y\in J}, ifxI{\displaystyle x\in I} is such thatf(x)=y{\displaystyle f(x)=y}, then(f1)(y)=1f(x){\displaystyle (f^{-1})'(y)={\dfrac {1}{f'(x)}}}.

We may without loss of generality assume thatf(x0)>0{\displaystyle f'(x_{0})>0}. Given thatD{\displaystyle D} is an open set andf{\displaystyle f'} is continuous atx0{\displaystyle x_{0}}, there existsr>0{\displaystyle r>0} such that(x0r,x0+r)D{\displaystyle (x_{0}-r,x_{0}+r)\subseteq D} and|f(x)f(x0)|<f(x0)2for all |xx0|<r.{\displaystyle |f'(x)-f'(x_{0})|<{\dfrac {f'(x_{0})}{2}}\qquad {\text{for all }}|x-x_{0}|<r.}

In particular,f(x)>f(x0)2>0for all |xx0|<r.{\displaystyle f'(x)>{\dfrac {f'(x_{0})}{2}}>0\qquad {\text{for all }}|x-x_{0}|<r.}

This shows thatf{\displaystyle f} is strictly increasing for all|xx0|<r{\displaystyle |x-x_{0}|<r}. Letδ>0{\displaystyle \delta >0} be such thatδ<r{\displaystyle \delta <r}. Then[xδ,x+δ](x0r,x0+r){\displaystyle [x-\delta ,x+\delta ]\subseteq (x_{0}-r,x_{0}+r)}. By the intermediate value theorem, we find thatf{\displaystyle f} maps the interval[xδ,x+δ]{\displaystyle [x-\delta ,x+\delta ]} bijectively onto[f(xδ),f(x+δ)]{\displaystyle [f(x-\delta ),f(x+\delta )]}. Denote byI=(xδ,x+δ){\displaystyle I=(x-\delta ,x+\delta )} andJ=(f(xδ),f(x+δ)){\displaystyle J=(f(x-\delta ),f(x+\delta ))}. Thenf:IJ{\displaystyle f:I\to J} is a bijection and the inversef1:JI{\displaystyle f^{-1}:J\to I} exists. The fact thatf1:JI{\displaystyle f^{-1}:J\to I} is differentiable follows from the differentiability off{\displaystyle f}. In particular, the result follows from the fact that iff:IR{\displaystyle f:I\to \mathbb {R} } is a strictly monotonic and continuous function that is differentiable atx0I{\displaystyle x_{0}\in I} withf(x0)0{\displaystyle f'(x_{0})\neq 0}, thenf1:f(I)R{\displaystyle f^{-1}:f(I)\to \mathbb {R} } is differentiable with(f1)(y0)=1f(x0){\displaystyle (f^{-1})'(y_{0})={\dfrac {1}{f'(x_{0})}}}, wherey0=f(x0){\displaystyle y_{0}=f(x_{0})} (a standard result in analysis). This completes the proof.

A proof using successive approximation

[edit]

To prove existence, it can be assumed after an affine transformation thatf(0)=0{\displaystyle f(0)=0} andf(0)=I{\displaystyle f^{\prime }(0)=I}, so thata=b=0{\displaystyle a=b=0}.

By themean value theorem for vector-valued functions, for a differentiable functionu:[0,1]Rm{\displaystyle u:[0,1]\to \mathbb {R} ^{m}},u(1)u(0)sup0t1u(t){\textstyle \|u(1)-u(0)\|\leq \sup _{0\leq t\leq 1}\|u^{\prime }(t)\|}. Settingu(t)=f(x+t(xx))xt(xx){\displaystyle u(t)=f(x+t(x^{\prime }-x))-x-t(x^{\prime }-x)}, it follows that

f(x)f(x)x+xxxsup0t1f(x+t(xx))I.{\displaystyle \|f(x)-f(x^{\prime })-x+x^{\prime }\|\leq \|x-x^{\prime }\|\,\sup _{0\leq t\leq 1}\|f^{\prime }(x+t(x^{\prime }-x))-I\|.}

Now chooseδ>0{\displaystyle \delta >0} so thatf(x)I<12{\textstyle \|f'(x)-I\|<{1 \over 2}} forx<δ{\displaystyle \|x\|<\delta }. Suppose thaty<δ/2{\displaystyle \|y\|<\delta /2} and definexn{\displaystyle x_{n}} inductively byx0=0{\displaystyle x_{0}=0} andxn+1=xn+yf(xn){\displaystyle x_{n+1}=x_{n}+y-f(x_{n})}. The assumptions show that ifx,x<δ{\displaystyle \|x\|,\,\,\|x^{\prime }\|<\delta } then

f(x)f(x)x+xxx/2{\displaystyle \|f(x)-f(x^{\prime })-x+x^{\prime }\|\leq \|x-x^{\prime }\|/2}.

In particularf(x)=f(x){\displaystyle f(x)=f(x^{\prime })} impliesx=x{\displaystyle x=x^{\prime }}. In the inductive schemexn<δ{\displaystyle \|x_{n}\|<\delta }andxn+1xn<δ/2n{\displaystyle \|x_{n+1}-x_{n}\|<\delta /2^{n}}. Thus(xn){\displaystyle (x_{n})} is aCauchy sequence tending tox{\displaystyle x}. By constructionf(x)=y{\displaystyle f(x)=y} as required.

To check thatg=f1{\displaystyle g=f^{-1}} is C1, writeg(y+k)=x+h{\displaystyle g(y+k)=x+h} so thatf(x+h)=f(x)+k{\displaystyle f(x+h)=f(x)+k}. By the inequalities above,hk<h/2{\displaystyle \|h-k\|<\|h\|/2} so thath/2<k<2h{\displaystyle \|h\|/2<\|k\|<2\|h\|}.On the other hand, ifA=f(x){\displaystyle A=f^{\prime }(x)}, thenAI<1/2{\displaystyle \|A-I\|<1/2}. Using thegeometric series forB=IA{\displaystyle B=I-A}, it follows thatA1<2{\displaystyle \|A^{-1}\|<2}. But then

g(y+k)g(y)f(g(y))1kk=hf(x)1[f(x+h)f(x)]k4f(x+h)f(x)f(x)hh{\displaystyle {\|g(y+k)-g(y)-f^{\prime }(g(y))^{-1}k\| \over \|k\|}={\|h-f^{\prime }(x)^{-1}[f(x+h)-f(x)]\| \over \|k\|}\leq 4{\|f(x+h)-f(x)-f^{\prime }(x)h\| \over \|h\|}}

tends to 0 ask{\displaystyle k} andh{\displaystyle h} tend to 0, proving thatg{\displaystyle g} is C1 withg(y)=f(g(y))1{\displaystyle g^{\prime }(y)=f^{\prime }(g(y))^{-1}}.

The proof above is presented for a finite-dimensional space, but applies equally well forBanach spaces. If an invertible functionf{\displaystyle f} is Ck withk>1{\displaystyle k>1}, then so too is its inverse. This follows by induction using the fact that the mapF(A)=A1{\displaystyle F(A)=A^{-1}} on operators is Ck for anyk{\displaystyle k} (in the finite-dimensional case this is an elementary fact because the inverse of a matrix is given as theadjugate matrix divided by itsdeterminant).[1][7] The method of proof here can be found in the books ofHenri Cartan,Jean Dieudonné,Serge Lang,Roger Godement andLars Hörmander.

A proof using the contraction mapping principle

[edit]

Here is a proof based on thecontraction mapping theorem. Specifically, following T. Tao,[8] it uses the following consequence of the contraction mapping theorem.

LemmaLetB(0,r){\displaystyle B(0,r)} denote an open ball of radiusr inRn{\displaystyle \mathbb {R} ^{n}} with center 0 andg:B(0,r)Rn{\displaystyle g:B(0,r)\to \mathbb {R} ^{n}} a map with a constant0<c<1{\displaystyle 0<c<1} such that

|g(y)g(x)|c|yx|{\displaystyle |g(y)-g(x)|\leq c|y-x|}

for allx,y{\displaystyle x,y} inB(0,r){\displaystyle B(0,r)}. Then forf=I+g{\displaystyle f=I+g} onB(0,r){\displaystyle B(0,r)}, we have

(1c)|xy||f(x)f(y)|,{\displaystyle (1-c)|x-y|\leq |f(x)-f(y)|,}

in particular,f is injective. If, moreover,g(0)=0{\displaystyle g(0)=0}, then

B(0,(1c)r)f(B(0,r))B(0,(1+c)r){\displaystyle B(0,(1-c)r)\subset f(B(0,r))\subset B(0,(1+c)r)}.

More generally, the statement remains true ifRn{\displaystyle \mathbb {R} ^{n}} is replaced by a Banach space. Also, the first part of the lemma is true for any normed space.

Basically, the lemma says that a small perturbation of the identity map by a contraction map is injective and preserves a ball in some sense. Assuming the lemma for a moment, we prove the theorem first. As in the above proof, it is enough to prove the special case whena=0,b=f(a)=0{\displaystyle a=0,b=f(a)=0} andf(0)=I{\displaystyle f'(0)=I}. Letg=fI{\displaystyle g=f-I}. Themean value inequality applied totg(x+t(yx)){\displaystyle t\mapsto g(x+t(y-x))} says:

|g(y)g(x)||yx|sup0<t<1|g(x+t(yx))|.{\displaystyle |g(y)-g(x)|\leq |y-x|\sup _{0<t<1}|g'(x+t(y-x))|.}

Sinceg(0)=II=0{\displaystyle g'(0)=I-I=0} andg{\displaystyle g'} is continuous, we can find anr>0{\displaystyle r>0} such that

|g(y)g(x)|21|yx|{\displaystyle |g(y)-g(x)|\leq 2^{-1}|y-x|}

for allx,y{\displaystyle x,y} inB(0,r){\displaystyle B(0,r)}. Then the early lemma says thatf=g+I{\displaystyle f=g+I} is injective onB(0,r){\displaystyle B(0,r)} andB(0,r/2)f(B(0,r)){\displaystyle B(0,r/2)\subset f(B(0,r))}. Then

f:U=B(0,r)f1(B(0,r/2))V=B(0,r/2){\displaystyle f:U=B(0,r)\cap f^{-1}(B(0,r/2))\to V=B(0,r/2)}

is bijective and thus has an inverse. Next, we show the inversef1{\displaystyle f^{-1}} is continuously differentiable (this part of the argument is the same as that in the previous proof). This time, letg=f1{\displaystyle g=f^{-1}} denote the inverse off{\displaystyle f} andA=f(x){\displaystyle A=f'(x)}. Forx=g(y){\displaystyle x=g(y)}, we writeg(y+k)=x+h{\displaystyle g(y+k)=x+h} ory+k=f(x+h){\displaystyle y+k=f(x+h)}. Now, by the early estimate, we have

|hk|=|f(x+h)f(x)h||h|/2{\displaystyle |h-k|=|f(x+h)-f(x)-h|\leq |h|/2}

and so|h|/2|k|{\displaystyle |h|/2\leq |k|}. Writing{\displaystyle \|\cdot \|} for the operator norm,

|g(y+k)g(y)A1k|=|hA1(f(x+h)f(x))|A1|Ahf(x+h)+f(x)|.{\displaystyle |g(y+k)-g(y)-A^{-1}k|=|h-A^{-1}(f(x+h)-f(x))|\leq \|A^{-1}\||Ah-f(x+h)+f(x)|.}

Ask0{\displaystyle k\to 0}, we haveh0{\displaystyle h\to 0} and|h|/|k|{\displaystyle |h|/|k|} is bounded. Hence,g{\displaystyle g} is differentiable aty{\displaystyle y} with the derivativeg(y)=f(g(y))1{\displaystyle g'(y)=f'(g(y))^{-1}}. Also,g{\displaystyle g'} is the same as the compositionιfg{\displaystyle \iota \circ f'\circ g} whereι:TT1{\displaystyle \iota :T\mapsto T^{-1}}; sog{\displaystyle g'} is continuous.

It remains to show the lemma. First, we have:

|xy||f(x)f(y)||g(x)g(y)|c|xy|,{\displaystyle |x-y|-|f(x)-f(y)|\leq |g(x)-g(y)|\leq c|x-y|,}

which is to say

(1c)|xy||f(x)f(y)|.{\displaystyle (1-c)|x-y|\leq |f(x)-f(y)|.}

This proves the first part. Next, we showf(B(0,r))B(0,(1c)r){\displaystyle f(B(0,r))\supset B(0,(1-c)r)}. The idea is to note that this is equivalent to, given a pointy{\displaystyle y} inB(0,(1c)r){\displaystyle B(0,(1-c)r)}, find a fixed point of the map

F:B¯(0,r)B¯(0,r),xyg(x){\displaystyle F:{\overline {B}}(0,r')\to {\overline {B}}(0,r'),\,x\mapsto y-g(x)}

where0<r<r{\displaystyle 0<r'<r} such that|y|(1c)r{\displaystyle |y|\leq (1-c)r'} and the bar means a closed ball. To find a fixed point, we use the contraction mapping theorem and checking thatF{\displaystyle F} is a well-defined strict-contraction mapping is straightforward. Finally, we have:f(B(0,r))B(0,(1+c)r){\displaystyle f(B(0,r))\subset B(0,(1+c)r)} since

|f(x)|=|x+g(x)g(0)|(1+c)|x|.{\displaystyle |f(x)|=|x+g(x)-g(0)|\leq (1+c)|x|.\square }

As might be clear, this proof is not substantially different from the previous one, as the proof of the contraction mapping theorem is by successive approximation.

Applications

[edit]

Implicit function theorem

[edit]

The inverse function theorem can be used to solve a system of equations

f1(x)=y1fn(x)=yn,{\displaystyle {\begin{aligned}&f_{1}(x)=y_{1}\\&\quad \vdots \\&f_{n}(x)=y_{n},\end{aligned}}}

i.e., expressingy1,,yn{\displaystyle y_{1},\dots ,y_{n}} as functions ofx=(x1,,xn){\displaystyle x=(x_{1},\dots ,x_{n})}, provided the Jacobian matrix is invertible. Theimplicit function theorem allows to solve a more general system of equations:

f1(x,y)=0fn(x,y)=0{\displaystyle {\begin{aligned}&f_{1}(x,y)=0\\&\quad \vdots \\&f_{n}(x,y)=0\end{aligned}}}

fory{\displaystyle y} in terms ofx{\displaystyle x}. Though more general, the theorem is actually a consequence of the inverse function theorem. First, the precise statement of the implicit function theorem is as follows:[9]

To see this, consider the mapF(x,y)=(x,f(x,y)){\displaystyle F(x,y)=(x,f(x,y))}. By the inverse function theorem,F:U×VW{\displaystyle F:U\times V\to W} has the inverseG{\displaystyle G} for some neighborhoodsU,V,W{\displaystyle U,V,W}. We then have:

(x,y)=F(G1(x,y),G2(x,y))=(G1(x,y),f(G1(x,y),G2(x,y))),{\displaystyle (x,y)=F(G_{1}(x,y),G_{2}(x,y))=(G_{1}(x,y),f(G_{1}(x,y),G_{2}(x,y))),}

implyingx=G1(x,y){\displaystyle x=G_{1}(x,y)} andy=f(x,G2(x,y)).{\displaystyle y=f(x,G_{2}(x,y)).} Thusg(x)=G2(x,0){\displaystyle g(x)=G_{2}(x,0)} has the required property.{\displaystyle \square }

Giving a manifold structure

[edit]

In differential geometry, the inverse function theorem is used to show that the pre-image of aregular value under a smooth map is a manifold.[10] Indeed, letf:URr{\displaystyle f:U\to \mathbb {R} ^{r}} be such a smooth map from an open subset ofRn{\displaystyle \mathbb {R} ^{n}} (since the result is local, there is no loss of generality with considering such a map). Fix a pointa{\displaystyle a} inf1(b){\displaystyle f^{-1}(b)} and then, by permuting the coordinates onRn{\displaystyle \mathbb {R} ^{n}}, assume the matrix[fixj(a)]1i,jr{\displaystyle \left[{\frac {\partial f_{i}}{\partial x_{j}}}(a)\right]_{1\leq i,j\leq r}} has rankr{\displaystyle r}. Then the mapF:URr×Rnr=Rn,x(f(x),xr+1,,xn){\displaystyle F:U\to \mathbb {R} ^{r}\times \mathbb {R} ^{n-r}=\mathbb {R} ^{n},\,x\mapsto (f(x),x_{r+1},\dots ,x_{n})} is such thatF(a){\displaystyle F'(a)} has rankn{\displaystyle n}. Hence, by the inverse function theorem, we find the smooth inverseG{\displaystyle G} ofF{\displaystyle F} defined in a neighborhoodV×W{\displaystyle V\times W} of(b,ar+1,,an){\displaystyle (b,a_{r+1},\dots ,a_{n})}. We then have

x=(FG)(x)=(f(G(x)),Gr+1(x),,Gn(x)),{\displaystyle x=(F\circ G)(x)=(f(G(x)),G_{r+1}(x),\dots ,G_{n}(x)),}

which implies

(fG)(x1,,xn)=(x1,,xr).{\displaystyle (f\circ G)(x_{1},\dots ,x_{n})=(x_{1},\dots ,x_{r}).}

That is, after the change of coordinates byG{\displaystyle G},f{\displaystyle f} is a coordinate projection (this fact is known as thesubmersion theorem). Moreover, sinceG:V×WU=G(V×W){\displaystyle G:V\times W\to U'=G(V\times W)} is bijective, the map

g=G(b,):Wf1(b)U,(xr+1,,xn)G(b,xr+1,,xn){\displaystyle g=G(b,\cdot ):W\to f^{-1}(b)\cap U',\,(x_{r+1},\dots ,x_{n})\mapsto G(b,x_{r+1},\dots ,x_{n})}

is bijective with the smooth inverse. That is to say,g{\displaystyle g} gives a local parametrization off1(b){\displaystyle f^{-1}(b)} arounda{\displaystyle a}. Hence,f1(b){\displaystyle f^{-1}(b)} is a manifold.{\displaystyle \square } (Note the proof is quite similar to the proof of the implicit function theorem and, in fact, the implicit function theorem can be also used instead.)

More generally, the theorem shows that if a smooth mapf:PE{\displaystyle f:P\to E} is transversal to a submanifoldME{\displaystyle M\subset E}, then the pre-imagef1(M)P{\displaystyle f^{-1}(M)\hookrightarrow P} is a submanifold.[11]

Global version

[edit]

The inverse function theorem is a local result; it applies to each point.A priori, the theorem thus only shows the functionf{\displaystyle f} is locally bijective (or locally diffeomorphic of some class). The next topological lemma can be used to upgrade local injectivity to injectivity that is global to some extent.

Lemma[12][full citation needed][13] IfA{\displaystyle A} is a closed subset of a (second-countable) topological manifoldX{\displaystyle X} (or, more generally, a topological space admitting anexhaustion by compact subsets) andf:XZ{\displaystyle f:X\to Z},Z{\displaystyle Z} some topological space, is a local homeomorphism that is injective onA{\displaystyle A}, thenf{\displaystyle f} is injective on some neighborhood ofA{\displaystyle A}.

Proof:[14] First assumeX{\displaystyle X} iscompact. If the conclusion of the theorem is false, we can find two sequencesxiyi{\displaystyle x_{i}\neq y_{i}} such thatf(xi)=f(yi){\displaystyle f(x_{i})=f(y_{i})} andxi,yi{\displaystyle x_{i},y_{i}} each converge to some pointsx,y{\displaystyle x,y} inA{\displaystyle A}. Sincef{\displaystyle f} is injective onA{\displaystyle A},x=y{\displaystyle x=y}. Now, ifi{\displaystyle i} is large enough,xi,yi{\displaystyle x_{i},y_{i}} are in a neighborhood ofx=y{\displaystyle x=y} wheref{\displaystyle f} is injective; thus,xi=yi{\displaystyle x_{i}=y_{i}}, a contradiction.

In general, consider the setE={(x,y)X2xy,f(x)=f(y)}{\displaystyle E=\{(x,y)\in X^{2}\mid x\neq y,f(x)=f(y)\}}. It is disjoint fromS×S{\displaystyle S\times S} for any subsetSX{\displaystyle S\subset X} wheref{\displaystyle f} is injective. LetX1X2{\displaystyle X_{1}\subset X_{2}\subset \cdots } be an increasing sequence of compact subsets with unionX{\displaystyle X} and withXi{\displaystyle X_{i}} contained in the interior ofXi+1{\displaystyle X_{i+1}}. Then, by the first part of the proof, for eachi{\displaystyle i}, we can find a neighborhoodUi{\displaystyle U_{i}} ofAXi{\displaystyle A\cap X_{i}} such thatUi2X2E{\displaystyle U_{i}^{2}\subset X^{2}-E}. ThenU=iUi{\displaystyle U=\bigcup _{i}U_{i}} has the required property.{\displaystyle \square } (See also[15] for an alternative approach.)

The lemma implies the following (a sort of) global version of the inverse function theorem:

Inverse function theorem[16] Letf:UV{\displaystyle f:U\to V} be a map between open subsets ofRn{\displaystyle \mathbb {R} ^{n}} or more generally of manifolds. Assumef{\displaystyle f} is continuously differentiable (or isCk{\displaystyle C^{k}}). Iff{\displaystyle f} is injective on a closed subsetAU{\displaystyle A\subset U} and if the Jacobian matrix off{\displaystyle f} is invertible at each point ofA{\displaystyle A}, thenf{\displaystyle f} is injective on a neighborhoodA{\displaystyle A'} ofA{\displaystyle A} andf1:f(A)A{\displaystyle f^{-1}:f(A')\to A'} is continuously differentiable (or isCk{\displaystyle C^{k}}).

Note that ifA{\displaystyle A} is a point, then the above is the usual inverse function theorem.

Holomorphic inverse function theorem

[edit]

There is a version of the inverse function theorem forholomorphic maps.

Theorem[17][18] LetU,VCn{\displaystyle U,V\subset \mathbb {C} ^{n}} be open subsets such that0U{\displaystyle 0\in U} andf:UV{\displaystyle f:U\to V} a holomorphic map whose Jacobian matrix in variableszi,z¯i{\displaystyle z_{i},{\overline {z}}_{i}} is invertible (the determinant is nonzero) at0{\displaystyle 0}. Thenf{\displaystyle f} is injective in some neighborhoodW{\displaystyle W} of0{\displaystyle 0} and the inversef1:f(W)W{\displaystyle f^{-1}:f(W)\to W} is holomorphic.

The theorem follows from the usual inverse function theorem. Indeed, letJR(f){\displaystyle J_{\mathbb {R} }(f)} denote the Jacobian matrix off{\displaystyle f} in variablesxi,yi{\displaystyle x_{i},y_{i}} andJ(f){\displaystyle J(f)} for that inzj,z¯j{\displaystyle z_{j},{\overline {z}}_{j}}. Then we havedetJR(f)=|detJ(f)|2{\displaystyle \det J_{\mathbb {R} }(f)=|\det J(f)|^{2}}, which is nonzero by assumption. Hence, by the usual inverse function theorem,f{\displaystyle f} is injective near0{\displaystyle 0} with continuously differentiable inverse. By chain rule, withw=f(z){\displaystyle w=f(z)},

z¯j(fj1f)(z)=kfj1wk(w)fkz¯j(z)+kfj1w¯k(w)f¯kz¯j(z){\displaystyle {\frac {\partial }{\partial {\overline {z}}_{j}}}(f_{j}^{-1}\circ f)(z)=\sum _{k}{\frac {\partial f_{j}^{-1}}{\partial w_{k}}}(w){\frac {\partial f_{k}}{\partial {\overline {z}}_{j}}}(z)+\sum _{k}{\frac {\partial f_{j}^{-1}}{\partial {\overline {w}}_{k}}}(w){\frac {\partial {\overline {f}}_{k}}{\partial {\overline {z}}_{j}}}(z)}

where the left-hand side and the first term on the right vanish sincefj1f{\displaystyle f_{j}^{-1}\circ f} andfk{\displaystyle f_{k}} are holomorphic. Thus,fj1w¯k(w)=0{\displaystyle {\frac {\partial f_{j}^{-1}}{\partial {\overline {w}}_{k}}}(w)=0} for eachk{\displaystyle k}.{\displaystyle \square }

Similarly, there is the implicit function theorem for holomorphic functions.[19]

As already noted earlier, it can happen that an injective smooth function has the inverse that is not smooth (e.g.,f(x)=x3{\displaystyle f(x)=x^{3}} in a real variable). This is not the case for holomorphic functions because of:

Proposition[19] Iff:UV{\displaystyle f:U\to V} is an injective holomorphic map between open subsets ofCn{\displaystyle \mathbb {C} ^{n}}, thenf1:f(U)U{\displaystyle f^{-1}:f(U)\to U} is holomorphic.

Formulations for manifolds

[edit]

The inverse function theorem can be rephrased in terms of differentiable maps betweendifferentiable manifolds. In this context the theorem states that for a differentiable mapF:MN{\displaystyle F:M\to N} (of classC1{\displaystyle C^{1}}), if thedifferential ofF{\displaystyle F},

dFp:TpMTF(p)N{\displaystyle dF_{p}:T_{p}M\to T_{F(p)}N}

is alinear isomorphism at a pointp{\displaystyle p} inM{\displaystyle M} then there exists an open neighborhoodU{\displaystyle U} ofp{\displaystyle p} such that

F|U:UF(U){\displaystyle F|_{U}:U\to F(U)}

is adiffeomorphism. Note that this implies that the connected components ofM andN containingp andF(p) have the same dimension, as is already directly implied from the assumption thatdFp is an isomorphism.If the derivative ofF is an isomorphism at all pointsp inM then the mapF is alocal diffeomorphism.

Generalizations

[edit]

Banach spaces

[edit]

The inverse function theorem can also be generalized to differentiable maps betweenBanach spacesX andY.[20] LetU be an open neighbourhood of the origin inX andF:UY{\displaystyle F:U\to Y\!} a continuously differentiable function, and assume that the Fréchet derivativedF0:XY{\displaystyle dF_{0}:X\to Y\!} ofF at 0 is abounded linear isomorphism ofX ontoY. Then there exists an open neighbourhoodV ofF(0){\displaystyle F(0)\!} inY and a continuously differentiable mapG:VX{\displaystyle G:V\to X\!} such thatF(G(y))=y{\displaystyle F(G(y))=y} for ally inV. Moreover,G(y){\displaystyle G(y)\!} is the only sufficiently small solutionx of the equationF(x)=y{\displaystyle F(x)=y\!}.

There is also the inverse function theorem forBanach manifolds.[21]

Constant rank theorem

[edit]

The inverse function theorem (and theimplicit function theorem) can be seen as a special case of the constant rank theorem, which states that a smooth map with constantrank near a point can be put in a particular normal form near that point.[22] Specifically, ifF:MN{\displaystyle F:M\to N} has constant rank near a pointpM{\displaystyle p\in M\!}, then there are open neighborhoodsU ofp andV ofF(p){\displaystyle F(p)\!} and there are diffeomorphismsu:TpMU{\displaystyle u:T_{p}M\to U\!} andv:TF(p)NV{\displaystyle v:T_{F(p)}N\to V\!} such thatF(U)V{\displaystyle F(U)\subseteq V\!} and such that the derivativedFp:TpMTF(p)N{\displaystyle dF_{p}:T_{p}M\to T_{F(p)}N\!} is equal tov1Fu{\displaystyle v^{-1}\circ F\circ u\!}. That is,F "looks like" its derivative nearp. The set of pointspM{\displaystyle p\in M} such that the rank is constant in a neighborhood ofp{\displaystyle p} is an open dense subset ofM; this is a consequence ofsemicontinuity of the rank function. Thus the constant rank theorem applies to a generic point of the domain.

When the derivative ofF is injective (resp. surjective) at a pointp, it is also injective (resp. surjective) in a neighborhood ofp, and hence the rank ofF is constant on that neighborhood, and the constant rank theorem applies.

Polynomial functions

[edit]

If it is true, theJacobian conjecture would be a variant of the inverse function theorem for polynomials. It states that if a vector-valued polynomial function has aJacobian determinant that is an invertible polynomial (that is a nonzero constant), then it has an inverse that is also a polynomial function. It is unknown whether this is true or false, even in the case of two variables. This is a major open problem in the theory of polynomials.

Selections

[edit]

Whenf:RnRm{\displaystyle f:\mathbb {R} ^{n}\to \mathbb {R} ^{m}} withmn{\displaystyle m\leq n},f{\displaystyle f} isk{\displaystyle k} timescontinuously differentiable, and the JacobianA=f(x¯){\displaystyle A=\nabla f({\overline {x}})} at a pointx¯{\displaystyle {\overline {x}}} is ofrankm{\displaystyle m}, the inverse off{\displaystyle f} may not be unique. However, there exists a localselection functions{\displaystyle s} such thatf(s(y))=y{\displaystyle f(s(y))=y} for ally{\displaystyle y} in aneighborhood ofy¯=f(x¯){\displaystyle {\overline {y}}=f({\overline {x}})},s(y¯)=x¯{\displaystyle s({\overline {y}})={\overline {x}}},s{\displaystyle s} isk{\displaystyle k} times continuously differentiable in this neighborhood, ands(y¯)=AT(AAT)1{\displaystyle \nabla s({\overline {y}})=A^{T}(AA^{T})^{-1}} (s(y¯){\displaystyle \nabla s({\overline {y}})} is theMoore–Penrose pseudoinverse ofA{\displaystyle A}).[23]

Over a real closed field

[edit]

The inverse function theorem also holds over areal closed fieldk (or ano-minimal structure).[24] Precisely, the theorem holds for a semialgebraic (or definable) map between open subsets ofkn{\displaystyle k^{n}} that is continuously differentiable.

The usual proof of the IFT uses Banach's fixed point theorem, which relies on the Cauchy completeness. That part of the argument is replaced by the use of theextreme value theorem, which does not need completeness. Explicitly, in§ A proof using the contraction mapping principle, the Cauchy completeness is used only to establish the inclusionB(0,r/2)f(B(0,r)){\displaystyle B(0,r/2)\subset f(B(0,r))}. Here, we shall directly showB(0,r/4)f(B(0,r)){\displaystyle B(0,r/4)\subset f(B(0,r))} instead (which is enough). Given a pointy{\displaystyle y} inB(0,r/4){\displaystyle B(0,r/4)}, consider the functionP(x)=|f(x)y|2{\displaystyle P(x)=|f(x)-y|^{2}} defined on a neighborhood ofB¯(0,r){\displaystyle {\overline {B}}(0,r)}. IfP(x)=0{\displaystyle P'(x)=0}, then0=P(x)=2[f1(x)y1fn(x)yn]f(x){\displaystyle 0=P'(x)=2[f_{1}(x)-y_{1}\cdots f_{n}(x)-y_{n}]f'(x)} and sof(x)=y{\displaystyle f(x)=y}, sincef(x){\displaystyle f'(x)} is invertible. Now, by the extreme value theorem,P{\displaystyle P} admits a minimal at some pointx0{\displaystyle x_{0}} on the closed ballB¯(0,r){\displaystyle {\overline {B}}(0,r)}, which can be shown to lie inB(0,r){\displaystyle B(0,r)} using21|x||f(x)|{\displaystyle 2^{-1}|x|\leq |f(x)|}. SinceP(x0)=0{\displaystyle P'(x_{0})=0},f(x0)=y{\displaystyle f(x_{0})=y}, which proves the claimed inclusion.{\displaystyle \square }

Alternatively, one can deduce the theorem from the one over real numbers byTarski's principle.[citation needed]

See also

[edit]

Notes

[edit]
  1. ^abcTheorem 1.1.7. inHörmander, Lars (2015).The Analysis of Linear Partial Differential Operators I: Distribution Theory and Fourier Analysis. Classics in Mathematics (2nd ed.). Springer.ISBN 978-3-642-61497-2.
  2. ^McOwen, Robert C. (1996)."Calculus of Maps between Banach Spaces".Partial Differential Equations: Methods and Applications. Upper Saddle River, NJ: Prentice Hall. pp. 218–224.ISBN 0-13-121880-8.
  3. ^Tao, Terence (12 September 2011)."The inverse function theorem for everywhere differentiable maps". Retrieved26 July 2019.
  4. ^Jaffe, Ethan."Inverse Function Theorem"(PDF).
  5. ^Spivak 1965, pages 31–35
  6. ^Hubbard, John H.;Hubbard, Barbara Burke (2001).Vector Analysis, Linear Algebra, and Differential Forms: A Unified Approach (Matrix ed.).
  7. ^Cartan, Henri (1971).Calcul Differentiel (in French).Hermann. pp. 55–61.ISBN 978-0-395-12033-0.
  8. ^Theorem 17.7.2 inTao, Terence (2014).Analysis. II. Texts and Readings in Mathematics. Vol. 38 (Third edition of 2006 original ed.). New Delhi: Hindustan Book Agency.ISBN 978-93-80250-65-6.MR 3310023.Zbl 1300.26003.
  9. ^Spivak 1965, Theorem 2-12.
  10. ^Spivak 1965, Theorem 5-1. and Theorem 2-13.
  11. ^"Transversality"(PDF).northwestern.edu.
  12. ^One of Spivak's books (Editorial note: give the exact location).
  13. ^Hirsch 1976, Ch. 2, § 1., Exercise 7. NB: This one is for aC1{\displaystyle C^{1}}-immersion.
  14. ^Lemma 13.3.3. ofLectures on differential topology utoronto.ca
  15. ^Dan Ramras (https://mathoverflow.net/users/4042/dan-ramras), On a proof of the existence of tubular neighborhoods., URL (version: 2017-04-13):https://mathoverflow.net/q/58124
  16. ^Ch. I., § 3, Exercise 10. and § 8, Exercise 14. in V. Guillemin, A. Pollack. "Differential Topology". Prentice-Hall Inc., 1974. ISBN 0-13-212605-2.
  17. ^Griffiths & Harris 1978, p. 18.
  18. ^Fritzsche, K.; Grauert, H. (2002).From Holomorphic Functions to Complex Manifolds. Springer. pp. 33–36.ISBN 978-0-387-95395-3.
  19. ^abGriffiths & Harris 1978, p. 19.
  20. ^Luenberger, David G. (1969).Optimization by Vector Space Methods. New York: John Wiley & Sons. pp. 240–242.ISBN 0-471-55359-X.
  21. ^Lang, Serge (1985).Differential Manifolds. New York: Springer. pp. 13–19.ISBN 0-387-96113-5.
  22. ^Boothby, William M. (1986).An Introduction to Differentiable Manifolds and Riemannian Geometry (Second ed.). Orlando: Academic Press. pp. 46–50.ISBN 0-12-116052-1.
  23. ^Dontchev, Asen L.; Rockafellar, R. Tyrrell (2014).Implicit Functions and Solution Mappings: A View from Variational Analysis (Second ed.). New York: Springer-Verlag. p. 54.ISBN 978-1-4939-1036-6.
  24. ^Chapter 7, Theorem 2.11. inDries, L. P. D. van den (1998).Tame Topology and O-minimal Structures. London Mathematical Society lecture note series, no. 248. Cambridge, New York, and Oakleigh, Victoria: Cambridge University Press.doi:10.1017/CBO9780511525919.ISBN 9780521598385.

References

[edit]
Spaces
Properties
Theorems
Operators
Algebras
Open problems
Applications
Advanced topics
Basic concepts
Derivatives
Measurability
Integrals
Results
Related
Functional calculus
Applications
Retrieved from "https://en.wikipedia.org/w/index.php?title=Inverse_function_theorem&oldid=1334844273"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp