Inverse function theorem

Inverse function theorem In mathematics, specifically differential calculus, the inverse function theorem gives a sufficient condition for a function to be invertible in a neighborhood of a point in its domain: namely, that its derivative is continuous and non-zero at the point. The theorem also gives a formula for the derivative of the inverse function. In multivariable calculus, this theorem can be generalized to any continuously differentiable, vector-valued function whose Jacobian determinant is nonzero at a point in its domain, giving a formula for the Jacobian matrix of the inverse. There are also versions of the inverse function theorem for complex holomorphic functions, for differentiable maps between manifolds, for differentiable functions between Banach spaces, and so forth.

The theorem was first established by Picard and Goursat using an iterative scheme: the basic idea is to prove a fixed point theorem using the contraction mapping theorem.

Contents 1 Statements 2 Example 3 Counter-example 4 Methods of proof 4.1 A proof using successive approximation 4.2 A proof using the contraction mapping principle 5 Applications 5.1 Implicit function theorem 5.2 Giving a manifold structure 6 Global version 7 Holomorphic inverse function theorem 8 Formulations for manifolds 9 Generalizations 9.1 Banach spaces 9.2 Constant rank theorem 9.3 Polynomial functions 9.4 Selections 10 See also 11 Notes 12 References Statements For functions of a single variable, the theorem states that if {displaystyle f} is a continuously differentiable function with nonzero derivative at the point a; then {displaystyle f} is injective (or bijective onto the image) in a neighborhood of a, the inverse is continuously differentiable near {displaystyle b=f(a)} , and the derivative of the inverse function at {displaystyle b} is the reciprocal of the derivative of {displaystyle f} at {displaystyle a} : {displaystyle {bigl (}f^{-1}{bigr )}'(b)={frac {1}{f'(a)}}={frac {1}{f'(f^{-1}(b))}}.} It can happen that a function {displaystyle f} may be injective near a point {displaystyle a} while {displaystyle f'(a)=0} . An example is {displaystyle f(x)=(x-a)^{3}} . In fact, for such a function, the inverse cannot be differentiable at {displaystyle b=f(a)} , since if {displaystyle f^{-1}} were differentiable at {displaystyle b} , then, by the chain rule, {displaystyle 1=(f^{-1}circ f)'(a)=(f^{-1})'(b)f'(a)} , which implies {displaystyle f'(a)neq 0} . (The situation is different for holomorphic functions; see #Holomorphic inverse function theorem below.) For functions of more than one variable, the theorem states that if f is a continuously differentiable function from an open set {displaystyle U} of {displaystyle mathbb {R} ^{n}} into {displaystyle mathbb {R} ^{n}} , and the derivative {displaystyle f'(a)} is invertible at a point a (that is, the determinant of Jacobian matrix of f at a is non-zero), then there exist neighborhoods {displaystyle U'} of {displaystyle a} in {displaystyle U} and {displaystyle V} of {displaystyle b=f(a)} such that {displaystyle f(U')subset V} and {displaystyle f:U'to V} is bijective.[1] Writing {displaystyle f=(f_{1},ldots ,f_{n})} , this means that the system of m equations {displaystyle y_{i}=f_{i}(x_{1},dots ,x_{n})} has a unique solution for {displaystyle x_{1},dots ,x_{n}} in terms of {displaystyle y_{1},dots ,y_{n}} when {displaystyle xin U',yin V} . Note that the theorem does not say {displaystyle f} is bijective onto the image where {displaystyle f'} is invertible (the determinant of the Jacobian matrix is nonzero) but that it is locally bijective where {displaystyle f'} is invertible.

Moreover, the theorem says that the inverse function {displaystyle f^{-1}:Vto U'} is continuously differentiable, and its derivative at {displaystyle b=f(a)} is the inverse map of {displaystyle f'(a)} ; i.e., {displaystyle (f^{-1})'(b)=f'(a)^{-1}.} In other words, if {displaystyle Jf^{-1}(b),Jf(a)} are Jacobian matrices representing {displaystyle (f^{-1})'(b),f'(a)} , this means: {displaystyle Jf^{-1}(b)=Jf(a)^{-1}.} The hard part of the theorem is the existence and differentiability of {displaystyle f^{-1}} . Assuming this, the inverse derivative formula follows from the chain rule applied to {displaystyle f^{-1}circ f=I} . (Indeed, {displaystyle I=(f^{-1}circ f)^{'}(a)=(f^{-1})'(b)circ f'(a).} ) Since taking the inverse is infinitely differentiable, the formula for the derivative of the inverse shows that if {displaystyle f} is {displaystyle k} -th differentiable, with nonzero derivative at the point a, then the inverse is also {displaystyle k} -th differentiable. Here {displaystyle k} is a positive integer or {displaystyle infty } .

There are two variants of the inverse function theorem.[1] Given a continuously differentiable map {displaystyle f:Uto mathbb {R} ^{m}} , the one is The derivative {displaystyle f'(a)} is surjective (i.e., the Jacobian matrix representing it has rank {displaystyle m} ) if and only if there exists a continuously differentiable function {displaystyle g} on a neighborhood {displaystyle V} of {displaystyle b=f(a)} such {displaystyle fcirc g=I} near {displaystyle b} .

and the other is The derivative {displaystyle f'(a)} is injective if and only if there exists a continuously differentiable function {displaystyle g} on a neighborhood {displaystyle V} of {displaystyle b=f(a)} such {displaystyle gcirc f=I} near {displaystyle a} .

In the first case (when {displaystyle f'(a)} is surjective), the point {displaystyle b=f(a)} is called a regular value. Since {displaystyle m=dim ker(f'(a))+dim operatorname {im} (f'(a))} , the first case is equivalent to saying {displaystyle b=f(a)} is not in the image of critical points {displaystyle a} (a critical point is a point {displaystyle a} such that the kernel of {displaystyle f'(a)} is nonzero). The statement in the first case is sometimes also called the submersion theorem.

These variants are restatements of the inverse functions theorem. Indeed, in the first case when {displaystyle f'(a)} is surjective, we can find an (injective) linear map {displaystyle T} such that {displaystyle f'(a)circ T=I} . Define {displaystyle h(x)=a+Tx} so that we have: {displaystyle (fcirc h)'(0)=f'(a)circ T=I.} Thus, by the inverse function theorem, {displaystyle fcirc h} has inverse near {displaystyle 0} ; i.e., {displaystyle fcirc hcirc (fcirc h)^{-1}=I} near {displaystyle b} . The second case ( {displaystyle f'(a)} is injective) is seen in the similar way.

Example Consider the vector-valued function {displaystyle F:mathbb {R} ^{2}to mathbb {R} ^{2}!} defined by: {displaystyle F(x,y)={begin{bmatrix}{e^{x}cos y}\{e^{x}sin y}\end{bmatrix}}.} The Jacobian matrix is: {displaystyle J_{F}(x,y)={begin{bmatrix}{e^{x}cos y}&{-e^{x}sin y}\{e^{x}sin y}&{e^{x}cos y}\end{bmatrix}}} with Jacobian determinant: {displaystyle det J_{F}(x,y)=e^{2x}cos ^{2}y+e^{2x}sin ^{2}y=e^{2x}.,!} The determinant {displaystyle e^{2x}!} is nonzero everywhere. Thus the theorem guarantees that, for every point p in {displaystyle mathbb {R} ^{2}!} , there exists a neighborhood about p over which F is invertible. This does not mean F is invertible over its entire domain: in this case F is not even injective since it is periodic: {displaystyle F(x,y)=F(x,y+2pi )!} .

Counter-example The function {displaystyle f(x)=x+2x^{2}sin({tfrac {1}{x}})} is bounded inside a quadratic envelope near the line {displaystyle y=x} , so {displaystyle f'(0)=1} . Nevertheless, it has local max/min points accumulating at {displaystyle x=0} , so it is not one-to-one on any surrounding interval.

If one drops the assumption that the derivative is continuous, the function no longer need be invertible. For example {displaystyle f(x)=x+2x^{2}sin({tfrac {1}{x}})} and {displaystyle f(0)=0} has discontinuous derivative {displaystyle f'!(x)=1-2cos({tfrac {1}{x}})+4xsin({tfrac {1}{x}})} and {displaystyle f'!(0)=1} , which vanishes arbitrarily close to {displaystyle x=0} . These critical points are local max/min points of {displaystyle f} , so {displaystyle f} is not one-to-one (and not invertible) on any interval containing {displaystyle x=0} . Intuitively, the slope {displaystyle f'!(0)=1} does not propagate to nearby points, where the slopes are governed by a weak but rapid oscillation.

Methods of proof As an important result, the inverse function theorem has been given numerous proofs. The proof most commonly seen in textbooks relies on the contraction mapping principle, also known as the Banach fixed-point theorem (which can also be used as the key step in the proof of existence and uniqueness of solutions to ordinary differential equations).[2][3] Since the fixed point theorem applies in infinite-dimensional (Banach space) settings, this proof generalizes immediately to the infinite-dimensional version of the inverse function theorem[4] (see Generalizations below).

An alternate proof in finite dimensions hinges on the extreme value theorem for functions on a compact set.[5] Yet another proof uses Newton's method, which has the advantage of providing an effective version of the theorem: bounds on the derivative of the function imply an estimate of the size of the neighborhood on which the function is invertible.[6] A proof using successive approximation To prove existence, it can be assumed after an affine transformation that {displaystyle f(0)=0} and {displaystyle f^{prime }(0)=I} , so that {displaystyle a=b=0} .

By the fundamental theorem of calculus if {displaystyle u} is a C1 function, {textstyle u(1)-u(0)=int _{0}^{1}u^{prime }(t),dt} , so that {textstyle |u(1)-u(0)|leq sup _{0leq tleq 1}|u^{prime }(t)|} . Setting {displaystyle u(t)=f(x+t(x^{prime }-x))-x-t(x^{prime }-x)} , it follows that {displaystyle |f(x)-f(x^{prime })-x+x^{prime }|leq |x-x^{prime }|,sup _{0leq tleq 1}|f^{prime }(x+t(x^{prime }-x))-I|.} Now choose {displaystyle delta >0} so that {textstyle |f'(x)-I|<{1 over 2}} for {displaystyle |x|1} , then so too is its inverse. This follows by induction using the fact that the map {displaystyle F(A)=A^{-1}} on operators is Ck for any {displaystyle k} (in the finite-dimensional case this is an elementary fact because the inverse of a matrix is given as the adjugate matrix divided by its determinant). [1][7] The method of proof here can be found in the books of Henri Cartan, Jean Dieudonné, Serge Lang, Roger Godement and Lars Hörmander.

A proof using the contraction mapping principle Here is a proof based on the contraction mapping theorem. Specifically, following T. Tao,[8] it uses the following consequence of the contraction mapping theorem.

Lemma — Let {displaystyle B(0,r)} denote an open ball of radius r in {displaystyle mathbb {R} ^{n}} with center 0. If {displaystyle g:B(0,r)to mathbb {R} ^{n}} is a map such that {displaystyle g(0)=0} and there exists a constant {displaystyle 00} such that {displaystyle |g(y)-g(x)|leq 2^{-1}|y-x|} for all {displaystyle x,y} in {displaystyle B(0,r)} . Then the early lemma says that {displaystyle f=g+I} is injective on {displaystyle B(0,r)} and {displaystyle B(0,r/2)subset f(B(0,r))} . Then {displaystyle f_U=B(0,r)cap f^{-1}(B(0,r/2))to V=B(0,r/2)} is bijective and thus has the inverse. Next, we show the inverse {displaystyle f^{-1}} is continuously differentiable (this part of the argument is the same as that in the previous proof). This time, let {displaystyle g=f^{-1}} denote the inverse of {displaystyle f} and {displaystyle A=f'(x)} . For {displaystyle x=g(y)} , we write {displaystyle g(y+k)=x+h} or {displaystyle y+k=f(x+h)} . Now, by the early estimate, we have {displaystyle |h-k|=|f(x+h)-f(x)-h|leq |h|/2} and so {displaystyle |h|/2leq |k|} . Writing {displaystyle |cdot |} for the operator norm, {displaystyle |g(y+k)-g(y)-A^{-1}k|=|h-A^{-1}(f(x+h)-f(x))|leq |A^{-1}||Ah-f(x+h)+f(x)|.} As {displaystyle kto 0} , we have {displaystyle hto 0} and {displaystyle |h|/|k|} is bounded. Hence, {displaystyle g} is differentiable at {displaystyle y} with the derivative {displaystyle g'(y)=f'(g(y))^{-1}} . Also, {displaystyle g'} is the same as the composition {displaystyle iota circ f'circ g} where {displaystyle iota :Tmapsto T^{-1}} ; so {displaystyle g'} is continuous.

It remains to show the lemma. First, the map {displaystyle f} is injective on {displaystyle B(0,r)} since if {displaystyle f(x)=f(y)} , then {displaystyle g(y)-g(x)=x-y} and so {displaystyle |g(y)-g(x)|=|y-x|} , which is a contradiction unless {displaystyle y=x} . (This part does not need the assumption {displaystyle g(0)=0} .) Next we show {displaystyle f(B(0,r))supset B(0,(1-c)r)} . The idea is to note that this is equivalent to, given a point {displaystyle y} in {displaystyle B(0,(1-c)r)} , find a fixed point of the map {displaystyle F:{overline {B}}(0,r')to {overline {B}}(0,r'),,xmapsto y-g(x)} where {displaystyle 0

Si quieres conocer otros artículos parecidos a Inverse function theorem puedes visitar la categoría Differential topology.

Deja una respuesta

Tu dirección de correo electrónico no será publicada.


Utilizamos cookies propias y de terceros para mejorar la experiencia de usuario Más información