San José State University
Department of Economics
Thayer Watkins
Silicon Valley
& Tornado Alley

The Implicit Function Theorem
and Its Proof

The Implicit Function Theorem (IFT) is a generalization of the result that

If G(x,y)=C,
where G(x,y) is a continuous function
and C is a constant,
and ∂G/∂y≠0 at some point P
then y may be expressed as a function of x
in some domain about P;
i.e., there exists a function
over that domain such that

Note that without any loss of generality the constant C can be taken to be 0. If G*(x,y)=C then G(x,y)=G*(x,y)-C=0.

The IFT is a very important tool in economic analysis and so the conditions under which it holds must be carefully specified. The simplest way to do this is to give a formal, explicit proof of the theorem. First a proof of an artifically limited version of the IFT will be given and this will provide an understanding and a guide to the proof of the full version.

This means that within R z can be represented as a function of x and y; i.e., z=f(x,y).

Without any loss of generality the set S can be taken to be a parallelpiped, a box,centered on P because within the set S there will be at least one such box. Within any such box there will be another box containing P such that ∂F/∂z has the the same sign as at P. Let the box be given as a triple (a,b,c) such that

|x-x0| ≤ a
|y-y0| ≤ b
|z-z0| ≤ c

Without any loss of generality the sign of ∂F/∂z at P can be taken to be positive.

This means that

F(x0,y0,z0+c) > 0
F(x0,y0,z0-c) < 0

Now consider any point (x1,y1) such that

|x1-x0| ≤ a
|y1-y0| ≤ b

Because F(x0,y0,z0+c) > 0 and F(x,y,z) is continuous, F(x1,y1,z0+c) > 0. Likewise F(x1,y1,z0-c) < 0. With x and y held fixed at x1 and y1, G(z)=F(x1,y1,z) is a function such that G(z0+c) > 0 and G(z0-c) < 0. Therefore there is some z between z0-c and z0+c such that G(z)=0; i.e., F(x1,y1,z)=0. Moreover this value of z is unique. Since this holds for any (x,y) such

|x-x0| ≤ a
|y-y0| ≤ b

over this domain there a function z=f(x,y) such that F(x,y,z)=0.

Corollary 1: The function f(x,y) determined above is continuous.

Corollary 2: The partial derivatives of the function f(x,y) determined above are given by:

∂f/∂x = -(∂F/∂x)/(∂F/∂z)
∂f/∂y = -(∂F/∂y)/(∂F/∂z)

The Mean Value Theorem says that for a function z=h(x) with a continuous derivative

Δz = h(x+Δx)-h(x) = h'(x + θΔ)Δx
for some θ between 0 and 1.

This can be extended to a binary function w=G(x,z) with continuous partial derivatives so that

Δw = G(x+Δx,z+Δz)-G(x,z)=[(∂G/∂x)Δx+(∂G/∂z)Δz]
where the partial derivatives
are evaluated at (x+θΔx,z+θΔz)
for some θ between 0 and 1.

This is not the only generalization of the Mean Value Theorem but it is sufficient for the purpose here. Let G(x,z) be F(x,y,z) with y held fixed. Then

ΔF = (∂F/∂x)Δx + (∂F/∂z)Δz
where the partial derivatives
are evaluated at (x+θΔx,z+θΔz)
for some θ between 0 and 1.

Now let Δz be the change in z for z on the surface F(x,y,z)=0. Thus ΔF=0 and hence

Δz/Δx = - (∂F/∂x)/(∂F/∂z)
where the partial derivatives
are evaluated at (x+θΔx,z+θΔz)
for some θ between 0 and 1.

In the limit as Δx goes to zero this becomes

dz/dx = df/dx = - (∂F/∂x)/(∂F/∂z).

Likewise, by an analogous procedure,

dz/dy = df/dy = - (∂F/∂y)/(∂F/∂z).

The proof is essentially the same as for Theorem 1, but some unnecessary restrictions in the proof can be removed. Within the set S there will be a parallelpiped containing P and specified by its lower and upper corner points (X1L,...,XnL,zL) and (X1UX1U,zU) such that:

X1L < x1 < X1U
XnL < xn < XnU

and ∂F/∂z has the same sign as at P.

Then F(x1(P),...,x1(P),zU) and F(x1(P),...,x1(P),zL) will have opposite signs. For any point Q within the parallelpiped, ignoring the coordinate z,


will have the same signs as for P and are thus of opposite sign. For a fixed Q then G(z)=F(x1(P),...,x1(P),z) has opposite signs at zU and zL and therefore there is a z between these two levels such that G(z)=0. This value of z is a function of ((x1(Q),...,x1(Q)); i.e., z=f((x1(Q),...,x1(Q)).

Consider F(x,y,u,v)=0 and the point P and apply Theorem 2 to get v as a function of x,y and u; i.e., find v=h(x,y,u). Substitute h(x,y,u) for v in G(x,y,u,v) to obtain H(x,y,u)=G(x,y,u,h(x,y,u)). H(x,y,u)=0 so Theorem 1 applies and thus there exists f(x,y) such that u=f(x,y). A substitution of this function for u in v=h(x,y,u) gives v=g(x,y)=h(x,y,f(x,y)).

Proof: The procedure is the same as in the proof of Theorem 3. Theorem 2 is applied to one of the F-functions to obtain one of the u-variables as a function of the x-variables and the other u-variables. This expression for the u-variable as function of the other variables is substituted into a second F-function and another u-variable is obtained as a function of the remaining variables until one u-variable is found as a function of only the x-variables. This u-variable is then substituted into the preceding expression for a u-variable. This process is continued until all of the u-variables are obtained as functions only of the x-variables.

HOME PAGE OF Thayer Watkins