applet-magic.com Thayer Watkins Silicon Valley & Tornado Alley USA |
---|
Constrained and Unconstrained |
The purpose of this reading is the derivation of the second order conditions for optimization so that the reader will not only understand what those second order conditions are but also why those are the proper conditions. For the unconstrained case the conditions are stated in terms of the matrix of second derivatives called the Hessian matrix. the Hessian matrix is intuitively understandable. the conditions for the constrained case can be easily stated in terms of a matrix called the bordered Hessian. Generation after generation of applied mathematics students have accepted the bordered Hessian without a clue as to why it is the relevant entity.
In order to fulfill the goal of providing an intuitive derivation of the second order conditions the one, two and three variable cases will be given first before moving to the general n variable case.
Although this case is so simple it seems almost trivial its analysis sets the stage for the more complex cases.
Let f(x) be a differentiable function. the quadratic approximation to f(x) near some point x_{0} is given by
where dx=(x-x_{0}) and df=f(x)-f(x_{0}).
A critical point is a value of x_{0} such that f_{x}(x_{0})=0. If f_{x}(x_{0})=0 then
For a minimum df has to be positive and since (dx)^{2} is always positive it means that f_{xx}(x_{0}) must be positive. On the other hand for a maximum df has to be negative and that requires that f_{xx}(x_{0}) be negative. The Hessian matrix for this case is just the 1×1 matrix [f_{xx}(x_{0})]. (Hereafter the point at which the second derivatives are evaluated will not be expressed explicitly so the Hessian matrix for this case would be said to be [f_{xx}].
There is no corresponding constrained optimization problems for this one variable case.
The quadratic approximation about the point (x_{0}, y_{0}) for a differentiable function f(x,y) is
df = f_{x}dx + f_{y}dy + ½(dx, dy)H(dx, dy)^{T}
where (dx, dy)^{T} is the column vector which is the transpose of the row vector (dx, dy) and
| f_{xx} f_{xy} | | ||
H | = | | | |
| f_{yx} f_{yy} | |
For the unconstrained case a critical point is one such that f_{x}=0 and f_{y}=0 so
For a minimum the second order condition is that H be a positive definite matrix. The conditon for a matrix to be positive definite is that its principal minors all be positive. For a maximum, H must be a negative definite matrix which will be the case if the pincipal minors alternate in sign.
For the constrained case a critical point is defined in terms of the Lagrangian multiplier method. Suppose the constraint is
where λ, the Lagrangian multiplier, is chosen so as to have the critical values satisfy the constraint.
The quadratic approximation can now be expressed as
Since any deviations from the critical point must also satisfy the constraint
therefore
but now H does not have to be either positive or negative definite for an extreme (maximum or minimum) because dx and dy are not unrestricted; i.e., only dx and dy such that p_{x}dx + p_{y}dy = 0 are allowed. This makes specifying the conditions on H very difficult. The further analysis of the constrained case will be postponed until after the consideration of the unconstrained case for three variables.
For a trinary function f(x,y,z) the quadratic approximation of the deviations is
where now H is given by
| f_{xx} f_{xy} f_{xz} | | ||
H | = | | f_{yx} f_{yy} f_{yz} | |
| f_{zx} f_{zy} f_{zz} | |
As in the one and two variable unconstrained case the first order terms vanish and the conditions for a minimum is the positive definiteness of H and similarly negative definiteness for the maximum. Those conditions in turn can be stated in terms of the signs of the principal minors of H. There is nothing new here for this case.
The significance of this case is that the constrained two variable case can be restated in terms of a three variable case through the use of the Lagrangian multiplier method.
In the Lagrangian multiplier method the optimization problem of minimizing f(x,y) with respect to x and y subject to the constraint p_{x}x + p_{y}y = I is transformed into a three variable problem of unconstrained minimizing
with respect to x, y and λ.
The first order condition for λ is
The quadratic approximation for L(x,y,λ) is then
Because of the satisfication of the constraints the first two terms reduce to zero and likewise the third term. Thus the value of dL is given by
The second order conditons for the constrained two variable case then can be stated in terms of the Hessian H* for the corrsponding three variable case. The character of H* is given by first noting that
And thus the second derivatives are given as:
| f_{xx} f_{xy} -p_{x} | | ||
H* | = | | f_{yx} f_{yy} -p_{y} | |
| -p_{x} -p_{y} 0 | |
So this is the enigmatic bordered Hessian. The positive or negative definiteness of H* then constitutes the second order conditions for the constrained optimization problem.
HOME PAGE OF Thayer Watkins |