Chapter 3.
Optimization
Introduction
• The primary uses of mathematics in the study of
production and price analyses are two fold:
(i) to find extreme values of functions
e.g., maximum values of certain
functions (e.g., utility,
profit, etc.) and minimum values of
certain functions
(e.g., costs, expenditures, etc.).
(ii) to study under which conditions economic
optima
(maxima and minima) hold.
Introduction
Example of (ii):
(consumer
equilibrium for
• MU zi pi
utility
z j maximization)
MU pj
MPPxi
(producer
ri
equilibrium
•
MPP rj
for profit
x j
maximization)
Optimization
There are two general types of
optimization problems:
- unconstrained optimization
- constrained optimization
3.1. Unconstrained optimization
(i) Simplest case: single argument
functions
with one explanatory variable
(ii) General case: multiple argument
functions
with n explanatory variables
3.1.1. Unconstrained optimization of
single argument
• Suppose you had these two single
argument functions:
3.1.1. Unconstrained optimization of
single argument
• What do we observe from these 2 graphs?
(i) the peaks and troughs occur where
the slope of the function is zero
(where critical points occur)
(ii) the slopes are positive and negative
to the left and right of the critical
points
3.1.1. Unconstrained optimization of
single argument
• By using derivatives, we can solve for critical
values and determine if these critical values
are relative maxima or relative minima.
(critical value) (critical value)
3.1.1. Unconstrained optimization of
single argument
• First Derivative Test:
What is the function doing around the critical
value?
relative max relative min
3.1.1. Unconstrained optimization of
single argument
• Second Derivative Test:
Another test to determine whether critical
values are relative maximum(s)/minimum(s)
(i) Relative maximum: second derivative is
negative or concave down
(ii) Relative minimum: second derivative is
positive or concave up
3.1.1. Unconstrained optimization of
single argument
• For the previous example:
Critical value is a relative max
Critical value is a relative min
3.1.1. Unconstrained optimization of
single argument
• Procedure
for optimizing a single argument function
(1) Given , find
(2) Set and solve for critical
value(s)
(3) Either use the First Derivative Test
or
Second Derivative Test to verify
whether
the critical value(s) are relative max or
relative min or neither.
3.1.1. Unconstrained optimization of
single argument
• Example:
Let
(critical value)
Using the second derivative
test:
is a relative
min
3.1.1. Unconstrained optimization of
single argument
• Another example:
(critical values)
3.1.1. Unconstrained optimization of
single argument
Second Derivative Test:
is a relative max
is a relative min
3.1.2. Unconstrained optimization of multiple
arguments
• Let’s take an example:
Take partial derivatives and set equal to zero:
3.1.2. Unconstrained optimization of multiple
arguments
• Solving these two equations simultaneously:
3.1.2. Unconstrained optimization of multiple
arguments
Distributing -10
Combining like terms
Subtracting -28 to both sides
Dividing both sides by -28
3.1.2. Unconstrained optimization of multiple
arguments
So and or are the
critical values.
However, we don’t know if these critical values
represent a relative max or relative min or
neither.
3.1.2. Unconstrained optimization of multiple
arguments
Second order direct partials:
3.1.2. Unconstrained optimization of multiple
arguments
Second order cross partials:
Shows that symmetry
condition holds
3.1.2. Unconstrained optimization of multiple
arguments
Using the criteria for optimization with single
argument functions, we are tempted to
conclude that if f11 <0 and f22 >0 , then
critical values represent a relative max
Unfortunately, the second order conditions for
multiple argument functions is not that simple.
Because the sign of the second order direct
partials
only ensure an extremum in the
dimension or the dimension, but not
the dimension.
Saddle Point
• If the second order conditions
rested solely on the signs of the
second order direct partials, you
could get cases such as the saddle
point.
• See the example on saddle point:
– The intersection of the , , and
shows a minimum in the space
and a maximum in the space.
Saddle Point
• The point being:
the second order condition for multiple
argument functions is not so simple.
• For this case, we have to set-up and
then evaluate a Hessian determinant.
Where
3.1.2. Unconstrained optimization of multiple
arguments
• So the cookbook procedure for
optimizing multiple argument
functions are:
(i) Take first order partial derivatives
(ii) Set first order partial derivatives equal to zero
and solve simultaneously for critical values
(iii) Take second order direct and cross
partial derivatives
(iv) Evaluate the Hessian determinant
Second Order or Sufficient Condition
• Rules for second order or sufficient
condition for multiple argument
functions:
– Let
– Form Hessian determinant consisting of
second order direct and cross partials:
Second Order or Sufficient Condition
• The first principal minor is defined
by deleting all rows and columns
except the first row and first
column.
– So, First principal minor
Second Order or Sufficient Condition
Thesecond principal minor is
formed by the first and second
rows and columns and deleting all
other rows and columns
So,
Second principal minor
Second Order or Sufficient Condition
The third principal minor:
You can use the La Place Transformation
procedure
or the criss-cross method shown by Chiang
to solve
for
Second Order or Sufficient Condition
Fourth, fifth, etc. principal minors
follow this same pattern.
The second order or sufficient
condition for a relative max is:
(alternating signs)
Thesecond order or sufficient
condition for a relative min is:
Second Order or Sufficient Condition
Recall the example:
Earlier we found the critical value to be
Is this critical value a relative max or min?
Second Order or Sufficient Condition
Use second order or sufficient condition:
represents a relative max
3.2. Constrained Optimization
33
Optimization with Equality Constraints
• Now, a constraint is added to the optimization
problem:
maxx,y u(x,y) s.t x px + y py = I, where px
, py are exogenous prices and I is exogenous
income.
• Different methods to solve this problem:
– Substitution
– Total differential approach
– Lagrange Multiplier
34
Substitution Approach
• Easy to use in simple 2x2 systems. Using the constraint,
substitute into objective function and optimize as usual.
• Example:
U U ( x1 , x2 ) x1 x2 2 x1 s.t. 60 4 x1 2 x2
1) Solve for x 2
x2 30 2 x1
2) Substituting into U(x1 , x 2 )
U x1 30 2 x1 2 x1 32 x1 2 x12
3) F.o.c. :
dU dx1 32 4 x1 0; x1* 8; and x2* 14;
Check s.o.c. :
d 2U dx12 4 0 maximum
4) Calculate maximum Value for U(.) : U * 128 35
Total Differential Approach
• Total differentiation of objective function and constraints:
1 - 2) U f x, y ; s.t. B g x, y
3 4) dU f x dx f y dy 0; dB g x dx g y dy 0
f x 0 dx g x 0 dx
5 6) dU ; dB
0 f y dy 0 g y dy
7 8) dx dy f y f x ; dx dy g y g x
9 10) f y f x g y g x ; f y gy fx gx
• Equation (10) along with the restriction (2) form the basis
to solve this optimization problem.
36
Total Differential Approach
• Example: U=x1x2 + 2x1 s.t. 60=4x1+ 2 x2
Taking first-order differentials of U and budget constraint
(B):
dU x2 dx1 x1dx2 2dx1 0; dB 4dx1 2dx2 0
dx1 dx2 x1 x2 2; dx1 dx2 1 2
x1 x2 2 1 2 x1 x2 2 2
60 41 1 2 x2 2 x2 x2* 14; x1* 8
U * (8,14) 814 28 128
37
Lagrange-multiplier Approach
• To avoid working with (possibly) zero denominators,
let λ denote the common value in (10). Rewriting
(10) and adding the budget constraint we are left
with a 3x3 system of equations:
10' ) f x g x
10' ' ) f y g y
2) B g ( x, y )
38
Lagrange-multiplier Approach
• There is a convenient function that produces
(10’), (10’’) and (2) as a set of f.o.c.: The
Lagrangian function, which includes the
objective function and the constraint:
L f ( x1 , x2 ) λB g ( x1 , x2 )
• The constraint is multiplied by a variable, λ,
called the Lagrange multiplier (LM).
Lagrange-multiplier Approach
• Once we form the Lagrangian function, the Lagrange
function becomes the new objective function.
(1) L f ( x1 , x2 ) λB g ( x1 , x2 )
( 2) L B g ( x1 , x2 ) 0
(3) Lx1 f x1 λg x1 0
( 4) Lx2 f x2 λg x 2 0
Lλλ L λx1 L λx 2 B
(5) Lx1λ Lx1x1 Lx1x 2 x1 0
Lx λ Lx 2 x1 Lx 2 x 2 x2 0
2
40
LM Approach
• Note that
Lλ λ = 0
Lλ x1 = 0
Lλ x2 = 0
• Then
0 g x1 g x2 B
(5) g x1 Lx1x1 Lx1x 2 x1 0
gx Lx 2 x1 Lx 2 x 2 x2 0
2
• If the constraints are linear, the Hessian of the
Lagrangian can be seen as the Hessian of the original
objective function, bordered by the first derivatives of
the constraints. This new Hessian is called bordered
41
Hessian.
LM Approach: Example
Maximize Utility U U x,y where U x , U y 0
Subject to the budget constraint B xPx yPy
Z U x, y B xPx yPy
Z xPx yPy 0 Z x U x Px 0
Ux Uy
Z y U y Py 0 Z B
Px Py
Z Z x Z y 0 Px Py
H Z x Z xx Z xy Px U xx U xy
Z y Z yx Z yy Py U yx U yy
2 Px PyU xy Py2U xx Px2U yy
42
LM Approach: Interpretation
(1) L L , x, y; B, Px , Py
( 2) L U x, y B xPx yPy
L Ux
(3) U x Px 0
x Px
L Uy
( 4) U y Py 0
y Py
L
(5) B xPx yPy 0
From f.o.c.
Ux Uy U * Px U
, x
Px Py B Py Uy
Note: From the f.o.c., at the optimal point:
- λ = marginal utility of good i, scaled by its
price.
43
- Scaled marginal utilities are equal across
LM Approach: Interpretation
• Now, we total differentiate U(x,y). We’ll get similar results
U U x, y
dU U x dx U y dy 0
dy U x
marginal
aterof substituti
on(MRS)
dx Uy
Budget
constraint
andbudget
line
B xPx yPy
B Px dy Px U x
y x,
Py Py dx Py Uy
Note: Again, at the optimal point, the ratio of marginal utilities
(MRS) is equal to the ratio of prices.
44
LM Approach: Second-order conditions
• has no effect on the value of L* because the
constraint equals zero but …
• A new set of second-order conditions are needed
• The constraint changes the criterion for a relative
max. or min.
(1) H ax 2 2hxy by 2 s.t. x y 0
α
(2) y x solve the constraint for y
2
α α
(3) H ax 2 2hx x b x
β β
2
x
( 4) H a 2 2 β h b 2
β
45
(5) H 0 iff a 2 2 β h b 2 0
LM Approach: Second-order conditions
(1) H aβ 2 2 α β h bα 2 0
0
( 2) H a h - aβ 2 2 α β h - bα 2
h b
(3) H is positive definite s.t. x y 0
0
iff a h 0 min
h b
46
LM Approach: Example
1 - 2) U x1 x2 2 x1 s.t. B 60 – 4 x1 – 2 x2 0
Form the Lagrangian function
3) L x1 x2 2 x1 λ60 – 4 x1 – 2 x2
FOC
4) Lλ 60 – 4 x1 – 2 x2 0
5 6) Lx1 x2 2 λ4 0; 1 4x2 1 2
7 8) Lx2 x1 λ2 0; 1 2x1
9 10) 1 4 x2 1 2 1 2 x1 ; x2 2 x1 2
11 12) 60 4 x1 22 x1 2 ; x1* 8
13 14) 60 48 – 2 x2 ; x2* 14
15 17) U 814 28; U * 128; * 4
47
LM Approach: Example - Cramer’s rule
1) U x1 x2 2 x1 Utility function
2) B 60 – 4 x1 – 2 x2 0 Budget constraint
3) L x1 x2 2 x1 λ60 – 4 x1 – 2 x2 Lagrangian function
4) Lλ 60 – 4 x1 – 2 x2 0 1st order conditions
5) Lx1 x2 2 λ4 0
6) Lx2 x1 λ2 0
0 4 2 60
7) 4 0 1 x 2 ;
1
2 1 0 x2 0
8) H 16 0; d 2 L is negative definite, L* is maximum
48
LM Approach: Example - Cramer’s rule
0 4 2
9) J 4 0 1 8 8 16
2 1 0
60 4 2
10) J 2 0 1 4 60 64
0 1 0
0 60 2
11) J x1 4 2 1 120 8 128
2 0 0
0 4 60
12) J x2 4 0 2 16 240 224
2 1 0
13 15) λ * 64 16 4 x1* 128 16 8 x2* 224 16 14
49
16) U *
x1* x2* 2 x1* 814 2 8 128
LM Approach: n-variable case
a) No one - variable test because there must be one more variable than constraint
b) 2 variable test of soc
H2 0 negative definite :( max
H2 0 positive definite :) min
c) 3 variable test of soc
H 2 0, H3 0 negative definite :( max
H 2 0, H3 0 positive definite :) min
d) n variable case soc, (p. 361)
H 2 0, H 3 0, H 4 0,...( 1) n H n 0 negative definite :( max
H 2 0, H 3 0, H 4 0,..., H n 0 positive definite :) min
Where
0 g1 g2 g3
0 g1 g 2 g Z11 Z Z
H 2 g1 Z11 Z12 ; H 3 1 ;
g2 Z Z 22 Z
g 2 Z 21 Z 22
g3 Z Z Z 33
50
Optimality Conditions – Unconstrained Case
• Let x* be the point that we think is the minimum for f(x).
• Necessary condition (for optimality):
df(x*) = 0
• A point that satisfies the necessary condition is a
stationary point
– It can be a minimum, maximum, or saddle point
• Q: How do we know that we have a minimum?
• Answer: Sufficiency Condition:
The sufficient conditions for x* to be a strict local
minimum are:
df(x*) = 0
d2f(x*) is positive definite
Constrained
Constrained Case
Case –– KKT
KKT Conditions
Conditions
• To proof a claim of optimality in constrained
minimization (or maximization), we have to check the
found point (x*) with respect to the (Karesh) Kuhn
Tucker (KKT) conditions.
• Kuhn and Tucker extended the Lagrangian theory to
include the general classical single-objective nonlinear
programming problem:
Minimize f(x)
Subject to gj(x) 0 for j = 1, 2, ..., J
hk(x) = 0 for k = 1, 2, ..., K
x = (x1, x2, ..., xN)
Interior versus Exterior Solutions
• Interior: If no constraints are active and (thus) the solution lies
at the interior of the feasible space, then the necessary
condition for optimality is same as for unconstrained case:
f(x*) = 0 ( difference operator for matrices )
• Exterior: If solution lies at the exterior, then the condition
f(x*) = 0 does not apply because some constraints will block
movement to this minimum.
– Some constraints will (thus) be active.
• We cannot get any more improvement (in this case) if for x*
there does not exist a vector d that is both a descent direction
and a feasible direction.
– In other words: the possible feasible directions do not
intersect the possible descent directions at all.
Mathematical Form
• A vector d that is both descending and feasible
cannot exist if
-f = mi (gi) (with mi 0) for all active constraints
iI.
– This can be rewritten as 0 = f + mi (gi)
– This condition is correct IF feasibility is defined as
g(x) 0.
– If feasibility is defined as g(x) 0, then this
becomes -f = mi (-gi)
• Again, this only applies for the active constraints.
Mathematical Form
• Usually the inactive constraints are included as
well, but the condition mj gj = 0 (with mj 0) is
added for all inactive constraints jJ.
– This is referred to as the complimentary slackness
condition.
– Note that this condition is equivalent to stating
that mj = 0 for inactive constraints -i.e., zero price
for non-binding constraints!
• Note that I+J = m, the total number of
(inequality) constraints.
Necessary KKT Conditions
• For the problem:
Min f(x)
s.t. g(x) 0
(n variables, m constraints)
• The necessary conditions are:
f(x) + mi gi(x) = 0 (optimality)
gi(x) 0 for i = 1, 2, ..., m (feasibility)
mi gi(x) = 0 for i = 1, 2, ..., m (complementary
slackness)
mi 0 for i = 1, 2, ..., m (non-negativity)
• Note that the first condition gives n equations.
Necessary KKT Conditions: General Case
• For general case (n variables, M Inequalities, L
equalities):
Min f(x) s.t.
gi(x) 0 for i = 1, 2, ..., M
hj(x) = 0 for J = 1, 2, ..., L
• In all this, the assumption is that gj(x*) for j
belonging to active constraints and hk(x*) for
k = 1, ...,K are linearly independent
– This is referred to as constraint qualification
Necessary KKT Conditions: General Case
• The necessary conditions are:
f(x) + mi gi(x) + lj hj(x) = 0 (optimality)
gi(x) 0 for i = 1, 2, ..., M (feasibility)
hj(x) = 0 for j = 1, 2, ..., L (feasibility)
mi gi(x) = 0 for i = 1, 2, ..., M
(complementary slackness)
mi 0 for i = 1, 2, ..., M (non-negativity)
(Note: lj is unrestricted in sign)
Necessary KKT Conditions (if g(x)0)
• If the definition of feasibility changes, the
optimality and feasibility conditions change
• The necessary conditions become:
f(x) - mi gi(x) + lj hj(x) = 0 (optimality)
gi(x) 0 for i = 1, 2, ..., M (feasibility)
hj(x) = 0 for j = 1, 2, ..., L (feasibility)
mi gi(x) = 0 for i = 1, 2, ..., M
(complementary slackness)
mi 0 for i = 1, 2, ..., M (non-negativity)
Restating
Restating the
the Optimization
Optimization Problem
Problem
• Kuhn Tucker Optimization Problem: Find vectors x(Nx1),
m(1xM) and l (1xK) that satisfy:
f(x) + mi gi(x) + lj hj(x) = 0 (optimality)
gi(x) 0 for i = 1, 2, ..., M (feasibility)
hj(x) = 0 for j = 1, 2, ..., L (feasibility)
mi gi(x) = 0 for i = 1, 2, ..., M (complementary slackness
condition)
mi 0 for i = 1, 2, ..., M (non-negativity)
• If x* is an optimal solution to NLP, then there exists a (m*,
l*) such that (x*, m*, l*) solves the Kuhn–Tucker
problem.
• The above equations not only give the necessary
conditions for optimality, but also provide a way of finding
the optimal point.
KKT
KKT Conditions:
Conditions: Limitations
Limitations
• Necessity theorem helps identify points that
are not optimal. A point is not optimal if it
does not satisfy the Kuhn–Tucker conditions.
• On the other hand, not all points that satisfy
the Kuhn-Tucker conditions are optimal points.
• The Kuhn–Tucker sufficiency theorem gives
conditions under which a point becomes an
optimal solution to a single-objective NLP.
KKT
KKT Conditions:
Conditions: Sufficiency
Sufficiency Condition
Condition
• Sufficient conditions that a point x* is a strict local
minimum of the classical single objective NLP problem,
where f, gj, and hk are twice differentiable functions are
that
1) The necessary KKT conditions are met.
2) The Hessian matrix 2L(x*) = 2f(x*) + mi2gi(x*) +
lj2hj(x*) is positive definite on a subspace of R n as defined
by the condition:
yT 2L(x*) y 0 is met for every vector y(1xN) satisfying:
gj(x*)y = 0 for j belonging to I1 = { j | gj(x*) = 0, uj* > 0}
(active constraints)
hk(x*)y = 0 for k = 1, ..., K
y0
KKT
KKT Sufficiency
Sufficiency Theorem
Theorem (Special
(Special Case)
Case)
• Consider the classical single objective NLP
problem.
minimize f(x)
Subject to gj(x) 0 for j = 1, 2, ..., J
hk(x) = 0 for k = 1, 2, ..., K
x = (x1, x2, ..., xN)
• Let the objective function f(x) be convex, the
inequality constraints gj(x) be all convex
functions for j = 1, ..., J, and the equality
constraints hk(x) for k = 1, ..., K be linear.
KKT Sufficiency Theorem (Special Case)
• If this is true, then the necessary KKT
conditions are also sufficient.
• Therefore, in this case, if there exists a
solution x* that satisfies the KKT necessary
conditions, then x* is an optimal solution to
the NLP problem.
• In fact, it is a global optimum.
KKT
KKT Conditions:
Conditions: Closing
Closing Remarks
Remarks
• Kuhn-Tucker Conditions are an extension of Lagrangian
function and method.
• They provide powerful means to verify solutions
• But there are limitations…
– Sufficiency conditions are difficult to verify.
– Practical problems do not have required nice properties.
– For example, you will have a problems if you do not know the
explicit constraint equations.
• If you have a multi-objective (lexicographic)
formulation, then I would suggest testing each priority
level separately.
KKT Conditions: Example
Minimize C x1 4 x2 4
2 2
s.t. 2x1 3x 2 6; 12 - 3 x1 2 x2 0; x1 , x 2 0
L x1 4 x2 4 1 6 - 2x1 3x 2 2 12 3 x1 2 x2
2 2
Form Lagrangian :
F.o.c. :
1) Lx1 2x1 4 21 32 0;
2) Lx2 2x2 4 31 22 0;
3) Lλ1 6 - 2x1 3x 2 0;
4) Lλ 2 12 3 x1 2 x2 0;
Case 1 : Let 2 0, Lλ 2 0 (2nd Constraint inactive) :
From 1) and 2) 1 x1 4 2 3 x2 4 ;
From 3) x1 3 3 2 x2 ;
3 3 2 x2 4 2 3 x2 4 x2* 5 3 * 6 13
x2* 30 / 39 10 / 13, x1* 24 / 13, 1 28 / 13 0 (Violates KKT conditions)
Case 2 : Let 1 0, Lλ 0 (1st Constraint inactive) :
1
From 1) and 2) 2 2 3x1 4 x2 4 x1 3 2 x2 4 4
From 4) x1 2 3 x2 4
2 3 x2 4 3 / 2x2 4 4; 2 3 3 2x2* 6 73
x2* 36 13 , x1* 84 / 39 28 / 13, 2 16 / 13 0 (Meets KKT conditions)
KKT Conditions: Example
Assume L λ1 L x1 L x 2 0, L λ 2 0 , therefore x1 , x 2 ,& 1 0, 2 0.
1) Lλ1 6 - 2x1 3x 2 0
2) Lx1 2 x1 8 21 0
3) Lx2 2 x2 8 31 0
Use eq.s (1), (2), & (3) to solve for 1 , x1 , and x 2
0 2 3 1 6
2 2 0 x 8 ,
1
3 0 2 x2 8
6 2 3 0 6 3 0 2 6
J 1 8 2 0 , J x1 2 8 0 , J x2 2 2 8
8 0 2 3 8 2 3 0 8
J 26 , J 1 56 , J x1 48 J x2 20
56 28 48 24 20 10
1 0 x1 0 x2 0
26 13 26 13 26 13 74
1 , 2 , x1 , x2 do not meet the KKT conditions for a minimum.
KKT Conditions: Example
Assume L x1 , L x 2 ,& L 2 0, L 1 0 , therefore x1 , x 2 ,& 2 0, 1 0.
(1) L2 12 3 x1 2 x2 0
( 2) Lx1 2 x1 8 32 0
(3) Lx2 2 x2 8 22 0
Use eq.s (1), (2), & (3) to solve for 2 , x1 , and x 2
0 3 2 2 12 12 3 2 0 12 2 0 3 12
3 2 0 x 8 , J 8 2 0 , J 3 8 0 , J 3 2 8
1 2 x1 x2
2 0 2 x2 8 8 0 2 2 8 2 2 0 8
J 26 , J 2 32 , J x1 56 J x2 72
32 16 56 28 72 36
2 0 x1 0 x2 0
26 13 26 13 26 13
1 , 2 , x1 , x2 meet the KKT conditions for a minimum.
75