0% found this document useful (0 votes)

12 views25 pages

Op Tim Ization

The document discusses multivariable unconstrained optimization, detailing the use of gradients and Hessians to find local minima and maxima of functions with multiple variables. It explains methods such as Newton's Method and the Steepest Descent Method for solving optimization problems, including the importance of step size and convergence criteria. Additionally, it covers the concepts of positive/negative definiteness of matrices and provides examples to illustrate the optimization process.

Uploaded by

techiesaditya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views25 pages

Op Tim Ization

Uploaded by

techiesaditya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

26-08-2025

Multivariable Unconstrained
Optimization
• For functions with one variable, we use the 1st
and 2nd derivatives.
Mathematical Methods of
Optimization • For functions with multiple variables, we use
identical information that is the gradient and the
Hessian.

• The gradient is the first derivative with respect to

all variables whereas the Hessian is the
equivalent of the second derivative

The Gradient The Hessian

• Review of the gradient (): • The Hessian (2) of f(x1, x2, …, xn) is:
For a function “f ”, of variables x1, x2, …, xn:  f 2 f 2 f 2 
  
 x12
2
x1x2 x1xn 
 f f f 
f =     f f 2 f 2 
 x1 x2 xn  
 2 f =  x x x2
2
x2 xn 
 2 1 
 2    
 f f 2 f 2
Example: f = 15x1 + 2( x2 )3 − 3x1 ( x3 ) 2 
 x x xn x2 xn 
2
 n 1

f = 15 − 3( x3 ) 2 6( x2 ) 2 − 6 x1 x3 

1
26-08-2025

Hessian Example Unconstrained Optimization

• Example : The optimization procedure for multivariable
functions is:
f = 15x1 + 2( x2 )3 − 3x1 ( x3 ) 2
1. Solve for the gradient of the function equal to

f = 15 − 3( x3 ) 2
6( x2 ) 2
− 6 x1 x3  zero to obtain candidate points.
2. Obtain the Hessian of the function and
evaluate it at each of the candidate points
 0 0 − 6 x3  • If the result is “positive definite” then the point is a
2 
 f = 0 12 x2 0  local minimum.
• If the result is “negative definite” then the point is a
− 6 x3 0 − 6 x1  local maximum.

Positive/Negative Definite Positive/Negative Semi-definite

• A matrix is “positive definite” if all of the • A matrix is “positive semi-definite” if all of
eigenvalues of the matrix are positive the eigenvalues are non-negative (≥ 0)
(> 0)
• A matrix is “negative semi-definite” if all of
• A matrix is “negative definite” if all of the the eigenvalues are non-positive (≤ 0)
eigenvalues of the matrix are negative
(< 0)

2
26-08-2025

Example Matrix Unconstrained NLP Example

Given the matrix A: 2 4 5 Consider the problem:
A = − 5 − 7 − 1 Minimize f(x1, x2, x3) = (x1)2 + x1(1 – x2) + (x2)2
 1 1 2  – x2x3 + (x3)2 + x3
The eigenvalues of A are:
1 = −3.702  2 = −2 3 = 2.702 First, we find the gradient with respect to xi:
 2 x1 + 1 − x2 
f =  − x1 + 2 x2 − x3 
This matrix is negative definite
 − x2 + 2 x3 + 1 

Unconstrained NLP Example Unconstrained NLP Example

Next, we set the gradient equal to zero: So we have only one candidate point to
 2 x1 + 1 − x2  0 check.
f = 0   − x + 2 − x  = 0 
 1 2 3  
− x2 + 2 x3 + 1 0 Find the Hessian:
So, we have a system of 3 equations and 3  2 −1 0 
 f = − 1 2 − 1
unknowns. When we solve, we get: 2
 x1  − 1
x =  x2  = − 1  0 − 1 2 
 x3  − 1

3
26-08-2025

Unconstrained NLP Example Unconstrained NLP Example

The eigenvalues of this matrix are: Unlike in Linear Programming, unless we
know the shape of the function being
1 = 3.414 2 = 0.586 3 = 2
minimized or can determine whether it is
convex, we cannot tell whether this point is
All of the eigenvalues are > 0, so the
the global minimum or if there are function
Hessian is positive definite.
values smaller than it.
 x1  − 1
So, the point x =  x2  = − 1 is a minimum
 x3  − 1

Method of Solution Method of Solution

• In the previous example, when we set the • To avoid this difficulty, NLP problems are
gradient equal to zero, we had a system of usually solved numerically.
3 linear equations & 3 unknowns.
• For other problems, these equations could • We will now look at examples of numerical
be nonlinear. methods used to find the optimum point for
• Thus, the problem can become trying to single-variable NLP problems. These and
solve a system of nonlinear equations, other methods may be found in any
which can be very difficult. numerical methods reference.

4
26-08-2025

Newton’s Method Newton’s Method Diagram

When solving the equation f (x) = 0 to find a
minimum or maximum, one can use the
Tangent of
iteration step: f (x) at xk
f ' (xk )
x k +1 = x k − '' k f (x) x* xk+1 xk x
f (x )
where k is the current iteration. Newton’s Method approximates f (x) as a straight
Iteration is continued until |xk+1 – xk| < e where line at xk and obtains a new point (xk+1), which is
e is some specified tolerance. used to approximate the function at the next
iteration. This is carried on until the new point is
sufficiently close to x*.

Newton’s Method Comments Regula-Falsi Method

• One must ensure that f (xk+1) < f (xk) for This method requires two points, xa & xb that
finding a minimum and f (xk+1) > f (xk) for bracket the solution to the equation
finding a maximum. f (x) = 0.
• Disadvantages: f ' ( xb )  ( xb − x a )
xc = xb − ' b
– Both the first and second derivatives must be f (x ) − f ' (xa )
calculated
– The initial guess is very important – if it is not where xc will be between xa & xb. The next
close enough to the solution, the method may interval will be xc and either xa or xb,
not converge whichever has the sign opposite of xc.

5
26-08-2025

Regula-Falsi Diagram Regula-Falsi Comments

f (x)
• This method requires initial knowledge of
two points bounding the solution
• However, it does not require the
xa xc
x* x calculation of the second derivative
xb
• The Regula-Falsi Method requires slightly
The Regula-Falsi method approximates the more iterations to converge than the
function f (x) as a straight line and Newton’s Method
interpolates to find the root.

Slides: Boyd and Vandenberghe, Convex optimization

6
26-08-2025

7
26-08-2025

Multivariable Optimization The Step Size

• Now we will consider unconstrained • The step size, ak, is calculated in the
multivariable optimization following way:
• Nearly all multivariable optimization • We want to minimize the function f(xk+1) =
methods do the following: f(xk +akdk) where the only variable is ak
1. Choose a search direction dk because xk & dk are known.
• We set df (x + ak d ) = 0 and solve for ak
k k k
2. Minimize along that direction to find a new
point: da
xk +1
= x +a d
k k k
using a single-variable solution method
where k is the current iteration number and ak such as the ones shown previously.
is a positive scalar called the step size.

8
26-08-2025

Steepest Descent Method Steepest Descent Method

• This method is very simple – it uses the • Because the gradient is the rate of change
gradient (for maximization) or the negative of the function at that point, using the
gradient (for minimization) as the search gradient (or negative gradient) as the
direction: search direction helps reduce the number
+  max  of iterations needed
d k =  f (x k ) for  
−   min  x2 f(x) = 5
-f(xk)
f(x) = 20
f(xk)

+ 
So, x k +1 = x k  a k f (x k ) xk
−  f(x) = 25
x1

Steepest Descent Method Steps Steepest Descent Method Steps

So the steps of the Steepest Descent 5. To determine convergence, either use
Method are: some given tolerance e1 and evaluate:
1. Choose an initial point x0 f (x k +1 ) − f (x k )  e 1
2. Calculate the gradient f(xk) where k is
the iteration number for convergence
3. Calculate the search vector: d = f (x )
k k Or, use another tolerance e2 and evaluate:
4. Calculate the next x: x k +1 = x k + a k d k f ( x k )  e 2
Use a single-variable optimization for convergence
method to determine ak.

9
26-08-2025

Convergence Steepest Descent Example

• These two criteria can be used for any of Let’s solve the earlier problem with the
the multivariable optimization methods Steepest Descent Method:
discussed here Minimize f(x1, x2, x3) = (x1)2 + x1(1 – x2) + (x2)2
– x2x3 + (x3)2 + x3

Recall: The norm of a vector, ||x|| is given by: 0 

x = xT  x = ( x1 ) 2 + ( x2 ) 2 +  + ( xn ) 2 Let’s pick x = 0
0
 
0

Steepest Descent Example Steepest Descent Example

f (x) = 2 x1 + (1 − x2 ) − x1 + 2 x2 − x3 − x2 + 2 x3 + 1 f (x1 ) = (a 0 ) 2 + (−a 0 )(1) + 0 − 0 + (a 0 ) 2 + (−a 0 )

d0 = −f (x0 ) = −2(0) + 1 − 0 − 0 + 0 − 0 − 0 + 0 + 1 = 2(a 0 ) 2 − 2(a 0 )

= −1 0 1 = − 1 0 − 1
df (x1 )
= 4(a 0 ) − 2
x1 = 0 0 0 + a 0  − 1 0 − 1 da 0

Now, set equal to zero and solve:

Now, we need to determine a0 4(a 0 ) = 2  a 0 = 2 = 1
4 2

10
26-08-2025

Steepest Descent Example Steepest Descent Example

So, Take the negative gradient to find the next
x = 0 0 0 + a  − 1 0 − 1
1 0 search direction:
 1 1
= 0 0 0 + − 0 −  
d1 = −f (x1 ) = − − 1 + 1 + 0
1
+0+
1 
0 − 1 + 1
 2 2
 2 2 
 1 1
 x1 = − 0 −  d1 = 0 − 1 0
 2 2

Steepest Descent Example Steepest Descent Example

Update the iteration formula: Insert into the original function & take the
derivative so that we can find a1:
 1 1
x 2 = − 0 −  + a 1  0 − 1 0
 2 2 f (x 2 ) =
1  1
( ) 1 1 1
+  −  1 + a 1 + (a 1 ) 2 − (a 1 )  + −
4  2 2 4 2
 1 1 = (a ) − a −
1 2 1 1
= − −a1 −  2
 2 2
df (x1 )
= 2(a 1 ) − 1
da 1

11
26-08-2025

Steepest Descent Example Steepest Descent Example

Now we can set the derivative equal to zero Now, calculate x2:
and solve for a1:
 1 1
x 2 = − 0 −  + a 1  0 − 1 0
2(a 1 ) = 1  a 1 = 1  2 2
2
 1 1  1 
= − 0 −  + 0 − 0
 2 2  2 

 1 1 1
 x 2 = − − − 
 2 2 2

Steepest Descent Example Steepest Descent Example

 1 1 1 1  Find a2: 1 3 1
d 2 = −f (x 2 ) = − − 1 + 1 + −1 + − 1 + 1 f (x3 ) = (a 2 + 1) 2 − (a 2 + 1) +
 2 2 2 2  2 2 4
 1 1
 d 2 = − 0 −  df (x3 ) 3
 2 2 = (a 2 + 1) −
So, da 2 2
 1 1 1  1 1
x 3 = − − −  + a 2  − 0 −  Set the derivative equal to zero and solve:
 2 2 2   2 2
 1 1 1  (a 2 + 1) =
3
= − (a 2 + 1) − − (a 2 + 1)  a = 12
2

 2 2 2  2

12
26-08-2025

Steepest Descent Example Steepest Descent Example

Calculate x3: Find the next search direction:

 1 1 1  1 1  1   1 
x 3 = − − −  + a 2  − 0 −  d 3 = −f (x3 ) = − 0 0 = 0 − 0
 2 2 2  2 2  2   2 
 1 1 1  1 1
= − − −  + − 0 −   3 1 3  1 
 2 2 2  4 4 x 4 = − − −  + a 3  0 − 0
 4 2 4  2 
 3 1 3  3 3
 x 3 = − − −  = −
1
− (a 3 + 1) − 
 4 2 4  4 2 4

Steepest Descent Example Steepest Descent Example

Find a3: 1 3
f (x 4 ) = (a 3 + 1) 2 − (a 3 ) −
3 So, x4 becomes:
4 2 2
 3 1 3  5 
x 4 = − − −  + 0 − 0
 4 2 4  8 
df (x 4 ) 1 3 9
= (a + 1) − = 0
da 3 2 8
 3 9 3
 x 4 = − − − 
5  4 8 4
a3 =
4

13
26-08-2025

Steepest Descent Example Steepest Descent Example

The next search direction: Find a4: 73 4 2 43 4 51
f (x5 ) = (a ) − a −
5 3 5  5 3 5 32 32 64
d 4 = −f (x 4 ) = −  − = − − 
8 4 8   8 4 8
df (x5 ) 73 4 43
= a − =0
da 4 16 32
 3 9 3  5 3 5
x 5 = − − −  + a 4  − − 
 4 8 4  8 4 8
 a = 43146
4

 1 5 3 3 1 5 
= − (3 + a 4 ) − ( − a 4 ) − (3 + a 4 )
 4 2 4 2 4 2 

Steepest Descent Example Steepest Descent Example

Update x5: Let’s check to see if the convergence criteria
is satisfied
 3 9 3  43  5 3 5 Evaluate ||f(x5)||:
x 5 = − − − +  − − 
 4 8 4  146  8 4 8  21 35 21 
f (x5 ) = 
 584 584 584 
 1091 66 1091 
 x 5 = − − −
 1168 73 1168 
f (x5 ) = (21584) + (35 584) + (21584)
2 2 2
= 0.0786

14
26-08-2025

Steepest Descent Example Quadratic Functions

So, ||f(x5)|| = 0.0786, which is very small • Quadratic functions are important for the
and we can take it to be close enough to next method we will look at
zero for our example • A quadratic function can be written in the
Notice that the answer of form: xTQx where x is the vector of
variables and Q is a matrix of coefficients
 1091 66 1091 
x = − − − 
 1168 73 1168
Example:  2 − 1 0   x1 
is very close to the value of x* = − 1 − 1 − 1 x1 x2 x3 − 1 2 − 1  x2  = 2(x1 ) 2 − 2x1x 2 + 2(x 2 ) 2
that we obtained analytically  0 − 1 2   x3 
– 2x 2 x 3 + 2(x3 ) 2

Conjugate Gradient Method Conjugate Gradient Steps

• The Conjugate Gradient Method has the 1. Choose a starting point x0 and calculate
property that if f(x) is quadratic, it will take f(x0). Let d0 = -f(x0)
exactly n iterations to converge, where n is
the number of variables in the x vector 2. Calculate x1 using: x = x + a d
1 0 0 0

• Although it works especially well with Find a0 by performing a single-variable

quadratic functions, this method will work optimization on f(x0 +a0d0) using the
with non-quadratic functions also methods discussed earlier. (See
illustration after algorithm explanation)
https://en.wikipedia.org/wiki/Conjugate_gradient_method

15
26-08-2025

Conjugate Gradient Steps Conjugate Gradient Steps

3. Calculate f(x1) and f(x1). The new 4. Use either of the two methods discussed
search direction is calculated using the earlier to determine tolerance:
equation:
f (x k +1 ) − f (x k )  e 1
T f (x1 )  f (x1 )
d = −f (x ) + d T
1 1 0

 f (x 0 )  f (x 0 ) Or,
This can be generalized for the kth iteration: f ( x k )  e 2
k +1 k +1
 f (x )  f (x )
T
d k +1 = −f (x k +1 ) + d k
T f (x k )  f (x k )

Number of Iterations Step Size for Quadratic Functions

• For quadratic functions, this method will • When optimizing the step size, we can
converge in n iterations (k = n) approximate the function to be optimized
in the following manner:
• For non-quadratic functions, after n f (x k + ad k )  f (x k ) + T f (x k )  ad k + 12 (ad k )T   2 f (x k )  (ad k )
iterations, the algorithm cycles again with
dn+1 becoming d0. • For a quadratic function, this is not an
approximation – it is exact

16
26-08-2025

Step Size for Quadratic Functions Step Size for Quadratic Functions
We take the derivative of that function with • So, for the problem of optimizing a
respect to a and set it equal to zero: quadratic function,
df (x k + ad k ) T f ( x k )  d k
= T f (x k )  d k + (d k )T   2 f (x k )  ad k = 0 a k ,opt = −
da (d )   2 f (x k )  d k
k T

The solution to this equation is: is the optimum step size.

T f ( x k )  d k
• For a non-quadratic function, this is an
a k ,opt = − approximation of the optimum step size.
(d )   2 f (x k )  d k
k T

Multivariable Newton’s Method Multivariable Newton’s Method

We can approximate the gradient of f at a We can generalize this equation to give an
point x0 by: iterative expression for the Newton’s
Method:
f (x)  f (x0 ) + 2 f (x0 )  (x − x0 )
We can set the right-hand side equal to zero

x k +1 = x k −  2 f (x k ) −1
 f (x k )
and rearrange to give: where k is the iteration number
0

x = x −  f (x ) 2 0

−1
 f (x )0

17
26-08-2025

Newton’s Method Steps Comments on Newton’s Method

1. Choose a starting point, x0 • We can see that unlike the previous two
2. Calculate f(xk) and 2f(xk) methods, Newton’s Method uses both the
3. Calculate the next x using the equation gradient and the Hessian
• This usually reduces the number of

x k +1 = x k −  2 f (x k ) 
−1
 f (x k ) iterations needed, but increases the
computation needed for each iteration
4. Use either of the convergence criteria • So, for very complex functions, a simpler
discussed earlier to determine method is usually faster
convergence. If it hasn’t converged,
return to step 2.

Newton’s Method Example Newton’s Method Example

For an example, we will use the same The Hessian is:  2 −1 0 
problem as before:  2 f (x) = − 1 2 − 1
 0 − 1 2 
Minimize f(x1, x2, x3) = (x1)2 + x1(1 – x2) + (x2)2
– x2x3 + (x3)2 + x3 And we will need the inverse of the Hessian:

 2 −1 0 
−1 3 1 1 
f (x) = 2 x1 − x2 + 1 − x1 + 2 x2 − x3 − x2 + 2 x3 + 1  4 2 4
 2
f ( x) 
−1
= − 1 2 − 1 =  1 1 1 
 0 − 1 2   12 1
2
3 
 4 2 4 

18
26-08-2025

Newton’s Method Example Newton’s Method Example

0  So, the new x is:
−1
x1 = x 0 − 2f ( x 0 )  f ( x 0 )
So, pick x = 0
0

3 1 1 
0 0   4 2 4   1
= 0  −  1 1 1   0 
 2 2
Calculate the gradient for the 1st iteration: 0   1 1 3   1
 4 2 4 
f (x0 ) = 0 − 0 + 1 − 0 + 0 − 0 − 0 + 0 + 1
− 1
 f (x ) = 1 0 1
0  x1 = − 1
− 1

Newton’s Method Example Comments on Example

Now calculate the new gradient: • Because it uses the 2nd derivative,
Newton’s Method models quadratic
f (x1 ) = − 2 + 1 + 1 1 − 2 + 1 1 − 2 + 1 = 0 0 0 functions exactly and can find the optimum
point in one iteration.
Since the gradient is zero, the method has
• If the function had been a higher order, the
converged
Hessian would not have been constant
and it would have been much more work
to calculate the Hessian and take the
inverse for each iteration.

19
26-08-2025

Constrained Nonlinear Convexity & global vs. local optima

➔When minimizing a function, if we want to be sure that we
Optimization can get a global solution via differentiation, we need to impose
• Previously in this chapter, we solved NLP some requirements on our objective function.
problems that only had objective functions, ➔We will also need to impose some requirements on the
feasible set S (set of possible values the solution x* may take).
with no constraints.
Min f(x) min f ( x)
• Now we will look at methods on how to
solve problems that include constraints. subject to subject to
h(x)=c
g(x)> b
Feasible set
xS
Definition: If f(x) is a convex function, and if S is a convex set, then
the above problem is a convex programming problem.
Definition: If f(x) is not a convex function, or if S is not a convex set,
then the above problem is a non-convex programming problem. 78

Convex vs. nonconvex programming problems A convex programming problem

MATHEMATICAL
We focus on this one, but
PROGRAMMING
The desirable quality of a convex Two variables with one min f ( x1 , x2 ) conclusions we derive will also
programming problem is that any locally equality-constraint s.t. h( x1 , x2 ) = c apply to the other two. The
benefit of focusing on this one is
optimal solution is also a globally optimal Convex that we can visualize it.
solution. ➔If we have a method of min f ( x)
We address convex Multi-variable with one
finding a locally optimal solution, that equality-constraint. s.t. h( x ) = c
programming
method also finds for us the globally problems in
optimum solution. addressing linear
programming. Multi-variable with multiple min f ( x)
The undesirable quality of a non-convex equality-constraints. s.t. h( x ) = c
programming problem is that any method
which finds a locally optimal solution Non-convex
does not necessarily find the globally We will also, later,
address a special form of
optimum solution. non-convex programming
problems called integer
programs. 79 80

20
26-08-2025

Contour maps/Constant Cost Contours/level Sets Contour maps and 3-D illustrations
. Definition: A contour map is a 2-dimensional plane, i.e., a .
coordinate system in 2 variables, say, x1, x2, that illustrates Example: Draw the 3-D surface for f ( x1, x2 ) = x12 + x22
curves (contours) of constant functional value f(x1, x2). [X,Y] = meshgrid(-
Example: Draw the contour map for f ( x1, x2 ) = x12 + x22 2.0:.2:2.0,-2.0:.2:2.0);
Z = X.^2+Y.^2;
[X,Y] = meshgrid(- surfc(X,Y,Z)
2.0:.2:2.0,-2.0:.2:2.0); xlabel('x1')
Z = X.^2+Y.^2; ylabel('x2')
[c,h]=contour(X,Y,Z); zlabel('f(x1,x2)')
clabel(c,h);
grid; Height is f(x)
xlabel('x1');
Each contour of fixed value f is
ylabel('x2');
the projection onto the x1-x2
plane of a horizontal slice
made of the 3-D figure at a Contours
value f above the x1-x2 plane.

81 82

Solving a convex program: graphical analysis Solving a convex program: graphical analysis
.
Example: Solve this convex program: min f ( x1 , x2 ) = x12 + x22 .
Solution:
s.t. h(x1,x2 ) = x1 + x2 = 6
A straight line is a convex set because a line
segment between any two points on it remain on it.
x* = ( x1 , x2 )*  (1.25,1.25)
f ( x1 , x2 )* = 3

x1 + x2 = 6  x2 = − x1 + 6 Any contour f<3 does not

intersect the equality
constraint;
Superimpose this Any contour f>3 intersects the
equality constraint at two
relation on top of the points.
contour plot for f(x1,x2).
➔The contour f=3 and the
equality constraint just touch
each other at the point x*.
1. f(x1,x2) must be minimized, and so we would like the solution
to be as close to the origin as possible; “Just touch”:
2. The solution must be on the thick line in the right-hand corner The two curves are tangent to one another at the solution point.
of the plot, since this line represents the equality constraint. 83 84

21
26-08-2025

Solving a convex program: graphical analysis Solving a convex program: graphical analysis

.
The two curves are tangent to one another at the solution point.
.
(
f ( x1 , x2 )* =  h(x1,x2 )* − c )
➔ The normal (gradient) Moving everything to the left:
vectors of the two curves, at
the solution (tangent) point, (
f ( x1 , x2 )* −  h(x1,x2 )* − c = 0 ) Alternately:
are parallel. ( )
f ( x1 , x2 )* +  c − h(x1,x2 )* = 0

This means the following Performing the gradient operation (taking

two vectors are parallel: derivatives with respect to x1 and x2) :
 
* *
  2x  
 x ( f ( x1 , x2 ) −  (h(x1,x2 ) − c ))
*
{ f ( x1 , x2 )*} =  x12 + x22 =  1 
 2 x2  0 
 1  = 
    ( f ( x , x ) −  (h(x ,x ) − c ))
*
* 1 0 
{h( x1 , x2 ) * − 6} =  x1 + x2 − 6 =  
1  x2 1 2 1 2


“Parallel” means that the two vectors have the same direction. We do not know that In this problem, we already know the solution, but what if we did not?

( )
they have the same magnitude. To account for this, we equate with a “multiplier” λ:
Then could we use the above equations to find the solution?
f ( x , x ) =  h(x ,x ) − c
1 2
*
1 2
*
85 86

Solving a convex program: analytical analysis Solving a convex program: analytical analysis
In this problem, we already know the solution, but what if we did not? Observation: The three equations are simply partial derivatives of the
f ( x1 , x2 ) −  (h(x1 ,x2 ) − c )
Then could we use the above equations to find the solution? function
 
*

 x ( f ( x1 , x2 ) −  (h(x1,x2 ) − c )) 0    
*

  =   x ( f ( x1 , x2 ) −  (h(x1,x2 ) − c ))
1

  ( f ( x , x ) −  (h(x ,x ) − c )) 0   1  0 
 x2 
  ( f ( x , x ) −  h(x ,x ) − c )  = 0
1 2 1 2

NO! Because we only have 2 equations, yet 3 unknowns: x1, x2, λ.  x2 1 2 1 2
  
  0
So we need another equation. Where do we get that equation? h( x1 , x2 ) − c
 
 
Recall our equality constraint: h(x1, x2)-c=0 . This must be satisfied! This is obviously true for the first two equations , but it is not so
Therefore: obviously true for the last one. But to see it, observe
 
*

 ( f ( x , x ) −  (h(x ,x ) − c )) 
x
1 2 1 2

 1  0  Three equations,
( f ( x1, x2 ) −  (h(x1,x2 ) − c )) = 0
  ( f ( x , x ) −  (h(x ,x ) − c )) = 0 three unknowns,

 x2 1 2 1 2
   we can solve.
  0  −h(x1,x2 ) + c = 0  h(x1,x2 ) = c
h( x1 , x2 ) − c
  87 88
 

22
26-08-2025

Formal approach to solving our problem Applying to our example

Define the Lagrangian function: Define the Lagrangian function:

L ( x1 , x2 ,  ) = f ( x1 , x2 ) −  (h(x1 ,x2 ) − c ) L ( x1 , x2 ,  ) = f ( x1 , x2 ) −  (h(x1,x2 ) − c )

In a convex programming problem, the “first-order conditions” for
finding the solution is given by
(
= x12 + x22 −  x1 + x2 − 6 )
L ( x1 , x2 ,  ) = 0 L ( x1 , x2 ,  ) = 0
  
L ( x1, x2 ,  ) = 0 L ( x1, x2 ,  ) = 0 L ( x1 , x2 ,  ) = 2 x1 −  = 0
x1  x1
L ( x,  ) = 0 x1

OR

L ( x1, x2 ,  ) = 0
Or more x OR

L ( x1, x2 ,  ) = 0 
L ( x1 , x2 ,  ) = 2 x2 −  = 0
x2 compactly x2 x2

 L ( x,  ) = 0 

L ( x1, x2 ,  ) = 0  
L ( x1, x2 ,  ) = 0 

(
L ( x1 , x2 ,  ) = − x1 + x2 − 6 = 0 )
where we have
used x=(x1, x2) A set of 3 linear equations and 3 unknowns;
89 we can write in the form of Ax=b. 90

Applying to our example

2 0 − 1  x1   0 
0 2 − 1  x  =  0 
  2   
1 1 0      6 
−1
 x1  2 0 − 1  0  1.2247 
  x2  = 0 2 − 1  0  = 1.2247 
Now, let’s go back to our example with a
   1 1 0   6  2.4495 nonlinear equality constraint.

23
26-08-2025

Example with nonlinear equality Example with nonlinear equality

.Non-convex because a line connecting two points in the min f ( x1 , x2 ) = x12 + x22 .
Solution:
set do not remain in the set. (see “notes” of this slide)
s.t. h(x1,x2 ) = 2 x1 x2 = 3 x* = ( x1 , x2 )*  (1.25,1.25)
f ( x1 , x2 )* = 3
3
2 x1x2 = 3  x2 = Any contour f<3 does not
2 x1
intersect the equality
constraint;
Superimpose this Any contour f>3 intersects the
equality constraint at two
relation on top of the points.
contour plot for f(x1,x2).
➔The contour f=3 and the
equality constraint just touch
each other at the point x*.

1. f(x1,x2) must be minimized, and so we would like the solution

to be as close to the origin as possible; “Just touch”:
2. The solution must be on the thick line in the right-hand corner The two curves are tangent to one another at the solution point.
of the plot, since this line represents the equality constraint. 93 94

Example with nonlinear equality Example with nonlinear equality

The two curves are tangent to one another at the solution point. This gives us the following two equations.
.
➔ The normal (gradient)  
*

vectors of the two curves, at  x ( f ( x1 , x2 ) −  (h(x1,x2 ) − c )) 0 
the solution (tangent) point,  1
 = 
  ( f ( x , x ) −  (h(x ,x ) − c )) 0 
are parallel.  x2 1 2 1 2

This means the following And we add the equality constraint to give 3 equations, 3 unknowns:
two vectors are parallel:
 
*

 x ( f ( x1 , x2 ) −  (h(x1,x2 ) − c ))
*
2x 
 *
{ f ( x1 , x2 )*} =  x12 + x22 =  1 
0  Three equations,
 2 x2   1 
2 x 
*   ( f ( x , x ) −  (h(x ,x ) − c )) = 0 three unknowns,
{h( x1 , x2 ) * −3} = 2 x1 x2 − 3 =  2 
*
 x2 1 2 1 2
   we can solve.
 2 x1    0
h( x1 , x2 ) − c
“Parallel” means that the two vectors have the same direction. We do not know that  
 
( )
they have the same magnitude. To account for this, we equate with a “multiplier” λ:
f ( x , x ) =  h(x ,x ) − c
1 2
*
1 2
*
95 96

24
26-08-2025

Example with nonlinear equality Example with nonlinear equality

Define the Lagrangian function:
L ( x1 , x2 ,  ) = f ( x1 , x2 ) −  (h(x1,x2 ) − c ) Our approach worked in this case, i.e., we
found a local optimal point that was also a
= x12 + x22 −  (2 x1 x2 − 3) global optimal point, but because it was
not a convex programming problem, we
L ( x1 , x2 ,  ) = 0 had no guarantee that this would happen.
  The conditions we established, below, we call first order conditions.
L ( x1, x2 ,  ) = 0 L ( x1 , x2 ,  ) = 2 x1 − 2x2 = 0
x1 x1 For convex programming problems, they are first order sufficient conditions to provide
  the global optimal point.
OR L ( x1, x2 ,  ) = 0 L ( x1 , x2 ,  ) = 2 x2 − 2x1 = 0
x2 x2 For nonconvex programming problems, they are first order necessary conditions to
  provide the global optimal point.
L ( x1, x2 ,  ) = 0 L ( x1 , x2 ,  ) = −(2 x1 x2 − 3) = 0
  
L ( x,  ) = 0
x1 = x2 =
3
= 1.2247
x
You can solve this algebraically to obtain and f=3 in

2

x1 = x 2 = −
3
= −1.2247
both cases
L ( x,  ) = 0
2 97 

Mathematical Methods of Optimization
No ratings yet
Mathematical Methods of Optimization
62 pages
Opt Lec 10
No ratings yet
Opt Lec 10
16 pages
Gradient and Newton Optimization
No ratings yet
Gradient and Newton Optimization
42 pages
Chapter 2 - Final
No ratings yet
Chapter 2 - Final
11 pages
CS-6777 Liu Abs
100% (1)
CS-6777 Liu Abs
103 pages
Unconstrained Function Optimization
No ratings yet
Unconstrained Function Optimization
30 pages
Midterm 1 Notes
No ratings yet
Midterm 1 Notes
46 pages
Process Optimization
100% (1)
Process Optimization
70 pages
Multivariable Optimization Techniques
No ratings yet
Multivariable Optimization Techniques
25 pages
(k+1) K (K) (K) (K) : Recall That A Direction Is A Vector of Unit Length
No ratings yet
(k+1) K (K) (K) (K) : Recall That A Direction Is A Vector of Unit Length
5 pages
Optimization Based On Gradient Descent
No ratings yet
Optimization Based On Gradient Descent
24 pages
Gradient Based Optimization
No ratings yet
Gradient Based Optimization
24 pages
Optimumengineeringdesign Day3a
No ratings yet
Optimumengineeringdesign Day3a
34 pages
33-Cauchy Method and Fletcher-Reeves Method-13-04-2024
No ratings yet
33-Cauchy Method and Fletcher-Reeves Method-13-04-2024
37 pages
19 Newton Method
No ratings yet
19 Newton Method
10 pages
5 1 SD 17122020
No ratings yet
5 1 SD 17122020
47 pages
Non-Linear Programming Problems in Engineering Optimization Course
No ratings yet
Non-Linear Programming Problems in Engineering Optimization Course
27 pages
Clnote Oct12
No ratings yet
Clnote Oct12
25 pages
Lecture 05 - Unconstrained
No ratings yet
Lecture 05 - Unconstrained
21 pages
Unconstrained Multivariable Optimization
No ratings yet
Unconstrained Multivariable Optimization
42 pages
Multi-Variable Optimization Methods
No ratings yet
Multi-Variable Optimization Methods
21 pages
EO - Chapter 7 - Multiple Variable Methods - 2nd Part
No ratings yet
EO - Chapter 7 - Multiple Variable Methods - 2nd Part
33 pages
Optimization 2
No ratings yet
Optimization 2
40 pages
Numerical Optimization
No ratings yet
Numerical Optimization
31 pages
US - TMC - 05 - Optimization 2022
No ratings yet
US - TMC - 05 - Optimization 2022
43 pages
Chapter 9st - Non-Linear Programming
No ratings yet
Chapter 9st - Non-Linear Programming
21 pages
Optimization for Engineers
No ratings yet
Optimization for Engineers
166 pages
4 Pattern Directions, 21-08-2024
No ratings yet
4 Pattern Directions, 21-08-2024
58 pages
OPTFIT Aflevering
No ratings yet
OPTFIT Aflevering
9 pages
Lec 30
No ratings yet
Lec 30
22 pages
6 OneD Unconstrained Opt
No ratings yet
6 OneD Unconstrained Opt
29 pages
Algorithms Process Optimization
No ratings yet
Algorithms Process Optimization
5 pages
Optimization Class Notes MTH-9842
No ratings yet
Optimization Class Notes MTH-9842
25 pages
Preguntas Del Examen
No ratings yet
Preguntas Del Examen
8 pages
Lecture 5
No ratings yet
Lecture 5
6 pages
Chapter 9 Lecture Notes
No ratings yet
Chapter 9 Lecture Notes
3 pages
Clnote Oct8
No ratings yet
Clnote Oct8
39 pages
Chương 9
No ratings yet
Chương 9
12 pages
CHE 502 Lesson 5 Note
No ratings yet
CHE 502 Lesson 5 Note
51 pages
Steepest Descent for Optimization
No ratings yet
Steepest Descent for Optimization
29 pages
Week02 Convex Optimization
No ratings yet
Week02 Convex Optimization
48 pages
HW4 Solutions Autotag
No ratings yet
HW4 Solutions Autotag
7 pages
14 Newton
No ratings yet
14 Newton
24 pages
Lect. #8
No ratings yet
Lect. #8
14 pages
Unconstrained Numerical Optimization An Introduction For Econometricians
100% (1)
Unconstrained Numerical Optimization An Introduction For Econometricians
32 pages
NEOM UNIT-1 Sept-23
No ratings yet
NEOM UNIT-1 Sept-23
34 pages
ECOM 6302: Engineering Optimization: Chapter Three
100% (1)
ECOM 6302: Engineering Optimization: Chapter Three
56 pages
OpTimIzation Overview
No ratings yet
OpTimIzation Overview
47 pages
Nonlinear Programming: Operations Research: Applications and Algorithms 4th Edition
No ratings yet
Nonlinear Programming: Operations Research: Applications and Algorithms 4th Edition
45 pages
Lecture 2
No ratings yet
Lecture 2
19 pages
Optimization Methods 1755746982
No ratings yet
Optimization Methods 1755746982
116 pages
Unconstrained Minimization
No ratings yet
Unconstrained Minimization
7 pages
Chapter 4: Unconstrained Optimization
No ratings yet
Chapter 4: Unconstrained Optimization
25 pages
Optimal Control and Optimization Basics
No ratings yet
Optimal Control and Optimization Basics
48 pages
Lecture 8
No ratings yet
Lecture 8
24 pages
Concept of Optimization
No ratings yet
Concept of Optimization
34 pages
Face Prep Capgemini Slot Analysis 23rd Aug 2021 Slot 1
No ratings yet
Face Prep Capgemini Slot Analysis 23rd Aug 2021 Slot 1
17 pages
AI Problem Solving Techniques
No ratings yet
AI Problem Solving Techniques
52 pages
Maze Generation Algorithms
No ratings yet
Maze Generation Algorithms
4 pages
MSc Numerical Analysis Course
No ratings yet
MSc Numerical Analysis Course
5 pages
Binary Search Trees
No ratings yet
Binary Search Trees
39 pages
Circular Linked List
No ratings yet
Circular Linked List
25 pages
Conjugate Gradient Algorithms and Finite Element Methods
No ratings yet
Conjugate Gradient Algorithms and Finite Element Methods
382 pages
Algorithm Design for Students
No ratings yet
Algorithm Design for Students
35 pages
OOPS & DS Virtual Lab
No ratings yet
OOPS & DS Virtual Lab
50 pages
Keras Guide for Deep Learning Enthusiasts
No ratings yet
Keras Guide for Deep Learning Enthusiasts
1 page
2150703
No ratings yet
2150703
3 pages
Data Structure MCQs for Students
100% (1)
Data Structure MCQs for Students
40 pages
Clustering Algorithms CheatSheet
No ratings yet
Clustering Algorithms CheatSheet
6 pages
Quicksort and Merge Sort
No ratings yet
Quicksort and Merge Sort
5 pages
Machine Learning Lab Exam Guide
No ratings yet
Machine Learning Lab Exam Guide
2 pages
Job-Shop-Scheduling Uain Heurstic Bottle Neck Shift
No ratings yet
Job-Shop-Scheduling Uain Heurstic Bottle Neck Shift
7 pages
Two-Dimensional Arrays
No ratings yet
Two-Dimensional Arrays
16 pages
Regula-Falsi Method for CE Students
No ratings yet
Regula-Falsi Method for CE Students
5 pages
VIPS OOPS Unit 2 Collection API Interface
No ratings yet
VIPS OOPS Unit 2 Collection API Interface
63 pages
Java Collections for Beginners
100% (1)
Java Collections for Beginners
81 pages
Backpropagation Algorithm
No ratings yet
Backpropagation Algorithm
3 pages
Dec 2019 CS302 - Design and Analysis of Algorithms (S) - Ktu Qbank
No ratings yet
Dec 2019 CS302 - Design and Analysis of Algorithms (S) - Ktu Qbank
4 pages
Object Oriented Programming & Data Structure PDF
No ratings yet
Object Oriented Programming & Data Structure PDF
4 pages
Pattern Matching
No ratings yet
Pattern Matching
46 pages
KKT
No ratings yet
KKT
3 pages
2019-Dec EE-712 23
No ratings yet
2019-Dec EE-712 23
1 page
Unsupervised Learning: Clustering Techniques
No ratings yet
Unsupervised Learning: Clustering Techniques
25 pages
Daa Detention Work
No ratings yet
Daa Detention Work
7 pages
Chapter 5
No ratings yet
Chapter 5
13 pages
ML Unit 3
No ratings yet
ML Unit 3
14 pages

Op Tim Ization

Uploaded by

Op Tim Ization

Uploaded by

26-08-2025

• The gradient is the first derivative with respect to

The Gradient The Hessian

Hessian Example Unconstrained Optimization

Positive/Negative Definite Positive/Negative Semi-definite

Example Matrix Unconstrained NLP Example

Unconstrained NLP Example Unconstrained NLP Example

Unconstrained NLP Example Unconstrained NLP Example

Method of Solution Method of Solution

Newton’s Method Newton’s Method Diagram

Newton’s Method Comments Regula-Falsi Method

Regula-Falsi Diagram Regula-Falsi Comments

Slides: Boyd and Vandenberghe, Convex optimization

Multivariable Optimization The Step Size

Steepest Descent Method Steepest Descent Method

Steepest Descent Method Steps Steepest Descent Method Steps

Convergence Steepest Descent Example

Recall: The norm of a vector, ||x|| is given by: 0 

Steepest Descent Example Steepest Descent Example

d0 = −f (x0 ) = −2(0) + 1 − 0 − 0 + 0 − 0 − 0 + 0 + 1 = 2(a 0 ) 2 − 2(a 0 )

Now, set equal to zero and solve:

Steepest Descent Example Steepest Descent Example

Steepest Descent Example Steepest Descent Example

Steepest Descent Example Steepest Descent Example

Steepest Descent Example Steepest Descent Example

Steepest Descent Example Steepest Descent Example

Steepest Descent Example Steepest Descent Example

Steepest Descent Example Steepest Descent Example

Steepest Descent Example Steepest Descent Example

Steepest Descent Example Quadratic Functions

Conjugate Gradient Method Conjugate Gradient Steps

• Although it works especially well with Find a0 by performing a single-variable

Conjugate Gradient Steps Conjugate Gradient Steps

Number of Iterations Step Size for Quadratic Functions

The solution to this equation is: is the optimum step size.

Multivariable Newton’s Method Multivariable Newton’s Method

Newton’s Method Steps Comments on Newton’s Method

Newton’s Method Example Newton’s Method Example

Newton’s Method Example Newton’s Method Example

Newton’s Method Example Comments on Example

Constrained Nonlinear Convexity & global vs. local optima

Convex vs. nonconvex programming problems A convex programming problem

x1 + x2 = 6  x2 = − x1 + 6 Any contour f<3 does not

This means the following Performing the gradient operation (taking

Formal approach to solving our problem Applying to our example

L ( x1 , x2 ,  ) = f ( x1 , x2 ) −  (h(x1 ,x2 ) − c ) L ( x1 , x2 ,  ) = f ( x1 , x2 ) −  (h(x1,x2 ) − c )

Applying to our example

Example with nonlinear equality Example with nonlinear equality

1. f(x1,x2) must be minimized, and so we would like the solution

Example with nonlinear equality Example with nonlinear equality

Example with nonlinear equality Example with nonlinear equality

You might also like