Op Tim Ization
Op Tim Ization
Multivariable Unconstrained
Optimization
• For functions with one variable, we use the 1st
and 2nd derivatives.
Mathematical Methods of
Optimization • For functions with multiple variables, we use
identical information that is the gradient and the
Hessian.
1
26-08-2025
2
26-08-2025
3
26-08-2025
4
26-08-2025
5
26-08-2025
f (x)
• This method requires initial knowledge of
two points bounding the solution
• However, it does not require the
xa xc
x* x calculation of the second derivative
xb
• The Regula-Falsi Method requires slightly
The Regula-Falsi method approximates the more iterations to converge than the
function f (x) as a straight line and Newton’s Method
interpolates to find the root.
6
26-08-2025
7
26-08-2025
8
26-08-2025
+
So, x k +1 = x k a k f (x k ) xk
− f(x) = 25
x1
9
26-08-2025
10
26-08-2025
11
26-08-2025
1 1 1
x 2 = − − −
2 2 2
2 2 2 2
12
26-08-2025
1 1 1 1 1 1 1
x 3 = − − − + a 2 − 0 − d 3 = −f (x3 ) = − 0 0 = 0 − 0
2 2 2 2 2 2 2
1 1 1 1 1
= − − − + − 0 − 3 1 3 1
2 2 2 4 4 x 4 = − − − + a 3 0 − 0
4 2 4 2
3 1 3 3 3
x 3 = − − − = −
1
− (a 3 + 1) −
4 2 4 4 2 4
13
26-08-2025
1 5 3 3 1 5
= − (3 + a 4 ) − ( − a 4 ) − (3 + a 4 )
4 2 4 2 4 2
14
26-08-2025
15
26-08-2025
f (x 0 ) f (x 0 ) Or,
This can be generalized for the kth iteration: f ( x k ) e 2
k +1 k +1
f (x ) f (x )
T
d k +1 = −f (x k +1 ) + d k
T f (x k ) f (x k )
16
26-08-2025
Step Size for Quadratic Functions Step Size for Quadratic Functions
We take the derivative of that function with • So, for the problem of optimizing a
respect to a and set it equal to zero: quadratic function,
df (x k + ad k ) T f ( x k ) d k
= T f (x k ) d k + (d k )T 2 f (x k ) ad k = 0 a k ,opt = −
da (d ) 2 f (x k ) d k
k T
17
26-08-2025
2 −1 0
−1 3 1 1
f (x) = 2 x1 − x2 + 1 − x1 + 2 x2 − x3 − x2 + 2 x3 + 1 4 2 4
2
f ( x)
−1
= − 1 2 − 1 = 1 1 1
0 − 1 2 12 1
2
3
4 2 4
18
26-08-2025
3 1 1
0 0 4 2 4 1
= 0 − 1 1 1 0
2 2
Calculate the gradient for the 1st iteration: 0 1 1 3 1
4 2 4
f (x0 ) = 0 − 0 + 1 − 0 + 0 − 0 − 0 + 0 + 1
− 1
f (x ) = 1 0 1
0 x1 = − 1
− 1
19
26-08-2025
20
26-08-2025
Contour maps/Constant Cost Contours/level Sets Contour maps and 3-D illustrations
. Definition: A contour map is a 2-dimensional plane, i.e., a .
coordinate system in 2 variables, say, x1, x2, that illustrates Example: Draw the 3-D surface for f ( x1, x2 ) = x12 + x22
curves (contours) of constant functional value f(x1, x2). [X,Y] = meshgrid(-
Example: Draw the contour map for f ( x1, x2 ) = x12 + x22 2.0:.2:2.0,-2.0:.2:2.0);
Z = X.^2+Y.^2;
[X,Y] = meshgrid(- surfc(X,Y,Z)
2.0:.2:2.0,-2.0:.2:2.0); xlabel('x1')
Z = X.^2+Y.^2; ylabel('x2')
[c,h]=contour(X,Y,Z); zlabel('f(x1,x2)')
clabel(c,h);
grid; Height is f(x)
xlabel('x1');
Each contour of fixed value f is
ylabel('x2');
the projection onto the x1-x2
plane of a horizontal slice
made of the 3-D figure at a Contours
value f above the x1-x2 plane.
81 82
Solving a convex program: graphical analysis Solving a convex program: graphical analysis
.
Example: Solve this convex program: min f ( x1 , x2 ) = x12 + x22 .
Solution:
s.t. h(x1,x2 ) = x1 + x2 = 6
A straight line is a convex set because a line
segment between any two points on it remain on it.
x* = ( x1 , x2 )* (1.25,1.25)
f ( x1 , x2 )* = 3
21
26-08-2025
Solving a convex program: graphical analysis Solving a convex program: graphical analysis
.
The two curves are tangent to one another at the solution point.
.
(
f ( x1 , x2 )* = h(x1,x2 )* − c )
➔ The normal (gradient) Moving everything to the left:
vectors of the two curves, at
the solution (tangent) point, (
f ( x1 , x2 )* − h(x1,x2 )* − c = 0 ) Alternately:
are parallel. ( )
f ( x1 , x2 )* + c − h(x1,x2 )* = 0
“Parallel” means that the two vectors have the same direction. We do not know that In this problem, we already know the solution, but what if we did not?
( )
they have the same magnitude. To account for this, we equate with a “multiplier” λ:
Then could we use the above equations to find the solution?
f ( x , x ) = h(x ,x ) − c
1 2
*
1 2
*
85 86
Solving a convex program: analytical analysis Solving a convex program: analytical analysis
In this problem, we already know the solution, but what if we did not? Observation: The three equations are simply partial derivatives of the
f ( x1 , x2 ) − (h(x1 ,x2 ) − c )
Then could we use the above equations to find the solution? function
*
x ( f ( x1 , x2 ) − (h(x1,x2 ) − c )) 0
*
= x ( f ( x1 , x2 ) − (h(x1,x2 ) − c ))
1
( f ( x , x ) − (h(x ,x ) − c )) 0 1 0
x2
( f ( x , x ) − h(x ,x ) − c ) = 0
1 2 1 2
NO! Because we only have 2 equations, yet 3 unknowns: x1, x2, λ. x2 1 2 1 2
0
So we need another equation. Where do we get that equation? h( x1 , x2 ) − c
Recall our equality constraint: h(x1, x2)-c=0 . This must be satisfied! This is obviously true for the first two equations , but it is not so
Therefore: obviously true for the last one. But to see it, observe
*
( f ( x , x ) − (h(x ,x ) − c ))
x
1 2 1 2
1 0 Three equations,
( f ( x1, x2 ) − (h(x1,x2 ) − c )) = 0
( f ( x , x ) − (h(x ,x ) − c )) = 0 three unknowns,
x2 1 2 1 2
we can solve.
0 −h(x1,x2 ) + c = 0 h(x1,x2 ) = c
h( x1 , x2 ) − c
87 88
22
26-08-2025
OR
L ( x1, x2 , ) = 0
Or more x OR
L ( x1, x2 , ) = 0
L ( x1 , x2 , ) = 2 x2 − = 0
x2 compactly x2 x2
L ( x, ) = 0
L ( x1, x2 , ) = 0
L ( x1, x2 , ) = 0
(
L ( x1 , x2 , ) = − x1 + x2 − 6 = 0 )
where we have
used x=(x1, x2) A set of 3 linear equations and 3 unknowns;
89 we can write in the form of Ax=b. 90
2 0 − 1 x1 0
0 2 − 1 x = 0
2
1 1 0 6
−1
x1 2 0 − 1 0 1.2247
x2 = 0 2 − 1 0 = 1.2247
Now, let’s go back to our example with a
1 1 0 6 2.4495 nonlinear equality constraint.
91
23
26-08-2025
24
26-08-2025
x1 = x 2 = −
3
= −1.2247
both cases
L ( x, ) = 0
2 97
25