Introduction and Unconstrained Optimization
Master QFin at WU Vienna Lecture Optimization
R udiger Frey
ruediger.frey@wu.ac.at http://statmath.wu.ac.at/frey
Spring 2014
1 / 29
Introduction and Unconstrained Optimization
Admin
Dates of the lecture. 4.3., 6.3., 11.3.,18.3. (incl mid-term)
25.3., 1.4., 8.4. (nal exam at some point before easter)
Examination. 10 % home assignments, 30% Midterm Exam
(unit4) 60% Final Exam
Tutor. Giorgo Ottonello, 2nd year QFin. He will correct
homework assignments
2 / 29
Introduction and Unconstrained Optimization
Useful references
R. Frey, lecture notes Optimization, available on learn@WU,
2014
Bertsekas, D., Nonlinear Programming, Athena Scientic
Publishing, 1999
Griva, Nash, Sofer, Linear and Nonlinear optimization, SIAM
publishing, 2009
3 / 29
Introduction and Unconstrained Optimization
Overview
Introduction and Unconstrained Optimization Introduction Mathematical Background Unconstrained Optimization: Theory
4 / 29
Introduction and Unconstrained Optimization
Optimization problems
In its most general for an optimization problem is minimize f (x ) subject to x X (1)
is a subset of Rn , and the cost Here the set of admissible points X to R. Often the admissible points function f is a function from X are further restricted by explicit inequality constraints. Note that maximization problems can be addressed by replacing f with f , as supx X f (x ) = inf x X {f (x )}.
5 / 29
Introduction and Unconstrained Optimization
Types of optimization problems.
is of continuous nature such Continuous problems. Here X = Rn or sets of the form as X = {x Rn : g (x ) 0 for some g : Rn Rn }. These X problems are usually tackled with calculus or convex analysis. is a (usually large) nite set (as in Discrete problems. Here X network optimization)
Nonlinear programming. Here f is nonlinear or the
is specied by nonlinear equations. constrained set X
Linear programming. Here f and g are linear, and (1) takes
the form min c x such that Ax b for c , x Rn , a matrix A Rmn , b Rm and m n.
6 / 29
Introduction and Unconstrained Optimization
Optimization problems in nance, economics and statistics.
a) Portfolio optimization. We give two examples.
Maximization of expected utility.
Rn
max E (u (V0 + (ST S0 ))) ,
1, . . . S n) Here V0 is the initial wealth of an investor; S0 = (S0 0 1 ( ), . . . S n ( )) the is the initial asset price; ST ( ) = (ST T terminal asset price; represents the portfolio strategy. The increasing and concave function u : R R is the utility function that is used to model the risk aversion of the investor.
Markowitz problem. Here one looks for the minimal-variance
portfolio under all portfolios with a given mean.
7 / 29
Introduction and Unconstrained Optimization
Optimization problems ctd.
b) Calibration problems. Denote by g1 (), . . . , gm () model prices of m nancial instruments for given parameter vector the observed market price. Rn and by g , . . . , gm X 1 Model calibration leads to the optimization problem 1 min 2 X
m
(gi () gi )2 .
i =1
If gi is linear in we have a standard regression problem; otherwise one speaks of a generalized regression problem. c) Maximum likelihood methods in statistics. d) Financial mathematics. Duality results from convex analysis are crucial in nancial mathematics (rst fundamental theorem of asset pricing or superhedging duality).
8 / 29
Introduction and Unconstrained Optimization
Overview
1. Unconstrained Optimization
Necessary and sucient optimality conditions Numerical methods
2. Lagrange multiplier theory and Karush-Kuhn-Tucker theory 3. Convex optimization and duality
Convexity and separation; The dual problem Duality results and economic applications
9 / 29
Introduction and Unconstrained Optimization
Dierentiability and partial derivatives
Consider some f : U Rn R, (x1 , . . . , xn )t f (x1 , . . . , xn ), where U is an open subset of Rn , and some x U . Denition. (1) f is called continuously dierentiable on U (f C 1 (U )) if for all x U all partial derivatives exist and if the partial derivatives are are continuous functions of x . (2) More generally, a function f : U Rn Rm , (x1 , . . . , xn )t (f1 (x1 , . . . , xn ), . . . , fm (x1 , . . . , xn ))t is continuously dierentiable on U if all components f1 , . . . , fm belong to C 1 (U ).
10 / 29
Introduction and Unconstrained Optimization
Example: Quadratic form
Consider a symmetric 2 2 matrix A and let
2 2 f (x1 , x2 ) = x t Ax = a11 x1 + 2a12 x1 x2 + a22 x2 .
f (x ) f (x ) = 2a11 x1 + 2a12 x2 = (2Ax )1 and = (2Ax )2 x1 x2
11 / 29
Introduction and Unconstrained Optimization
Gradient and Jacobi matrix
Suppose that f : U R is in C 1 (U ). Then the column vector
t
f (x ) =
f f x1 (x ), . . . , xn (x )
is the gradient of f .
For a C 1 function g : U Rm the Jacobi matrix is given by
Jg (x ) =
g1 ( x ) x1
...
. . .
g1 (x ) xn
. . .
gm (x ) x1
gm (x ) xn
Sometimes one uses also the gradient matrix g (x ) = Jg (x )t = (g1 (x ), . . . , gm (x )).
12 / 29
Introduction and Unconstrained Optimization
First order (Taylor) approximation
Consider some C 1 function f : U R. Then for any x , y U f (y ) f (x ) = f (x )t (y x ) + R (x , y x ) where it holds that lim
z 0 R (x ,z ) z
(2)
= 0.
Idea. The function f can be approximated locally around x by the ane mapping y f (x ) + f (x )t (y x ). Similarly, we get for a C 1 function g : U Rm that g (y ) g (x ) = Jg (x )(y x )+ R (x , y x ) with lim R (x , z ) = 0. z
z 0
13 / 29
Introduction and Unconstrained Optimization
Chain rule
Theorem. Consider C 1 functions f : Rn Rm and g : Rk Rn and let h := f g . Then h is C 1 and it holds for the Jacobi matrix that Jh(x ) = Jf (g (x ))Jg (x ), i.e. the Jacobian of the concatenation is the product of the individual Jacobi matrices. Example. (Derivative along a vector). Consider a C 1 functions f : Rn R We want to consider the function f along the straight line (t ) := x + tv , for t R, x , v Rn . We have J (t ) = v , Jf (x ) = (f (x ))t and hence d d f ((t )) = (f (x +tv ))t v , in particular f ((0)) = (f (x ))t v . dt dt
14 / 29
Introduction and Unconstrained Optimization
Second derivatives
Denition. Consider C 1 function f : U Rn R. Then the rst f (x ) order partial derivatives xi , 1 i n, are themselves functions from U to R. 1. If all partial derivatives are C 1 functions, f is called twice continuously dierentiable on U (f C 2 (U )). Fix i , j {1, . . . , n}. Then one writes
f xi (x ) 2f (x ) := xi xj xj
for the second partial derivative in direction xi and xj . 2. For f C 2 (U ) the matrix Hf with Hfij (x ) = Hessian matrix of f .
2f xi xj (x )
is the
15 / 29
Introduction and Unconstrained Optimization
Theorem of Young and Schwarz
Theorem. Consider f C 2 (U ). Then the Hessian matrix is symmetric, that is 2f 2f (x ) = (x ) , xi xj xj xi 1 i , j n.
It follows that the Hessian is a symmetric matrix, that is Hfij (x ) = Hfji (x ), 1 i , j n. In particular, the deniteness of Hf can be checked using eigenvalues: HF is positive semi-denite if all eigenvalues are strictly positive and positive semidenite if all eigenvalues are non-negative.
16 / 29
Introduction and Unconstrained Optimization
Example
3 x + x 2 x 2 + x + x 2 . Then we have (1) Consider f (x1 , x2 ) = x1 2 1 1 2 2
2f 2 = 6x1 x2 + 2x2 , 2 x1
2f 2 = 2x1 + 2, 2 x2
2f 2 = 3x1 + 4x1 x2 . x1 x2
(2) Consider f (x ) = x t Ax for some symmetric matrix A. Then Hf (x ) = 2A.
17 / 29
Introduction and Unconstrained Optimization
Second order Taylor expansion
Theorem. If f is C 2 (U ) and x , y U the Taylor formula becomes 1 f (y ) f (x ) = f (x )t (y x )+ (y x )t Hf (x )(y x )+ R2 (x , y x ) 2 where lim
z 0 R 2 ( x ,z ) z 2
= 0.
Idea. f can be approximated locally around x U by the quadratic function 1 y f (x ) + f (x )t (y x ) + (y x )t Hf (x )(y x ) . 2 Locally, this is a better approximation than the rst order Taylor approximation.
18 / 29
Introduction and Unconstrained Optimization
Unconstrained optimization: the problem
In this section we consider problems of the form = Rn minimize f (x ) for x X Moreover, we assume that f is once or twice continuously dierentiable. is an open subset of Rn . Most results hold also in the case where X (3)
19 / 29
Introduction and Unconstrained Optimization
Local and global optima
Denition. Consider the optimization problem (3). 1. x is called (unconstrained) local minimum of f if there is some > 0 such that f (x ) f (x ) for all x Rn with x x < . 2. x is called global minimum of f , if f (x ) f (x ) x Rn . 3. x is said to be a strict local/global minimum if the inequality f (x ) f (x ) is strict for x = x . 4. The value of the problem is f := inf {f (x ) : x Rn } Remark Local and global maxima are dened analogously.
20 / 29
Introduction and Unconstrained Optimization
Necessary optimality conditions
Proposition. Suppose that x U is a local optimum of f . 1. If f is C 1 in U , then f (x ) = 0. (First Order Necessary Condition or FONC). 2. If moreover f C 2 (U ) then Hf (x ) is positive semi-denite (Second Order Necessary Condition or SONC). Comments.
x Rn with f (x ) = 0 is called stationary point of f . Proof is based on Taylor formula. Necessary conditions for a local maximum: f (x ) = 0,
Hf (x ) negative semidenite.
Necessary conditions in general not sucient: consider
f (x ) = x 3 , x = 0.
21 / 29
Introduction and Unconstrained Optimization
Sucient optimality conditions
Proposition. (Sucient conditions.) Let f : U Rn R be C 2 on U . Suppose that x U satises the conditions f (x ) = 0, Hf (x ) strictly positive denite Then x is a local minimum. Comments.
Sucient conditions not necessary: Consider eg.
(4)
f (x ) = x 4 , x = 0.
No global statements possible.
22 / 29
Introduction and Unconstrained Optimization
The case of convex functions
Denition. (Convex sets and functions) i) A set X Rn is convex if x1 , x2 X , [0, 1] the convex combination x1 + (1 )x2 belongs to X . ii) A function f : X Rn R (X convex) is called convex if x1 , x2 X , [0, 1] f (x1 + (1 )x2 ) f (x1 ) + (1 )f (x2 ); f is strict convex if the inequality is strict for (0, 1). iii) f : X Rn R is concave f is (strict) convex holds in (5). Strict concavity is dened in the same way. (5)
23 / 29
Introduction and Unconstrained Optimization
Characterizations of Convexity
Lemma. Consider an open convex set X Rn . A C 1 function f : X Rn is convex on the if and only if it holds for all x , z X that f (z ) f (x ) + f (x ) (z x ). If f is C 2 a necessary and sucient condition for the convexity of f on X is the condition that Hf (x ) is positive semi-denite for all x X. Comments
f is concave on U Hf is negative semidenite on U . Note that we may decide convexity or concavity by nding the
eigenvalues of Hf (x ).
24 / 29
Introduction and Unconstrained Optimization
Example
2 + 2x x x 2 . Is f convex, Problem. Let f (x1 , x2 ) = 2x1 x2 x1 1 2 2 concave or none of both ?
Solution. The symmetric matrix representing the quadratic part of f is 1 1 A= 1 1 An easy computation gives for the Hessian that Hf (x ) = 2A. Hence we need to check the deniteness of A. Approach via eigenvalues. The characteristic polynomial of A is P () = 2 + 2; the equation P () = 0 has solutions (eigenvalues) 2 and 0. Hence, A is negative semidenite and the function is concave.
25 / 29
Introduction and Unconstrained Optimization
Optimality conditions for convex functions
Proposition. Let f : X R be a convex function on some convex set X Rn . Then 1. A local minimum of f over X is also a global minimum. If f is strictly convex, there exists at most one global minimum. 2. If X is open, the condition f (x ) = 0 is necessary and sucient for x X to be a global minimum of f .
26 / 29
Introduction and Unconstrained Optimization
Example: Quadratic cost functions
n Let f (x ) = 1 2 x Qx b x , x R for a symmetric n n matrix Q n and some b R . Then we have
f (x ) = Qx b and Hf (x ) = Q . a) Local minima. If x is a local minimum we must have f (x ) = Qx b = 0, Hf (x ) = Q positive semi-denite; hence if Q is not positive semi-denite, f has no local minima. b) If Q is positive semi-denite, f is convex. In that case we need not distinguish global and local minima, and f has a global minimum if and only if there is some x with Qx = b . c) If Q is positive denite, Q 1 exists and the unique global minimum is attained at x = Q 1 b .
27 / 29
Introduction and Unconstrained Optimization
Existence results for a global minimum
Proposition(Weierstrass Theorem) Let X Rn be non-empty and suppose that f : X R is continuous in X . Suppose moreover, that one of the following three conditions holds (1) X is compact (closed and bounded). (2) X is closed and f is coercive, that is (x k )k N X with x k one has lim f (x k ) =
k
(3) There is some R such that the level set {x X : f (x ) } is non-empty and compact. Then f has at least one global minimum and the set of all global minima is compact. Remark. The result holds also for lower semicontinous functions. (f is called lower semicontinuous if for all x X , all (x k )k N with x k x lim inf k f (x k ) f (x ).)
28 / 29