CS/ECE/ISyE 524 Introduction to Optimization Spring 2017–18
14. Convex programming
Convex sets and functions
Convex programs
Hierarchy of complexity
Example: geometric programming
Laurent Lessard (www.laurentlessard.com)
Convex sets
A set of points C ⊆ Rn is convex if for all points x, y ∈ C
and any real number 0 ≤ α ≤ 1, we have αx + (1 − α)y ∈ C .
all points in C can see each other.
can be closed or open (includes y
boundary or not), or some
combination where only some
boundary points are included. x C
can be bounded or unbounded.
14-2
Convex sets
Intersections preserve convexity:
If I is a collection of T
convex sets {Ci }i∈I , then
the intersection S = i∈I Ci is convex.
proof: Suppose x, y ∈ S and 0 ≤ α ≤ 1. By definition,
x, y ∈ Ci for each i ∈ I. By the convexity of Ci , we must have
αx + (1 − α)y ∈ Ci as well. Therefore αx + (1 − α)y ∈ S,
and we are done.
note: The union of convex sets C1 ∪ C2 is need not be convex!
14-3
Convex sets
Constraints can be characterized by sets!
If we define C1 := {x ∈ Rn | Ax ≤ b} then:
Ax ≤ b ⇐⇒ x ∈ C1
If we define C2 := {x ∈ Rn | Fx = g } then:
Ax ≤ b and Fx = g ⇐⇒ x ∈ C1 ∩ C2
14-4
Convex sets
Example: SOCP
Let C := x ∈ Rn kAx + bk ≤ c T x + d . To prove C is
convex, suppose x, y ∈ C and let z := αx + (1 − α)y . Then:
kAz + bk = kA(αx + (1 − α)y ) + bk
= kα(Ax + b) + (1 − α)(Ay + b)k
≤ αkAx + bk + (1 − α)kAy + bk
≤ α(c T x + d) + (1 − α)(c T y + d)
= c Tz + d
Therefore, kAz + bk ≤ c T z + d, i.e. C is convex.
14-5
Convex sets
Example: spectrahedron
( " # )
1 x1 x2
Let C := x ∈ R3 x1 1 x3 0 . To prove C is
x2 x3 1
convex, consider the set S := X ∈ R3×3 X = X T 0
Note that S is the PSD cone. It is convex because if we define
Z := αX + (1 − α)Y where X , Y ∈ S and 0 ≤ α ≤ 1, then
w T Zw = w T (αX + (1 − α)Y ) w
= αw T Xw + (1 − α)w T Yw
So if X 0 and Y 0, then Z 0. So S is convex. Now, C
is convex because it’s the intersection of two convex sets: the
PSD cone S and the affine space {X ∈ R3×3 | Xii = 1}.
14-6
Convex functions
If C ⊆ Rn , a function f : C → R is convex if C is a
convex set and for all x, y ∈ C and 0 ≤ α ≤ 1, we have:
f (αx + (1 − α)y ) ≤ αf (x) + (1 − α)f (y )
f is concave if −f is convex.
f (x)
3
y
2
1 x
x
-1 1 2 3 4
14-7
Convex and concave functions
Convex functions on R:
Affine: ax + b.
Absolute value: |x|.
Quadratic: ax 2 for any a ≥ 0.
Exponential: ax for any a > 0.
Powers: x α for x > 0, α ≥ 1 or α ≤ 0.
Negative entropy: x log x for x > 0.
Concave functions on R:
Affine: ax + b.
Quadratic: ax 2 for any a ≤ 0.
Powers: x α for x > 0, 0 ≤ α ≤ 1.
Logarithm: log x for x > 0.
14-8
Convex and concave functions
Convex functions on Rn :
Affine: aT x + b.
Norms: kxk2 , kxk1 , kxk∞
Quadratic form: x T Qx for any Q 0
14-9
Building convex functions
1. Nonnegative weighted sum: If f (x) and g (x) are convex
and α, β ≥ 0, then αf (x) + βg (x) is convex.
2. Composition with an affine function:
If f (x) is convex, so is g (x) := f (Ax + b)
3. Pointwise maximum: If f1 (x), . . . , fk (x) are convex, then
g (x) := max {f1 (x), . . . , fk (x)} is convex.
proof: Let z := αx + (1 − α)y as usual.
g (z) = f (Az + b)
= f (α(Ax + b) + (1 − α)(Ay + b))
≤ αf (Ax + b) + (1 − α)f (Ay + b)
= αg (x) + (1 − α)g (y )
14-10
Convex functions vs sets
Level set: If f is a convex function, then the set
of points satisfying f (x) ≤ a is a convex set.
Converse is false: if all level sets of f are convex, it does
not necessarily imply that f is a convex function!
14-11
Convex functions vs sets
Epigraph: f : Rn → R is a convex function if and only if
the set {(x, t) ∈ Rn+1 | f (x) ≤ t} is convex.
f (x) t
4 4
3 3
2 2
1 1
x x
-1 1 2 3 4 -1 1 2 3 4
plot of f (x) {(x, t) | f (x) ≤ t}
14-12
Convex programs
The standard form for a convex optimization problem:
minimize f0 (x)
x
subject to: fi (x) ≤ 0 for i = 1, . . . , k
Ax = b
x ∈C
f0 , f1 , . . . , fk are convex functions
C is a convex set
14-13
Convex programs
Can turn f0 (x) into a linear constraint (use epigraph)
Can characterize constraints using sets.
Minimalist form:
minimize c T x
x∈S
S is a convex set
14-14
Key properties and advantages
1. The set of optimal points x ? is itself a convex set.
I Proof: If x ? and y ? are optimal, then we must have
f ? = f0 (x ? ) = f0 (y ? ). Also, f ? ≤ f0 (z) for any z. Choose
z := αx ? + (1 − α)y ? with 0 ≤ α ≤ 1. By convexity of f0 ,
f ? ≤ f0 (αx ? + (1 − α)y ? ) ≤ αf0 (x ? ) + (1 − α)f0 (y ? ) = f ? .
Therefore, f0 (z) = f ? , i.e. z is also an optimal point.
2. If x is a locally optimal point, then it is globally optimal.
I Follows from the result above. A very powerful fact!
3. Upper an lower bounds available via duality (more later!)
4. Often numerically tractable (not always!)
14-15
Hierarchy of programs
From least general to most general model:
1. LP: linear cost and linear constraints
2. QP: convex quadratic cost and linear constraints
3. QCQP: convex quadratic cost and constraints
4. SOCP: linear cost, second order cone constraints
5. SDP: linear cost, semidefinite constraints
6. CVX: convex cost and constraints
Less general (simpler) models are typically preferable
14-16
Solving convex problems
Simpler models are usually more efficient to solve
Factors affecting solver speed:
How difficult is it to verify that x ∈ C ?
How difficult is it to project onto C ?
How difficult is it to evaluate f (x) ?
How difficult is it to compute ∇f (x) ?
Can the solver take advantage of sparsity?
14-17
Example: geometric programming
The log-sum-exp function (shown left) is convex:
n
!
X
f (x) := log exp xk
k=1
It’s a smoothed version of max{x1 , . . . , xk } (shown right)
14-18
Example: geometric programming
Suppose we have positive decision variables xi > 0, and
constraints of the form (with each cj > 0 and αjk ∈ R):
X α α α
cj x1 j1 x2 j2 · · · xn jn ≤ 1
j=1
Then by using the substitution yi := log(xi ), we have:
n
!
X
log exp (aj0 + aj1 y1 + · · · + ajn yn ) ≤ 0
j=1
(where aj0 := log cj ). This is a log-sum-exp function composed
with an affine function (convex!)
14-19
Example: geometric programming
Example: We want to design a box of height h, width w , and
depth d with maximum volume (hwd) subject to the limits:
total wall area: 2(hw + hd) ≤ Awall
total floor area: wd ≤ Aflr
height-width aspect ratio: α ≤ wh ≤ β
width-depth aspect ratio: γ ≤ wd ≤ δ
We can make some of the constraints linear, but not all of them.
This appears to be a nonconvex optimization problem...
14-20
Example: geometric programming
Example: We want to design a box of height h, width w , and
depth d with maximum volume (hwd) subject to the limits:
total wall area: 2(hw + hd) ≤ Awall
total floor area: wd ≤ Aflr
height-width aspect ratio: α ≤ wh ≤ β
width-depth aspect ratio: γ ≤ wd ≤ δ
minimize h−1 w −1 d −1
h,w ,d > 0
2 2 1
subject to: Awall
hw + Awall
hd ≤ 1, Aflr
wd ≤ 1
−1
αh w ≤ 1, 1
β
hw −1 ≤ 1
γwd −1 ≤ 1, 1 −1
δ
w d ≤1
14-21
Example: geometric programming
minimize h−1 w −1 d −1
h,w ,d > 0
2 2 1
subject to: Awall
hw + Awall
hd ≤ 1, Aflr
wd ≤ 1
−1
αh w ≤ 1, 1
β
hw −1 ≤ 1
γwd −1 ≤ 1, 1 −1
δ
w d ≤1
Define: x := log h, y := log w , and z := log d.
Express the problem in terms of the new variables x, y , z.
Note: h, w , d are positive but x, y , z are unconstrained.
14-22
Example: geometric programming
minimize log(e −x−y −z )
x,y ,z
subject to: log(e log(2/Awall )+x+y + e log(2/Awall )+x+z ) ≤ 0
log(e log(1/Aflr )+y +z ) ≤ 0
log(e log α−x+y ) ≤ 0, log(e − log β+x−y ) ≤ 0
log(e log γ+y −z ) ≤ 0, log(e − log δ−y +z ) ≤ 0
this is a convex model, but it can be simplified!
most of the constraints are actually linear.
14-23
Example: geometric programming
minimize −x −y −z
x,y ,z
subject to: log(e log(2/Awall )+x+y + e log(2/Awall )+x+z ) ≤ 0
y + z ≤ log Aflr
log α ≤ x − y ≤ log β
log γ ≤ z − y ≤ log δ
This is a convex optimization problem.
14-24