Intro to Dynamic Programming
In the next few lectures
• Two important algorithm design techniques
– Dynamic programming
– Greedy algorithm
Optimization Problems
• An important and practical class of computational
problems.
– For each problem instance, there are many feasible
solutions
– Each solution has a value (or cost)
– The goal is to find a solution with the optimal value
• For most of these, the best known algorithm runs in
exponential time.
• Industry would pay dearly to have faster algorithms.
• For some of them, there are quick dynamic
programming or greedy algorithms.
• For the rest, you may have to consider acceptable
solutions rather than optimal solutions
Dynamic programming
• The word “programming” is historical and
predates computer programming
• A very powerful, general tool for solving
certain optimization problems
• Once understood it is relatively easy to
apply.
• It looks like magic until you have seen
enough examples
A motivating example: Fibonacci
numbers
• 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, …
F(0) = 1;
F(1) = 1;
F(n) = F(n-1) + f(n-2)
• How to compute F(n)?
A recursive algorithm
function fib(n)
if (n == 0 or n == 1) return 1;
else return fib(n-1) + fib(n-2);
F(9)
F(8) F(7)
F(7) F(6) F(6) F(5)
F(6) F(5) F(5) F(4) F(5) F(4) F(4) F(3)
An iterative algorithm
function fib(n)
F[0] = 1; F[1] = 1;
for i = 2 to n
F[i] = F[i-1] + F[i-2];
Return F[n];
• Problem with the recursive Fib algorithm:
– Each subproblem was solved for many times!
• Solution: avoid solving the same subproblem more than
once
(1) pre-compute all subproblems that may be needed later
(2) Compute on demand, but memorize the solution to avoid
recomputing
• Can you always speedup a recursive algorithm by
making it an iterative algorithm?
– E.g., merge sort
– No. since there is no overlap between the two sub-problems
Another motivating example:
Shortest path problem
start
goal
• Find the shortest path from start to goal
• An optimization problem
Possible approaches
• Brute-force algorithm
– Enumerate all possible paths and compare
– How many?
• Greedy algorithm
– Always use the currently shortest edge
– Does not necessarily lead to the optimal
solution
• Can you think about a recursive strategy?
Optimal subpaths
• Claim: if a path startgoal is optimal, any sub-path xy,
where x, y is on the optimal path, is also optimal.
b a + b + c is shortest
a c
start x y goal b’ < b
b’ a + b’ + c < a + b + c
• Proof by contradiction
– If the subpath b between x and y is not the shortest
– we can replace it with a shorter one, b’
– which will reduce the total length of the new path
– the optimal path from start to goal is not the shortest =>
contradiction!
– Hence, the subpath b must be the shortest among all paths from
x to y
Shortest path problem
dist(start, A) + SP(A, goal)
SP(start, goal) = min dist(start, B) + SP(B, goal)
dist(start, C) + SP(C, goal)
A special shortest path problem
S
G
n
Each edge has a length (cost). We need to get to G from S. Can only move
to the right or bottom. Aim: find a path with the minimum total length
Recursive solution
n
• Suppose we’ve found the shortest path
• It must use one of the two edges:
– (m, n-1) to (m, n) Case 1
– (m-1, n) to (m, n) Case 2
• If case 1
– find shortest path from (0, 0) to (m, n-1) m
– SP(0, 0, m, n-1) + dist(m, n-1, m, n) is the overall shortest path
• If case 2
– find shortest path from (0, 0) to (m-1, n)
– SP(0, 0, m, n-1) + dist(m, n-1, m, n) is the overall shortest path
• We don’t know which case is true
– But if we’ve found the two paths, we can compare
– Actual shortest path is the one with shorter overall length
Recursive solution
Let F(i, j) = SP(0, 0, i, j). => F(m, n) is length of SP from (0, 0) to (m, n)
n
F(m-1, n) + dist(m-1, n, m, n)
F(m, n) = min
F(m, n-1) + dist(m, n-1, m, n)
Generalize
F(i-1, j) + dist(i-1, j, i, j)
m
F(i, j) = min
F(i, j-1) + dist(i, j-1, i, j)
• To find shortest path from (0,
0) to (m, n), we need to find
shortest path to both (m-1, n)
and (m, n-1)
m
• If we use recursive call,
some subpaths will be
recomputed for many times
• Strategy: solve subproblems
in certain order such that
redundant computation can
n be avoided
Dynamic Programming
Let F(i, j) = SP(0, 0, i, j). => F(m, n) is length of SP from (0, 0) to (m, n)
n
F(i-1, j) + dist(i-1, j, i, j)
F(i, j) = min
F(i, j-1) + dist(i, j-1, i, j)
m
i = 1 .. m, j = 1 .. n
Boundary condition: i = 0 or j = 0.
Easy to figure out. Number of unique subproblems =
m*n
(i-1, j)
Data dependency determines
order to compute: left to right, top
(i, j-1)
to bottom
(i, j)
Dynamic programming illustration
S 3 9 1 2
0 3 12 13 15
5 3 3 3 3
3 2 5 2
5 6 8 13 15
2 3 3 9 3
2 4 2 3
7 9 11 13 16
6 2 3 7 4
3 6 3 3
13 11 14 17 20
4 6 3 1 3
1 2 3 2
17 17 17 18 20
G
F(i-1, j) + dist(i-1, j, i, j)
F(i, j) = min
F(i, j-1) + dist(i, j-1, i, j)
Trace back
S 3 9 1 2
0 3 12 13 15
5 3 3 3 3
3 2 5 2
5 6 8 13 15
2 3 3 9 3
2 4 2 3
7 9 11 13 16
6 2 3 7 4
3 6 3 3
13 11 14 17 20
4 6 3 1 3
1 2 3 2
17 17 17 18 20
G
Shortest path has length 20
What is the actual shortest path?
Elements of dynamic programming
• Optimal sub-structures
– Optimal solutions to the original problem contains
optimal solutions to sub-problems
• Overlapping sub-problems
– Some sub-problems appear in many solutions
• Memorization and reuse
– Carefully choose the order that sub-problems are
solved, such that each sub-problem will be solved for
at most once and the solution can be reused
Two steps to dynamic programming
• Formulate the solution as a recurrence relation of
solutions to subproblems.
• Specify an order to solve the subproblems so you
always have what you need.
– Bottom-up
• Tabulate the solutions to all subproblems before they are used
– What if you cannot determine the order easily, of if not
all subproblems are needed?
– Top-down
• Compute when needed.
• Remember the ones you’ve computed
Example problems that can be
solved by dynamic programming
• Sequence alignment problem (several different
versions)
• Shortest path in general graphs
• Knapsack (several different versions)
• Scheduling
• etc.
Sequence alignment
• Compare two strings to see if they are
similar
– We need to define similarity
– Very useful in many applications
– Comparing DNA sequences, articles, source
code, etc.
– Example: Longest Common Subsequence
problem (LCS)
Common subsequence
• A subsequence of a string is the string
with zero or more chars left out
• A common subsequence of two strings:
– A subsequence of both strings
– Ex: x = {A B C B D A B }, y = {B D C A B A}
– {B C} and {A A} are both common
subsequences of x and y
Longest Common Subsequence
• Given two sequences x[1 . . m] and y[1 . . n], find a
longest subsequence common to them both.
“a” not “the”
x: A B C B D A B
BCBA =
LCS(x, y)
y: B D C A B A
functional notation,
but not a function
Brute-force LCS algorithm
Check every subsequence of x[1 . . m] to see
if it is also a subsequence of y[1 . . n].
Analysis
• 2m subsequences of x.
• Hence, the runtime would be exponential !
Towards a better algorithm: a DP strategy
• Key: optimal substructure and overlapping sub-
problems
• First we’ll find the length of LCS. Later we’ll modify
the algorithm to find LCS itself.
Optimal substructure
• Notice that the LCS problem has optimal substructure:
parts of the final solution are solutions of subproblems.
– If z = LCS(x, y), then any prefix of z is an LCS of a prefix of
x and a prefix of y.
i m
x
z
n
y
j
• Subproblems: “find LCS of pairs of prefixes of x and y”
Recursive thinking
m
x
n
y
• Case 1: x[m]=y[n]. There is an optimal LCS that matches
x[m] with y[n]. Find out LCS (x[1..m-1], y[1..n-1])
• Case 2: x[m] y[n]. At most one of them is in LCS
– Case 2.1: x[m] not in LCS Find out LCS (x[1..m-1], y[1..n])
– Case 2.2: y[n] not in LCS Find out LCS (x[1..m], y[1..n-1])
Recursive thinking
m
x
n
y
• Case 1: x[m]=y[n] Reduce both sequences by 1 char
– LCS(x, y) = LCS(x[1..m-1], y[1..n-1]) || x[m]
• Case 2: x[m] y[n] concatenate
– LCS(x, y) = LCS(x[1..m-1], y[1..n]) or
LCS(x[1..m], y[1..n-1]), whichever is longer
Reduce either sequence by 1 char
Finding length of LCS
m
x
n
y
• Let c[i, j] be the length of LCS(x[1..i], y[1..j])
=> c[m, n] is the length of LCS(x, y)
• If x[m] = y[n]
c[m, n] = c[m-1, n-1] + 1
• If x[m] != y[n]
c[m, n] = max { c[m-1, n], c[m, n-1] }
Generalize: recursive formulation
c[i–1, j–1] + 1 if x[i] = y[j],
c[i, j] = max{c[i–1, j], c[i, j–1]} otherwise.
1 2 i m
x: ...
1 2 j n
y: ...
Recursive algorithm for LCS
LCS(x, y, i, j)
if x[i] = y[ j]
then c[i, j] LCS(x, y, i–1, j–1) + 1
else c[i, j] max{ LCS(x, y, i–1, j),
LCS(x, y, i, j–1)}
Worst-case: x[i] y[ j], in which case the
algorithm evaluates two subproblems, each
with only one parameter decremented.
DP Algorithm
• Key: find out the correct order to solve the sub-problems
• Total number of sub-problems: m * n
c[i–1, j–1] + 1 if x[i] = y[j],
c[i, j] = max{c[i–1, j], c[i, j–1]} otherwise.
0 j n
i C(i, j)
m
DP Algorithm
LCS-Length(X, Y)
1. m = length(X) // get the # of symbols in X
2. n = length(Y) // get the # of symbols in Y
3. for i = 1 to m c[i,0] = 0 // special case: Y[0]
4. for j = 1 to n c[0,j] = 0 // special case: X[0]
5. for i = 1 to m // for all X[i]
6. for j = 1 to n // for all Y[j]
7. if ( X[i] == Y[j])
8. c[i,j] = c[i-1,j-1] + 1
9. else c[i,j] = max( c[i-1,j], c[i,j-1] )
10. return c
LCS Example
We’ll see how LCS algorithm works on the following
example:
• X = ABCB
• Y = BDCAB
What is the LCS of X and Y?
LCS(X, Y) = BCB
X =AB C B
Y= B DCAB
ABCB
LCS Example (0) BDCAB
j 0 1 2 3 4 5
i Y[j] B D C A B
0 X[i]
1 A
2 B
3 C
4 B
X = ABCB; m = |X| = 4
Y = BDCAB; n = |Y| = 5
Allocate array c[5,6]
ABCB
LCS Example (1) BDCAB
j 0 1 2 3 4 5
i Y[j] B D C A B
0 X[i] 0 0 0 0 0 0
1 A 0
2 B 0
3 C 0
4 B 0
for i = 1 to m c[i,0] = 0
for j = 1 to n c[0,j] = 0
ABCB
LCS Example (2) BDCAB
j 0 1 2 3 4 5
i Y[j] B D C A B
0 X[i] 0 0 0 0 0 0
1 A 0 0
2 B 0
3 C 0
4 B 0
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
ABCB
LCS Example (3) BDCAB
j 0 1 2 3 4 5
i Y[j] B D C A B
0 X[i] 0 0 0 0 0 0
1 A 0 0 0 0
2 B 0
3 C 0
4 B 0
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
ABCB
LCS Example (4) BDCAB
j 0 1 2 3 4 5
i Y[j] B D C A B
0 X[i] 0 0 0 0 0 0
1 A 0 0 0 0 1
2 B 0
3 C 0
4 B 0
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
ABCB
LCS Example (5) BDCAB
j 0 1 2 3 4 5
i Y[j] B D C A B
0 X[i] 0 0 0 0 0 0
1 A 0 0 0 0 1 1
2 B 0
3 C 0
4 B 0
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
ABCB
LCS Example (6) BDCAB
j 0 1 2 3 4 5
i Y[j] B D C A B
0 X[i] 0 0 0 0 0 0
1 A 0 0 0 0 1 1
2 B 0 1
3 C 0
4 B 0
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
ABCB
LCS Example (7) BDCAB
j 0 1 2 3 4 5
i Y[j] B D C A B
0 X[i] 0 0 0 0 0 0
1 A 0 0 0 0 1 1
2 B 0 1 1 1 1
3 C 0
4 B 0
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
ABCB
LCS Example (8) BDCAB
j 0 1 2 3 4 5
i Y[j] B D C A B
0 X[i] 0 0 0 0 0 0
1 A 0 0 0 0 1 1
2 B 0 1 1 1 1 2
3 C 0
4 B 0
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
ABCB
LCS Example (9) BDCAB
j 0 1 2 3 4 5
i Y[j] B D C A B
0 X[i] 0 0 0 0 0 0
1 A 0 0 0 0 1 1
2 B 0 1 1 1 1 2
3 C 0 1 1
4 B 0
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
ABCB
LCS Example (10) BDCAB
j 0 1 2 3 4 5
i Y[j] B D C A B
0 X[i] 0 0 0 0 0 0
1 A 0 0 0 0 1 1
2 B 0 1 1 1 1 2
3 C 0 1 1 2
4 B 0
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
ABCB
LCS Example (11) BDCAB
j 0 1 2 3 4 5
i Y[j] B D C A B
0 X[i] 0 0 0 0 0 0
1 A 0 0 0 0 1 1
2 B 0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
ABCB
LCS Example (12) BDCAB
j 0 1 2 3 4 5
i Y[j] B D C A B
0 X[i] 0 0 0 0 0 0
1 A 0 0 0 0 1 1
2 B 0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
ABCB
LCS Example (13) BDCAB
j 0 1 2 3 4 5
i Y[j] B D C A B
0 X[i] 0 0 0 0 0 0
1 A 0 0 0 0 1 1
2 B 0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
ABCB
LCS Example (14) BDCAB
j 0 1 2 3 4 5
i Y[j] B D C A B
0 X[i] 0 0 0 0 0 0
1 A 0 0 0 0 1 1
2 B 0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2 3
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Algorithm Running Time
• LCS algorithm calculates the values of each
entry of the array c[m,n]
• So what is the running time?
O(m*n)
since each c[i,j] is calculated in
constant time, and there are m*n
elements in the array
How to find actual LCS
• The algorithm just found the length of LCS, but not LCS itself.
• How to find the actual LCS?
• For each c[i,j] we know how it was acquired:
c[i 1, j 1] 1 if x[i ] y[ j ],
c[i, j ]
max(c[i, j 1], c[i 1, j ]) otherwise
• A match happens only when the first equation is taken
• So we can start from c[m,n] and go backwards, remember
x[i] whenever c[i,j] = c[i-1, j-1]+1.
2 2 For example, here
2 3 c[i,j] = c[i-1,j-1] +1 = 2+1=3
Finding LCS
j 0 1 2 3 4 5
i Y[j] B D C A B
0 X[i] 0 0 0 0 0 0
1 A 0 0 0 0 1 1
2 B 0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2 3
Time for trace back: O(m+n).
Finding LCS (2)
j 0 1 2 3 4 5
i Y[j] B D C A B
0 X[i] 0 0 0 0 0 0
1 A 0 0 0 0 1 1
2 B 0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2 3
LCS (reversed order): B C B
LCS (straight order): B C B