KEMBAR78
LCS Using DP | PDF | String (Computer Science) | Algorithms
0% found this document useful (0 votes)
46 views67 pages

LCS Using DP

The document discusses the longest common subsequence problem and provides an example to demonstrate the recursive algorithm. It defines the problem as finding the longest subsequence that is common to two sequences, describes an optimal substructure and provides a recursive solution using a 2D table to store subproblem solutions.

Uploaded by

BushraS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views67 pages

LCS Using DP

The document discusses the longest common subsequence problem and provides an example to demonstrate the recursive algorithm. It defines the problem as finding the longest subsequence that is common to two sequences, describes an optimal substructure and provides a recursive solution using a 2D table to store subproblem solutions.

Uploaded by

BushraS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 67

Longest Common Subsequence

Analysis of Algorithms 1
Longest Common Subsequence
• Problem: Given 2 sequences, X = 〈x1,...,xm〉 and
Y = 〈y1,...,yn〉, find a common subsequence whose length is maximum.

springtime ncaa tournament basketball

printing north carolina krzyzewski

Subsequence need not be consecutive, but must be in order.

Comp 122, Spring 2004


Longest common subsequence (LCS)

For a sequence X = x1, x2, …, xn, a subsequence is


a subset of the sequence defined by a set of
increasing indices (i1, i2, …, ik) where
1 ≤ i1 < i2 < … < i k ≤ n

X=ABACDABAB

ABA?
Longest common subsequence (LCS)

For a sequence X = x1, x2, …, xn, a subsequence is a


subset of the sequence defined by a set of increasing
indices (i1, i2, …, ik) where
1 ≤ i1 < i2 < … < i k ≤ n

X=ABACDABAB

ABA
Longest common subsequence (LCS)

For a sequence X = x1, x2, …, xn, a subsequence is


a subset of the sequence defined by a set of
increasing indices (i1, i2, …, ik) where
1 ≤ i1 < i2 < … < i k ≤ n

X=ABACDABAB

ACA?
Longest common subsequence (LCS)

For a sequence X = x1, x2, …, xn, a subsequence is


a subset of the sequence defined by a set of
increasing indices (i1, i2, …, ik) where
1 ≤ i1 < i2 < … < i k ≤ n

X=ABACDABAB

ACA
Longest common subsequence (LCS)

For a sequence X = x1, x2, …, xn, a subsequence is


a subset of the sequence defined by a set of
increasing indices (i1, i2, …, ik) where
1 ≤ i1 < i2 < … < i k ≤ n

X=ABACDABAB

DCA?
Longest common subsequence (LCS)

For a sequence X = x1, x2, …, xn, a subsequence is


a subset of the sequence defined by a set of
increasing indices (i1, i2, …, ik) where
1 ≤ i1 < i2 < … < i k ≤ n

X=ABACDABAB

DCA
Longest common subsequence (LCS)

For a sequence X = x1, x2, …, xn, a subsequence is


a subset of the sequence defined by a set of
increasing indices (i1, i2, …, ik) where
1 ≤ i1 < i2 < … < i k ≤ n

X=ABACDABAB

AADAA?
Longest common subsequence (LCS)

For a sequence X = x1, x2, …, xn, a subsequence is


a subset of the sequence defined by a set of
increasing indices (i1, i2, …, ik) where
1 ≤ i1 < i2 < … < i k ≤ n

X=ABACDABAB

AADAA
LCS problem

Given two sequences X and Y, a common subsequence is a


subsequence that occurs in both X and Y
Given two sequences X = x1, x2, …, xn and
Y = y1, y2, …, yn,

What is the longest common subsequence?

X=ABCBDAB
Y=BDCABA
LCS problem

Given two sequences X and Y, a common subsequence is a


subsequence that occurs in both X and Y
Given two sequences X = x1, x2, …, xn and
Y = y1, y2, …, yn,

What is the longest common subsequence?

X=ABCBDAB
Y=BDCABA
Longest Common Subsequence

Analysis of Algorithms 13
Longest Common Subsequence

Analysis of Algorithms 14
Characterizing a Longest Common Subsequence

Analysis of Algorithms 15
Optimal Substructure
Theorem
Let Z = 〈z1, . . . , zk〉 be any LCS of X and Y .
1. If xm = yn, then zk = xm = yn and Zk-1 is an LCS of Xm-1 and Yn-1.
2. If xm ≠ yn, then either zk ≠ xm and Z is an LCS of Xm-1 and Y .
3. or zk ≠ yn and Z is an LCS of X and Yn-1.

Comp 122, Spring 2004


Naïve Solution
Let X be a sequence of length m,
and Y a sequence of length n.

Check for every subsequence of X whether it is a subsequence of Y, and


return the longest common subsequence found.

There are 2m subsequences of X. Testing a sequences whether or not it is a


subsequence of Y takes O(n) time. Thus, the naïve algorithm would
take O(n2m) time.

17
Naïve Algorithm
• For every subsequence of X, check whether it’s a subsequence of Y .
• Time: Θ(n2m).
– 2m subsequences of X to check.
– Each subsequence takes Θ(n) time to check:
scan Y for first letter, for second, and so on.

Comp 122, Spring 2004


Optimal Substructure
Theorem
Let Z = 〈z1, . . . , zk〉 be any LCS of X and Y .
1. If xm = yn, then zk = xm = yn and Zk-1 is an LCS of Xm-1 and Yn-1.
2. If xm ≠ yn, then either zk ≠ xm and Z is an LCS of Xm-1 and Y .
3. or zk ≠ yn and Z is an LCS of X and Yn-1.

Notation:
prefix Xi = 〈x1,...,xi〉 is the first i letters of X.
This says what any longest common subsequence must look like;
do you believe it?

Comp 122, Spring 2004


Optimal Substructure
Theorem
Let Z = 〈z1, . . . , zk〉 be any LCS of X and Y .
1. If xm = yn, then zk = xm = yn and Zk-1 is an LCS of Xm-1 and Yn-1.
2. If xm ≠ yn, then either zk ≠ xm and Z is an LCS of Xm-1 and Y .
3. or zk ≠ yn and Z is an LCS of X and Yn-1.

Proof: (case 1: xm = yn)


Any sequence Z’ that does not end in xm = yn can be made longer by adding xm = yn
to the end. Therefore,
(1) longest common subsequence (LCS) Z must end in xm = yn.
(2) Zk-1 is a common subsequence of Xm-1 and Yn-1, and
(3) there is no longer CS of Xm-1 and Yn-1, or Z would not be an LCS.

Comp 122, Spring 2004


Optimal Substructure
Theorem
Let Z = 〈z1, . . . , zk〉 be any LCS of X and Y .
1. If xm = yn, then zk = xm = yn and Zk-1 is an LCS of Xm-1 and Yn-1.
2. If xm ≠ yn, then either zk ≠ xm and Z is an LCS of Xm-1 and Y .
3. or zk ≠ yn and Z is an LCS of X and Yn-1.

Proof: (case 2: xm ≠ yn, and zk ≠ xm)


Since Z does not end in xm,
(1) Z is a common subsequence of Xm-1 and Y, and
(2) there is no longer CS of Xm-1 and Y, or Z would not be an LCS.

Comp 122, Spring 2004


Recursive Solution
• Define c[i, j] = length of LCS of Xi and Yj .
• We want c[m,n].

This gives a recursive algorithm and solves the problem.


But does it solve it well?

Comp 122, Spring 2004


Longest Common Subsequence Algorithm

Analysis of Algorithms 23
A Recursive Solution to Subproblems

Analysis of Algorithms 24
A Recursive Solution to Subproblems

Analysis of Algorithms 25
Example
INPUT: two strings
OUTPUT: longest common subsequence
X= ACTGAACTCTGTGCACT

Y= TGACTCAGCACAAAAAC

Z= TGACTCGCAC

Ex: X= {A B C B D A B }, Y= {B D C A B A}
Longest Common Subsequence:
X= ABCBDAB X= AB C BDAB
Y= BDCAB A Y= BDCAB A

Analysis of Algorithms 26
Example

Analysis of Algorithms 27
Computing the Length of an LCS

Analysis of Algorithms
28
LCS Example
We’ll see how LCS algorithm works on the following example:
• X = ABCB
• Y = BDCAB

What is the Longest Common Subsequence


of X and Y?

LCS(X, Y) = BCB
X=ABCB
Y=BDCAB

29
LCS Example (0)
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi
A
1
2 B

3 C
4 B

X = ABCB; m = |X| = 4
Y = BDCAB; n = |Y| = 5
Allocate array c[5,4]
30
LCS Example (1)
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi
0 0 0 0 0 0
A
1 0
2 B
0
3 C 0
4 B 0

for i = 1 to m c[i,0] = 0
for j = 1 to n c[0,j] = 0

31
LCS Example (2) ABCB
j 0 1 2 3 4 5
BDCAB
i Yj B D C A B
0 Xi
0 0 0 0 0 0
A
1 0 0
2 B
0
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

32
LCS Example (3) ABCB
j 0 1 2 3 4 5
BDCAB
i Yj B D C A B
0 Xi
0 0 0 0 0 0
A
1 0 0 0 0
2 B
0
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

33
LCS Example (4) ABCB
j 0 1 2 3 4 5
BDCAB
i Yj B D C A B
0 Xi
0 0 0 0 0 0
A
1 0 0 0 0 1
2 B
0
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

34
LCS Example (5) ABCB
j 0 1 2 3 4 5
BDCAB
i Yj B D C A B
0 Xi
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

35
LCS Example (6) ABCB
j 0 1 2 3 4 5
BDCAB
i Yj B D C A B
0 Xi
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

36
LCS Example (7) ABCB
j 0 1 2 3 4 5
BDCAB
i Yj B D C A B
0 Xi
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

37
LCS Example (8) ABCB
j 0 1 2 3 4 5
BDCAB
i Yj B D C A B
0 Xi
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

38
LCS Example (10) ABCB
j 0 1 2 3 4 5
BDCAB
i Yj B D C A B
0 Xi
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

39
LCS Example (11) ABCB
j 0 1 2 3 4 5
BDCAB
i Yj B D C A B
0 Xi
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

40
LCS Example (12) ABCB
j 0 1 2 3 4 5
BDCAB
i Yj B D C A B
0 Xi
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

41
LCS Example (13) ABCB
j 0 1 2 3 4 5
BDCAB
i Yj B D C A B
0 Xi
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

42
LCS Example (14) ABCB
j 0 1 2 3 4 5
BDCAB
i Yj B D C A B
0 Xi
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

43
LCS Example (15) ABCB
j 0 1 2 3 4 5
BDCAB
i Yj B D C A B
0 Xi
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2 3
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

44
Finding LCS
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2 3

45
Finding LCS
j 0 1 2 3 4 5
i Yj B D C A B
0 Xi
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2 3

46
LCS Length Algorithm
LCS-Length(X, Y)
1. m = length(X) // get the # of symbols in X
2. n = length(Y) // get the # of symbols in Y
3. for i = 1 to m c[i,0] = 0 // special case: Y0
4. for j = 1 to n c[0,j] = 0 // special case: X0
5. for i = 1 to m // for all Xi
6. for j = 1 to n // for all Yj
7. if ( Xi == Yj )
8. c[i,j] = c[i-1,j-1] + 1
9. else c[i,j] = max( c[i-1,j], c[i,j-1] )
10. return c[m,n] // return LCS length for X and Y
47
Constructing an LCS

Analysis of Algorithms 48
Constructing an LCS

Analysis of Algorithms 49
Computing the length of an LCS
LCS-LENGTH (X, Y)
1. m ← length[X]
2. n ← length[Y]
3. for i ← 1 to m
4. do c[i, 0] ← 0
5. for j ← 0 to n
6. do c[0, j ] ← 0
7. for i ← 1 to m
b[i, j ] points to table entry
8. do for j ← 1 to n whose subproblem we used
9. do if xi = yj in solving LCS of Xi
10. then c[i, j ] ← c[i−1, j−1] + 1 and Yj.
11. b[i, j ] ← “ ”
12. else if c[i−1, j ] ≥ c[i, j−1]
13. then c[i, j ] ← c[i− 1, j ] c[m,n] contains the length
14. b[i, j ] ← “↑” of an LCS of X and Y.
15. else c[i, j ] ← c[i, j−1]
16. b[i, j ] ← “←”
Time: O(mn)
17. return c and b

Comp 122, Spring 2004


Constructing an LCS
PRINT-LCS (b, X, i, j)
1. if i = 0 or j = 0
2. then return
3. if b[i, j ] = “ ”
4. then PRINT-LCS(b, X, i−1, j−1)
5. print xi
6. elseif b[i, j ] = “↑”
7. then PRINT-LCS(b, X, i−1, j)
8. else PRINT-LCS(b, X, i, j−1)

•Initial call is PRINT-LCS (b, X,m, n).


•When b[i, j ] = , we have extended LCS by one character. So LCS
= entries with in them.
•Time: O(m+n)
Comp 122, Spring 2004
Example 2
Analysis of Algorithms 53
If xi=yj i-1,j-1
k

K+1
Arrow= i,j

i-1,j
If xi!=yj
max
i,j-1

If max

If max
Arrow=
If xi=yj i-1,j-1
k

K+1
Arrow=

i-1,j
If xi!=yj
max
i,j-1

If max
Arrow=
LCS= B C B A
The algorithm
The algorithm

Base case initialization


The algorithm

Fill in the matrix


The algorithm
The algorithm
The algorithm
Running time?

Θ(nm)
Computing the Length of an LCS

Analysis of Algorithms 64
Longest Common Subsequence

This improvement works if we only need the length of an LCS


Analysis of Algorithms 65
Example 3
Example: Determine the LCS of (1,0,0,1,0,1,0,1) and (0,1,0,1,1,0,1,1,0).
Solution: let X = (1,0,0,1,0,1,0,1) and
Y = (0,1,0,1,1,0,1,1,0).

From the table we can deduct that LCS = 6.


There are several such sequences, for
instance (1,0,0,1,1,0)
(0,1,0,1,0,1) and
(0,0,1,1,0,1)

Analysis of Algorithms 66
Thanks

You might also like