Dynamic Programming, Longest Common Subsequence(动态规划,最长相同子序列)
Design technique, like divide-and-conquer.
Example: Longest Common Subsequence (LCS)
Given two sequences x[1 . . m] and y[1 . . n], find a longest subsequence common to them both.
Bruce-Force algorithm
Check every subsequence of x[1 . . m] to see if it is also a subsequence of y[1 . . n].
Analysis
- Checking = O(n) time per subsequence.
- 2^m subsequences of x (each bit-vector of length m determines a distinct subsequence of x).
- So, Worst-case running time = O(n2^m) = exponential time. Slow!
Simplification
- Look at the length of a longest-common subsequence.
- Extend the algorithm to find the LCS itself.
Notation : Denote the length of a sequence s by |s|.
Strategy : Consider prefixes of x and y.
Define: c[i, j] = | LCS(x[1 . . i], y[1 . . j])|. 最长子序列长度
那么可知,c[m,n] = |LCS(x,y)|
Theorem.
证明略
Dynamic-programming hallmark #1
Optimal substructure :An optimal solution to a problem (instance) contains optimal solutions to subproblems。
动态规划适用于子问题不是独立的情况,也就是各子问题包含公共的子子问题。
Recursion algorithm for LCS
if x[i]=y[j]
then c[i,j]=LCS(c[i-1,j-1])+1
else
c[i,j] = max{LCS(i,j-1),LCS(i-1,j)};
return c[i,j]
Worst-case: x[i]!=y[j], 对任意i,j。in which case the algorithm evaluates two subproblems, each with only one parameter decremented.
Height = m + n ⇒ work potentially exponential。but we’re solving subproblems already solved!
Dynamic-programming hallmark #2
Overlapping subproblems A recursive solution contains a “small” number of distinct subproblems repeated many times.
The number of distinct LCS subproblems for two strings of lengths m and n is only mn.
Memoization algorithm
Memoization: After computing a solution to a subproblem, store it in a table. Subsequent calls check the table to avoid redoing work.
Memoization Algorithm
c[i,j]=LCS(x,y,i,j)
if c[i,j]=nil
then if x[i]=y[j]
then c[i,j] = LCS(x,ymi-1,j-1)+1
else
c[i,j] = max{LCS(i,j-1),LCS(i-1,j)}
return c[i,j];
Time = Θ(mn) = constant work per table entry.
Space = Θ(mn).
Dynamic-programming algorithm
LCS-LENGTH(X,Y)
m = length[X]
n = length[Y]
for i=1 to m
do c[i,0] = 0
for j=0 to n
fo c[0,j] = 0
for i=1 to n
do for j=1 to n
do if x[i] = y[j]
then c[i,j] = c[i-1,j-1]+1
else if c[i-1,j] >= c[i,j-1]
then c[i,j] = c[i-1,j]
else c[i,j] = c[i,j-1]
return c
Time = Θ(mn) .
Reconstruct LCS by tracing backwards .
- Space = Θ (mn ) .