Dynamic Programming, Longest Common Subsequence(动态规划,最长相同子序列)

Design technique, like divide-and-conquer.

Example: Longest Common Subsequence (LCS)

Given two sequences x[1 . . m] and y[1 . . n], find a longest subsequence common to them both.
image.png

Bruce-Force algorithm

Check every subsequence of x[1 . . m] to see if it is also a subsequence of y[1 . . n].
Analysis

  • Checking = O(n) time per subsequence.
  • 2^m subsequences of x (each bit-vector of length m determines a distinct subsequence of x).
  • So, Worst-case running time = O(n2^m) = exponential time. Slow!

    Simplification

  1. Look at the length of a longest-common subsequence.
  2. Extend the algorithm to find the LCS itself.

Notation : Denote the length of a sequence s by |s|.
Strategy : Consider prefixes of x and y.
Define: c[i, j] = | LCS(x[1 . . i], y[1 . . j])|. 最长子序列长度
那么可知,c[m,n] = |LCS(x,y)|
Theorem.
Day 15 - 图2
证明略

Dynamic-programming hallmark #1

Optimal substructure :An optimal solution to a problem (instance) contains optimal solutions to subproblems。
动态规划适用于子问题不是独立的情况,也就是各子问题包含公共的子子问题。

  1. Recursion algorithm for LCS
  2. if x[i]=y[j]
  3. then c[i,j]=LCS(c[i-1,j-1])+1
  4. else
  5. c[i,j] = max{LCS(i,j-1),LCS(i-1,j)};
  6. return c[i,j]

Worst-case: x[i]!=y[j], 对任意i,j。in which case the algorithm evaluates two subproblems, each with only one parameter decremented.
image.png
Height = m + n ⇒ work potentially exponential。but we’re solving subproblems already solved!

Dynamic-programming hallmark #2

Overlapping subproblems A recursive solution contains a “small” number of distinct subproblems repeated many times.
The number of distinct LCS subproblems for two strings of lengths m and n is only mn.
Memoization algorithm
Memoization: After computing a solution to a subproblem, store it in a table. Subsequent calls check the table to avoid redoing work.

Memoization Algorithm
    c[i,j]=LCS(x,y,i,j)
  if c[i,j]=nil
      then if x[i]=y[j]
                then c[i,j] = LCS(x,ymi-1,j-1)+1
        else 
            c[i,j] = max{LCS(i,j-1),LCS(i-1,j)}
   return c[i,j];

Time = Θ(mn) = constant work per table entry.
Space = Θ(mn).

Dynamic-programming algorithm

image.png

LCS-LENGTH(X,Y)
m = length[X]
n = length[Y]
for i=1 to m
    do c[i,0] = 0
for j=0 to n
    fo c[0,j] = 0
for i=1 to n
    do for j=1 to n
      do if x[i] = y[j]
        then c[i,j] = c[i-1,j-1]+1
    else if c[i-1,j] >= c[i,j-1]
      then c[i,j] = c[i-1,j]
      else c[i,j] = c[i,j-1]
return c

Time = Θ(mn) .
Reconstruct LCS by tracing backwards .

  • Space = Θ (mn ) .