VIEWS: 7 PAGES: 48 POSTED ON: 10/8/2012
15.Dynamic Programming Dynamic programming is typically applied to optimization problems. In such problem there can be many solutions. Each solution has a value, and we wish to find a solution with the optimal value. Dynamic programming is a method for solving complex problems by breaking them down into simpler steps. It is applicable to problems exhibiting the properties of overlapping subproblems and optimal substructure. The development of a dynamic programming algorithm can be broken into a sequence of four steps: 1. Characterize the structure of an optimal solution. 2. Recursively define the value of an optimal solution. 3. Compute the value of an optimal solution in a bottom-up (or top-down) fashion. 4. Construct an optimal solution from computed information. 2 15.1 Rod cutting Rod cutting problem: given a rod of length n inches and a table of price pi for i=1, 2, …., n, determine the maximum revenue obtainable by cutting up the rod and selling the prices. Consider the case when n=4. length i 1 2 3 4 5 6 7 8 9 10 price pi 1 5 8 9 10 17 17 20 24 30 9 1 8 5 5 8 1 (a) (b) (c) (d) 1 1 5 1 5 1 5 1 1 1 1 1 1 (e) (f) (g) (h) 3 We can cut up a rod of length n in 2n-1 different ways. If an optimal solution cuts the rod into k pieces, for some 1 k n, then an optimal decomposition n =i1+i2+…+ik of the rod into pieces of lengths i1, i2,….ik provides maximum corresponding revenue rn=pi1+pi2+….+pik. rn=max(pn, r1+rn-1, r2+rn-2, …., rn-1+r1) Rod cutting problem exhibits optimal substructure: optimal solutions to a problem incorporate optimal solutions to related subproblems, which we may solve independently. We can rewrite the rn equation as follow: rn max( pi rn i ) 1i n 4 Recursive top-down implementation CUT-ROD(p, n) 4 1 if n == 0 2 return 0 3 2 1 0 3 q = - 2 1 0 1 0 0 4 for i = 1 to n 0 5 q = max(q, p[i]+CUT-ROD(p, n-i)) 1 0 0 6 return q 0 Let T(n) denote the total number of calls made to CUT-ROD when called with its second parameter equal to n. n 1 T ( n) 1 T ( j ) j 0 We have : T (n) 2 n 5 Dynamic programming method A naive recursive solution is inefficient because it solves the same subproblems repeatedly. We arrange for each subproblem to be solved only once and save its solution for later use. Dynamic programming thus uses additional memory to save computation time; it serves an example of a time-memory trade-off. There are usually two equivalent ways to implement a 4 dynamic-programming approach: Top-down with memorization 4 3 Bottom-up method 3 2 1 0 2 2 1 0 1 0 0 1 0 1 0 0 0 6 0 Top-down with memoization MEMOIZED-CUT-ROD(p,n) 1 let r[0..n] be a new array 2 for i = 0 to n 3 r[i] = - 4 return MEMOIZED-CUT-ROD-AUX(p, n, r) 4 MEMOIZED-CUT-ROD-AUX(p, n, r) 3 2 1 0 1 if r[n]0 2 return r[n] 2 1 0 1 0 0 3 if n == 0 0 1 0 0 4 q=0 5 else q = - 0 6 for i= 1 to n 7 q=max(q, p[i]+MEMOIZED-CUT-ROD-AUX(p, n-i, r)) 8 r[n]=q 9 return q 7 Bottom-up method 4 3 2 1 0 BOTTOM-UP-CUT-ROD(p,n) 1 let r[0..n] be a new array 1 0 1 0 0 2 2 r[0] = 0 0 3 for j = 1 to n 1 0 0 4 q = - 0 5 for i = 1 to j 6 q = max(q, p[i] + r[j-i]) 7 r[j] = q 8 return r[n] 8 Reconstructing a Solution EXTENDED-BOTTOM-UP-CUT-ROD(p,n) 1 let r[0..n] be a new array 2 r[0] = 0 i 0 1 2 3 4 5 6 7 8 9 10 3 for j = 1 to n r[i] 0 1 5 8 10 13 17 18 22 25 30 4 q = - s[i] 0 1 2 3 2 2 6 1 2 3 10 5 for i = 1 to j 6 if q < p[i] + r[j-i] 7 q = p[i] + r[j-i] 8 s[j] = i length i 1 2 3 4 5 6 7 8 9 10 9 r[j] = q price pi 1 5 8 9 10 17 17 20 24 30 10 return r and s PRINT-CUT-ROD-SOLUTION(p,n) 1 (r,s) = EXTENDED-BOTTOM-UP-CUT-ROD(p,n) 2 while n > 0 3 print s[n] 4 n = n-s[n] 9 15.2 Matrix-chain multiplication MATRIX MULTIPLY(A, B) Let A be a p q matrix, and B be a q r 1 if A.columns B.rows matrix. Then the complexity is p q r . 2 error “incompatible dimensions” 3 else let C be a new A.rows B.columns matrix 4 for i = 1 to A.rows 5 for j = 1 to B.columns 6 cij = 0 7 for k = 1 to A.columns 8 cij = cij + cik ckj 9 return C A product of matrices is fully parenthesized if it is either a single matrix, or a product of two fully parenthesized matrix product, surrounded by parentheses. For example: we can fully parenthesize the product A1 A2 A3 A4 in five ways ( A ( A2 ( A3 A4 )))( A (( A2 A3 ) A4 )) 1 1 (( A A2 )( A3 A4 ))(( A ( A2 A3 )) A4 ) 1 1 ((( A A2 ) A3 ) A4 ) 1 10 Matrix-chain multiplication problem The problem of matrix-chain multiplication: given a chain <A1, A2, …., An> of n matrices, where i = 1, 2, …., n matrix Ai dimension pi-1 pi, fully parenthesize the product A1, A2, …., An in a way that minimize the number of scalar multiplications. To illustrate the different costs incurred by different parenthesizations of a matrix product, consider the problem of a chain <A1, A2, A3> Assume the dimensions of A1 A2 A3 are 10 100, 100 5 and 5 50. ((A1 A2) A3) takes 10 100 5 + 10 5 50 = 7500 scalar multiplications (A1 (A2 A3) ) takes 10 100 50 + 10 100 50 = 75000 scalar multiplications 11 Counting the number of parenthesizations Denote the number of alternative parenthesizations of a sequence of n matrices by P(n) . Then 1 if n 1 n 1 P ( n) P(k ) p(n k ) if n 2 k 1 P(n) C (n 1) [Catalan number] 知道就好 1 2n 4n n ( n3 / 2 ) n 1 12 Four steps for Matrix-chain multiplication problem using dynamic programming Step 1: The structure of an optimal parenthesization. Step 2: A recursive solution Step 3: Computing the optimal costs Step 4: Constructing an optimal solution Step 1: Optimal (( A1 A2 . .. Ak )( Ak 1 Ak 2 . . . An )) Combine 13 Step 2: A recursive solution Define m[i, j]= minimum number of scalar multiplications Ai.. j Ai Ai 1.. A j needed to compute the matrix 0 i j m[i, j ] min i k j {m[i, k ] m[ k 1, j ] pi 1 pk p j } i j Goal m[1, n] 14 Step 3: Computing the optimal costs Let p = < p0, p1 , .., pi ,…,pn> MATRIX_CHAIN_ORDER(p) 1 n = p.length –1 2 Let m[1..n, 1..n] and s[1..n-1, 2..n] be a new table 3 for i = 1 to n 4 m[i, i] 0 5 for l = 2 to n 6 for i = 1 to n – l + 1 Complexity: O(n 3 ) 7 j=i+l–1 8 m[i, j] = 9 for k = i to j – 1 10 q = m[i, k] + m[k+1, j]+ pi-1pkpj 11 if q < m[i, j] 12 m[i, j] = q 13 s[i, j] = k 14 return m and s 15 Example: The m and s table computed by MATRIX-CHAIN-ORDER for n=6 A1 30 35 p0 p1 A2 35 15 p1 p2 A3 15 5 p2 p3 A4 5 10 p3 p4 A5 10 20 p4 p5 A6 20 25 p5 p6 m[2,5]= min{ m[2,2]+m[3,5]+p1p2p5=0+2500+351520=13000, m[2,3]+m[4,5]+p1p3p5=2625+1000+35520=7125, m[2,4]+m[5,5]+p1p4p5=4375+0+351020=11374 }=7125 16 Step 4: Constructing an optimal solution MATRIX-OPTIMAL-PARENS( s, i, j) 1 if j == i 2 print “A”i 3 else print “(“ 4 MATRIX-OPTIMAL-PARENS( s, i, s[i, j]) 5 MATRIX-OPTIMAL-PARENS( s, s[i, j]+1, j) 6 print “)” example: (( A1 ( A2 A3 ))(( A4 A5 ) A6 )) 17 16.3 Elements of dynamic programming Discovering Optimal substructure A common pattern: 1. You show that a solution to the problem consists of making a choice. Making this choice leaves one or more subproblems to be solved. 2. You suppose that for a given problem, you are given the choice that leads to an optimal solution. 3. Given this choice, you determine which subproblems ensue (接著發生) and how to best characterize the resulting space of subproblems. 4. You show that the solutions to the subproblems used within the optimal solution to the problem must themselves be optimal by using a “cut-and-paste” technique. 19 Optimal substructure: We say that a problem exhibits optimal substructure if an optimal solution to the problem contains within its optimal solutions to subproblems. Example: Matrix-multiplication problem. Optimal substructure varies across problem domains in two ways: how many subproblems are used in an optimal solution to the original problem, and how many choices we have in determining which subproblems to use in an optimal solution. 20 Subtleties (細微的差別) One should be careful not to assume that optimal substructure applies when it does not. Consider the following two problems in which we are given a directed graph G = (V, E) and vertices u, v V. Unweighted shortest path: Find a path from u to v consisting of the fewest edges. Good for Dynamic programming. Unweighted longest simple path: Find a simple path from u to v consisting of the most edges. Not good for Dynamic programming. q r s t 21 Overlapping subproblems: example: MAXTRIX_CHAIN_ORDER RECURSIVE_MATRIX_CHAIN RMC(p, i, j) 1 if i == j 2 then return 0 3 m[i, j] = 4 for k = i to j – 1 5 q = RMC(p,i,k) + RMC(p,k+1,j) + pi-1pkpj 6 if q < m[i, j] 7 then m[i, j] = q 8 return m[i, j] The recursion tree for the computation of RMCHAIN(P, 1, 4) 23 T (1) 1 n 1 T (n) 1 (T (k ) T (n k ) 1) for n 1 k 1 n 1 T ( n) 2 T (i ) n i 1 We can prove that T(n) =(2n) using substitution method. T (1) 1 20 n 1 n2 T (n) 2 2i 1 n 2 2i n i 1 i 0 2(2 n 1 1) n (2 n 2) n 2 n 1 Solution: 1. bottom up 2. memorization (memorize the natural, but inefficient) 24 MEMORIZED_MATRIX_CHAIN MEMORIZED_MATRIX_CHAIN(p) 1 n = p.length – 1 2 let m[1..n, 1..n] be q new table 3 for i = 1 to n 3 for j = 1 to n 4 m[i, j] = 5 return LOOKUP_CHAIN (m, p, 1, n) 25 LOOKUP_CHAIN LOOKUP_CHAIN(m, p, i, j) 1 if m[i, j] < 2 then return m[i, j] 3 if i == j 4 m[i, j] = 0 5 else for k = i to j – 1 6 q = LOOKUP_CHAIN(m, p, i, k) +LOOKUP_CHAIN (m, p, k+1, j) + pi-1pkpj 7 if q < m[i, j] 8 m[i, j] q 9 return m[i, j] Time Complexity: O(n 3 ) 26 16.4 Longest Common Subsequence X = < A, B, C, B, D, A, B > Y = < B, D, C, A, B, A > < B, C, A > is a common subsequence of both X and Y. < B, C, B, A > or < B, C, A, B > is the longest common subsequence of X and Y. 27 Longest-common-subsequence problem: We are given two sequences X = <x1, x2,..., xm> and Y = <y1, y2,..., yn>, and wish to find a maximum length common subsequence of X and Y. We define Xi = < x1,x2,...,xi > and Yj = < y1, y2,..., yj > 28 Theorem 16.1. (Optimal substructure of LCS) Let X = <x1,x2,...,xm> and Y = <y1,y2,...,yn> be the sequences, and let Z = <z1,z2,...,zk> be any LCS of X and Y. 1. If xm = yn then zk = xm = yn and Zk-1 is an LCS of Xm-1 and Yn-1. 2. If xm yn then zk xm implies Z is an LCS of Xm-1 and Y. 3. If xm yn then zk yn implies Z is an LCS of X and Yn-1. 29 A recursive solution to subproblem Define c [i, j] is the length of the LCS of Xi and Yj . 0 if i=0 or j= 0 c[i, j ] c[i 1, j 1] 1 if i,j> 0 and xi=y j max{ c[i, j 1], c[i 1, j ]} if i,j> 0 and x y i j 30 Computing the length of an LCS LCS_LENGTH(X,Y) 14 c[i, j] = c[i-1, j] 1 m = X.length 15 b[i, j] = “” 2 n = Y.length 3 let b[1..m, 1..n] and c[0..m, 0..n] be new table 16 else c[i, j] = c[i, j-1] 4 for i = 1 to m 17 b[i, j] = “” 5 do c[i, 0] = 0 18 return c and b 6 for j = 1 to n 7 do c[0, j] = 0 8 for i = 1 to m 9 for j = 1 to n 10 if xi == yj 11 c[i, j] = c[i-1, j-1]+1 12 b[i, j] “” 13 elseif c[i–1, j] c[i, j-1] 31 Complexity: O(mn) PRINT_LCS PRINT_LCS(b, X, c, j ) 1 if i == 0 or j == 0 2 return 3 if b[i, j] == “” 4 PRINT_LCS(b, X, i-1, j-1) 5 print xi 6 elseif b[i, j] == “” 7 PRINT_LCS(b, X, i-1, j) 8 else PRINT_LCS(b, X, i, j-1) Complexity: O(m+n) 32 15.5 Optimal Binary search trees cost:2.80 cost:2.75 optimal!! n n i 1 pi qi 1 i 0 Assume that the actual cost of a search in a given binary search tree T equals the number of nodes examined. The expected cost of a search in T is n n E[searchcost in T ] (depthT (ki ) 1) pi (depthT (d i ) 1) qi i 1 i 0 n n 1 depthT (ki ) pi depthT (d i ) qi 33 i 1 i 0 Optimal Binary search trees For a given set of probabilities, our goal is to construct a binary search tree whose expected search is smallest. We call such a tree an optimal binary search tree. 34 Step 1: The structure of an optimal binary search tree Consider any subtree of a binary search tree. It must contain keys in a contiguous range ki, ...,kj, for some 1 i j n. In addition, a subtree that contains keys ki, ..., kj must also have as its leaves the dummy keys di-1, ..., dj. If an optimal binary search tree T has a subtree T' containing keys ki, ..., kj, then this subtree T' must be optimal as well for the subproblem with keys ki, ..., kj and dummy keys di-1, ..., dj. 35 Step 2: A recursive solution Let e[i, j ] as the expected cost of seraching an optimal ee binary serach tr containing the keys ki ,....., kj , wherei 1, and j n. j j This sum of probabilit y is w(i, j ) pl ql l i l i 1 qi 1 if j i - 1, e[i, j ] minj{e[i, r 1] e[r 1, j ] w(i, j )} if i j. ir 36 Step 3:computing the expected search cost of an optimal binary search tree OPTIMAL-BST(p,q,n) 1 Let e[1..n+1, 0..n], w e[1..n+1, 0..n], and root[1..n, 1.. N] be new tables. 2 for i = 1 to n + 1 3 e[i, i – 1] = qi-1 4 w[i, i – 1] = qi-1 the OPTIMAL-BST procedure takes (n3), 5 for l = 1 to n just like MATRIX-CHAIN-ORDER 6 for i = 1 to n – l + 1 7 j=i+l–1 8 e[i, j] = 9 w[i, j] = w[i, j – 1] + pj+qj 10 for r = i to j 11 do t = e[i, r –1]+e[r +1, j]+w[i, j] 12 if t < e[i, j] 13 e[i, j] = t 14 root[i, j] = r 15 return e and root 37 The table e[i,j], w[i,j], and root[i,j] computer by OPTIMAL-BST on the key distribution. j j w(i, j ) pl q l l i l i 1 qi 1 if j i - 1, e[i, j ] min{e[i, r 1] e[r 1, j ] w(i, j )} if i j. 38 ir j 補充 Assembly-line scheduling An automobile chassis enters each assembly line, has parts added to it at a number of stations, and a finished auto exits at the end of the line. Each assembly line has n stations, numbered j = 1, 2,...,n. We denote the jth station on line i ( where i is 1 or 2) by Si,j. The jth station on line 1 (S1,j) performs the same function as the jth station on line 2 (S2,j). 39 The stations were built at different times and with different technologies, however, so that the time required at each station varies, even between stations at the same position on the two different lines. We denote the assembly time required at station Si,j by ai,j. As the coming figure shows, a chassis enters station 1 of one of the assembly lines, and it progresses from each station to the next. There is also an entry time ei for the chassis to enter assembly line i and an exit time xi for the completed auto to exit assembly line i. 40 Normally, once a chassis enters an assembly line, it passes through that line only. The time to go from one station to the next within the same assembly line is negligible. Occasionally a special rush order comes in, and the customer wants the automobile to be manufactured as quickly as possible. For the rush orders, the chassis still passes through the n stations in order, but the factory manager may switch the partially-completed auto from one assembly line to the other after any station. The time to transfer a chassis away from assembly line i after having gone through station Sij is ti,j, where i = 1, 2 and j = 1, 2, ..., n-1 (since after the nth station, assembly is complete). The problem is to determine which stations to choose from line 1 and which to choose from line 2 in order to minimize the total time through the factory for one auto. 41 An instance of the assembly-line problem with costs 42 Step1 The structure of the fastest way through the factory the fast way through station S1,j is either the fastest way through Station S1,j-1 and then directly through station S1,j, or the fastest way through station S2,j-1, a transfer from line 2 to line 1, and then through station S1,j. Using symmetric reasoning, the fastest way through station S2,j is either the fastest way through station S2,j-1 and then directly through Station S2,j, or the fastest way through station S1,j-1, a transfer from line 1 to line 2, and then through Station S2,j. 43 Step 2: A recursive solution f min( f1[n] x1 , f 2 [n] x2 ) * e1 a1,1 if j 1, f1[ j ] min( f1[ j 1] a1, j , f 2 [ j 1] t 2, j 1 a1, j ) if j 2 e2 a2,1 if j 1, f 2[ j] min( f 2 [ j 1] a2, j , f1[ j 1] t1, j 1 a2, j ) if j 2 44 Step 3: Computing the fastest times Let ri(j)be the number of references made to fi[j] in a recursive algorithm. r1(n)=r2(n)=1 r1(j) = r2(j)=r1(j+1)+r2(j+1) The total number of references to all fi[1] values is (2n). We can do much better if we compute the fi[j] values in different order from the recursive way. Observe that for j 2, each value of fi[j] depends only on the values of f1[j-1] and f2[j-1]. 45 FASTEST-WAY procedure FASTEST-WAY(a, t, e, x, n) 1 f1[1] e1 + a1,1 2 f2[1] e2 + a2,1 3 for j 2 to n 4 if f1[j-1] + a1,j f2[j-1] + t2,j-1 +a1,j 5 f1[j] f1[j-1] + a1,j 6 l1[j] 1 7 else f1[j] f2[j-1] + t2,j-1 +a1,j 8 l1[j] 2 9 if f2[j-1] + a2, j f1[j-1] + t1,j-1 +a2,j 46 10 f2[j] f2[j – 1] + a2,j 11 l2[j] 2 12 else f2[j] f1[j – 1] + t1,j-1 + a2,j 13 l2[j] 1 14 if f1[n] + x1 f2[n] + x2 15 f* = f1[n] + x1 16 l* = 1 17 else f* = f2[n] + x2 18 l* = 2 47 Step 4: constructing the fastest way through the factory output PRINT-STATIONS(l, n) line 1, station 6 1 i l* line 2, station 5 2 print “line” i “,station” n line 2, station 4 3 for j n downto 2 line 1, station 3 4 do i li[j] line 2, station 2 5 print “line” i “,station” j – 1 line 1, station 1 48