Docstoc

dynamic

Document Sample
dynamic Powered By Docstoc
					  Dynamic programming

  Some divide and conquer
algorithms are awkward or
inefficient to implement
recursively.

  For example, computing
Fibonacci numbers and
binomial coefficients from
their recursive definitions
can take exponential time
(to add 0's and 1's to get
exponentially large function
values).
  The inefficiency in these
cases arises from the
existence of shared
subproblems that are
solved repeatedly.

  Another class of typical
examples are those that
require constructing an
optimal binary tree with a
given (inorder) traversal.

 These examples include:
optimal multiplication of
chains of matrices

optimal BSTs
   (both in Ch. 15, CLRS)

parsing
    (in HMU, Sec. 7.4.4)

  In this class of problems,
one can identify a small
class of subproblems as the
only subproblems that can
occur.
 And the cost and form of
an optimal solution can be
determined quickly by
knowing the optimal cost
and form of solutions to the
possible subproblems.

  This property (cf. (ii) of
K&T, p. 260) is called the
principle of optimality or
the optimal substructure
property.
  In our three sample
problems, the "form" of a
solution means the shape of
the binary tree.

  Other problems have
solutions of different forms.

  For example, for the
problem of finding the
shortest string accepted by
each of a given set of n
DFAs, the solution has the
form of a string.
  This problem fails to
satisfy the principle of
optimality.

   For problems that do
satisfy this principle, the
dynamic programming
strategy can often give an
efficient solution algorithm.

  Here the subproblems are
identified in advance,
solved bottom-up (i.e.,
smallest to largest), and the
solutions stored in a table.
  Working bottom-up
means that for each
subproblem to be solved,
solutions to its subproblems
are already present in the
table.

  In the Fibonacci case, the
table is just a 1-dimensional
array.

  In the binomial case, the
table is the 2-dimensional
Pascal's triangle.
  In the binary tree
examples, the table is also
2-dimensional.

  Here, entry (j,k)
corresponds to the
subproblem extending from
position j to position k in
the traversed tree.
  It's possible to retain the
top down approach to
divide and conquer
algorithms, and construct
the table on the fly as
needed.

 This strategy is called
memoization.
 Table entries are typically
numeric.

  For optimization
problems, these entries are
costs.
  A parallel table typically
is used for solution
structures.

  Its entries give links to the
substructures needed to
build the solution (or say
that the solution is atomic
and has no substructures).
  One dynamic
programming algorithm
that is useful in text editing
and computational biology
is described in Section 6.6
of Kleinberg & Tardos.

  It relates to the problem
of finding an optimal
matching of two strings,
such as "CS 49C" and
"CS47".
   By a matching we mean a
mapping between indices of
the first string and indices
of the second string, such
that if i maps to j and i'
maps to j' and i<j, then i' <
j'.

  In order to allow for
strings of different lengths,
we allow indices to remain
unmapped.
  So for example, one
matching of our two sample
strings is given below.
    C S   4 9 C
    C S _ 4 7 _


 This matching has two
unmapped indices and one
mismatch, and is in fact
optimal if we assign each
unmapped index and each
mismatch a penalty of 1.

 The algebraic condition
on matching is intended to
prevent "crossing".
  That is, we don't want
"CS 49" and "CS 94" to
have a zero-cost matching
that maps the 9's to each
other and the 4's to each
other.

  Construction of a
dynamic programming
algorithm begins with the
observation that in a
optimal matching of strings
of length m and n, either
  symbol m is mapped to
symbol n, or

  symbol m of the first
string is unmapped, or

  symbol n of the second
string is unmapped


  This observation induces
a recurrence relation on
OPT(i,j), where
OPT(i,j) is defined as the
minimum cost of an
alignment between the
prefix of length i of the first
string, and the prefix of
length j of the second
string.

  With the penalty function
used above, the recurrence
is just OPT(i,j) =

 min {ij + OPT(i-1,j-1),
      1 + OPT(i-1,j),
      1 + OPT(i,j-1)},
where ij = 0 if characters i
and j of the appropriate
strings agree, and 1
otherwise.

  Here the 2nd and 3rd
terms in braces correspond
to unmapped indices.
  The 1st corresponds to a
match or to a mismatch.

  A more general penalty
function can be used, and is
used in Section 6.6.
  Computing OPT(i,j)
requires O(1) time, since a
bottom-up approach
guarantees that function
values are available as
needed for smaller indices.

  So finding the minimum
matching cost OPT(m,n)
takes time O(mn).

 But what if the optimal
matching itself is wanted,
and not just its cost?
  Recall that here we use a
parallel table.
  This table may also be
computed in time O(mn).

  Entries in this table
record which of the three
terms inside the braces gave
the minimum value.

  The actual matching
could be constructed top-
down from these table
entries, beginning with
entry (m,n).
  Sometimes dynamic
programming won't give an
efficient algorithm.

  Consider the zero-one
version of the knapsack
problem -- where we have
to take all or none of each
item, so that the amount xi
taken of item i is either 0 or
1.

  If there are n items and
the capacity is W, then
the recurrence below will
give the solution V[n][W].

V[j][k] = max
{V[j-1][k], vj+ V[j-1][k-wk]}
  for j>0 and k>0

V[0][k] = 0
V[j][0] = 0


  Unfortunately the time
complexity (nW) needn't
be polynomial in n.

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:4
posted:12/4/2011
language:English
pages:21