Dynamic Programming

Document Sample
Dynamic Programming Powered By Docstoc
					Dynamic Programming

   With an Emphasis on DNA
     Sequence Alignment
Why Dynamic Programming?
   The recursive method of DNA sequence
    alignment is slow, especially with larger

   DNA strands tend to be large, which
    makes the recursive method
What is Dynamic
   Dynamic programming is a method of
    “unrolling” recursion for certain
    minimization/maximization problems.

   It eliminates the redundancy of certain
    calculations by storing the results in a
Redundancy in Calculations?
Three Cases:
   There can be no gap.
   There can be a gap in the first strand.
   There can be a gap in the second strand.

   Because the computer is exhaustively
    checking (checking every possible
    combination) these cases, redundancy
Redundancy in Calculations?
Examples:          The ellipses after a
 t a . . .         strand stand for the
  gt...             remainder of each of
 t - a . . .
                    the strands.
 - t a . . .      The same remainder,
  g-t...            and hence the same
                    calculation, has to be
                    performed three times
                    in the limited example
                    to the left.
How is it Implemented?
    G   T   A   C
                          The Matrix

A                   Difference Matrix:
                    Gap Penalty:      7
T                   Match:            0
                    Mismatch:         1
Relation Between Matrix and
   The strands are set outside the matrix.
   The strands line up with the calculation area
    in the matrix. (Beyond the initial row and
   Movement along a row in the matrix
    introduces a gap into the strand on the left.
   Movement along a column in the matrix
    introduces a gap in the strand at the top.
   Movement along the diagonal results in a lack
    of a gap.
Strands and the Matrix:
       G       T   A   C
                           In the row, the
                           location changes
      Row                  along the top
A   movement
                           strand, but it does
                           not in the left
                           strand. A gap forms
                           in the left strand.
Strands and the Matrix:
Movement (cont.)
     G   T   A    C
                        In the column, the
                        location changes in
              Column    the strand to the
A            movement
                        left, but not in the
                        top strand, so a gap
T                       forms in the top
Strands and the Matrix:
Movement (cont.)
     G   T      A       C

             Diagonal       Along the diagonal,
             movemen        the location changes
A                t
                            in both strands, so
T                           no gap is formed.

Matrix Setup
         G   T   A   C   In the difference matrix,
                         a penalty is applied as
                         the strands slide past
    0    7   14 21 28    each other in an
                         attempt to achieve the
                         best alignment.
A   7                    A gap penalty is added
                         to each block along the
T   14
                         The arrows represent
                         the direction that a
A   21                   number came from.
Filling the Matrix
         G   T   A   C

    0    7   14 21 28
                          The value that a
                           square holds is
A   7
                         based on the values
T   14                      surrounding it.

A   21
         G   T   A   C         The diagonal
                              contributes itself
    0    7   14 21 28       added to the result
                           of the comparison of
A   7    1                 the letters that mark
                                  the cell.
T   14                   In this case, ‘A’ and ‘G’
                            don’t match, so the
                            mismatch score, 1,
A   21
                                  is used.
         G    T   A   C

    0    7    14 21 28    The row contributes
                            itself added to the
A   7    14
                           gap penalty, which
                                    is 7.
T   14

A   21
         G    T   A   C

    0    7    14 21 28      The column
                           contributes itself
A   7    14
                          added to the gap
                          penalty, much like
T   14
                               the row.
A   21
Putting the Three Together
         G   T   A   C   The result that is put
                           into the cell is the
    0    7   14 21 28          least of the
                          contributions of the
A   7    1                 diagonal, row, and
                            column. In this
                           case, the diagonal
T   14                    yielded the smallest
                             value, so it was
A   21                            taken.
    What Does it Mean?
         G   T   A   C    The result is created
                          as the matrix is
     0   7   14 21 28     traversed.

A    7   1   8   15 22    The lower right
                          square contains the
T    14 8    1   8   15   end result of the
                          alignment algorithm.
A    21 15 9     1   8
                            To reassemble the
        G   T   A   C        strands, we must be
                             aware of the
    0   7   14 21 28         significance of picking
                             directions is in the
A   7   1   8   15 22           Row: gap in strand along
                                 left. (red)
T   14 8    1   8   15          Column: gap in strand
                                 along top. (orange)
                                Diagonal: there is no
A   21 15 9     1   8            gap. (lavender)
Walking the Matrix
                         The motion of direction
        G   T   A   C    is determined by
                         following the arrows
    0   7   14 21 28     backward across the
A   7   1   8   15 22    This means that as the
                         matrix is filled in, it is a
T   14 8    1   8   15   good idea to keep track
                         of where a value came
A   21 15 9     1   8
                            The result is obtained
        G   T   A   C        by building the aligned
                             strands backwards.
    0   7   14 21 28        Starting from the lower
                             right corner, it is:
A   7   1   8   15 22        -ATA
                            Reversing the order, it
T   14 8    1   8   15       is:
A   21 15 9     1   8        ATA-

Shared By: