Docstoc

Analysis of Algorithms, 91(1)

Document Sample
Analysis of Algorithms, 91(1) Powered By Docstoc
					UML CS                                  91.503 Midterm Exam                                           Fall, 2008


                                        MIDTERM EXAM SOLUTIONS

Stats: to be determined later (with ?? points added)
   - Minimum:
   - Maximum:
   - Average:
   - Standard Deviation:

1: (5 points) Asymptotic Growth of Functions


     (a) (1 point) List the 4 functions below in nondecreasing asymptotic order of growth:

                lg 32 n                    n 3 lg lg n                  3n  n                  n lg n  2


          1) 3n 
                     n
                                   2)    n lg n  2              3) n 3 lg lg n              4)     lg 32 n
              smallest                                                                         largest


         Rationale: lim n 3n  0 , so 3n  is the smallest.                     n lg n 2  n2 lg 2 n  O(n3 lg lg n)
                                n                       n


        because lg 2 n  O(n lg lg n) . lg 32         n  5n ; this exponential function dominates the
        other 3 functions.

1)    f1 (n)  (n2 )       2) f2 (n)  (n3 )         3)      f3 (n)  O((n lg n)2 )    4)                   
                                                                                                   f4 (n)   3n
                                                                                                                   n
                                                                                                                        

                                                                                                                     lg 32 n

                                                                                                                   n 3 lg lg n
                                                                                                   f2(n)
                                                                                                                                 n3

                                                                                                   f3(n)                n lg n  2

                                                                        f1(n)                                                    n2

                                                                                                                            3n  n
                                                                                           f4(n)


                                                            1 of 10
UML CS                                91.503 Midterm Exam                                Fall, 2008




b) (1 point) f1 (n)  ( f 2 (n))                          TRUE            FALSE

Proof: f1 (n)  (n 2 )  f1 (n)  O(n 2 ) [1] by the definition of the  operator. n 2  O(n3 ) [2].
Applying transitivity to [1] and [2] yields f1 (n)  O(n 3 ) [3]. Now, via transpose symmetry
from f 2 (n)  (n3 ) we have n3  O( f 2 (n)) [4]. Applying transitivity to [3] and [4] yields
 f1 (n)  ( f 2 (n)) .

c) (1 point) f 3 (n)  ( f 4 (n))                         TRUE            FALSE

Counter-example: f 3 (n)  n and f 4 (n)  n 3 .

d) (1 point) f 3 (n)  ( n 3 )                           TRUE             FALSE

Proof: f 3 (n)  O (( n lg n) 2 ) combined transitively with (n lg n) 2  O(n 3 ) yields f 3 (n)  ( n 3 ) .

e) (1 point) f1 (n)  ( f 4 (n))                          TRUE            FALSE

Counter-example: f1 ( n)  n 2 and f 4 (n)  n 3

2: (5 points) Recurrence


Find a tight upper bound on the closed-form solution for the following recurrence:

                                             T (n)  3T (n  1)  n

where T(n) is constant for sufficiently small n. That is, find a function g (n) such that
T (n)  O( g (n)) .

Solution: The Master Theorem does not apply here. A recursion tree can be used. The
tree has n  1 levels if T(0)=1. 3i (n  i ) work is done at the i th level, except for the bottom
level, where 3n T (0)  3n work is done (thanks to Jane for pointing out the work at the
bottom level). The total work is:
 n1
                           n1      n1               n1   n1

 
 i 0
      (3i (n  i))  3n    3i n     3i i   3n  n 3i  3i i  3n  (using closed-form solutions to
                           i0      i0               i 0  i 0

the summations):
 n(3n  1) (n  1)3( n1)  n3n  3 n 7 n n 3
                                         3  3    O(3n ) . Thus, T (n)  O3n .
       2                   4                     4      2 4
                                                     2 of 10
UML CS                          91.503 Midterm Exam                           Fall, 2008



3: (5 points) Analyze Pseudocode

Mystery1 has one argument: a positive integer value n.

Mystery1(n)
       print " Mystery1called with n " n
      if n  1
         then return
      for i  1 to 3
         do Mystery2(n / 4)
        return

Mystery2(n)
      print " Mystery2called with n " n
      if n  1
         then return
      Mystery1(n / 4)
      return


 Derive a tight upper bound on Mystery1’s worst-case asymptotic running time as a
 function of n.

 Solution: In the worst case, let n be a power of 4.
 T (n)  (1)  3((1)  T (n / 16))  3T (n / 16)  (1). Case 1 of the Master Theorem applies,
 yielding T (n)  (n log16 3 ) . So, a tight upper bound is T (n)  O(n log16 3 ) .


4: (35 points) Design an Algorithm: Shortest Paths

This problem is from Introduction to Algorithms: A Creative Approach, by Udi Manber.

Let G = ( V, E ) be an unweighted, directed graph. Let v and w be two vertices of G.

Design an efficient algorithm that finds the number of different shortest paths (not
necessarily vertex-disjoint) between v and w.

Make sure that you provide pseudocode, correctness justification and running time analysis
for your algorithm.




                                              3 of 10
UML CS                           91.503 Midterm Exam                          Fall, 2008



a) (12 points) Pseudocode: BFS-Count(G, s) is called once with s=v.
          BFS-Count(G, s)
          1 for each verte u V [G]  {s}
                             x
         2      do color[u]  WHITE
          3         nShortestPaths[u ]  0
          4         d[u]  
          5 color[s]  GRAY
          6 d [ s]  0
          7 nShortestPaths[s]  0
          8 Q0     
          9 ENQUEUE(Q, s)
         10 while Q  0   
         11     do u  DEQUEUE( Q)
         12        for each verte v  Adj[u]
                                   x
         13           do if color[v]  WHITE
         14                 then color[v]  GRAY
         15                       d[v]  d[u ]  1
         16                       nShortestPaths[v]  1
         17                      ENQUEUE(Q, v)
         18                  else if d[v]  d[u]  1
         19                       then nShortestPaths[v]  nShortestPaths[v]  nShortestPaths[u]
         20        color[u]  BLACK


b) (12 points) Correctness


       i) Mechanical: (4 points)

           The for loops in lines 1-4 and 12-19 terminate. In lines 1-4 the loop visits each
           vertex (except the source) once. In lines 12-19 the loop visits each element of an
           adjacency list whose length is finite. The while loop in lines 10-20 terminates
           because each vertex is ENQUEUE’d only once and is eventually DEQUEUE’d.
           Arrays color, d, and nShortestPaths stay within bounds.

      ii) “As Advertised”: (8 points)

       BFS-Count(G, s) uses a modified Breadth-First-Search starting at vertex s. (Note:
       Notationally, the vertex v inside BFS-Count should not be confused with the v in the
       high-level call BFS-Count(G, v)). It is similar to the BFS procedure on p. 532 of our
       textbook, except that the predecessor  array is not used and lines 3, 7, 16, 18, and
                                              4 of 10
UML CS                          91.503 Midterm Exam                        Fall, 2008

       19 are introduced to keep track of the number of shortest paths. Upon termination of
       BFS-Count(G, s), for each vertex x  s , d[x] contains the length of the shortest path
       from s to x. That is true due to Theorem 22.5.

       We claim that, upon termination, nShortestPaths contains the number of shortest
       paths from s to x. This can be shown by induction, where the inductive hypothesis is
       that, at the end of each iteration of the while loop, nShortestPaths[v] contains the
       number of shortest paths from s to v discovered by BFS so far for each vertex v
       adjacent to u. In lines 3 and 7, each element of nShortestPaths is initialized to 0. As a
       base case, at the end of the first iteration of the while loop, each vertex v adjacent to
       s, having been WHITE, will have nShortestPaths[v]=1; this correctly represents the
       shortest path of length 1 from s to v. For the inductive step we consider what occurs
       during some iteration of the while loop. When a vertex v is first discovered, this
       means that the first shortest path in the BFS tree from s to v has been identified;
       nShortestPaths[v] is therefore set to 1 in line 16. If v has previously been encountered
       when we arrive at line 12, then we must check if we have just discovered a shortest
       path; this is done with the test in line 18. If we have discovered another shortest
       path, then we perform the assignment in line 19. Note that our inductive hypothesis
       guarantees that nShortestPaths[v] contains the number of shortest paths discovered
       so far from s to v and that nShortestPaths[u] contains the number of shortest paths
       discovered so far from s to u. Thus, the addition in line 19 yields the correctly
       updated number of shortest paths discovered so far from s to v. This completes the
       induction. Upon termination, nShortestPaths contains the number of shortest paths
       from s to each other vertex, which guarantees that we find the number of shortest
       paths from s to the original target vertex w.

       c) (11 points) Analysis: Derive the tightest upper bound that you can on the worst-
       case asymptotic running time of your pseudocode.

       The worst-case asymptotic running time of BFS-Count(G, s) is in O(| V |  | E |) . This
       is because:
   -   the worst-case asymptotic running time of BFS(G, s) is in O(| V |  | E |)
   -   a constant number of operations has been removed
   -   a constant number of constant-time operations have been inserted, without creating
       new loops, function calls, or any recursion.


5: (10 points) Amortized Analysis

This problem uses the journal paper “Fast Hierarchical Clustering and Other Applications of
Dynamic Closest Pairs” by David Eppstein. Prove that the total potential (defined on p. 6
of the paper) cannot be negative, regardless of the sequence of insertion and deletion
operations (and any resulting merge operations). That is, show that:

                                     
                                 n2 log n  i  0     
                                             5 of 10
UML CS                                       91.503 Midterm Exam                                    Fall, 2008


(Note: Be sure to show that the initial total potential is also non-negative.)


Solution: When all the points are initially put into S1 at the beginning, then k=1. Thus,
                                                                    
                                                                                                              k
n | Si | log | Si | n 2 log n and   n 2 log n  n 2 log n  0 . In general, we know that                 | S
                                                                                                             i 1
                                                                                                                    i   | n,
                                                   k                      k
so log | Si | log n . This implies that         | S
                                                  i 1
                                                         i   | log | Si |  | Si | log n  n log n . Transitively,
                                                                         i 1
 k                                                             k

| S   i   | log | S i | n log n , which means that n | Si | log | Si | n 2 log n . Thus,
i 1                                                          i 1
                                       k
  n 2 log n    i  n 2 log n  n | Si | log | Si |  n 2 log n  n 2 log n  0 . Therefore the total potential
                                      i 1

is non-negative.

6: (10 points) Flows


This problem uses the flow network G '  (V ' , E ' ) described in Section 26.3 on p. 665-
666 of our textbook for finding a maximum bipartite matching in an undirected bipartite
graph G  (V , E ) . Here we are given an integer                        x  1 and we modify G ' to form the
flow network        G ' ' . G ' ' is identical to G ' except that each edge in G ' ' has capacity
x instead of capacity 1.
Does this change the results established in Section 26.3? Discuss if, how and why.


Solution: Even though all the edge capacities are all equal integers, some of the results of
Section 26.3 change or need additional explanation. In particular, the proof of Lemma
26.10 (converse direction) needs additional justification. In that proof the fact that the flow
was integer-valued and the capacity of each edge was 1 was used to argue that, for each
vertex u  L , one unit of positive flow could enter on at most one edge and leave on at
most one edge. This was critical to proving that the set of edges being considered was a
matching. Section 26.3 relies on the Ford-Fulkerson method, which, in its most general
form (FORD-FULKERSON-METHOD on p. 651) just keeps finding an augmenting path and
increasing flow accordingly. This method allows the following type of situation to occur, as
illustrated in Justin’s diagram below. Here maximum flow (although having the integrality
property) splits flow coming out of a vertex (vertex a):

                                                               6 of 10
UML CS                                 91.503 Midterm Exam                  Fall, 2008

               1/2            1/2
                         c                 x   2/2
         s                   0/2
                   1/2               1/2              t
                         b
             2/2                                2/2
                              1/2          y
                         a     1/2

As a result, Corollary 26.12, which showed that the cardinality of a maximum matching in M
equaled the value of a maximum flow in f, is no longer substantiated.


However, if we consider the more specialized FORD-FULKERSON procedure on p. 658,
then we can say (this is Jane’s point of view) that because FORD-FULKERSON forces flow
along an augmenting path to equal the minimum capacity on the path, and that capacity will
be x, the flow will not be split across edges emanating from a vertex of L and Lemma 26.10
will still hold. So, depending on your interpretation of “Ford Fulkerson method” one could
successfully argue either for or against the validity of this part of the converse direction of
Lemma 26.10.


Regardless of one’s interpretation, Jane correctly points out that the wording of both
Lemma 26.10 and Corollary 26.12 need to change slightly due to the cardinality of the
maximum flow now being | f || xM | rather than just | f || M | . The cardinality of the
matching is therefore | M || f | / x .


Note that because x is an integer the integrality theorem (Theorem 26.11) still holds.


7: (30 points) Dynamic Programming


This problem is adapted from the book Research and Education Association Problem
Solvers: Operations Research.


The task is to plan a production schedule for expensive wireless sensors over a 4-month
period of time from November, 2008 through February, 2009. The goal is to meet demand
while minimizing cost. The company has demand forecasts for each of the 4 months, given
below:


                                                          7 of 10
UML CS                             91.503 Midterm Exam                          Fall, 2008

Month                       Month Index                 Demand (in thousands)
November                                     1                              4
December                                     2                              1
January                                      3                              3
February                                     4                              2


A schedule is represented by (x1, x2, x3, x4), where xj represents the number of wireless
sensors (in thousands) produced during month j. A schedule is feasible if it meets demand.
That is, the following constraints must be satisfied:
                                       x1  4
                                       x1  x 2  5
                                       x1  x 2  x 3  8
                                       x1  x 2  x 3  x 4  10


Note that the last constraint is an equality to avoid over-producing.


The costs are:
-   $7,000 per sensor that is produced;
-   $40,000 for each month that has a production run (set-up cost);
-   $10,000 per month for each sensor that is produced during one month but shipped
    during a later month (carry-over or storage cost).


For example, for the schedule (5, 0, 3, 2) the total cost (in thousands of dollars) would be =
5  7  0  7  3  7  2  7  70 for the units produced,
plus 3  40  120 for the 3 setups (November, January and February),
plus 110  10 for the sensors produced in November and sold later on in February,
for a total cost of $200,000.


Formulate a minimal cost expression recursively. Justify your answer by demonstrating
optimal substructure. You do not need to solve your expression to obtain an optimal
schedule for this particular problem instance.

                                                  8 of 10
UML CS                                             91.503 Midterm Exam                                                   Fall, 2008

Solution: Pseudocode was not requested, so we just set up the recursive cost formulation
and demonstrate optimal substructure. All units here are expressed in thousands. Let zj be
the number of units on hand at the start of the j th month.
Then z1  0 , z2  x1  4 , z3  z2  x2  1 , z4  z3  x3  3 , and z4  x4  2  0 .
(Note that we use the last equation to avoid overproducing at the end.) Let dj be the
demand for month j. Let Pj(zj) be the cost, taking into account production decisions for
months j,…,4.

The recursive cost formulation is:

Pj ( z j )  min{ 7 x j  10 ( z j  x j  d j )  40 ( x j )  Pj 1 ( z j  x j  d j )} where

        1 if x j  0 
 (x j )             and the following constraints are imposed for each j in order to
        0 otherwise
satisfy demand and not over-produce:

zj  xj  d j        and        z j  x j  d j  d j 1    d 4 .

(We assume that P5 (0)  0 ). The book does not prove optimal substructure, but we do it
here. To justify the cost formulation, we examine each part of it. Since the goal is to
minimize cost, we minimize the overall cost expression. (Note that, for Pj(zj), we must
choose a combination of values for xj and zj that minimizes the cost expression.) The 7xj
part is the “per-sensor” cost, and 40xj) is the production set-up cost.
10 ( z j  x j  d j ) represents the storage charge. Finally, we discuss Pj 1 ( z j  x j  d j ) , which
we claim exhibits optimal substructure. Assuming that we abide by the constraints, we can
apply a cut-and-paste proof by contradiction to establish optimality for Pj 1 ( z j  x j  d j ) . Let
z j ' , x' j be values that minimize the Pj(zj) expression, so that Pj(z’j) is optimal. By way of
contradiction, suppose that there was a better way to make decisions for months ( j+1)…4
for z ' j  x' j  d j ; call this P ' j 1 ( z ' j  x' j d j ) . This would yield
7 x' j 10 ( z ' j  x' j d j )  40 ( x' j )  P' j 1 ( z ' j  x' j d j )  7 x' j 10 ( z ' j  x' j d j )  40 ( x' j )  Pj 1 ( z ' j  x' j d j )
. Since we are minimizing, this would produce a cost < Pj(z’j) , contradicting the optimality
of Pj(z’j).

Note that if Pj(zj) relied on past months rather than future months, then it would not be clear
how much extra to produce in the base case for the first month.




                                                                        9 of 10
UML CS   91.503 Midterm Exam   Fall, 2008




                    10 of 10

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:29
posted:4/16/2010
language:English
pages:10
Jun Wang Jun Wang Dr
About Some of Those documents come from internet for research purpose,if you have the copyrights of one of them,tell me by mail vixychina@gmail.com.Thank you!