VIEWS: 0 PAGES: 20 POSTED ON: 2/16/2013
Finding the max & min of a set A naive algorithm would find the max in n-1 comparisons, then find the min in n-1 comparisons, for a total of 2n-2 comparisons. One might suspect that an improvement is possible if the max & min are found simultaneously -- since the information found by the naïve algorithm in the first step isn't used in its second step. A divide & conquer approach will give an improvement, although not an asymptotic one. For n a power of 2, consider the strategy of dividing the set in half finding the max & min of each half comparing the two max values comparing the two min values Let T(n) be the number of comparisons needed to find the max & min of a set of size n. Then T(n) = 2T(n/2) + 2 T(2) = 1 An easy induction shows that T(n) >= n for n>= 4, so the algorithm gives no asymptotic improvement over the naive version. A related algorithm isolates one pair, compares its elements, and proceeds recursively. Here two additional comparisons are necessary. The resulting recurrence is T(n) = 1 + T(n-2) + 2 T(2) = 1 Again we can see that T(n) >= n for n>=4. In fact, no asymptotic improvement should be expected in either case, since any algorithm would need to look at all n items. But the argument suggests that T(n) = pn + q for some p and q. If this is true, then the first recurrence becomes pn + q = 2(pn/2 + q) + 2 from which one may conclude that q = -2. Then T(2) = 1 implies that 2p + q = 1, so that p = 3/2. The same conclusion follows from the second recurrence. That T(n) = 3n/2 - 2 can be checked by induction for even n. If n is odd, the result remains true if the right- hand side is rounded up. Note that T(n) = 3n/2 - 2 does give an improvement over naive version, although not an asymptotic one. It's possible to show that that no algorithm for this problem can use fewer comparisons, by using an adversary argument. It will help to talk in terms of elements "winning" and "losing" comparisons. Any comparison- based algorithm for the problem implicitly manipulates 4 sets: N: the elements that have neither won nor lost a comparison W: the elements that have won but not lost L: the elements that have lost but not won B: the elements that have both won and lost Initially, all elements are in N. The algorithm can't terminate until 1 element is in W, 1 is in L, and n-2 are in B. Elements must move from N to W or L to B, one step at a time. In all, 2(n-2) + 1 + 1 = 2n-2 steps must be made. The adversary can arrange that at most one step is made, except for comparisons between elements of N. In this case, two steps must be allowed. At most n/2 such comparisons are possible. So there are at most n/2 more steps than comparisons. And so the number of comparisons is at least 2n-2 - n/2 = 3n/2 - 2. Closest pairs of points One problem for which a naïve approach is improved by a divide-and-conquer algorithm is that of finding the closest pair of points in a set P = {(xi,yi)} of points. A naïve approach would find the (n ) distances 2 between the n points. But a divide-and-conquer approach needs only time (n log n). In this approach, we may assume that the points are sorted both by x-coordinate and by y-coordinate, and stored in two lists Px and Py. We may also assume that no two points agree in either coordinate (perhaps by performing a rotation). With these assumptions, we may, in linear time, 1. split Px into a "left" half Q and a "right" half R 2. create lists Qx and Qy containing the elements of Q sorted by x-coordinate and y-coordinate 3. create lists Rx and Ry containing the elements of R sorted by x-coordinate and y-coordinate Now we may process Q and R recursively. Suppose that we can combine the processed solutions into an overall solution in linear time. Then an argument like that for mergesort will give us an overall (n log n) time complexity. Note that the linear processing will somehow have to compare elements of Q with those of R. In doing so, we may assume that we have found the smallest distance between any two points in Q or any two points in R. The question becomes whether there are two points q Q and r R, whose distance is < . If so, both points must be closer than to the vertical line dividing Q from R. So we only need to search the set S of points in P whose x coordinate is within of this line. Claim: any point s in S has O(1) neighbors in S within a distance of . These points are within 15 elements of s in Sy. From the claim it follows that searching S for sufficiently close pairs of points takes time O(n). So the algorithm's "conquer" phase of the takes time (n), and overall it takes time (n log n). The proof of the claim is easiest to understand geometrically, as in Figure 5.7 of Kleinberg & Tardos.