Docstoc

uf

Document Sample
uf Powered By Docstoc
					I/O-Efficient Batched Union-Find and
 Its Applications to Terrain Analysis


  Pankaj K. Agarwal, Lars Arge, Ke Yi
           Duke University
          University of Aarhus
       The Union-Find Problem
•   A universe of N elements: x1, x2, …, xN
•   Initially N singleton sets: {x1}, {x2 }, …, {xN}
•   Each set has a representative
•   Maintain the partition under
    – Union(xi, xj) : Joins the sets containing xi and xj
    – Find(xi) : Returns the representative of the set
      containing xi
                                         The Solution
                                              representatives

             d                                h                              i                           p
     b               j       a            f       l      z       s               r       c       k

e        g                                        m

                                                  n
    Union(d, h) :                h                           Find(n) :                   h

             d           f           l                               d               f       l       m   n

    b            j       a           m                       b           j           a
                                         link-by-rank                                            path compression
e        g                           n                  e        g
                   Complexity
• O(N α(N)) for a sequence of N union and
  find operations [Tarjan 75]
  – α(•) : Inverse Ackermann function (very slow!)
  – Optimal in the worst case [Tarjan79, Fredman
    and Saks 89]
• Batched (Off-line) version
  – Entire sequence known in advance
  – Can be improved to linear on RAM [Gabow and
    Tarjan 85]
  – Not possible on a pointer machine [Tarjan79]
Simple and Good, as long as …
 The entire data structure fits in memory
 The I/O Model

Main memory of size M

                One I/O transfers B items
                between memory and disk



  Disk of infinite size
    Sources of “Non-Locality”
• Two operands in a union
• Nodes on a leaf-to-root path
• Operands in consecutive operations
  – Cannot remove for the on-line case


Need to eliminate all of them in order to
 get less than one I/O per operation!
                      Our Results
• An I/O-efficient algorithm for the batched union-find
  problem using O(sort(N)) = O(N/B logM/B(N/B)) I/Os
   – Same as sorting
   – optimal in the worst case
• A practical algorithm using O(sort(N) log(N/M)) I/Os
   – Implemented
• Applications to terrain analysis
   – Topological persistence : O(sort(N)) I/Os
      • Implemented
   – Contour trees : O(sort(N)) I/Os
I/O-Efficient Batched Union-Find
• Assumption: No redundant unions
  – Each union must join two different sets
  – Will remove later
• Two-stage algorithm
  – Convert to interval union-find
     • Compute an order on the elements s.t. each union
       joins two adjacent sets
  – Solve batched interval union-find
                 Union Tree
1: Union(d, g)
                                      r                                       r           9
2: Union(a, c)               6
                                          3
                                                                      6
3: Union(r, b)                                                                    3           f
                         a                    b                   a                   b
4: Union(a, e)       2            4               9       2               4
                          7                                        7
5: Union(e, i)   c       d            e           f   c           d           e
6: Union(r, a)       1        8           5               1                       5
7: Union(a, d)       g       h            i               g                       i
         g                                                    8
8: Union(d, h)                                            h
         r
9: Union(b, f)                    Equivalent union trees
        Transforming the Union Tree
                     r                            r                                     7    r
                         3                           3                                          3
            6                                  6 8                                        6 8
        a                    b               a     h   b                       d        a     h   b
    2            4               9       2               9                                          9
         7                                    7 4                               1        2 4
c       d            e           f   c       d    e           f                g        c     e     f
    1        8           5               1            5                                         5
    g       h            i               g            i                                             i

                                                                  7         r
                                                                      6             3       9
                                                                           8
    Weights along root-to-leaf                            d       a          h          b       f
                                                      1       2       4
    path decrease                                                          5
                                                      g       c           e         i
 Formulating as a Batched Problem
                         r
                             3
                6
            a                    b                   7         r
        2            4               9                   6          3       9
             7                                                8
    c       d            e           f       d       a          h       b       f
        1        8           5           1       2       4    5
                                         g       c           e      i
        g       h            i


For each edge, find the lowest ancestor edge
with a higher weight
       Cast in a Geometry Setting
                             r
                                 3                                             9
                    6
                                                             8
                a                    b
                         4                           7
           2     7                       9
                                             6
       c        d            e           f                               5
           1         8           5                                   4
           g        h            i                                            3
                                                 2
                                                         1
               Euler Tour                                    x: weight
                                                             y: positions in the tour
In O(sort(N)) I/Os [Chiang et al. 95]
      Cast in a Geometry Setting
                          r
                              3                                             9
                 6
                                                             8
             a                    b
                      4                              7
         2    7                       9
                                          6
     c       d            e           f                              5
         1        8           5                                  4
         g       h            i                                             3
                                              2
                                                         1


For each edge, find the                           For each segment, find
lowest ancestor edge                              the shortest segment
with a higher weight                              above and containing it
         Distribution Sweeping
              M/B vertical slabs




checked
recursively

                                   Total cost:
                                   O(sort(N))
               checked here
              In-Order Traversal
                                                        r
                                          3    6                        9
  Weights along root-to-leaf                            7       8
                                     b        a             d       h           f
  path decrease                           2                             1
                                              4     5
                                          c        e            i       g

At u, with child u1,…, uk
   (in increasing order of weight)
                                     b    r   c     a           e   i       g       d   h   f

1. Recursively visit subtree at u1
2. Return u                              Claim: this traversal
3. For i=2 ,…, k                         produces the right order
   Recursively visit subtree at ui
Solving Interval Union-Find

                                    Union:
                                    x: two operands
                                    y: time stamp

                                    Find:
                                    x: operand
                                    y: time stamp


Four instances of batched ray shooting: O(sort(N))
  Handling Redundant Unions
• Union tree becomes a graph
• Compute the minimum spanning tree
  – O(sort(N)) I/Os (randomized) [Chiang et al. 95]
    O(sort(N) loglog B) I/Os (deterministic) [Arge et
   al. 04]
  – Deterministic O(sort(N)) I/Os if graph is planar
  – Only MST edges are non-redundant
        A Practical Algorithm
• Previous algorithm too complicated
  – 2 Euler tours
  – 4 instances of batched ray shooting
  – MST
• A simple and practical algorithm
  – Divide-and-conquer
  – O(sort(N) log(N/M)) I/Os
  – Implemented
 Applications
1. Topological Persistence
2. Contour Trees
Topological Persistence
Formulated as Batched Union-Find
• Represented as a triangulated mesh




• Consider minimum-saddle pairs
                                           lower link
• When reach
  – A minimum or maximum: do nothing
  – A regular poin u: Issue union(u,v) for a lower neighbor v
  – A saddle u: let v and w be nodes from u’s two
    connected pieces in its lower link
    Issue: find(v), find(w), union(u,v), union(u,w)
Contour Trees
            Previous Results
• Directly maintain contours
  – O(N log N) time [van Kreveld et al. 97]
  – Needs union-split-find for circular lists
  – Do not extend to higher dimensions
• Two sweeps by maintaining components,
  then merge
  – O(N log N) time [Carr et al. 03]
  – Extend to arbitrary dimensions
                Join Tree and Split Tree
    Qualified nodes
        9                           9                           9                       9

        8                           8                           8                       8

7                           7                           7                       7
                    6                           6                           6                       6

        5                           5                           5                       5
                4                           4                           4                       4

    3                           3                           3                       3
                        2                           2
            1                           1                           1                       1

    Join tree                   Split tree                  Join tree               Split tree
                        Final Contour Tree
                                                Hard to BATCH!
        9                           9                                9

        8                           8                                8
7                           7                                7
                    6                           6                                6
        5                           5                                5
                4                           4                                4
    3                           3                                3
                        2                           2                                2
            1                           1                                1
    Join tree                   Split tree                       Contour tree
                 Another Characterization
Let w be the highest node that is a descendant of v in join tree
and ancestor of u in split tree, (u, w) is a contour tree edge

             9                                   9                               9
                                                         Now can BATCH!
             8                                   8                               8
 7                           u       7                           u       7                       u
                         6                                   6                               6
     v       5                           v       5                           u   5
                     4       w                           4       w                       4
                                                                                                 w
         3                                   3                               3
                                 2                                   2                               2
                 1                                   1                               1
         Join tree                           Split tree                       Contour tree
  Experiment 1:
Random Union-Find
 Experiment 2: Topological
Persistence on Terrain Data




       Neuse River Basin of NC
 Experiment 2: Topological
Persistence on Terrain Data
                     Summary
• An I/O-efficient algorithm for the batched union-find
  problem using O(sort(N)) = O(N/B logM/B(N/B)) I/Os
  – optimal in the worst case
• A practical algorithm using O(sort(N) log(N/M)) I/Os
• Applications to terrain analysis
  – Topological persistence : O(sort(N)) I/Os
  – Contour trees : O(sort(N)) I/Os
• Open Question: On-line case
  – Can we get below O(N α(N)) I/Os?
Thank you!

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:6
posted:10/20/2011
language:English
pages:32