# A Library by qingyunliuliu

VIEWS: 7 PAGES: 21

• pg 1
```									 DGrid: A Library of Large-Scale
Distributed Spatial Data Structures

Pieter Hooimeijer, 2006-05-02
Motivation
• DGrid was designed to:
– Support very large sets of dynamic point data
(i.e. points that move unpredictably).

– Offer flexible trade-offs between the cost of
search operations and the cost of updates.

– Run on parallel and distributed systems.
2
Spatial Data Structures
• Definition—Any data structure that holds:
- points        - rectangles
- lines         - polygons
- curves        - etc.
• They are typically optimized for a particular
type of search operation.

• We’ll focus on point data.
3
• Commonly used example: The Quadtree.
– Works like a binary search tree.
– Each node in the tree has four children: NE,
SE, SW, NW.
– This is called ‘Recursive Decomposition’

4
• The quadtree implementation in DGrid works
like this:

A (1,3)          B (1,2)         C (2,0)
• This is a ‘bottom-up Matrix (MX) Quadtree.’
(section 3.1.2)                               5
• Let’s do a search on that quadtree:

• We ruled out the ‘entire’ NE quadrant at
the root level of the tree.
6
compared to other tree data structures:

– The shape of the tree does not depend on the
insertion order.
– No need to balance the tree (which would be
expensive).
– Insertion and deletion are cheaper for
clustered data.

7
C++ Templates
• They look like this:
template <typename ItemType>
class vector {
// ...
};

• In this case, a separate vector< > class is
generated for each item type.
// Strong type-checking using templates:
MyType * a = someVector.get(5);
MyType * a = (MyType)someVector.get(5);
8
• Turns out, C++ templates are a crude
functional programming language.
• Why ‘crude?’
{- Templates
fact < int
template 1 = 1 N>
fact n = n
struct fact { * fact (n - 1)
static const int value = N * fact <N - 1 >:: value;
};

template < >
struct fact <1 > {
static const int value = 1;
};

• This is ‘executed’ by the compiler!                 9
• This is called template metaprogramming.

• It’s used extensively in DGrid, to make it:
– easier to use;
– faster;
– type safe.

10
Distributed Data
• DGrid uses Message Passing Interface (MPI)
to run on distributed systems.
– MPI is a library of basic ‘send’ and ‘receive’
operations.

– Each processor gets a unique ID (‘rank’).

– Use if-statements to run different code on
different processes.

11
12
DGrid
• DGrid has these data structures:
– Two types of 2D arrays.
– A distributed data structure.
– A location class.

• Allows nesting of these data structures.
13
• Let’s see some examples of nested data
structures:
– A 2D array of quadtrees

– A quadtree of small 2D arrays.

– A 2D array of 2D arrays.
• Called ‘tiling.’
• A lot like a ‘shallow’ quadtree.
14
• DGrid uses the Composite Design Pattern:

location< >          DataStructure

15
• DGrid uses templates instead of a ‘Component’
interface.
• The result is that the user can do this:
using namespace dgrid::tags;
typedef dgrid::dgrid<MyItem,
partial_grid_tag<

bucket a(0, 0, 639, 639, tiles(64, 64) <<
tiles(1, 1));

• This is the ‘2D array of quadtrees’
example.
16
• Important: The definition of the data structure
is a type!
• Consequences:
– Can check parameters at compile time.
(Must be tiles( ) << tiles ( ) for this example,
or it won’t compile.)
– Compiler can optimize extensively (it knows which
functions are going to call each other).

– Can’t define a type at runtime, so ‘composition’
must be known at compile time.

17
• Data structure operations:
– insert(x, y, item) – add item at (x, y)
– delete(x, y, item) – remove item from (x, y)

– get(x, y, some_list) – get all items at (x, y)
– get_range(x0, y0, x1, y1) – get all items in the
range [ (x0, y0) : (x1, y1) ]

• Note: even the location class must support
these operations.
18
• In a nested data structure, operations are
passed on from level to level.

• Because the types are known at compile
time, these calls can be inlined.
– Pro: eliminates the overhead of the function
call.
– Con: code size increases (function body is
repeated).

19
20
Future Work
• Add more data structures, more search
operations.

• Separate interface further from
implementation.