Mapping nodes of a data structure on a parallel memory system in

W
Shared by: huanglianjiang1
Categories
Tags
-
Stats
views:
0
posted:
12/22/2012
language:
Unknown
pages:
22
Document Sample
scope of work template
							 c-Perfect Hashing Schemes for Arrays,
 with Applications to Parallel Memories
G. Cordasco1, A. Negro1, A. L. Rosenberg2 and
                 V. Scarano1


      1 Dipartimento    di Informatica ed Applicazioni ”R.M. Capocelli”
               Università di Salerno, 84081, Baronissi (SA) – Italy

   2 Dept.of   Computer Science University of Massachusetts at Amherst
                         Amherst, MA 01003, USA
Summary

   The Problem
   The Motivation and some examples
   Our results
   Conclusion




                  Workshop on Distributed Data and
                          Structures 2003            2
The problem
Mapping nodes of a data structure on a parallel memory system in such a
way that data can be efficiently accessed by templates.

   Data structures (D): array, trees, hypercubes…
   Parallel memory systems (PM):
                                                        PM

                                                               P0     M0

                                                               P1     M1
                                                                              D
                                                                      MM -2
                                                               PP-1
                                                                      MM-1




                            Workshop on Distributed Data and
                                    Structures 2003                               3
The problem(2)
Mapping nodes of a data structure on a parallel memory system in such a
way that data can be efficiently accessed by templates.

 Templates: distinct sets of nodes (for arrays: rows, columns,
diagonals, submatrix; for trees: subtrees, simple paths, levels; etc).
    Formally: Let GD be the graph that describes the data structure D. A
    template t, for D is defined such as a subgraph of GD and
    each subgraph of GD isomorphous to t will be called instance
    of t-template.

Efficiently: few(or no) conflicts (i.e. few processors need to access the
same memory module at the same time).



                           Workshop on Distributed Data and
                                   Structures 2003                         4
    How a mapping algorithm should be:

 Efficient: the number of conflicts for each instance of the considered
template type should be minimized.
 Versatile: should allow efficient data access by an algorithm that uses
different templates.
 Balancing Memory Load: it should balance the number of data items
stored in each memory module.
 Allow for Fast Memory Address Retrieval: algorithms for retrieving the
memory module where a given data item is stored should be simple and
fast.
 Use memory efficiently: for fixed size templates, it should use the
same number of memory modules as the size.



                            Workshop on Distributed Data and
                                    Structures 2003                         5
    Why the versatility?

 Multi-programmed parallel machines: different set of processors run
different programs and access different templates.
 Algorithms using different templates: in manipulating arrays, for
example,accessing lines (i.e. rows and columns) is common.
 Composite templates: some templates are “composite”, e.g. the
Range-query template consisting of a path with complete subtrees
attached.



                A versatile algorithm is likely to perform well on that...




                            Workshop on Distributed Data and
                                    Structures 2003                          6
    Previous results: Array

Research in this field originated with strategies for mapping two-
dimensional arrays into parallel memories.

 Euler latin square (1778): conflict-free (CF) access to lines (rows and
columns).
 Budnik, Kuck (IEEE TC 1971): skewing distance

     CF access to lines and some subblocks.

 Lawrie (IEEE TC 1975): skewing distance

     CF access to lines and main diagonals.

     requires 2N modules for N data items.




                            Workshop on Distributed Data and
                                    Structures 2003                         7
    Previous results: Array(2)

   Colbourn, Heinrich (JPDC 1992): latin squares
       CF access for arbitrary subarrays: for r > s r × s and s × r
      subarrays.
       Lower Bound: for any CF mapping algorithm for arrays, where
      templates are r × s and s × r subarrays ( r > s ) and lines requires
      n > rs + s memory modules.
       Corollary: more than one-seventh of the memory modules are
      idle when any mapping is used that is CF for lines, r × s and s × r
      subarrays ( r > s ).




                              Workshop on Distributed Data and
                                      Structures 2003                        8
Previous results: Array(3)

   Kim, Prasanna Kumar (IEEE TPDS 93): latin squares
       CF access for lines, and main squares (i.e.             N N
      where the top-left item ( i, j ) is such that i=0 mod N and
      j=0 mod N ).
       Perfect Latin squares: main diagonals are also CF.

       Every    N  N subarray has at most 4 conflicts (intersects at
      most 4 main squares)

                          0     1     2      3
                          2     3     0      1
                          3     2     1      0
                          1     0     3      2

                              Workshop on Distributed Data and
                                      Structures 2003                    9
Previous results: Array(4)
  Das, Sarkar (SPDP 94): quasi-groups (or groupoids).
  Fan, Gupta, Liu (IPPS 94): latin cubes.



 These source offer strategies that scale with the number of memory module
 so that the number of available modules changes with the problem instance.




                          Workshop on Distributed Data and
                                  Structures 2003                      10
Our results: Templates
 We study templates that can be viewed as generalizing array-blocks
 and “paths” originating from the origin vertex <0,0>:
     Chaotic array (C): A (two-dimensional) chaotic array C is an
    undirected graph whose vertex-set VC is a subset of N × N that is
    order-closed, in the sense that for each vertex <r, s> ≠ <0, 0> of C,
    the set
             Vc    r 1, s   , r, s 1   




                         Workshop on Distributed Data and
                                 Structures 2003                      11
Our results: Templates(2)
   Ragged array (R): A (two-dimensional) ragged array R is a
   chaotic array whose vertex-set VR satisfies the following:
       if <v1 , v2 >  VR , then {v1} × [v2]  VR;

       if <v1 , 0 >  VR , then [v1] × {0}  VR;




       Motivation
            pixel maps;

            lists of names where the table change shape dynamically;

            relational table in relational database.


       For each n  N, [n] denotes the set {0, 1, …, n -1}
                           Workshop on Distributed Data and
                                   Structures 2003               12
Our results: Templates(3)
    Rectangular array (S): A (two-dimensional) rectangular array S is
   a ragged array whose vertex-set has the form [a] × [b] for some a,
   b  N.




                       Workshop on Distributed Data and
                               Structures 2003                     13
      Our results: Templates
     Chaotic Arrays




     Ragged Arrays




     Rectangular Arrays

Arrays
   Rectangualar
       Ragged              Workshop on Distributed Data and
          Chaotic                  Structures 2003            14
Our results: c-contraction
 For any integer c > 0 , a c-contraction of an array A is a graph G that
is obtained from A as follows.
      Rename A as G(0). Set k = 0.

      Pick a set S of  c vertices of G(k) that were vertices of A.
     Replace these vertices by a single vertex vS; replace all edges of
     G(k) that are incident to vertices of S by edges that are incident to
     vS. The graph so obtained is G(k+1)
 Iterate step 2 some number of times; G is the final G(k).




            A=G(0)               G(1)                        G(2)=G




                          Workshop on Distributed Data and
                                  Structures 2003                       15
Our results: some definitions
 Our results are achieved by proving a (more general) result, of
independent interest, about the sizes of graphs that are “almost”
perfect universal for arrays.

A graph Gc= (Vc, Ec) is c-perfect-universal for the family An if for each
array A  An exists a c-contraction of A that is a labeled-subgraph of Gc.

    An denotes the family of arrays having n or fewer vertices.




                          Workshop on Distributed Data and
                                  Structures 2003                       16
Our results: c-perfection number
 The c-perfection number for An, denoted Perfc(An), is the size of the
 smallest graph that is c-perfect-universal for the family An

    Theorem (F.R.K. Chung, A.L.Rosenberg, L.Snyder 1983)

                     nn 
        Perf1(Cn) =       1
                     22 
                    1   n    2n  
        Perf1(Rn) = 
                         3   1 3  3   n 
                    2       
          Perf1(Sn) = n




                             Workshop on Distributed Data and
                                     Structures 2003                  17
Our results
   Theorem
      For all integers c and n, letting X ambiguously denote Cn and Rn,

                                1 n  n  
      we have Perfc(X)                               1 .
                                c   2   2(c  1)  
                                         .
                                                  

   Theorem
                                             n  1 .
                                                    2
      For all integers c and n, Perfc(Cn) 
                                                        c 
                                                          
   Theorem
                                            n   n  1
      For all integers c and n, Perfc(Rn)               n .
                                                       c   c  1

   Theorem
                                            n
      For all integers c and n, Perfc(Sn) =   .
                                            c

                                Workshop on Distributed Data and
                                        Structures 2003                   18
Our results:
A Lower Bound on Perfc(Cn) and Perfc(Rn)
                                                                    n  
   Perfc(Cn)  Perfc(Rn)  1 Size(U n )  1   n  . 
                                                                         1
                                    c                 c   2   2(c  1)  




                                        Un

                      n 
                  0,           
                      2(c  1) 



                                    0,0
                                                                               n
                                                                                2 ,0
                                                                                

                                     Workshop on Distributed Data and
                                             Structures 2003                             19
Our results: An Upper Bound on Perfc(Cn)

               n  1
                         2
 Perfc(Cn) 
               c 
                                                                     d=3

                                                                Vd
 d = (n+1)/c;
 Vd = { v = <v1, v2> | v1 < d and v2 < d }.

                                                                 0,0



   For each vertex v = <v1, v2> :
       if v  Vd then color(v) = v1 d + v2;

       if v  Vd then color(v) = color(<v1 mod d, v2 mod d>);




                             Workshop on Distributed Data and
                                     Structures 2003                         20
Our results: An Upper Bound on Perfc(Cn)
                         n  1
                                   2

 Perfc(Cn)  Size(Vd)=
                         c 
                              
                                                              n = 13
                                                              c=5
                                                              d = (n+1)/c = 3




                          Vd



                         0,0

                           Workshop on Distributed Data and
                                   Structures 2003                                21
Conclusions
   A mapping strategy for versatile templates with a given “conflict tolerance” c.




   c-Perfect Hashing Schemes for Binary Trees, with Applications to Parallel
    Memories (accepted for EuroPar 2003):

            2( n 1) /( c 1)  Perf c (Tn )  c (11/ c ) 2( n 1) /( c 1)

   Future Work: Matching Lower and Upper Bound for Binary Trees.




                                  Workshop on Distributed Data and
                                          Structures 2003                        22

						
Related docs
Other docs by huanglianjiang1
9999
Views: 577  |  Downloads: 0
99977_1_Assignment-title
Views: 349  |  Downloads: 0
97600
Views: 228  |  Downloads: 0
9711 QUIZ
Views: 573  |  Downloads: 0
91712
Views: 196  |  Downloads: 0
96132.100.01.1_8500 to 8600 Upgrade Manual
Views: 329  |  Downloads: 0
9425f2a1-8439-4e5a-98f7-8d6d853f0158
Views: 258  |  Downloads: 0
92612-SAC-Summary
Views: 183  |  Downloads: 0
9121 - Bid Tabulations
Views: 836  |  Downloads: 0
91006
Views: 165  |  Downloads: 0