VIEWS: 16 PAGES: 22 POSTED ON: 4/13/2012 Public Domain
PRAM Models Advanced Algorithms & Data Structures Lecture Theme 13 Prof. Dr. Th. Ottmann Summer Semester 2006 Classification of the PRAM model • In the PRAM model, processors communicate by reading from and writing to the shared memory locations. • The power of a PRAM depends on the kind of access to the shared memory locations. 2 Classification of the PRAM model In every clock cycle, • In the Exclusive Read Exclusive Write (EREW) PRAM, each memory location can be accessed only by one processor. • In the Concurrent Read Exclusive Write (CREW) PRAM, multiple processor can read from the same memory location, but only one processor can write. 3 Classification of the PRAM model • In the Concurrent Read Concurrent Write (CRCW) PRAM, multiple processor can read from or write to the same memory location. 4 Classification of the PRAM model • It is easy to allow concurrent reading. However, concurrent writing gives rise to conflicts. • If multiple processors write to the same memory location simultaneously, it is not clear what is written to the memory location. 5 Classification of the PRAM model • In the Common CRCW PRAM, all the processors must write the same value. • In the Arbitrary CRCW PRAM, one of the processors arbitrarily succeeds in writing. • In the Priority CRCW PRAM, processors have priorities associated with them and the highest priority processor succeeds in writing. 6 Classification of the PRAM model • The EREW PRAM is the weakest and the Priority CRCW PRAM is the strongest PRAM model. • The relative powers of the different PRAM models are as follows. 7 Classification of the PRAM model • An algorithm designed for a weaker model can be executed within the same time and work complexities on a stronger model. 8 Classification of the PRAM model • We say model A is less powerful compared to model B if either: • the time complexity for solving a problem is asymptotically less in model B as compared to model A. or, • if the time complexities are the same, the processor or work complexity is asymptotically less in model B as compared to model A. 9 Classification of the PRAM model An algorithm designed for a stronger PRAM model can be simulated on a weaker model either with asymptotically more processors (work) or with asymptotically more time. 10 Adding n numbers on a PRAM Adding n numbers on a PRAM 11 Adding n numbers on a PRAM • This algorithm works on the EREW PRAM model as there are no read or write conflicts. • We will use this algorithm to design a matrix multiplication algorithm on the EREW PRAM. 12 Matrix multiplication For simplicity, we assume that n = 2p for some integer p. 13 Matrix multiplication • Each ci , j , 1 i, j n can be computed in parallel. • We allocate n processors for computing ci,j. Suppose these processors are P1, P2,…,Pn. • In the first time step, processor Pm , 1 m n computes the product ai,m x bm,j. • We have now n numbers and we use the addition algorithm to sum these n numbers in log n time. 14 Matrix multiplication • Computing each ci , j , 1 i, j n takes n processors and log n time. • Since there are n2 such ci,j s, we need overall O(n3) processors and O(log n) time. • The processor requirement can be reduced to O(n3 / log n). Exercise ! • Hence, the work complexity is O(n3) 15 Matrix multiplication • However, this algorithm requires concurrent read capability. • Note that, each element ai,j (and bi,j) participates in computing n elements from the C matrix. • Hence n different processors will try to read each ai,j (and bi,j) in our algorithm. 16 Matrix multiplication For simplicity, we assume that n = 2p for some integer p. 17 Matrix multiplication • Hence our algorithm runs on the CREW PRAM and we need to avoid the read conflicts to make it run on the EREW PRAM. • We will create n copies of each of the elements ai,j (and bi,j). Then one copy can be used for computing each ci,j . 18 Matrix multiplication Creating n copies of a number in O (log n) time using O (n) processors on the EREW PRAM. • In the first step, one processor reads the number and creates a copy. Hence, there are two copies now. • In the second step, two processors read these two copies and create four copies. 19 Matrix multiplication • Since the number of copies doubles in every step, n copies are created in O(log n) steps. • Though we need n processors, the processor requirement can be reduced to O (n / log n). Exercise ! 20 Matrix multiplication • Since there are n2 elements in the matrix A (and in B), we need O (n3 / log n) processors and O (log n) time to create n copies of each element. • After this, there are no read conflicts in our algorithm. The overall matrix multiplication algorithm now take O (log n) time and O (n3 / log n) processors on the EREW PRAM. 21 Matrix multiplication • The memory requirement is of course much higher for the EREW PRAM. 22