Parallel Sorting On Linear Cellular Automata by journalpublication


									Ubiquitous Computing and Communication Journal


                             Moein Shakeri, Homa Foroughi, Hossein Deldari
                  Department of Computer Engineering, Ferdowsi University of Mashhad, Iran

                Sorting is one of the fundamental problems in computer science and being used
                vastly in various domains. So different serial and parallel approaches have been
                proposed. One of the parallel sorting methods are algorithms that are based on
                computational model of cellular automata. A cellular automata machine is a
                structure of interconnected elementary automata evolving in a parallel and
                synchronous way .The most famous sorting algorithm for one dimensional cellular
                automata machine is Gordillo and Luna’s. This algorithm sorts n numbers in 2n-3
                steps. In this paper three new sorting algorithms are proposed. In the two first
                proposed algorithms, despite using smaller neighborhood radius, sorting steps have
                not been changed and in the third algorithm sorting steps are reduced by regarding
                same neighborhood radius as Gordillo- Luna’s second algorithms.

                Keywords: parallel sorting, linear cellular automata, cellular computation.

 1   INTRODUCTION                                                A comparison-based algorithm sorts an
                                                            unordered sequence of elements by repeatedly
     It’s too years that scientists are investigating       comparing pairs of elements and, if they are out of
 natural phenomena of world and trying to classify          order, exchanging them. This fundamental operation
 them based on human knowledge and mathematical             of comparison-based sorting is called compare-
 rules. Since most of natural events are occurred           exchange. The lower bound on the sequential
 concurrently in parallel manner (e.g. water stream of      complexity of any comparison-based sorting
 rivers or molecules of steam) it is actually essential     algorithm is Θ(nlog n) , where n is the number of
 to parallelize computations that are performed for         elements to be sorted. However Noncomparison-
 classification and composing rules for these               based algorithms sort by using certain known
 phenomena. To this aim, different methods and              properties of the elements (such as their binary
 applicable systems like supercomputers, computer           representation or their distribution). The lower bound
 networks and distributed systems have been                 complexity of these algorithms is Θ(n). This
 proposed. One of the approaches that is capable of         assortment is legal for both serial and parallel
 being implemented on mentioned systems is cellular         algorithms.
 automata [2].                                                 Parallel algorithms can be classified based on
      In this paper, we present the optimal solution for    type of problem solving, their usages and also their
 a classical problem under the cellular automata            dependency to system architecture. So parallel
 programming perspective. The problem that we are           algorithms are divided into two major classes:
 tackling is the sort of n elements in a linear cellular        1. Machine architecture based algorithms
 array, where each cell has a communication channel         (Implementation platform dependent)
 with its neighbors. Sorting is one of the most                2. Mathematical model based algorithms
 important problems and has a significant role in              Table 1 demonstrates classification of sorting
 solving various problems in different domains such         algorithms [5,6,7,8,9,10,12].      In this classification;
 as image processing, graph theory, computational           Sorting Networks, Bubble Sort and Quick Sort are
 geometry and scheduling. To validate and illustrate        regarded as implementation platform dependant and
 the computations on the CA machine, in this paper          comparison-based algorithms. Sample Sort have
 we present the analysis and implementation of              been classified as non-comparison based and
 parallel sort algorithms, and then we compare results      implementation platform dependant algorithms.
 with previous models based on CA.
      Generally, sorting algorithms can be categorized
     Comparison-based algorithms
     Noncomparison-based algorithms.

Volume 3 Number 3                                Page 112                                
Ubiquitous Computing and Communication Journal

                                                               same neighborhood as Gordillo-Luna’s second
                                                               algorithm [3], but sorting time have been reduced in
      Table1: Classification of sorting algorithms             comparison with Gordillo-Luna’s second algorithm.
                                                                   in the first proposed algorithm, no extra memory
   Mathematical           Implementation platform
                                                               is used for cellular automata-in contrast with
   model based             dependant algorithms                Gordillo-Luna’s first algorithm- Also in the second
    algorithms                                                 and third proposed algorithms, only one extra cell is
                       Non-comparison                          used; that’s less than memory used in Gordillo-
  Comparison based      based sorting     Comparison based     Luna’s.
  sorting algorithms     algorithms       sorting algorithms
                                                                   In the first proposed algorithm Reviewing values
       Sorting                            Sorting Networks     and replacement of them is done just in one step;
  algorithms based      Sample Sort         (bitonic sort)     unlike Gordillo-Luna’s, and this is effective in
     on Systolic                                               computations decrement and also speed increment.
       arrays                                Bubble Sort            The reminder of paper is organized as follows:
        Sorting                             Transposition)     Sections 2 briefly introduces cellular automata and
   algorithms based                          (Shell sort)      its properties, in section 3 Gordillo-Luna’s
      on Cellular                                              algorithms are explained, section 4 describes our
       Automata                               Quicksort        proposed algorithms and finally we draw some
                                                               conclusions in section 5.
     Mathematical model-based algorithms, such as
 algorithms based on Systolic arrays [11] and cellular         2   CELLULAR AUTOMATA
 automata, are free from implementation platform. In
 the algorithms that are based on 1-D (Linear) cellular            It’s too years that computational model of
 automata, sorting is done according to cellular               cellular automata has been proposed to study
 neighborhood and available local rules among them.            different fields of phenomenological argues of nature
 Most famous proposed algorithm is Gordillo-Luna’s             consisting         communication,         computation,
 algorithm.                                                    preproduction, rivalry, growing and so on. Also CA
     At current parallel sorting approaches on cellular        is a suitable tool for modeling physical phenomena,
 automata, sorting is done by a two state cycle (q0,           via converting them to basic phenomena, by using
 q1) automata. Two types of transitions exist in this          fundamental and basic rules [2, 4].
 automata (from q0 to q1 and vice versa). One of the                A cellular automaton is a mathematical model,
 algorithms that use this approach is Gordillo-Luna’s          which is discrete-space & time; time is considered as
 algorithm [3]. This algorithm is one of the rare ones,        specific constant intervals and space is demonstrated
 which uses 1-D (Linear) cellular automata for                 as one more dimensional cell networks. Dimensions
 sorting; also it’s the most famous one.                       of networks are related to dimensions of cellular
      Gordillo-Luna, proposed two algorithms that in           automata. Each cell has time-variant properties,
 both of them each cell needs three read-writeable             variable values of each cell in each interval,
 memories, one for numeral value, and two others for           describes status of that cell and overall system state
 controllers due to replacement of numeral value to            is recognized by considering status of all cell in a
 left and right. Both of these algorithms have two             specific interval [1].
 steps; at first step left and right controllers are
 adjusted and at second step, replacement decision is          2.1 The Cellular Automata Features
 made according to values of controllers. Number of                The solution of a computational problem
 neighbor cells that are been reviewed in Gordillo-            demands a data structure to define the input-output,
 Luna’s first and second algorithms, are respectively          and a procedure to transform the first one into the
 three and five.                                               other. In spite of that, the traditional CA Machines
      In this paper, three parallel sorting algorithms;        do not contain memory space in its definition. The
 based on linear cellular automata, are proposed that          status and results of process are coded in the states
 have the following distinctive features:                      by which the automata transit. General structure of
     Two first algorithms, use smaller neighborhood            this cellular automata is as follows:
 in comparison with Gordillo-Luna’s algorithms: In                  Definition 1: A cellular automata is a 4-tuple
 the first proposed algorithm, two neighbors (one at           CA= (Q,d,V,a), where:
 left and another at right), is used for each cell and in           Q is the nonempty state set that a cell can
 the second proposed algorithm, three neighbors (two           assume. A distinguished q0 ∈ Q state, determines
 at right and one at left), are considered. It’s necessary     that the state q0 of a cell remains unchanged if the
 to mention that reducing neighborhood radius                  directly connected cell are also in state q0. The q0
 doesn’t change number of sorting steps (the same as           state is called the quiescent state.
 Gordillo-Luna’s).                                                  The dimension d of the cellular space, building a
     In the third proposed algorithm, we have used             bounded subspace on Zd ( Z denoting the

Volume 3 Number 3                                   Page 113                             
Ubiquitous Computing and Communication Journal

  integers).Each cell x in the array is an element of the           X                            l
                                                                                 if Sir = 1 And S i+1 = 1
  subspace(x ∈ Zd). Note that, when d= 1;                            i+1
  corresponds to the case of a linear array.                  X i =  X i−1                        r
                                                                                 if S il = 1 And Si−1 = 1
       The neighborhood array (V0, V1 . . . , Vk} ,Vj               X i         otherwise
  ∈ Z; Vj ∈ [0, k]. Being x ∈ Zd an automata location,              

  V(x) = {x + V0, x+ V1 , . . . , x + Vk } describes the           For terminating the operations, a general
  finite list of k + 1 neighbors directly associated to it.   mechanism is used in each cycle.
  In the case of a linear array, x = i ∈ Z; the                   Based on this algorithm, worst case of sorting n
  neighborhood array (0, -1, +1} index the central, the       elements is O(2n-3).
  left and the right cells.                                        In the first step of Gordillo-Luna’s first algorithm,
       The transition function (or relation) δ defines the    computations of blocking rules for i th cell, is
  dynamical local state transformations to be applied to      performed by using three cells (Right neighbor i+1 th
  the cells. In standard automata without memory              cell, Cell itself i th cell and Left neighbor i-1 th cell).
  register the deterministic transitions are of the form          Combining equation 1,2 and 3 results in:
  δ: Qk+1 → Q.
                                                                     X i+1      if ( X i > X i+1 And X i+1 ≤ X i+2 )
                                                                                                                            (4)
                                                              X i =  X i−1      if ( X i < X i−1 And X i ≤ X i+1 )
  3    GORDILLO-LUNA’S SORTING                                      X
                                                                     i          otherwise

     Sorting operations in these algorithms have been         Equation 4, demonstrates that four cells (i-1th, ith,
  performed by using cellular and mealy automata,             i+1th and i+2th) are used for computing next state of
  which each cell has three memories (one for numeral         ith cell,
  value and two other for left and right
  controllers).These algorithms, calculate next state of      3.1 Gordillo-Luna’s Second Algorithm
  each cell in two steps.                                         In the Gordillo-luna’s first algorithm              we
     In the former step, the transition from q 0 to q1 is     constraint the left blocking rule in order that most of
  applied and the value of the local key register is          the swaps take place with the right neighbor of the
  compared to the value of the key register of both           automata cell. By this fact, the algorithm starts by
  neighbor cells. As a result, the values of the local        moving the smallest key from left to right. In the
  blocking rule registers are computed. In the last step,     other hand, an equivalent algorithm can use a similar
  where the transition from q1 to q0 is applied, the local    constraint applied to the right blocking rule, in order
  blocking rule values are compared to the                    that most of the swaps will take place in the left side
  corresponding ones of the neighbors to determine, if        of each cell, permitting the biggest key to start
  necessary, with which of them the swap will be done.        moving to the right.
     To simplify the description, Gordillo-Luna                   In the first step of Gordillo-Luna’s second
  considered X i , S l , S r as the local memory registers    algorithm, calculations for blocking rules of i th cell,
                      i i                                     is done according to five cells (i-2 th, i-1 th, i th, i+1
  of cell i coding the key, and the left and right            th and i+2 th). In the second step of algorithm,
  blocking rule respectively.                                 results of blocking rules of adjacent cells are
      Similarly, the subindex i - 1, and i + 1 address the    compared with each other and then decision about
  precedent (left) and the subsequent (right) cells,          replacement is made. If two steps are combined,
  which are directly connected to the i cell.                 decision for replacement of cell is done regarding to
  computation of the blocking rules in the first step         six cells (3 neighbors at right, cell itself and 2
  (the transition from q 0 to q1) is defined as follows :     neighbors at right). Sorting time of this algorithms is
                                                              O([      − 2 ]).
          1   if X i > X i+1                         (1)           2
   S ir = 
          0   otherwise
                                                              4     PROPOSED SORTING ALGORITHMS
          1   if X i < X i−1 And X i ≤ X i+1         (2)         First we review some tree representations that
   S ir = 
          0   otherwise                                      have been used in our proposed algorithms, and then
      Note that the left blocking rule is more                three different parallel sorting algorithms are
  constraining, given as result that the algorithm            explained in details.
      When the blocking rules have been computed, the             Definition 2: Any array can be displayed as a
  swapping rules are locally decided in the second            tree, like figure1.
  automata step (the transition from q1 to q0), by the
  following evaluation:

Volume 3 Number 3                                 Page 114                                         
                                                                 Also replacement conditions of each cell are
                                                              regarded as figure4:

         Figure 1: Tree representation of array                     Figure 4: Replacement conditions of ith cell

                                                                  It means that, both of ith cell neighbors (Left
       In this figure, “ i, i+1, i+2, …, i+10 ” are array     neighbor, i-1th cell and right neighbor, i+1th cell) are
 elements and edges represent value of each element           less than ith cell. If these conditions are met,
 toward adjacent elements. e.g. in figure1 value of           replacement is done between ith and i+1th cells. In the
 i+3th element is more than value of i+2th & i+4th            other words:
 elements. Or in a similar manner, i+7th element has
 less value rather than i+6th one, but more than i+8th                                                                           +1
                                                              if ( Xti −1 < Xti And Xti > Xti +1 ) Then {Xti +1 = X t +1 And X it+1 = X i }
                                                                                                                                        t         (5)
 one.                                                         else         {nothing}

      Based on this definition, an array is ascending
 sorted only if it has a tree representation like figure2.          in the above equation,
                                                                                                                       t +1
                                                                                                             x ti+1 , xi+1 are values of
                                                              ith and i+1th cells at time t and                                      t
                                                                                                                       xti−1, xti , xi+1          are
                                                                                                        th                                th
                                                              respectively values of i-1 (Left neighbor), i and
                                                              i+1th (Right neighbor) cells at time t.
                                                                  So, in the worst case, an array can be sorted in
                                                              O(2n-3), that’s equal to time of Gordillo’s first
                                                              algorithm [3]. In the other cases, termination of
                                                              algorithm can be announced by using an outer
                                                              supervisor. Indeed if no replacement is done at one
                                                              step, it means that the array is sorted. An example of
                                                              ascending sorting by this approach is illustrated in
   Figure 2: Tree representation of ascending array             Table2: Parallel sorting on cellular automata (first
                                                                              proposed algorithm)
     For descending sorted array, we have the same               Number                 Worst case for ascending sorting,
 definition; just direction of edges will be inversed.              of                          using 7 numbers
     Obviously worst case for ascending sorting of an             states
 array, is occurred when it has been sorted descending.                             7        6         5        4        3        2           1
                                                                      1             6        7         5        4        3        2           1
 4.1 First Proposed Algorithm
                                                                      2             6        5         7        4        3        2           1
     This algorithm uses a symmetric neighborhood of
                                                                      3             5        6         4        7        3        2           1
 radius one and it’s sorting time is equal to Gordillo-
 Luna’s first algorithm. In our proposed algorithm,                   4             5        4         6        3        7        2           1
 each cell just uses its right and left neighbor cells to             5             4        5         3        6        2        7           1
 sort the array. Indeed number of using cells has been                6             4        3         5        2        6        1           7
 decreased in comparison with with Gordillo-Luna’s                    7             3        4         2        5        1        6           7
 first algorithm. This decrement results in less intra-               8             3        2         4        1        5        6           7
 cell communication and overall computation.                          9             2        3         1        4        5        6           7
     Another difference is that, unlike Godillo’s                     10            2        1         3        4        5        6           7
 algorithm, our algorithm doesn’t use blocking rules,                 11            1        2         3        4        5        6           7
 neither memory in cellular automata.
 In this method, right and left neighbors of each cell        4.2       Second Proposed Algorithm
 are considered as follows:
                                                                  Sorting time of the second algorithm is the same
                                                              as Gordillo’s second algorithm, but the dominant
                                                              point of our algorithm is that, instead of using 6 cells
                                                              (Gordillo-Lona’s second algorithm), we just use 4
           Figure 3: Right and left neighbors                 cells. This decrement in neighborhood radius of each

Volume 3 Number 3                                  Page 115                                                    
 cell, leads in overall computation decrement and also                  if [ ( X it−1 < X it And X it > X it+1) or                            (6)
 reduction of intra-cells communications (that has an                                                         t
                                                                               ( X it > X it+1 And X it+1 < X i + 2 ) ] Then
 impressive effect in declining of sorting time).
     This algorithm considers three neighbor cells for                              Si = 1
 each cell (two cells at right and one at left) and done                else        Si = 0
 in two steps.
                                                                            Si demonstrates replacement between xi and xi+1
                                                                        cells. Finally in the second step replacement is done
                                                                        only if Si=1 and Si-1=0 :
          Figure 5: Neighbor cells for ith cell.
                                                                        if S i == 1 And S i−1 == 0 Then {X it +1 = X i+1 And X i+1 = X it }
                                                                                                                     t         t +1           (7)
    At first step, replacement conditions of i          cell, is        else {nothing}
 regarded as figure6:

                                                                        4.3 Third Proposed Algorithm
                                                                            This algorithm uses exactly the same number of
                                                                        cells used in Gordillo-Luna’s second algorithm, but
     Figure 6: Replacement conditions of ith cell                       with a substantial difference: sorting time of our
                                                                        proposed algorithm has a significant improvement in
    Fig.6 means that, furthermore using rule                            comparison with sorting time of Gordillo-Luna’s
 mentioned in equation 1, another rule is taken into                                                         3n
                                                                        second algorithm O([                    − 2 ]).
 account for reduction of sorting time. This new rule                                                         2
 uses four cells for deciding: if xi is greater than xi+1                  This algorithm is performed in two steps and
 and also xi+1 is smaller than xi+2, now it’s a suitable                computing the next state of each cell needs six
    case for replacing. These conditions causes to                      neighbor cells (3 neighbors at left, cell itself, and 2
 memory cell setting to one:                                            neighbors at right):

                              Table3. Ascending array sorting by using third proposed algorithm

           Number of              An example of worst case ascending sorting in an array with 15 numbers
                         15     14    13    12     11       10     9       8         7        6       5        4       3       2      1
                1        14     15    12    13     11       10     9       8         7       6        5       4        3       1      2
                2        14     12    15    11     13        9     10      8         7       6        5       4        1       3      2
                3        12     14    11    15      9       13     8      10         6       7        5       1        4       2      3
                4        12     11    14    9      15        8     13      6        10       5        7       1        2       4      3
                5        11     12     9    14      8       15     6      13         5       10       1       7        2       3      4
                6        11      9    12    8      14        6     15      5        13       1       10       2        7       3      4
                7        9      11     8    12      6       14     5      15         1       13       2       10       3       7      4
                8        9       8    11    6      12        5     14      1        15       2       13       3       10       4      7
                9        8       9     6    11      5       12     1      14         2       15       3       13       4       10     7
               10        8       6     9    5      11        1     12      2        14       3       15       4       13       7      10
               11        6       8     5    9       1       11     2      12         3       14       4       15       7       13     10
               12        6       5     8    1       9        2     11      3        12       4       14       7       15       10     13
               13        5       6     1     8      2        9     3      11         4       12       7       14      10       15     13
               14        5       1     6    2       8        3     9       4        11       7       12       10      14       13     15
               15        1       5     2    6       3        8     4       9         7       11      10       12      13       14     15
               16        1       2     5     3      6        4     8       7         9       10      11       12      13       14     15
               17        1       2     3    5       4        6     7       8         9       10      11       12      13       14     15
               18        1       2     3    4       5        6     7       8         9       10      11       12      13       14     15

Volume 3 Number 3                                       Page 116                                                         
                                                                                                    For parallel implementation of these algorithms,
                                                                                                we define some extremes at the first (last) of array
                                                                                                (lower bound for the first of array and upper bound
                 Figure 7: Neighbor cells for ith cell.                                         for the last of array), where fixed values will be just
                                                                                                copied. These algorithms programmed for the
     In the first step, replacement conditions are                                              intermediate cells.
  considered as follows:
                                                                                                5    CONCLUSION
                                                                                                    In this paper we proposed three novel sorting
                                                                                                algorithms based on 1-D (Linear) cellular automata.
                                                                                                The first and second proposed algorithms despite
                                                                                                using smaller neighborhood radius than Gordillo-
                                                                                                Luna’s algorithms have the same sorting time. The
                                                                                                third proposed algorithm uses same neighborhood
                                                                                                radius as Gordillo-Luna’s second algorithm, but less
         Figure 8: Replacement conditions of ith cell                                           sorting time. This significant improvement shows
                                                                                                efficiency and robustness of our proposed approach.
       It means that, in addition to using rule mentioned                                       It’s important to highlight significant of cellular
  at equation 1, other rules are also regarded. Second                                          automata as a computational mechanism to
  rule of figure8, says that if xi-3 is smaller than xi-2 , xi-                                 efficiently solve problems that use important amount
  2 is greater than xi and xi is greater than xi+1, then we                                     of data uniformly distributed in the space. the
  can exchange i th and i+1 th cells, and this causes,                                          proposed algorithms are mathematical model-based
  memory cell being set to 1. Third rule of figure8 has                                         algorithms and free from implementation platform.
  a similar deduction.
       Rules of this algorithm can be considered as                                             REFRENCES
  follows:                                                                                      [1] S. Wolfram .”Computation Theory of Cellular
                                                                                                    Automata” , Commun. Math. Phys. 96 , pp 15-
  if [(X i −1 < X i And X i > X i +1 ) or                                                           57 , springer , 1984.
      ( X i −3 < X i − 2 And X i − 2 > X i −1 And X i −1 > X i And X i > X i +1 ) or      (8)   [2] S. Wolfram (ed.). “Theory and applications of
      ( X i − 2 > X i −1 And X i −1 > X i And X i > X i +1 And X i +1 < X i + 2 )] Then             CA”, World Scientific, Singapore, 1986
           Si = 1
                                                                                                [3] J. L. Gordillo ,J. V. Luna , “Parallel Sort On a
  else     Si = 0
                                                                                                    Linear Array Of Cellular Automata” , IEEE trans.
       In the second step, if Si=1 and Si-1=0, then                                                 Computer vol. 2 ,pp. 1904-1910, 1994
  replacement is done:                                                                          [4] P. Sarkar , “Brief History of Cellular Automata” ,
  if S i == 1 And S i−1 == 0 Then {X t +1 = X t And X t +1 = X t } (9)
                                                                                                    ACM Computing Surveys, Vol. 32, No. 1, 2000.
                                                 i          i+1            i+1        i
                                                                                                [5] C. D. Thompson ,H. T. Kung, “Sorting on a
  else {nothing}
                                                                                                    Mesh Connected Parallel Computer” , Carnegie
                                                                                                    – Mellon University, Communications of the
      An example of sorting with this algorithm is                                                  ACM, vol. 20, pp 263-271 , 1977.
  illustrated at table3.                                                                        [6] D. S. Hirschberg , “ Fast Parallel Sorting
      Sorting time of Gordillo-Luna’s second algorithm                                              Algorithms” , Communications of the ACM , vol.
  and third proposed algorithm with different sizes of                                              21,pp. 657-661, August 1978.
  an array are available table 4.                                                               [7] W. P. Goodwin, S. K. Das , “Implementing
                                                                                                    Parallel Sorting Algorithms on a Linear Array of
     Table 4. Comparison between Gordillo-luna’s second                                             Transputers” , ACM , pp. 789-796 , 1989.
             algorithm & third proposed algorithm.
                                                                                                [8] K. Qureshi, “A           Practical     Performance
    No. of sorting       Worst case           Worst case
      elements        sorting time with    sorting time with
                                                                                                    Comparison of Parallel Sorting Algorithms on
                       Gordillo-Luna’s      third proposed                                          Homogeneous Network of Workstations” ,
                      second algorithm         algorithm                                        [9] K. E. Batcher, “ Sorting networks and their
                                                                                                    Applications” , AFIP Proc, Spring Joint
           n=10                              13                               11
                                                                                                    Computer Conference, Vol. 32, pp. 307-314,
           n=15                              21                               18
           n=20                              28                               24
                                                                                                [10] C. Rub, “On Batcher’s Merge Sorts as Parallel
           n=50                              73                               61
          n=100                              148                              124
                                                                                                    Sorting Algorithms”,
          n=200                              298                              249               [11] G.M.Megson , “An introduction to Systolic
          n=1000                            1498                             1249                   Algorithm design” clara don press Oxford,1992.
         n=10000                            14998                            12499              [12] V. Kumar ,”Introduction to parallel Computing” ,
         n=50000                            74998                            62499                  2nd Edition, 2003.

Volume 3 Number 3                                                                   Page 117                              

To top