Docstoc

Final Presentation

Document Sample
Final Presentation Powered By Docstoc
					Lior David      Ami Galperin
         Supervisor:
       Oded Green
                    Introduction
                    Building the covariance matrix
                             The naïve algorithm
                             Our algorithm
                                  Terminology
                                  The Algorithm
                                  Optimizations
                                  Results
                    MVM on Plurality
                           The MVM algorithm
                           Plurality Platform
                           Results
                    Future Projects
                    Conclusions


April 18, 2010                        Parallel Covariance Matrix Creation - Final Presentation   2
                    Introduction
                    Building the covariance matrix
                             The naïve algorithm
                             Our algorithm
                                  Terminology
                                  The Algorithm
                                  Optimizations
                                  Results
                    MVM on Plurality
                           The MVM algorithm
                           Plurality Platform
                           Results
                    Future Projects
                    Conclusions


April 18, 2010                        Parallel Covariance Matrix Creation - Final Presentation   3
   Developing a parallel algorithm for the creation
   of a covariance matrix
   Compatibility with Plurality’s HAL platform
   Maximized parallelization and core utilization
   Integrating the algorithm into Elta’s MVM
   (Minimum Variance Method) algorithm
   implementation


April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation   4
MVM is a modern 2-D spectral estimation algorithm
used by Elta’s Synthetic Aperture Radar (SAR).
The MVM algorithm:
            Improves resolution
            Removes side lobe artifacts (noise)
            Reduces speckle compared to what is possible with
            conventional Fourier transform SAR imaging
            techniques
   One of MVM’s main building blocks is the
   creation of a covariance matrix

April 18, 2010        Parallel Covariance Matrix Creation - Final Presentation   5
Plurality’s HyperCore Architecture Line (HAL) family of
massively parallel manycore processors features:
   Unique task-oriented programming model
   Near-serial programmability
   High performance at low cost per watt per square
   millimeter
   Unique shared memory architecture - 2 MB cache size




 April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation   6
                    Introduction
                    Building the covariance matrix
                             The naïve algorithm
                             Our algorithm
                                  Terminology
                                  The Algorithm
                                  Optimizations
                                  Results
                    MVM on Plurality
                           The MVM algorithm
                           Plurality Platform
                           Results
                    Future Projects
                    Conclusions


April 18, 2010                        Parallel Covariance Matrix Creation - Final Presentation   7
Motivation:

Implementing the naïve algorithm will give us a
greater understanding of the parallelization
problem.




April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation   8
          C1,1   C1,2   C1,3    C1,4      C1,5       …       C1,M
          C2,1   C2,2   C2,3    C2,4      C2,5       …       C2,M
                                                                                          Chip [NxM]
          C3,1   C3,2   C3,3    C3,4      C3,5       …       C3,M
          C4,1   C4,2   C4,3    C4,4      C4,5       …       C4,M
          C5,1   C5,2   C5,3    C5,4      C5,5       …       C5,M
            …     …      …        …        …         …         …

          CN,1   CN,2   CN,3    CN,4      CN,5       …       CN,M




April 18, 2010                 Parallel Covariance Matrix Creation - Final Presentation                9
          C1,1   C1,2   C1,3    C1,4      C1,5       …       C1,M
          C2,1   C2,2   C2,3    C2,4      C2,5       …       C2,M
                                                                                     Sub aperture [N1xM1]
          C3,1   C3,2   C3,3    C3,4      C3,5       …       C3,M
          C4,1   C4,2   C4,3    C4,4      C4,5       …       C4,M
          C5,1   C5,2   C5,3    C5,4      C5,5       …       C5,M
            …     …      …        …        …         …         …

          CN,1   CN,2   CN,3    CN,4      CN,5       …       CN,M




April 18, 2010                 Parallel Covariance Matrix Creation - Final Presentation                 10
          C1,1   C1,2   C1,3    C1,4      C1,5       …       C1,M
          C2,1   C2,2   C2,3    C2,4      C2,5       …       C2,M
          C3,1   C3,2   C3,3    C3,4      C3,5       …       C3,M
          C4,1   C4,2   C4,3    C4,4      C4,5       …       C4,M
          C5,1   C5,2   C5,3    C5,4      C5,5       …       C5,M
            …     …      …        …        …         …         …

          CN,1   CN,2   CN,3    CN,4      CN,5       …       CN,M




April 18, 2010                 Parallel Covariance Matrix Creation - Final Presentation   11
C1,1      C1,2   C1,3   C1,4   C1,5   …   C1,M


C2,1      C2,2   C2,3   C2,4   C2,5   …   C2,M


C3,1      C3,2   C3,3   C3,4   C3,5   …   C3,M


C4,1      C4,2   C4,3   C4,4   C4,5   …   C4,M




                                                                 
C5,1      C5,2   C5,3   C5,4   C5,5   …   C5,M

                                                                           C2,2*   C3,2*   C4,2*   C2,3*   C3,3*   C4,4*   C2,4*   C3,4*   C4,4*
 …         …      …      …      …     …    …


CN,1      CN,2   CN,3   CN,4   CN,5   …   CN,M




          C2,2   C2,3   C2,4


          C3,2   C3,3   C3,4


          C4,2   C4,3   C4,4




       April 18, 2010                     Parallel Covariance Matrix Creation - Final Presentation                                         12
                                                        Every Sub-aperture holds its covariance
C1,1      C1,2   C1,3   C1,4   C1,5   …   C1,M


C2,1      C2,2   C2,3   C2,4   C2,5   …   C2,M
                                                        matrix Cov
C3,1      C3,2   C3,3   C3,4   C3,5   …   C3,M
                                                                R1,1      R1,2      R1,3      R1,4   R1,5   …   R1,M∙N
C4,1      C4,2   C4,3   C4,4   C4,5   …   C4,M


C5,1      C5,2   C5,3   C5,4   C5,5   …   C5,M
                                                                R2,1      R2,2      R2,3      R2,4   R2,5   …   R2,M∙N
 …         …      …      …      …     …    …
                                                                R3,1      R3,2      R3,3      R3,4   R3,5   …   R3,M∙N
CN,1      CN,2   CN,3   CN,4   CN,5   …   CN,M

                                                                R4,1      R4,2      R4,3      R4,4   R4,5   …   R4,M∙N

                                                                R5,1      R5,2      R5,3      R5,4   R5,5   …   R5, M∙N

                                                                …         …         …         …      …      …   …

                                                                RM∙N,1 RM∙N,2 RM∙N,3 RM∙N,4 RM∙N,5 …            R M∙N,M∙N




       April 18, 2010                     Parallel Covariance Matrix Creation - Final Presentation                          13
C1,1      C1,2   C1,3   C1,4   C1,5   …   C1,M


C2,1      C2,2   C2,3   C2,4   C2,5   …   C2,M          The covariance matrix Cov is the sum of all
C3,1      C3,2   C3,3   C3,4   C3,5   …   C3,M
                                                        Sub-apertures Cov matrixes
C4,1      C4,2   C4,3   C4,4   C4,5   …   C4,M


C5,1      C5,2   C5,3   C5,4   C5,5   …   C5,M


 …         …      …      …      …     …    …
                                                                                   N-N1 M  M1
                                                                Covxx ~                     V           V
CN,1      CN,2   CN,3   CN,4   CN,5   …   CN,M
                                                                                                               *
                                                                                                     pq        pq
                                                                                    p 0      q 0




       April 18, 2010                     Parallel Covariance Matrix Creation - Final Presentation                  14
Shortcomings
   Each multiplication is executed many times
            For a 32x32 chip, the total number of multiplies is 11.4M
            when the optimal number of multiplications is 208K (x28!)
   The naïve algorithm is difficult to parallelize.
   Two main difficulties:
            Simultaneous writing to the same Rcells – requires
            mutexes
            Memory cost of holding a Cov matrix for every
            permutation (each is 250 KB) is too expensive


April 18, 2010          Parallel Covariance Matrix Creation - Final Presentation   15
Disadvantages
Plurality Platform
        Mutexes - adds complexity
        Memory space - cache size is only 2 MB


                 The problem requires different solution!



April 18, 2010           Parallel Covariance Matrix Creation - Final Presentation   16
Our Algorithm



        A Whole different
        Ball Game!

April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation   17
But first …

Before presenting the algorithm there is
a need to create a common language for the
terms we have created.




 April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation   18
                    Introduction
                    Building the covariance matrix
                             The naïve algorithm
                             Our algorithm
                                  Terminology
                                  The Algorithm
                                  Optimizations
                                  Results
                    MVM on Plurality
                           The MVM algorithm
                           Plurality Platform
                           Results
                    Future Projects
                    Conclusions


April 18, 2010                        Parallel Covariance Matrix Creation - Final Presentation   19
          M1
          C1,1   C1,2   C1,3    C1,4      C1,5       …       C1,M
          C2,1   M2
                 C2,2   C2,3    C2,4      C2,5       …       C2,M
                                                                                             Permutation
          C3,1   C3,2   C3,3    C3,4      C3,5       …       C3,M
          C4,1   C4,2   C4,3    C4,4      C4,5       …       C4,M
          C5,1   C5,2   C5,3    C5,4      C5,5       …       C5,M                         • Permutation [1,0]
            …     …      …        …        …         …         …                          • Permutation [1,1]

          CN,1   CN,2   CN,3    CN,4      CN,5       …       CN,M




April 18, 2010                 Parallel Covariance Matrix Creation - Final Presentation                         20
          M1
          C1,1   C1,2   C1,3    C1,4      C1,5       …       C1,M
          M2
          C2,1   C2,2   C2,3    C2,4      C2,5       …       C2,M
                                                                                             Permutation
          C3,1   C3,2   C3,3    C3,4      C3,5       …       C3,M
          C4,1   C4,2   C4,3    C4,4      C4,5       …       C4,M
          C5,1   C5,2   C5,3    C5,4      C5,5       …       C5,M                         • Permutation [1,0]
            …     …      …        …        …         …         …                          • Permutation [1,1]

          CN,1   CN,2   CN,3    CN,4      CN,5       …       CN,M




April 18, 2010                 Parallel Covariance Matrix Creation - Final Presentation                         21
          M1
          C1,1   C1,2   C1,3    C1,4      C1,5       …       C1,M
          C2,1   M2
                 C2,2   C2,3    C2,4      C2,5       …       C2,M
                                                                                             Permutation
          C3,1   C3,2   C3,3    C3,4      C3,5       …       C3,M
          C4,1   C4,2   C4,3    C4,4      C4,5       …       C4,M
          C5,1   C5,2   C5,3    C5,4      C5,5       …       C5,M                         • Permutation [1,0]
            …     …      …        …        …         …         …                          • Permutation [1,1]

          CN,1   CN,2   CN,3    CN,4      CN,5       …       CN,M




April 18, 2010                 Parallel Covariance Matrix Creation - Final Presentation                         22
          M1 C1,2
          C1,1          C1,3    C1,4      C1,5       …       C1,M
            Block
          C2,1 M2
                C2,2    C2,3    C2,4      C2,5       …       C2,M
                                                                                          Block
          C3,1   C3,2   C3,3    C3,4      C3,5       …       C3,M
          C4,1   C4,2   C4,3    C4,4      C4,5       …       C4,M
          C5,1   C5,2   C5,3    C5,4      C5,5       …       C5,M
            …     …      …        …        …         …         …

          CN,1   CN,2   CN,3    CN,4      CN,5       …       CN,M




April 18, 2010                 Parallel Covariance Matrix Creation - Final Presentation           23
          C1,1   C1,2   C1,3    C1,4      C1,5       …       C1,M
          C2,1   C2,2   C2,3    C2,4      C2,5       …       C2,M
                                                                                          Block
          C3,1   C3,2   C3,3    C3,4      C3,5       …       C3,M
          C4,1   C4,2   C4,3    C4,4      C4,5       …       C4,M
          C5,1   C5,2   C5,3    C5,4      C5,5       …       C5,M
            …     …      …        …        …         …         …

          CN,1   CN,2   CN,3    CN,4      CN,5       …       CN,M




April 18, 2010                 Parallel Covariance Matrix Creation - Final Presentation           24
     BNW


           C1,1   C1,2   C1,3    C1,4      C1,5       …       C1,M
           C2,1   C2,2   C2,3    C2,4      C2,5       …       C2,M
                                                                                           BNW
           C3,1   C3,2   C3,3    C3,4      C3,5       …       C3,M
           C4,1   C4,2   C4,3    C4,4      C4,5       …       C4,M
           C5,1   C5,2   C5,3    C5,4      C5,5       …       C5,M
            …      …      …        …        …         …         …

           CN,1   CN,2   CN,3    CN,4      CN,5       …       CN,M




April 18, 2010                  Parallel Covariance Matrix Creation - Final Presentation         25
          C1,1   C1,2   C1,3    C1,4      C1,5       …       C1,M
          C2,1   C2,2   C2,3    C2,4      C2,5       …       C2,M
                                                                                            Shifting
          C3,1   C3,2   C3,3    C3,4      C3,5       …       C3,M
          C4,1   C4,2   C4,3    C4,4      M1
                                          C4,5   …           C4,M                   Shift only upwards
                                            Block                                   and leftwards
          C5,1   C5,2   C5,3    C5,4      C5,5 M2…           C5,M
            …     …      …        …        …         …         …                  The block is always inside
                                                                                  the shifted window
          CN,1   CN,2   CN,3    CN,4      CN,5       …       CN,M




April 18, 2010                 Parallel Covariance Matrix Creation - Final Presentation                        26
          C1,1   C1,2   C1,3    C1,4      C1,5       …       C1,M
          C2,1   C2,2   C2,3    C2,4      C2,5       …       C2,M
                                                                                            Shifting
          C3,1   C3,2   C3,3    C3,4      C3,5       …       C3,M
          C4,1   C4,2   C4,3    C4,4      M1
                                          C4,5   …           C4,M                   Shift only upwards
                                            Block                                   and leftwards
          C5,1   C5,2   C5,3    C5,4      C5,5 M2…           C5,M
            …     …      …        …        …         …         …                  The block is always inside
                                                                                  the shifted window
          CN,1   CN,2   CN,3    CN,4      CN,5       …       CN,M




April 18, 2010                 Parallel Covariance Matrix Creation - Final Presentation                        27
          C1,1   C1,2   C1,3    C1,4      C1,5       …       C1,M
          C2,1   C2,2   C2,3    C2,4      C2,5       …       C2,M
                                                                                            Shifting
          C3,1   C3,2   C3,3    C3,4      C3,5       …       C3,M
          C4,1   C4,2   C4,3    C4,4      M1
                                          C4,5   …           C4,M                   Shift only upwards
                                            Block                                   and leftwards
          C5,1   C5,2   C5,3    C5,4      C5,5 M2…           C5,M
            …     …      …        …        …         …         …                  The block is always inside
                                                                                  the shifted window
          CN,1   CN,2   CN,3    CN,4      CN,5       …       CN,M




April 18, 2010                 Parallel Covariance Matrix Creation - Final Presentation                        28
          C1,1   C1,2   C1,3    C1,4      C1,5       …       C1,M
          C2,1   C2,2   C2,3    C2,4      C2,5       …       C2,M
                                                                                            Shifting
          C3,1   C3,2   C3,3    C3,4      C3,5       …       C3,M
          C4,1   C4,2   C4,3    C4,4      M1
                                          C4,5   …           C4,M                   Shift only upwards
                                            Block                                   and leftwards
          C5,1   C5,2   C5,3    C5,4      C5,5 M2…           C5,M
            …     …      …        …        …         …         …                  The block is always inside
                                                                                  the shifted window
          CN,1   CN,2   CN,3    CN,4      CN,5       …       CN,M
                                                                                  Shift of (0,0) is named
                                                                                  Zero iteration



April 18, 2010                 Parallel Covariance Matrix Creation - Final Presentation                        29
         R1,1    R1,2   R1,3    R1,4      R1,5       …         R1,M∙N

         R2,1    R2,2   R2,3    R2,4      R2,5       …         R2,M∙N
                                                               R3,M∙N                     Cov- The covariance
         R3,1    R3,2   R3,3    R3,4      R3,5       …
                                                                                           matrix[M∙N, M∙N]
         R4,1    R4,2   R4,3    R4,4      R4,5       …         R4,M∙N

         R5,1    R5,2   R5,3    R5,4      R5,5       …         R5, M∙N

         …       …      …       …         …          …         …

         RM∙N,1 RM∙N,2 RM∙N,3 RM∙N,4 RM∙N,5 …                  R M∙N,M∙N




April 18, 2010                 Parallel Covariance Matrix Creation - Final Presentation                         30
         R1,1    R1,2   R1,3    R1,4      R1,5       …         R1,M∙N

         R2,1    R2,2   R2,3    R2,4      R2,5       …         R2,M∙N

         R3,1    R3,2   R3,3    R3,4      R3,5       …         R3,M∙N                     Rcell
         R4,1    R4,2   R4,3    R4,4      R4,5       …         R4,M∙N

         R5,1    R5,2   R5,3    R5,4      R5,5       …         R5, M∙N

         …       …      …       …         …          …         …

         RM∙N,1 RM∙N,2 RM∙N,3 RM∙N,4 RM∙N,5 …                  R M∙N,M∙N




April 18, 2010                 Parallel Covariance Matrix Creation - Final Presentation           31
                    Introduction
                    Building the covariance matrix
                             The naïve algorithm
                             Our algorithm
                                  Terminology
                                  The Algorithm
                                  Optimizations
                                  Results
                    MVM on Plurality
                           The MVM algorithm
                           Plurality Platform
                           Results
                    Future Projects
                    Conclusions


April 18, 2010                        Parallel Covariance Matrix Creation - Final Presentation   32
Parallel
Each multiplication is executed once (208k for 32x32 chip)
Memory efficient
Generic
 Concept:
  Each Rcell in Cov is calculated by one specific
  permutation. This enables different
  permutations to work simultaneously.

 April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation   33
1. For each permutation (1:313)

     1.1 For each legal BNW

                  1.1.1. Multiply the two multipliers

                  1.1.2. For each legal shift (including the zero iteration)

                         1.1.2.1. Add the multiplication product to the
                                  matching Rcell in Cov




 April 18, 2010               Parallel Covariance Matrix Creation - Final Presentation   34
Finding all unique permutations
    Iterative algorithm
      1. Initialize Delta (x,y) set and Permutation(x,y) set
      2. For each pair of cells (M1,M2) in a N1xM1 matrix
                 2.1. If |M1-M2| is not in D
                      2.1.1. Add |M1-M2| to D
                      2.1.2. Add (M1,M2) to P
    Unique permutation count is 313 ( for Sub-aperture [13x13])
    Executed off-line
April 18, 2010             Parallel Covariance Matrix Creation - Final Presentation   35
                                                                                          Cov- The covariance
                  Chip [NxM]                                                               matrix[M∙N, M∙N]

C1,1      C1,2        C1,3   C1,4   C1,5        …       C1,M             R1,1      R1,2      R1,3   R1,4   R1,5   …   R1,M∙N

C2,1      C2,2        C2,3   C2,4   C2,5        …       C2,M             R2,1      R2,2      R2,3   R2,4   R2,5   …   R2,M∙N
C3,1      C3,2        C3,3   C3,4   C3,5        …       C3,M                                                          R3,M∙N
                                                                         R3,1      R3,2      R3,3   R3,4   R3,5   …
C4,1      C4,2        C4,3   C4,4   C4,5        …       C4,M
                                                                         R4,1      R4,2      R4,3   R4,4   R4,5   …   R4,M∙N
C5,1      C5,2        C5,3   C5,4   C5,5        …       C5,M
                                                                         R5,1      R5,2      R5,3   R5,4   R5,5   …   R5, M∙N
 …         …           …      …      …          …         …

CN,1      CN,2        CN,3   CN,4   CN,5        …       CN,M             …         …         …      …      …      …   …

                                                                         RM∙N,1 RM∙N,2 RM∙N,3 RM∙N,4 RM∙N,5 …         R    M∙N,M∙N




     April 18, 2010                      Parallel Covariance Matrix Creation - Final Presentation                     36
       For a given Permutation [1,1]

M1
C1,1      C1,2        C1,3   C1,4   C1,5        …       C1,M             R1,1      R1,2      R1,3   R1,4   R1,5   …   R1,M∙N

C2,1      M2
          C2,2        C2,3   C2,4   C2,5        …       C2,M             R2,1      R2,2      R2,3   R2,4   R2,5   …   R2,M∙N
C3,1      C3,2        C3,3   C3,4   C3,5        …       C3,M                                                          R3,M∙N
                                                                         R3,1      R3,2      R3,3   R3,4   R3,5   …
C4,1      C4,2        C4,3   C4,4   C4,5        …       C4,M
                                                                         R4,1      R4,2      R4,3   R4,4   R4,5   …   R4,M∙N
C5,1      C5,2        C5,3   C5,4   C5,5        …       C5,M
                                                                         R5,1      R5,2      R5,3   R5,4   R5,5   …   R5, M∙N
 …         …           …      …      …          …         …

CN,1      CN,2        CN,3   CN,4   CN,5        …       CN,M             …         …         …      …      …      …   …

                                                                         RM∙N,1 RM∙N,2 RM∙N,3 RM∙N,4 RM∙N,5 …         R    M∙N,M∙N




     April 18, 2010                      Parallel Covariance Matrix Creation - Final Presentation                     37
       There’s a Block

M1 C1,2
C1,1                  C1,3   C1,4   C1,5        …       C1,M             R1,1      R1,2      R1,3   R1,4   R1,5   …   R1,M∙N
  Block
C2,1 M2
      C2,2            C2,3   C2,4   C2,5        …       C2,M             R2,1      R2,2      R2,3   R2,4   R2,5   …   R2,M∙N
C3,1      C3,2        C3,3   C3,4   C3,5        …       C3,M                                                          R3,M∙N
                                                                         R3,1      R3,2      R3,3   R3,4   R3,5   …
C4,1      C4,2        C4,3   C4,4   C4,5        …       C4,M
                                                                         R4,1      R4,2      R4,3   R4,4   R4,5   …   R4,M∙N
C5,1      C5,2        C5,3   C5,4   C5,5        …       C5,M
                                                                         R5,1      R5,2      R5,3   R5,4   R5,5   …   R5, M∙N
 …         …           …      …      …          …         …

CN,1      CN,2        CN,3   CN,4   CN,5        …       CN,M             …         …         …      …      …      …   …

                                                                         RM∙N,1 RM∙N,2 RM∙N,3 RM∙N,4 RM∙N,5 …         R    M∙N,M∙N




     April 18, 2010                      Parallel Covariance Matrix Creation - Final Presentation                     38
     Leagal BNWs for this Block
BNW


M1 C1,2
C1,1                  C1,3   C1,4   C1,5        …       C1,M             R1,1      R1,2      R1,3   R1,4   R1,5   …   R1,M∙N
  Block
C2,1 M2
      C2,2            C2,3   C2,4   C2,5        …       C2,M             R2,1      R2,2      R2,3   R2,4   R2,5   …   R2,M∙N
C3,1      C3,2        C3,3   C3,4   C3,5        …       C3,M                                                          R3,M∙N
                                                                         R3,1      R3,2      R3,3   R3,4   R3,5   …
C4,1      C4,2        C4,3   C4,4   C4,5        …       C4,M
                                                                         R4,1      R4,2      R4,3   R4,4   R4,5   …   R4,M∙N
C5,1      C5,2        C5,3   C5,4   C5,5        …       C5,M
                                                                         R5,1      R5,2      R5,3   R5,4   R5,5   …   R5, M∙N
 …         …           …      …      …          …         …

CN,1      CN,2        CN,3   CN,4   CN,5        …       CN,M             …         …         …      …      …      …   …

                                                                         RM∙N,1 RM∙N,2 RM∙N,3 RM∙N,4 RM∙N,5 …         R    M∙N,M∙N




     April 18, 2010                      Parallel Covariance Matrix Creation - Final Presentation                     39
     For a given BNW

C1,1      C1,2        C1,3   C1,4   C1,5        …       C1,M             R1,1      R1,2      R1,3   R1,4   R1,5   …   R1,M∙N

C2,1      C2,2        C2,3   C2,4   C2,5        …       C2,M             R2,1      R2,2      R2,3   R2,4   R2,5   …   R2,M∙N
C3,1      C3,2        C3,3   C3,4   C3,5        …       C3,M                                                          R3,M∙N
                                                                         R3,1      R3,2      R3,3   R3,4   R3,5   …
C4,1      C4,2        C4,3   C4,4   M1
                                    C4,5   …            C4,M
                                      Block                              R4,1      R4,2      R4,3   R4,4   R4,5   …   R4,M∙N
C5,1      C5,2        C5,3   C5,4   C5,5 M2…            C5,M
                                                                         R5,1      R5,2      R5,3   R5,4   R5,5   …   R5, M∙N
 …         …           …      …      …          …         …

CN,1      CN,2        CN,3   CN,4   CN,5        …       CN,M             …         …         …      …      …      …   …

                                                                         RM∙N,1 RM∙N,2 RM∙N,3 RM∙N,4 RM∙N,5 …         R    M∙N,M∙N




     April 18, 2010                      Parallel Covariance Matrix Creation - Final Presentation                     40
     RES=M1∙M2*                                                          RES



C1,1      C1,2        C1,3   C1,4   C1,5        …       C1,M             R1,1      R1,2      R1,3   R1,4   R1,5   …   R1,M∙N

C2,1      C2,2        C2,3   C2,4   C2,5        …       C2,M             R2,1      R2,2      R2,3   R2,4   R2,5   …   R2,M∙N
C3,1      C3,2        C3,3   C3,4   C3,5        …       C3,M                                                          R3,M∙N
                                                                         R3,1      R3,2      R3,3   R3,4   R3,5   …
C4,1      C4,2        C4,3   C4,4   M1
                                    C4,5   …            C4,M
                                      Block                              R4,1      R4,2      R4,3   R4,4   R4,5   …   R4,M∙N
C5,1      C5,2        C5,3   C5,4   C5,5 M2…            C5,M
                                                                         R5,1      R5,2      R5,3   R5,4   R5,5   …   R5, M∙N
 …         …           …      …      …          …         …

CN,1      CN,2        CN,3   CN,4   CN,5        …       CN,M             …         …         …      …      …      …   …

                                                                         RM∙N,1 RM∙N,2 RM∙N,3 RM∙N,4 RM∙N,5 …         R    M∙N,M∙N




     April 18, 2010                      Parallel Covariance Matrix Creation - Final Presentation                     41
     The multipliers Numbering

C1,1      C1,2        C1,3   C1,4   C1,5        …       C1,M             R1,1      R1,2      R1,3   R1,4   R1,5   …   R1,M∙N

C2,1      C2,2        C2,3   C2,4   C2,5        …       C2,M             R2,1      R2,2      R2,3   R2,4   R2,5   …   R2,M∙N
C3,1      C3,2        C3,3   C3,4   C3,5        …       C3,M                                                          R3,M∙N
                                                                         R3,1      R3,2      R3,3   R3,4   R3,5   …
C4,1      C4,2        C4,3   C4,4    1      4             7
                                      Block                              R4,1      R4,2      R4,3   R4,4   R4,5   …   R4,M∙N
C5,1      C5,2        C5,3   C5,4    2      5             8
                                                                         R5,1      R5,2      R5,3   R5,4   R5,5   …   R5, M∙N
 …         …           …      …      3          6         9
CN,1      CN,2        CN,3   CN,4   CN,5        …       CN,M             …         …         …      …      …      …   …

                                                                         RM∙N,1 RM∙N,2 RM∙N,3 RM∙N,4 RM∙N,5 …         R    M∙N,M∙N




     April 18, 2010                      Parallel Covariance Matrix Creation - Final Presentation                     42
                                                                                                    RES
     The Zero Iteration                                                                                           Rcell (1,5)



C1,1      C1,2        C1,3   C1,4   C1,5        …       C1,M             R1,1      R1,2      R1,3   R1,4   RES
                                                                                                           R1,5 …         R1,M∙N

C2,1      C2,2        C2,3   C2,4   C2,5        …       C2,M             R2,1      R2,2      R2,3   R2,4   R2,5   …       R2,M∙N
C3,1      C3,2        C3,3   C3,4   C3,5        …       C3,M                                                              R3,M∙N
                                                                         R3,1      R3,2      R3,3   R3,4   R3,5 …
C4,1      C4,2        C4,3   C4,4    1     4              7                                                Diag(5-1)
                                      Block                              R4,1      R4,2      R4,3   R4,4   R4,5 …         R4,M∙N
C5,1      C5,2        C5,3   C5,4    2     5              8
                                                                         R5,1      R5,2      R5,3   R5,4   R5,5   …       R5, M∙N
 …         …           …      …      3          6         9
CN,1      CN,2        CN,3   CN,4   CN,5        …       CN,M             …                 Main Diag              …       …

                                                                         RM∙N,1 RM∙N,2 RM∙N,3 RM∙N,4 RM∙N,5 …             R   M∙N,M∙N




     April 18, 2010                      Parallel Covariance Matrix Creation - Final Presentation                        43
     Shifting

C1,1      C1,2        C1,3   C1,4   C1,5        …       C1,M             R1,1      R1,2      R1,3   R1,4   R1,5   …   R1,M∙N

C2,1      C2,2        C2,3   C2,4   C2,5        …       C2,M             R2,1      R2,2      R2,3   R2,4   R2,5   …   R2,M∙N
C3,1      C3,2        C3,3   C3,4   C3,5        …       C3,M                                                          R3,M∙N
                                                                         R3,1      R3,2      R3,3   R3,4   R3,5   …
C4,1      C4,2        C4,3   C4,4    1     4              7
                                      Block                              R4,1      R4,2      R4,3   R4,4   R4,5   …   R4,M∙N
C5,1      C5,2        C5,3   C5,4    2     5              8
                                                                         R5,1      R5,2      R5,3   R5,4   R5,5   …   R5, M∙N
 …         …           …      …      3          6         9
CN,1      CN,2        CN,3   CN,4   CN,5        …       CN,M             …                                        …   …

                                                                         RM∙N,1 RM∙N,2 RM∙N,3 RM∙N,4 RM∙N,5 …         R    M∙N,M∙N




     April 18, 2010                      Parallel Covariance Matrix Creation - Final Presentation                     44
                                                                                                    RES
     Shifting
                                                                                                                      Rcell (2,6)

C1,1      C1,2        C1,3   C1,4   C1,5        …       C1,M             R1,1      R1,2      R1,3   R1,4   RES
                                                                                                           R1,5 …          R1,M∙N

C2,1      C2,2        C2,3   C2,4   C2,5        …       C2,M             R2,1      R2,2      R2,3   R2,4   R2,5   RES
                                                                                                                  R2,6 R2,M∙N
C3,1      C3,2        C3,3   C3,4    1          4         7                                                                R3,M∙N
                                                                         R3,1      R3,2      R3,3   R3,4   R3,5 …
C4,1      C4,2        C4,3   C4,4    2     5              8                                                Diag(5-1)
                                      Block                              R4,1      R4,2      R4,3   R4,4   R4,5 …          R4,M∙N
C5,1      C5,2        C5,3   C5,4    3     6              9
                                                                         R5,1      R5,2      R5,3   R5,4   R5,5   …        R5, M∙N
 …         …           …      …      …          …         …

CN,1      CN,2        CN,3   CN,4   CN,5        …       CN,M             …                 Main Diag              …        …

                                                                         RM∙N,1 RM∙N,2 RM∙N,3 RM∙N,4 RM∙N,5 …              R    M∙N,M∙N




     April 18, 2010                      Parallel Covariance Matrix Creation - Final Presentation                          45
     Shifting

C1,1      C1,2        C1,3   C1,4   C1,5        …       C1,M             R1,1      R1,2      R1,3   R1,4   R1,5   …      R1,M∙N

C2,1      C2,2        C2,3   C2,4   C2,5        …       C2,M             R2,1      R2,2      R2,3   R2,4   R2,5   R2,6   R2,M∙N
C3,1      C3,2        C3,3   C3,4    1          4         7                                                              R3,M∙N
                                                                         R3,1      R3,2      R3,3   R3,4   R3,5   …
C4,1      C4,2        C4,3   C4,4    2     5              8
                                      Block                              R4,1      R4,2      R4,3   R4,4   R4,5   …      R4,M∙N
C5,1      C5,2        C5,3   C5,4    3     6              9
                                                                         R5,1      R5,2      R5,3   R5,4   R5,5   …      R5, M∙N
 …         …           …      …      …          …         …

CN,1      CN,2        CN,3   CN,4   CN,5        …       CN,M             …                                        …      …

                                                                         RM∙N,1 RM∙N,2 RM∙N,3 RM∙N,4 RM∙N,5 …            R    M∙N,M∙N




     April 18, 2010                      Parallel Covariance Matrix Creation - Final Presentation                        46
                                                                                                    RES
     Shifting

C1,1      C1,2        C1,3   C1,4   C1,5        …       C1,M             R1,1      R1,2      R1,3   R1,4   RES
                                                                                                           R1,5 …       R1,M∙N

C2,1      C2,2        C2,3   C2,4   C2,5        …       C2,M             R2,1      R2,2      R2,3   R2,4   R2,5   RES
                                                                                                                  R2,6 R2,M∙N
C3,1      C3,2        C3,3    1      4          7                        R3,1      R3,2      R3,3   R3,4   R3,5 …        RES
C4,1      C4,2        C4,3    2      5     8                                                               Diag(5-1)
                                                                         R4,1      R4,2      R4,3   R4,4   R4,5 …       R4,M∙N
                                      Block
C5,1      C5,2        C5,3    3      6     9
                                                                         R5,1      R5,2      R5,3   R5,4   R5,5   …     R5, M∙N
 …         …           …      …      …          …         …
                                                                         …                 Main Diag              …     …
CN,1      CN,2        CN,3   CN,4   CN,5        …       CN,M
                                                                         RM∙N,1 RM∙N,2 RM∙N,3 RM∙N,4 RM∙N,5 …           R    M∙N,M∙N




     April 18, 2010                      Parallel Covariance Matrix Creation - Final Presentation                       47
R1,1     R1,2     R1,3     R1,4     R1,5       R1,6     R1,7     …      R1,M∙N

R2,1     R2,2     R2,3     R2,4     R2,5       R2,6     R2,7     …      R2,M∙N    We came across a regularity
R3,1     R3,2     R3,3     R3,4     R3,5       R3,6     R3,7     …      R3,M∙N    in the offset of the Rcell
R4,1     R4,2     R4,3     R4,4     R4,5       R4,6     R4,7     s…     R4,M∙N    coordinates when shifting:
R5,1     R5,2     R5,3     R5,4     R5,5       R5,6     R5,7     …      R5, M∙N       Leftwards
R6,1     R6,2     R6,3     R6,4     R6,5       R6,6     R6,7     …      R6, M∙N        (+Sub-ap size, +Sub-ap size)
                                                                 …
R7,1     R7,2     R7,3     R7,4     R7,5       R7,6     R7,7            R7, M∙N       Upwards
…        …        …        …        …          …        …        …        …           (+1,+1)
                                                                 …
RM∙N,1   RM∙N,2   RM∙N,3   RM∙N,4   RM∙N,5     RM∙N,6   RM∙N,7         RM∙N,M∙N




April 18, 2010                               Parallel Covariance Matrix Creation - Final Presentation         48
R1,1     R1,2     R1,3     R1,4     R1,5       R1,6     R1,7     …      R1,M∙N

R2,1     R2,2     R2,3     R2,4     R2,5       R2,6     R2,7     …      R2,M∙N

R3,1     R3,2     R3,3     R3,4     R3,5       R3,6     R3,7     …      R3,M∙N
                                                                                              Each color represents a
R4,1     R4,2     R4,3     R4,4     R4,5       R4,6     R4,7     s…     R4,M∙N                 different permutation
R5,1     R5,2     R5,3     R5,4     R5,5       R5,6     R5,7     …      R5, M∙N

R6,1     R6,2     R6,3     R6,4     R6,5       R6,6     R6,7     …      R6, M∙N

R7,1     R7,2     R7,3     R7,4     R7,5       R7,6     R7,7     …      R7, M∙N

…        …        …        …        …          …        …        …        …
                                                                 …
RM∙N,1   RM∙N,2   RM∙N,3   RM∙N,4   RM∙N,5     RM∙N,6   RM∙N,7         RM∙N,M∙N




April 18, 2010                               Parallel Covariance Matrix Creation - Final Presentation             49
R1,1     R1,2     R1,3     R1,4     R1,5       R1,6     R1,7     …      R1,M∙N      For a given permutation:
R2,1     R2,2     R2,3     R2,4     R2,5       R2,6     R2,7     …      R2,M∙N          RES is always written into the
R3,1     R3,2     R3,3     R3,4     R3,5       R3,6     R3,7     …      R3,M∙N          same group of Rcells
R4,1     R4,2     R4,3     R4,4     R4,5       R4,6     R4,7     s…     R4,M∙N               All on the same diagonal
                                                                 …                           Not necessarily all diagonal
R5,1     R5,2     R5,3     R5,4     R5,5       R5,6     R5,7            R5, M∙N
                                                                                             cells
R6,1     R6,2     R6,3     R6,4     R6,5       R6,6     R6,7     …      R6, M∙N
                                                                                        There is no overlapping between
R7,1     R7,2     R7,3     R7,4     R7,5       R7,6     R7,7     …      R7, M∙N         Rcells of different permutations.
…        …        …        …        …          …        …        …        …             The basis for parallelism!
RM∙N,1   RM∙N,2   RM∙N,3   RM∙N,4   RM∙N,5     RM∙N,6   RM∙N,7
                                                                 …
                                                                       RM∙N,M∙N
                                                                                        Each shift writes to one unique
                                                                                        Rcell.
                                                                                        Theoretically enables parallelism
                                                                                        of Rcell granularity (an instance
                                                                                        per Rcell)


April 18, 2010                               Parallel Covariance Matrix Creation - Final Presentation                 50
April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation   51
Different workload for
different permutations,
therefore changing the
order of permutations’
execution may improve
core utilization.




April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation   52
Parallelization Opportunities
    Different permutations work simultaneously
    Different chips can work simultaneously
    Finer grain parallelism of Rcell granularity (an
    instance per Rcell)




April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation   53
Platform Comparison
Plurality vs. Distributed Systems
  Our algorithm is optimal for shared memory
  platforms since Cov is shared by all cores
  Working on distributed memory platforms will
  damage its efficiency as a result of
  communication overhead
  Plurality provides much higher performance-
  power utilization than Elta's grid computing



April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation   54
                    Introduction
                    Building the covariance matrix
                             The naïve algorithm
                             Our algorithm
                                  Terminology
                                  The Algorithm
                                  Optimizations
                                  Results
                    MVM on Plurality
                           The MVM algorithm
                           Plurality Platform
                           Results
                    Future Projects
                    Conclusions


April 18, 2010                        Parallel Covariance Matrix Creation - Final Presentation   55
Concept:
Execute many data-independent calculations off-
line and storing results as a memory efficient static
look-up tables.
Advantages:
  Reduces calculation at run time by 50%
  Same tables used for all chips



April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation   56
Permutations Table
Holds relevant permutation info:

     Multipliers’ indexes
     Block borders
     Zero iteration coordinates

Optimal table size: (4∙6 + 8∙2) bit ∙ 313 = 1.5 KBytes


 April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation   57
Offsets Table
Maps each shift to a Rcell
Concept:
      Uses the regularity in the offset of the Rcell
      coordinates when shifting upwards (+1,+1) or leftwards
      (+Sub-ap size, +Sub-ap size)


Optimal table size: 2∙(13∙13∙8) bit ∙ 313 = 106 KBytes


 April 18, 2010      Parallel Covariance Matrix Creation - Final Presentation   58
Using matrix Characteristics
Concept:
 Using matrix characteristics to reduce calculations
Important observation:
 Cov is an Hermitian matrix.
                                                            RR             †


                                             R i, j   R  j, i 

April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation       59
Using matrix Characteristics
Highlight:
  Building Cov’s upper triangle only and, if necessary,
  generate the lower triangle inexpensively
Advantages:
    Reduces calculations by half
    Requires less space for storing the Cov matrix
    Most eigendecomposition algorithms requires upper
    triangle only


April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation   60
                    Introduction
                    Building the covariance matrix
                             The naïve algorithm
                             Our algorithm
                                  Terminology
                                  The Algorithm
                                  Optimizations
                                  Results
                    MVM on Plurality
                           The MVM algorithm
                           Plurality Platform
                           Results
                    Future Projects
                    Conclusions


April 18, 2010                        Parallel Covariance Matrix Creation - Final Presentation   61
                      0.12
                                        Different Chip Sizes
                       0.1

                             Not optimized for x86
                      0.08
 Run Time [seconds]




                      0.06
                                                                                                       Naive
                                                                                                       Ours

                      0.04



                      0.02



                        0
                              6    8           16             26             32             33    36
                                                           Chip Size




April 18, 2010                         Parallel Covariance Matrix Creation - Final Presentation                62
                       0.08
                                   Different Sub-aparture Sizes
                       0.07


                       0.06
                              Not optimized for x86
                       0.05
  Run Time [seconds]




                       0.04
                                                                                                        naive
                                                                                                        Ours
                       0.03


                       0.02


                       0.01


                         0
                               3    4            6              8             11             13    15
                                                           Sub-ap Size




April 18, 2010                          Parallel Covariance Matrix Creation - Final Presentation                63
                    Introduction
                    Building the covariance matrix
                              The naïve algorithm
                              Our algorithm
                                  Terminology
                                  The Algorithm
                                  Optimizations
                                  Results
                    MVM on Plurality
                            The MVM algorithm
                            Plurality Platform
                            Results
                    Future Projects
                    Conclusions


April 18, 2010                         Parallel Covariance Matrix Creation - Final Presentation   64
                                                                                                     Input image
                                                                                            Original SAR Image is Segmented into Chips
                                                                                                 (32X32 Chip) . The chips overlap.



Preliminary Algorithm
    SK           2D FFT  X ,Y
         x ,KY


Elta’s Algorithm
   X ,Y Fragmentation  X ,Y 32x32 The chips overlap                                                Output image
                                                                                         MVM is Applied to Each Chip. The Various
                                                                                        Chips are Attached to Each Other and Forms a
    X ,Y 32x32 2D IFFT S X ,Y32x32                                                                  Full Size MVM Image



                              MVM
    S X ,Y32 x32 MVM               X ,Y 32 x32

  MVM
         X ,Y 32 x32   Attachment  X ,Y




April 18, 2010                   Parallel Covariance Matrix Creation - Final Presentation                                         65
                                    Main effort

                 INIT                                          1
                                                                                   Attachment
                                          Covarince




        Segmentation                                           2                     FINISH
                                         Eigenvalues




          2D-IFFT                             FFT




April 18, 2010          Parallel Covariance Matrix Creation - Final Presentation                66
                    Introduction
                    Building the covariance matrix
                              The naïve algorithm
                              Our algorithm
                                  Terminology
                                  The Algorithm
                                  Optimizations
                                  Results
                    MVM on Plurality
                            The MVM algorithm
                            Plurality Platform
                            Results
                    Future Projects
                    Conclusions


April 18, 2010                         Parallel Covariance Matrix Creation - Final Presentation   67
Plurality’s HyperCore Architecture Line (HAL) family of massively parallel manycore
processors includes:

    16 to 256 32-bit RISC cores
    4-64 co-processors that include a Floating Point unit and a Multiplier/Divider.
    Each co- processor is shared by four RISC processors
    Shared memory architecture - 2 MB size. No level one cache.
    Hardware-based scheduler that supports a task-oriented programming model
    A cycle accurate simulator that runs on a x86 platform
    Integrated into Eclipse IDE
    An emulator supporting Linux and Windows native environments




  April 18, 2010         Parallel Covariance Matrix Creation - Final Presentation   68
The emulator mimics the behavior of HAL's hardware scheduler
while still running on a X86 processor and working on
Linux/Windows-based environments.


   No need to change to new hardware and a new programming model
   The emulator is written in ANSI-C. (almost all compilers can compile it)
   It comes with a prebuilt Makefile and a Visual Studio solution
   The emulator calls each task with all its required information: its right task
   instance, right timing, and right core ID
   However, not cycle-accurate!



April 18, 2010          Parallel Covariance Matrix Creation - Final Presentation    69
A cycle-accurate hardware simulator, that simulates the exact
behavior of real HAL hardware. The simulator is integrated into
eclipse IDE, but is very hard to debug with.


   Cycle accurate simulation.
   Uses GNU’s well known binutils and GDB debugger
   Integrated into Eclipse IDE
   Ease transition to hardware




April 18, 2010       Parallel Covariance Matrix Creation - Final Presentation   70
   Compilation of the whole MVM algorithm using
   Plurality's emulator
   Compilation of our covariance matrix creation
   program using Plurality's simulator
   Using the Eclipse development environment to
   measure the cycle-accurate performance




April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation   71
N of M pre-compiler
   Motivation
            Overcoming Plurality’s unimplemented feature
            Allow manual scheduling in order to preserve
            processing time
   For a given task, limit the number of concurrent
   instances out of its defined quota
   Implemented in Perl


April 18, 2010        Parallel Covariance Matrix Creation - Final Presentation   72
                                                                                                  The naïve algorithm




                    Introduction
                    Building the covariance matrix
                              The naïve algorithm
                              Our algorithm
                                  Terminology
                                  The Algorithm
                                  Optimizations
                                  Results
                    MVM on Plurality
                            The MVM algorithm
                            Plurality Platform
                            Results
                    Future Projects
                    Conclusions


April 18, 2010                         Parallel Covariance Matrix Creation - Final Presentation               73
                 INIT                                                                 Attachment
                                          Covarince
                                                            X86
                                                         Simulator

        Segmentation                                                                    FINISH

                                         Eigenvalues




          2D-IFFT                           2D-FFT
                                                                                     X86
                                                                                   Emulator


April 18, 2010          Parallel Covariance Matrix Creation - Final Presentation                   74
                                        6
   2D(I)FFT and
   eigendecomposition                   5
   using Intel’s MKL as
                                        4
   black-box on the X86
   Compiled to native x86               3                                     1Series
   code, but not fully                  2                                     2Series
   optimized                                                                  3Series
                                        1

                                        0




April 18, 2010     Parallel Covariance Matrix Creation - Final Presentation             75
                                    Speedup for 61 Permutations
                    11

                    10

                     9

                     8
   Cycles speedup




                     7

                     6

                     5

                     4

                     3

                     2

                     1

                     0
                         2   4           8             16            32             64        128       256
                                                             Cores

                                                                                            Chip size: 15x15
                                                                                            Sub-Aparture size: 6x6

April 18, 2010                   Parallel Covariance Matrix Creation - Final Presentation                      76
                                   Speedup for 113 Permutations
                    19
                    18
                    17
                    16
                    15
                    14
                    13
                    12
   Cycles speedup




                    11
                    10
                     9
                     8
                     7
                     6
                     5
                     4
                     3
                     2
                     1
                     0
                         2   4           8             16            32             64        128       256
                                                             Cores

                                                                                            Chip size: 20x20
                                                                                            Sub-Aparture size: 8x8

April 18, 2010                   Parallel Covariance Matrix Creation - Final Presentation                      77
                    Introduction
                    Building the covariance matrix
                              The naïve algorithm
                              Our algorithm
                                  Terminology
                                  The Algorithm
                                  Optimizations
                                  Results
                    MVM on Plurality
                            The MVM algorithm
                            Plurality Platform
                            Results
                    Future Projects
                    Conclusions


April 18, 2010                         Parallel Covariance Matrix Creation - Final Presentation   78
Completing MVM on Plurality
   Implement a parallel algorithm for finding
   eigenvalues and vectors of a dense Hermitian
   matrix
   2D(I)FFT on Plurality using Plurality’s 1-D Library
   Task map Optimizations




April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation   79
   High complexity
   Many Algorithems: QR, SVD, D&C, Jacobi, etc.
   Many OTS solutions: Intel, AMD, IBM, GNU,
   LAPACK, NAG, FEAST, etc.
   Shared memory ∩ Parallel ∩ Open source C = ф




April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation   80
   Main features:
            Fast – O(n2) for nxn Matrix
            Parallel
            Memory efficient – O(n2) for nxn Matrix
            Complex data structures
            Implementation unavailable
   Optimal for plurality’s Platform


April 18, 2010        Parallel Covariance Matrix Creation - Final Presentation   81
                    Introduction
                    Building the covariance matrix
                              The naïve algorithm
                              Our algorithm
                                  Terminology
                                  The Algorithm
                                  Optimizations
                                  Results
                    MVM on Plurality
                            The MVM algorithm
                            Plurality Platform
                            Results
                    Future Projects
                    Conclusions


April 18, 2010                         Parallel Covariance Matrix Creation - Final Presentation   82
    Our algorithm is unique: no parallel solution has
    been available to date. This solution may be
    applied to other signal processing problems
    Implementation of MRRR is possible, therefore,
    enabling the complete MVM algorithm to work
    on plurality's platform
    Using our solution on plurality's platform may
    be very appealing since plurality provides higher
    performance-power utilization than Grid
    Computing and faster run time

April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation   83
   Plurality’s low power platform may enable
   integrating SAR
            On satellites
            On Unmanned Aerial Vehicles (UAV’s)
            More implications …




April 18, 2010        Parallel Covariance Matrix Creation - Final Presentation   84
April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation   85
April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation   86
                                                                                                      The MVM algorithm




Assembling SAR radar picture consists of 2 phases:
   Preliminary Algorithm                  1. Conventional SAR

Incoming radar                      DATA Manipulation
    Echoes                       RMC, Adaptive Pre-Sum, MOCOMP,                        2D FFT            SAR Image
                             Autofocus, Polar to Rectangular Interpolation


      Elta’s MVM Algorithm                 2. MVM method in SAR

  Identify Target of Interest                        Obtain Virtual SAR                       MVM SAR Image
     Upon a SAR Image                             Raw DATA Corresponding                    of the selected target
                                                   to The Selected Target



                                      Filtering and                           MVM
                                         2D IFFT                             Process



  April 18, 2010                Parallel Covariance Matrix Creation - Final Presentation                         87
                                                                            Our algorithm




Look-up tables
Concepts:
Execute many calculations in advance, saving them in a
memory efficient static look-up tables.

Advantages:
Greatly reduces calculation at run time
Same table used for all chips


April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation          88
                                                                                            Our algorithm




Look-up tables
 Permutation Table
       Holds relevant info for permutation
                 M1x   M1y         M2x            M2y             Bx            By   REFx   REFy

       1         …     …             …             …              …             …     …      …

      …          …     …             …             …              …             …     …      …

      …          …     …             …             …              …             …     …      …

     313         …     …             …             …              …             …     …      …




April 18, 2010           Parallel Covariance Matrix Creation - Final Presentation                  89
                                                                               Our algorithm




Look-up tables
 Permutation Table
    [M1x, M1y] are coordinates of first multiplier
    [M1x, M1y] are coordinates of second multiplier
    Bx is the number of rows of the permutation block
    By is the number of cols of the permutation block
    [REFx, REFy] are the coordinates of the pixel at REF matrix
    (at Zero iteration)

 Optimal table size: (4*6+8*2)bit*313=1.565KBytes

April 18, 2010      Parallel Covariance Matrix Creation - Final Presentation          90
                                                                               Our algorithm




Look-up tables
 Offsets Table
     Maps each shift to a pixel
Concept:
     We came across a regularity in the offset of the pixel
     coordinates when shifting upwards (+1,+1) or leftwards
     (+Sub-ap size, +Sub-ap size)




April 18, 2010      Parallel Covariance Matrix Creation - Final Presentation          91
                                                                                                                   Our algorithm




Look-up tables
 Offsets Table
  Table’s construction
                                                                   0   13   26   39   52   65   78   91    104   117   130   143   156
  First we create a general Matrix                                 1   14   27   40   53   66   79   92    105   118   131   144   157

  containing all possible pixel offsets                            2
                                                                   3
                                                                       15
                                                                       16
                                                                            28
                                                                            29
                                                                                 41
                                                                                 42
                                                                                      54
                                                                                      55
                                                                                           67
                                                                                           68
                                                                                                80
                                                                                                81
                                                                                                     93
                                                                                                     94
                                                                                                           106
                                                                                                           107
                                                                                                                 119
                                                                                                                 120
                                                                                                                       132
                                                                                                                       133
                                                                                                                             145
                                                                                                                             146
                                                                                                                                   158
                                                                                                                                   159
                                                                   4   17   30   43   56   69   82   95    108   121   134   147   160
                                                                   5   18   31   44   57   70   83   96    109   122   135   148   161

       Matrix[i,j]- the offset when shifting i                     6   19   32   45   58   71   84   97    110   123   136   149   162
                                                                   7   20   33   46   59   72   85   98    111   124   137   150   163
       steps upwards and j steps leftwards                         8   21   34   47   60   73   86   99    112   125   138   151   164
                                                                   9   22   35   48   61   74   87   100   113   126   139   152   165
                                                                  10   23   36   49   62   75   88   101   114   127   140   153   166
                                                                  11   24   37   50   63   76   89   102   115   128   141   154   167
                                                                  12   25   38   51   64   77   90   103   116   129   142   155   168




April 18, 2010            Parallel Covariance Matrix Creation - Final Presentation                                           92
                                                                                                                 Our algorithm




Look-up tables
 Offsets Table
  Table’s construction
   Then, we add each permutation’s                               0
                                                                 1
                                                                     13
                                                                     14
                                                                          26
                                                                          27
                                                                               39
                                                                               40
                                                                                    52
                                                                                    53
                                                                                         65
                                                                                         66
                                                                                              78
                                                                                              79
                                                                                                   91
                                                                                                   92
                                                                                                         104
                                                                                                         105
                                                                                                               117
                                                                                                               118
                                                                                                                     130
                                                                                                                     131
                                                                                                                           143
                                                                                                                           144
                                                                                                                                 156
                                                                                                                                 157
   Zero Iteration coordinates (x,y) to                           2   15   28   41   54   67   80   93    106   119   132   145   158

   the matrix to form each                                       3
                                                                 4
                                                                     16
                                                                     17
                                                                          29
                                                                          30
                                                                               42
                                                                               43
                                                                                    55
                                                                                    56
                                                                                         68
                                                                                         69
                                                                                              81
                                                                                              82
                                                                                                   94
                                                                                                   95
                                                                                                         107
                                                                                                         108
                                                                                                               120
                                                                                                               121
                                                                                                                     133
                                                                                                                     134
                                                                                                                           146
                                                                                                                           147
                                                                                                                                 159
                                                                                                                                 160

   permutation offsets table                                     5   18   31   44   57   70   83   96    109   122   135   148   161
                                                                 6   19   32   45   58   71   84   97    110   123   136   149   162
                                                                 7   20   33   46   59   72   85   98    111   124   137   150   163

                 Coodrszero-iteration (x,y) +                    8   21   34   47   60   73   86   99    112   125   138   151   164
                                                                 9   22   35   48   61   74   87   100   113   126   139   152   165
                                                                10   23   36   49   62   75   88   101   114   127   140   153   166
                                                                11   24   37   50   63   76   89   102   115   128   141   154   167
                                                                12   25   38   51   64   77   90   103   116   129   142   155   168




April 18, 2010          Parallel Covariance Matrix Creation - Final Presentation                                           93
                                                                                                                Our algorithm




Look-up tables
 Offsets Table
                                                                         313
  Table’s construction                                                    X
                                                                0   13   26   39   52   65   78   91    104   117   130   143   156
                                                                1   14   27   40   53   66   79   92    105   118   131   144   157
                                     313                        2   15   28   41   54   67   80   93    106   119   132   145   158
                                                                3   16   29   42   55   68   81   94    107   120   133   146   159

                                      X                         4
                                                                5
                                                                    17
                                                                    18
                                                                         30
                                                                         31
                                                                              43
                                                                              44
                                                                                   56
                                                                                   57
                                                                                        69
                                                                                        70
                                                                                             82
                                                                                             83
                                                                                                  95
                                                                                                  96
                                                                                                        108
                                                                                                        109
                                                                                                              121
                                                                                                              122
                                                                                                                    134
                                                                                                                    135
                                                                                                                          147
                                                                                                                          148
                                                                                                                                160
                                                                                                                                161

                 Coodrszero-iteration (x,y) +                   6
                                                                7
                                                                    19
                                                                    20
                                                                         32
                                                                         33
                                                                              45
                                                                              46
                                                                                   58
                                                                                   59
                                                                                        71
                                                                                        72
                                                                                             84
                                                                                             85
                                                                                                  97
                                                                                                  98
                                                                                                        110
                                                                                                        111
                                                                                                              123
                                                                                                              124
                                                                                                                    136
                                                                                                                    137
                                                                                                                          149
                                                                                                                          150
                                                                                                                                162
                                                                                                                                163
                                                                8   21   34   47   60   73   86   99    112   125   138   151   164

Optimal table size:                                             9
                                                               10
                                                                    22
                                                                    23
                                                                         35
                                                                         36
                                                                              48
                                                                              49
                                                                                   61
                                                                                   62
                                                                                        74
                                                                                        75
                                                                                             87
                                                                                             88
                                                                                                  100
                                                                                                  101
                                                                                                        113
                                                                                                        114
                                                                                                              126
                                                                                                              127
                                                                                                                    139
                                                                                                                    140
                                                                                                                          152
                                                                                                                          153
                                                                                                                                165
                                                                                                                                166


(13*13*8)bit*313=52.9KBytes                                    11
                                                               12
                                                                    24
                                                                    25
                                                                         37
                                                                         38
                                                                              50
                                                                              51
                                                                                   63
                                                                                   64
                                                                                        76
                                                                                        77
                                                                                             89
                                                                                             90
                                                                                                  102
                                                                                                  103
                                                                                                        115
                                                                                                        116
                                                                                                              128
                                                                                                              129
                                                                                                                    141
                                                                                                                    142
                                                                                                                          154
                                                                                                                          155
                                                                                                                                167
                                                                                                                                168




April 18, 2010         Parallel Covariance Matrix Creation - Final Presentation                                           94
                                                                                Our algorithm




Using Matrix Characteristics
Concept:
Using matrix characteristics to reduce calculations
Important observation:
 REF is an Hermitian matrix.
                                                            RR             †


                                             R i, j   R  j, i 
April 18, 2010   Parallel Covariance Matrix Creation - Final Presentation              95

				
DOCUMENT INFO
Shared By:
Tags:
Stats:
views:2
posted:4/25/2012
language:
pages:95
mr doen mr doen mr http://bineh.com
About just a nice girl