A Perspective on the Future of Massively Parallel Computing Fine by mirit35


									    A Perspective on the Future of
    Massively Parallel Computing:
Fine-grain vs. Coarse-Grain Parallel Models
           comparison & contrasts

                 A review

                 Presented by Nanyan Jiang
                Outline of the paper
• Coarse grain parallel models
   – Von Neumann machine
   – Multiprocessor
   – Distributed systems
• Fine-grain parallel models – the connectionist
   – Artificial neutral network (ANN)
   – Cellular automata (CA)
• Comparison and contrasts between coarse-grain and
  fine-grain models
   – Architectural level
   – Functional level
• Conclusion
              Different Premises
• Coarse-grain models: each individual processor is
  computationally powerful, based on processors-
  separated-from-writable-storage paradigms
• Fine-grain models: very-large-size networks or
  grids of interconnected simple processing units.
   – The complex interconnection and interaction patterns
     among entities
   – Those entities are by themselves of a very limited
     computational power.
 each individual
   processor is
                             Coarse-grain parallel Models
                                           Extended to
          Multiprocessor                                     Shared
          approach                                        Memory Machine
                Von Neumann Machine                                                                              Most or all

                    A single processor                    processor       processor           processor       of other devices
                                                                          Shared memory
                        A memory                                                                                 are shared,
                          10 - 1000                             Distributed                                   e.g., I/O devices,
                                                              Memory Machine
                   How many processors to
                                                                                                                   OS, etc.
                   effectively pack together?             Processor       Processor            Processor
                                                           memory          memory               memory
                                                                                      Shared bus

                                                                                                          Message passing
                       Distributed                             Middleware
                        System                    OS              OS               OS              How to effectively
Loosely                                                                                            communicate, coordinate and
                                                Processor       Processor        Processor
coupled                                                                                            cooperate to accomplish a
                                                 memory          memory           memory
                       10^6 - 10^8                   device          device          device        common goal?

   Fine-grain Models – connectionist
• A very large number of highly interconnected very simple basic
  information processing units
• A large number of processors and an even larger number of
  processor-to-processor communication links

                          CA                 ANN

  Learning capability     no                 yes

  Interconnection         Only on NB         Fairly complex and
                                             desnse interconnection

  Pattern of connection   Heterogeneous      Homegeneous

  Functional neighbor     Nearby nodes       Larger region
 Structural comparison and contrasts

                    Fine-grain models               Coarse-grain model
                    e.g. ANN and CA                 e.g. multiprocessor (MP),
                                                    distributed systems (DS)
Number of           10^10 nodes,                    MP: 10^3~10^4
processing unit     10^14 communication links       DS: 10^6 ~ 10^8
Computation         Each node: very simple, very    Each node: powerful, expensive and
power               cheap and highly energy         tends to dissipate a good deal of
                    efficient                       energy
                    Power of computation comes
                    from interaction
Memory or the       ANN: the edge weights.          Clear distinction between processors
read/write/storag   Program at processor, data at   and memory (physically and
e medium            edge                            logically)
                    CA: program in the processor,
                    node’s state is the data
 Functional comparison and contrast
                     how to process information

                     Fine-grain                          Coarse-grain

Write over storage   No                                  Yes

Interact with the    CA: closed physical world,           Countable inputs
outer world          uncountable number of
                     configurations; entirely transparent
                     fully describe the state (1)
                     underlying topologies, (2) DFA;
                     (3) current state of each node
                     ANN: internal states, observable
                     external states and internal
                     synaptic edge weights
               Comparative advantages
                      Connectionist parallel models         Classical parallel models

    Application       Pattern recognition, where            Multiplying very large number
    domain            patterns are highly structured        Search through very large
                      and/or context-sensitive              structure-free data

    Characteristics   Plasticity: learning, dynamic         Without any learning
                      interaction with the                  Without any interaction with the
                      environment, and/or adaptability      environment
                      components embedded in them.          Following a simple, fixed-size-
                                                            and-complexity set of arithmetic

Advantages of fine-grain connectionist models
- Scalability                                          - Robustness and reliability in the
- Avoiding the slow storage bottleneck                 presence of noise
- Flexibility, adaptability and modifiability          - Graceful degradation
without explicitly re-program the entire system        - Energy consumption
Comparative advantage of connectionist model

                     Vacuum      Neural                             MOSFETs       Neural
                     tube        switcher                                         switcher
interconnectivity                Higher         interconnectivity                 Higher
density                          Higher         density             Higher
size                             Smaller        size                Smaller
Energy                           lower          Energy                            lower
consumption                                     consumption

                    Distributed System          Neural switcher
Link failure                                    robust
                    Geographically distant
                    Not very compatible
                    Conflict individual goals
scalability                                     Better: the delays related to fetching data
                                                from the “storage” need not grow
                                                proportionally to it, as no physically
                                                separated memory storage
• Connectionist models can be viewed as models of
  very fine-grained parallelism, as the processing
  units, and their basic operations, are much simpler
  than coarse-grain models
• Differences between these two models
   – Structural and Functional
• Different application domains and why
• Comparative advantages:
   – Scalability, robustness, graceful degradation, energy

To top