The CMU Reconfigurable Computing Project

Shared by: Nb56KRh
Categories
Tags
-
Stats
views:
4
posted:
5/1/2012
language:
pages:
38
Document Sample
scope of work template
							             The CMU Reconfigurable
                Computing Project


                    April 9, 1999
                    Mihai Budiu
                 mihaib@cs.cmu.edu



SSS 4/9/99         CMU Reconfigurable Computing   1
             Current Project Members

  CS Department                          ECE Department

                                         Herman Schmit
  Seth Copen Goldstein                   Srihari Cadambi
  Mihai Budiu                            Matt Moe
                                         Robert Taylor
                                         Ronald Laufer

SSS 4/9/99         CMU Reconfigurable Computing            2
Why Study Reconfigurable Hardware?
             It is a nice computation paradigm
                   (wire your own computer)




SSS 4/9/99            CMU Reconfigurable Computing   3
Why Study Reconfigurable Hardware
      Algorithm Year System                      Versus Speedup x
      DNA matching      1992    SPLASH 2         SPARC 10        4300

      FIR Filter        1998    PipeRench     UltraSparc           90
                                              300Mhz
      IDEA Encryption   1998    PipeRench     UltraSparc           61
                                              300Mhz
      SAT solver        1997    Pamette       SPARC 5         17--1100
                                              110Mhz
      Ray Casting       1995    RIPP-10       Pentium             33.8
                                              75Mhz
      Hidden Markov     1996    1 Xilinx FPGA SPARC 10            24.4
      Model
      DES Encryption    1996    GARP             UltraSparc        24
                                                 170Mhz
      SPEC92            1994    MIPS+RC          MIPS             1.22



SSS 4/9/99               CMU Reconfigurable Computing                4
             Commercial Players




 Source: In-stat April 1998
 *Does not include software, hardwire or support EPROMs
SSS 4/9/99         CMU Reconfigurable Computing           5
What Is “Reconfigurable Hardware?”
                      Interconnection
                          network




                                               Universal gates
                                                   and/or
                                              storage elements




                                   Switches


SSS 4/9/99   CMU Reconfigurable Computing                        6
        Basic Ingredient: RAM cell

                    0
             a0     0   data            a0
                    0                                  a1 & a2
                                        a1
             a1
                    1



                  Universal gate = RAM


SSS 4/9/99              CMU Reconfigurable Computing             7
              Basic Ingredients (ctd)



                        1                   0




                                            1
                         1


             A switch is controlled by a 1-bit RAM cell
SSS 4/9/99              CMU Reconfigurable Computing      8
                          Outline
•   What is reconfigurable hardware
•   RH vs other computation paradigms
•   Challenges in RH research
•   PipeRench: the CMU project:
     – the hardware
     – the software
• Conclusions
SSS 4/9/99            CMU Reconfigurable Computing   9
                   RH vs ASICs
• Generally Application-Specific Integrated Circuits
  will be faster than RH:
     – RH wires are slow & big
     – RH bit-slices are costly to interconnect
     – RH devices must store configuration on the chip
                                  but
• RH can be reprogrammed
     – new algorithms
     – to fix bugs
• RH cheaper in small production
• RH tolerates faults better
• RH sometimes faster with staged computation
SSS 4/9/99           CMU Reconfigurable Computing        10
             RH vs Microprocessors
• RH less flexible (like a VLIW with fixed
  instructions)
                       but
• RH provides more (customized)
  computation elements
• RH can decrease memory traffic
• RH can be tailored for specific algorithms
  and data types

RH will not replace mP, but complement them
SSS 4/9/99        CMU Reconfigurable Computing   11
                        Types of RH

• FPGAs: bit-level logic functionality
             (the basic processing elements compute on 1 bit)
• word-based architectures: PipeRench (CMU)
             (basic PE operates on 8 bits)
             (basic PE is a small ALU)
• coarse architectures: RAW (MIT)
             (basic PE is a MIPS 2000 core)

SSS 4/9/99                 CMU Reconfigurable Computing         12
                                    RH In A System
    Tit le:
    (coupling)
    Creator:
    (FrameMak er 5.5 PowerPC: Las erWrit er 8 8. 5. 1)
    Prev iew:
    This EPS pict ure was not sav ed
    wit h a prev iew inc luded in it .
    Comment:
    This EPS pict ure will print to a
    Post Sc ript print er, but not t o
    ot her t y pes of print ers.




SSS 4/9/99                                       CMU Reconfigurable Computing   13
              Challenges In RC
• Software tools:
     – Programming RC like software development
     – Automatic compilation from HLL
     – Automatic program partitioning
• Mapping efficiently algorithms (no ISA)
• System issues
     – interfaces
     – find “ideal” RC fabric


SSS 4/9/99         CMU Reconfigurable Computing   14
   The CMU Reconfigurable
      Computing Project


SSS 4/9/99   CMU Reconfigurable Computing   15
             Hardware Goals

• To build a complete reconfigurable
  hardware device
• To build the system integration hardware
• To host the device in a PC




SSS 4/9/99     CMU Reconfigurable Computing   16
    Our Device:

•   Word processing elements
•   Pipelined architecture
•   Virtualized hardware
•   Local interconnection network
•   Wide pipelined bus


SSS 4/9/99      CMU Reconfigurable Computing   17
Configuration
memory                                         Data & Config
                                                 controller




Stripes


                                               Processing
                                                elements

   SSS 4/9/99   CMU Reconfigurable Computing          18
             Hardware Virtualization
Actual available
   hardware

                                                       Instructions
                                                  currently in hardware



                                    Instructions paged out



SSS 4/9/99         CMU Reconfigurable Computing                      19
        Hardware Virtualization (2)
                        Page out

                 compute
                 compute
                                                          Program in
                 compute
                                                          configuration
                 configure                                memory
                                   Page in
             hardware


 Overlap configuration
 with computation.
SSS 4/9/99                 CMU Reconfigurable Computing             20
                   Processing Elements
                                           a
                                               b

                                                      Cin
             PE2                PE1                         PE0


                       out
                                                      • Look-up table
                                                      • Any 3-to-1 function



SSS 4/9/99             CMU Reconfigurable Computing                      21
      The Interconnection Network
             P*B bits




                           Word-level cross-bar
                                                              0
                                             B bits



                PE N         PE                        PE 1


                                  Pass Registers

                                    P*B*N bits
SSS 4/9/99              CMU Reconfigurable Computing              22
                                                    The PCI Board
      Tit le:
      chip. eps
      Creator:
      f ig2dev Vers ion 3.2 Pat chlev el 0-bet a3
      Prev iew:
      This EPS pict ure was not sav ed
      wit h a prev iew inc luded in it .
      Comment:
      This EPS pict ure will print to a
      Post Sc ript print er, but not to
      ot her t y pes of print ers.




SSS 4/9/99                                           CMU Reconfigurable Computing   23
                Software Goal
To program reconfigurable devices using the
 standard software development processes:

                                                 Java
   – Compile C or Java
   – Do it quickly                          Partitioner

                                     Data-flow Intermediate
                                            Language

                                                 DIL
                            Built

                                           Configuration
                                      Reconfigurable HW       CPU

 SSS 4/9/99       CMU Reconfigurable Computing                      25
        Building Circuits From DIL

             a = b + c * d;                    b        c   d
             e = c - d;
                                                        *
• variables          wires                         +        -
• operators          gates
                                                   a        e




SSS 4/9/99               CMU Reconfigurable Computing           26
Mapping Circuits To
                         a       b c

   a     b c                 +
                                              a   b       c
                                 -
    +
                                              +       -
         -
                   a     b               c


                     +               -

 SSS 4/9/99    CMU Reconfigurable Computing                   27
             The DIL Compiler Front-End
                                       Circuit

                     Parser
         Dil
                    Evaluator                                      Backend
       input file
                     Loader




                                        Loader

                      component                        Component
                        library                         circuits

SSS 4/9/99              CMU Reconfigurable Computing                    28
                    The DIL Compiler Backend
                    Circuit
                                                                             Circuit
                  (expanded)                Circuit                         (placed)

                                                            Placer-
Front-end                      Optimizer
                                                            Router



     The whole compilation process is
                                                                         Code generator
     very fast (compared to classical
     CAD tools).

     We can compile two orders of                                 xfig   C++      Asm
     magnitude faster.

     SSS 4/9/99                    CMU Reconfigurable Computing                    29
Processing Element Size Tradeoffs


                       Small                       Big
             Efficient usage                       Wasteful
                     Slower                        Faster bit-slice
  Flexible interconnect                            Coarse routing
  Bigger configuration                             Fewer configuration bits
Place and route easier                             Constrains the compiler



SSS 4/9/99                CMU Reconfigurable Computing                  30
             Stripe Width Tradeoffs
                  Wider                         Narrower
          Fewer stripes                         More will fit
        Virtualize more                         Fewer page-ins
       Bandwidth waste                          Less bandwidth available
        Placer freedom                          Placement constrained




SSS 4/9/99            CMU Reconfigurable Computing                  31
             Bus Width Tradeoffs

                Wider                           Narrower
             More area                          Less area
        High bandwidth                          Time-mux bus




SSS 4/9/99               CMU Reconfigurable Computing          32
                  Clock Speed Tradeoffs
                                         (run-time)

                               Faster                           Slower
          Short critical path                                   Big chains
         Long pipeline built                                    Compact circuits
Decomposition overhead                                          Little decomposition
      Virtualized more                                          Less virtualized

             24           24

                                                                   24              24
                  8   8           +                                      +
                      +                                                      24

         +            8

                           24

SSS 4/9/99                       CMU Reconfigurable Computing                           33
                                   Configuration Bits per Stripe
                                                                                     PE bit width
                                                                          2   4     8 16 32
               1600
               1400
Configuration Bits




               1200
               1000
                     800
                     600
                     400
                     200
                      0
                              64      80            96       112              128        144
                                                    Stripe Width

                 SSS 4/9/99                CMU Reconfigurable Computing                        34
   Title:
   (fir-throughput.eps)
   Creator:
   Adobe Illus trator(TM) 7.0
   Prev iew:
   This EPS pic ture was not sav ed
   with a prev iew inc luded in it.
   Comment:
   This EPS pic ture will print to a
   PostSc ript printer, but not to
   other ty pes of printers.




SSS 4/9/99                             CMU Reconfigurable Computing   35
                 Project Status
• Operational:
     – Behavioral and structural models of Piperench
       in Verilog
     – Assembler, simulator
     – Tools for visualization and debugging
     – One tile fabricated and tested
     – Very fast compiler from intermediate language
• In work:
     – Prototype PipeRench to be taped this summer
     – PCI board to host PipeRench in a PC

SSS 4/9/99         CMU Reconfigurable Computing      36
 Simulated Speed-up vs. UltraSparc @ 300Mhz

     1000.0
               328.8


                                            90.9                          76.1
      100.0                                              61.8
                        29.0                                     26.0
                                  20.6

       10.0




         1.0
               ATR     Cordic    DCT         FIR         IDEA   Nqueens   Over

SSS 4/9/99                CMU Reconfigurable Computing                           37
                  Future Work
• Build the PCI board
• Build the OS device drivers
• Start investigating HLL issues:
     – automatic partitioning
     – translation to DIL
     – special code transformations


SSS 4/9/99         CMU Reconfigurable Computing   38
                Conclusions
• A set of important applications can benefit from
  RC devices
• RC offer potential for substantial performance
  improvement at a low cost


• RC devices will soon be mainstream U
  in the embedded computing world;     V
  perhaps in the future they will also R
  permeate the desktop                           Pentium V


SSS 4/9/99        CMU Reconfigurable Computing           39

						
Related docs
Other docs by Nb56KRh
Critical Incident Management
Views: 4  |  Downloads: 0
A cerca de la dimensi�n de la disciplina
Views: 1  |  Downloads: 0
PROVINCIA DE SAN LUIS - DOC
Views: 22  |  Downloads: 0
Desarrollo del pensamiento creativo
Views: 264  |  Downloads: 0
Fourth Grade packet
Views: 14  |  Downloads: 0
INTEGRACION ESCOLAR
Views: 113  |  Downloads: 0
701 Pennsylvania Avenue, N
Views: 2  |  Downloads: 0
Implementasjonsguide AltInn
Views: 19  |  Downloads: 0
Teoria Cr�tica � Escola de Frankfurt
Views: 108  |  Downloads: 0
Mapas conceptuales - PowerPoint
Views: 108  |  Downloads: 0