S4 by rajig

VIEWS: 22 PAGES: 67

									                         Generic FPGA - Architecture
•   FPGA Integrated Circuit
     –   What‟s Inside the FPGA?


                                                                    FPGA


                                                         IO Pins




                                         Dr. J. M. Emmert
                                   Dept of Electrical Engineering
                        Generic FPGA - Architecture
•   Two Abstract Layers
     –   Programmable Logic Layer
     –   Programming Memory Layer

                                                                           FPGA
                            Prgm Logic Layer
                            Prgm Mem Layer




                                                Dr. J. M. Emmert
                                          Dept of Electrical Engineering
                         Generic FPGA - Architecture
•   Two Abstract Layers
     –   Programmable Logic Layer
     –   Programming Memory Layer (memory cells)




                             Prgm Logic Layer




                             Prgm Mem Layer

                                                                            Memory Cells (MCs)




                                                 Dr. J. M. Emmert
                                           Dept of Electrical Engineering
                        Generic FPGA - Architecture
•   Two Abstract Layers
     –   Programming Memory Layer (cont …)




                             Prgm Mem Layer

                                                                          Memory Cells (MCs)
                                               Dr. J. M. Emmert
                                         Dept of Electrical Engineering
       Basic (Generic) Programming Technologies
                                                                                   SW1
                                                                           SW0
•   How does the programming work
•   SRAM example                                                   PLB
     – SRAM MCs used for LUTs
     – SRAM cells used for CBs and                                     3
                                                                       2
       SBs to hold switch values                                       1
                                                                       0   CB

          W   .
              .
                                   Column of SRAM
                                   MCs for programming
              .                    the FPGA switches and
                                   LUT cells
                                                                             SW2    SW3
              MC      LUT cell 3

              MC      LUT cell 2

              MC      LUT cell 1

              MC      LUT cell 0

              MC      SW0

              MC      SW1

              MC      SW2

              MC      SW3

                      …
              .
              MC

              .
              .
                                            Dr. J. M. Emmert
                                      Dept of Electrical Engineering
       Basic (Generic) Programming Technologies
                                                                                       SW1
                                                                               SW0
•   How does the programming work
•   SRAM example                                                   PLB
     – SRAM MCs used for LUTs
     – SRAM cells used for CBs and                                     3
                                                                       2
                                                                           1
                                                                           0
       SBs to hold switch values                                       1   0
                                                                       0   0   CB

          W   .
              .
                                   Column of SRAM
                                   MCs for programming
              .                    the FPGA switches and
                                   LUT cells
                                                                                 SW2    SW3
              MC 1    LUT cell 3

              MC 0    LUT cell 2

              MC 0    LUT cell 1

              MC 0    LUT cell 0

              MC 1    SW0

              MC 0    SW1

              MC 0    SW2

              MC 1    SW3

              MC .    …
              ..
              ..
              .
                                            Dr. J. M. Emmert
                                      Dept of Electrical Engineering
       Basic (Generic) Programming Technologies
•   How does the programming work
•   SRAM example
     – SRAM MCs used for LUTs
     – SRAM cells used for CBs and
       SBs to hold switch values


          W   .
              .
                               Column of SRAM
                               MCs for programming
              .                the FPGA switches and
                               LUT cells
              MC
              MC               Memory Frame
              MC

              MC

              MC

              MC

              MC

              MC


              .
              MC

              .
              .
                                          Dr. J. M. Emmert
                                         of John M. Emmert
                                     DeptDr.Electrical Engineering
       Basic (Generic) Programming Technologies
•   How does the programming work
•   SRAM example
     – SRAM MCs used for LUTs
     – SRAM cells used for CBs and
       SBs to hold switch values

              .                Column of SRAM                         .
                                                                      .   W
          W
              .
              .
                               MCs for programming
                               the FPGA switches and                  .
                               LUT cells
              MC
              MC               Memory Frame                                       NOTE: in symbol
              MC                                   =>                         C
                                                                                  the outputs are not
                                                                                  shown
              MC                                                              o
                                                                              l
              MC                                                              u
                                                                                   Symbol for one
                                                                              m
              MC                                                                   Memory Frame
                                                                              n
              MC

              MC
                                                                      .
                                                                      .
              .
              MC

              .                                                       .
              .
                                           Dr. J. M. Emmert
                                     Dept of Electrical Engineering
          Basic (Generic) Programming Technologies
                 • Frame Based Programming
                            serial
•   All frame               string
    inputs for a
    given row
    are tied                                                       column address decoder
    together
    and to the                                    W                                         W        W
    FB output
•   Column                    Frame Buffer
    address
    decoder                                   .
                                              .              .................
                                                   column




                                                                                            column



                                                                                                     column
    selects the
    frame to
    write with
                                              .
                                              .
    information                               .
                                              .
    in the
    frame
    buffer
                                              .
                                              .
                   •   Each Column is made up of one or more memory frames
                        •   NOTE: may have more than one frame / column
                   •   Column/frame decoder picks which frame to program with the current frame buffer
                       data
                   •   Frame buffer is a register
                        •   Serial load
                        •   Parallel output
                                                   Dr. J. M. Emmert
                                             Dept of Electrical Engineering
          Basic (Generic) Programming Technologies
                 • Frame Based Programming
                            serial
•   All frame               string
    inputs for a
    given row
    are tied                                                       column address decoder
    together
    and to the                                    W                                         W        W
    FB output
•   Column                    Frame Buffer
    address
    decoder                                   .
                                              .              .................
                                                   column




                                                                                            column



                                                                                                     column
    selects the
    frame to
    write with
                                              .
                                              .
    information                               .
                                              .
    in the
    frame
    buffer
                                              .
                                              .
                   •   Each Column is made up of one or more memory frames
                        •   NOTE: may have more than one frame / column
                   •   Column/frame decoder picks which frame to program with the current frame buffer
                       data
                   •   Frame buffer is a register
                        •   Serial load
                        •   Parallel output
                                                   Dr. J. M. Emmert
                                             Dept of Electrical Engineering
          Basic (Generic) Programming Technologies
                 • Frame Based Programming
                            serial
•   All frame               string
    inputs for a
    given row
    are tied                                                       column address decoder
    together
    and to the                                    W                                         W        W
    FB output
•   Column                    Frame Buffer
    address
    decoder                                   .
                                              .              .................
                                                   column




                                                                                            column



                                                                                                     column
    selects the
    frame to
    write with
                                              .
                                              .
    information                               .
                                              .
    in the
    frame
    buffer
                                              .
                                              .
                   •   Each Column is made up of one or more memory frames
                        •   NOTE: may have more than one frame / column
                   •   Column/frame decoder picks which frame to program with the current frame buffer
                       data
                   •   Frame buffer is a register
                        •   Serial load
                        •   Parallel output
                                                   Dr. J. M. Emmert
                                             Dept of Electrical Engineering
          Basic (Generic) Programming Technologies
                 • Frame Based Programming
                            serial
•   All frame               string
    inputs for a
    given row
    are tied                                                       column address decoder
    together
    and to the                                    W                                         W        W
    FB output
•   Column                    Frame Buffer
    address
    decoder                                   .
                                              .              .................
                                                   column




                                                                                            column



                                                                                                     column
    selects the
    frame to
    write with
                                              .
                                              .
    information                               .
                                              .
    in the
    frame
    buffer
                                              .
                                              .
                   •   Each Column is made up of one or more memory frames
                        •   NOTE: may have more than one frame / column
                   •   Column/frame decoder picks which frame to program with the current frame buffer
                       data
                   •   Frame buffer is a register
                        •   Serial load
                        •   Parallel output
                                                   Dr. J. M. Emmert
                                             Dept of Electrical Engineering
          Basic (Generic) Programming Technologies
                 • Frame Based Programming
                            serial
•   All frame               string
    inputs for a
    given row
    are tied                                                       column address decoder
    together
    and to the                                    W                                         W        W
    FB output
•   Column                    Frame Buffer
    address
    decoder                                   .
                                              .              .................
                                                   column




                                                                                            column



                                                                                                     column
    selects the
    frame to
    write with
                                              .
                                              .
    information                               .
                                              .
    in the
    frame
    buffer
                                              .
                                              .
                   •   Each Column is made up of one or more memory frames
                        •   NOTE: may have more than one frame / column
                   •   Column/frame decoder picks which frame to program with the current frame buffer
                       data
                   •   Frame buffer is a register
                        •   Serial load
                        •   Parallel output
                                                   Dr. J. M. Emmert
                                             Dept of Electrical Engineering
          Basic (Generic) Programming Technologies
                 • Frame Based Programming
                            serial
•   All frame               string
    inputs for a
    given row
    are tied                                                       column address decoder
    together
    and to the                                    W                       W                 W        W
    FB output
•   Column                    Frame Buffer
    address
    decoder                                   .
                                              .              .................
                                                   column




                                                                              column




                                                                                            column



                                                                                                     column
    selects the
    frame to
    write with
                                              .
                                              .
    information                               .
                                              .
    in the
    frame
    buffer
                                              .
                                              .
                   •   Each Column is made up of one or more memory frames
                        •   NOTE: may have more than one frame / column
                   •   Column/frame decoder picks which frame to program with the current frame buffer
                       data
                   •   Frame buffer is a register => loaded with values to write to addressed frame
                        •   Serial load
                        •   Parallel output
                                                   Dr. J. M. Emmert
                                             Dept of Electrical Engineering
Basic (Generic) Programming Technologies
• Example for FPGA with four programming memory
  frames and N memory cells in each frame
                   serial
                   string


                                                                  column address decoder
                                    2                 0                 1                2        3
                                                 W                  W                W        W
                                        0
                     Frame Buffer       1
                                        2
                                           .
                                           .




                                                                                              column
                                                  column




                                                                     column




                                                                                     column
                                           .
                                           .
                                           .
                                           .
                                        N-1.
                                           .
 •   For this example we only have four programming memory frames, so we will need
     two address bits to pick the right frame
 •   To program each frame we will need 2 address bits + N programming bits = N+2
     bits
 •   Thus the frame buffer must hole N+2 bits, where the first two bits are for the
     address decoder and the next N bits are for the memory frame

                                                  Dr. J. M. Emmert
                                            Dept of Electrical Engineering
    Basic (Generic) Programming Technologies
    • Example for FPGA with four programming memory
      frames and N memory cells in each frame
0
0
f MC0   serial                                                           column address decoder
f MC1   string                             2                 0                 1                2        3
.                                                       W                  W                W        W
.                                              0
.                           Frame Buffer       1
.                                              2
f MCN-1                                           .
                                                  .




                                                                                                     column
                                                         column




                                                                            column




                                                                                            column
                                                  .
                                                  .
                                                  .
                                                  .
                                               N-1.
                                                  .
       •   To program the first frame we shift in N frame programming bits then “00” to
           address the first frame
       •   Then we assert the write signal to store the N programming bits in frame 0




                                                         Dr. J. M. Emmert
                                                   Dept of Electrical Engineering
    Basic (Generic) Programming Technologies
    • Example for FPGA with four programming memory
      frames and N memory cells in each frame
0
1
f MC0   serial                                                           column address decoder
f MC1   string                             2                 0                 1                2        3
.                                                       W                  W                W        W
.                                              0
.                           Frame Buffer       1
.                                              2
f MCN-1                                           .
                                                  .




                                                                                                     column
                                                         column




                                                                            column




                                                                                            column
                                                  .
                                                  .
                                                  .
                                                  .
                                               N-1.
                                                  .
       •   To program the second frame we shift in N frame programming bits then “01” to
           address the second frame
       •   Then we assert the write signal to store the N programming bits in frame 1




                                                         Dr. J. M. Emmert
                                                   Dept of Electrical Engineering
       Basic (Generic) Programming Technologies
        • Example for Virtex4
Xilinx ASCII Bitstream
Created by Bitstream I.34
Design name: Toplevel.ncd
Architecture: virtex4
Part:          4vsx35ff668
Date:          Thu Jan 29 13:28:04 2009
Bits:          13700288
11111111111111111111111111111111
10101010100110010101010101100110
00100000000000000000000000000000
00110000000000001000000000000001
00000000000000000000000000000111
00100000000000000000000000000000
00100000000000000000000000000000
00110000000000010010000000000001
00000000000001010011000111100101
00110000000000011000000000000001




       Ref: xilinx.com




                                                Dr. J. M. Emmert
                                          Dept of Electrical Engineering
•   So far … covered basic components on                       Technology     Description                  Volatility   Reprogramm           Area             R                C
                                                                                                                            able
    generic FPGAs
     –      LBs                                                  SRAM       RAM Cell stores                    Y           In-ckt           Large            1-2K            10-20fF
              •   Functional units                                            „1‟ or „0‟
              •   FFs                                            Fuse           Break a                        N              N             Small       50-500                1-5fF
     –      Routing network                                     Antifuse       connection
                                                                              Special “one
              •   CBs                                                                                          N              N                              50               1fF
                                                                            time” transistor
              •   SBs                                                       Zhigh => Zlow
              •   Wiring segments
                                                                EPROM         Electrically                     N          Out-ckt           Small            2-4K            10-20fF
     –      Programming technologies                                         programmable
              •   MCs                                                        (UV erasable)
              •   Switches
              •   SRAM – frame example
                                                               EEPROM          Electrically                    N           In-ckt           2xEPR            2-4K            10-20fF
•   Now …                                                                    programmable                                                    OM
                                                                              (Electrically
     –      How to build an FPGA                                                erasable)
              • Architectural considerations
              1. Size / Type of LBs
                       –   # LUTs/LB ?
                       –   #inputs/LUT ?
              2. Interconnect
                                                                        •   Frame Based Programming
                       –   W ?                                               serial
                       –   FC ?                                              string
                       –   FS ?
                                                                                                                        column address decoder
     –      How to pick the right FPGA
                                                                                                  W                                                 W               W
              3. Evaluation of -or- comparison of commercial
                  FPGAs how?
•   Later                                                                      Frame Buffer




                                                                                                  column




                                                                                                                                                    column




                                                                                                                                                                    column
     –      Basic generic CAD flow                                                            .
     –      Specialty components                                                              .
                                                                                              .                    .................
                                                                                              .
                                                                                              .
•   Questions ???                                                                             .
                                                                                              .
                                                                                              .

                                                      Dr. J. M. Emmert
                                                Dept of Electrical Engineering
                 Generic FPGA Architecture
•      How to build an FPGA (not how to use and FPGA)
      –     Architecture considerations for LBs                                              Functional


      –
                                                                                                Unit
            Architecture considerations for FPGA                                           (Combinational
                                                                                               Logic)
                                                                                                            Z   1
                                                                                                                            Y

      –
                                                                                                                0
            Architecture of routing network (Fs, Fc, W)
      1.    Size of Functional Unit                                                   I3
                                                                                      I2
            A.    Number of input bits                                                I1
                                                                                      I0                        1
                                                                                                                            Y
            B.    Number of output bits (decomposable?)                                                         0   D       Q
                                                                                                                        Q
      2.    Functionality of Functional Unit                                          S0
                                                                                      S1

            A.    LUT
                                                                                      CLK
            B.    Mux
            C.    PLA
      3.    Is it advantageous to include a FF in PLB?
            A. Or should we create FFs with functional units?
      4.    Routing Area vs. Logic Area
            A. Size of routing area versus size of logic area?

References:
- Rose et. al. Architecture of Programmable Gate Arrays: The Effect of LB
       Functionality on Area Efficiency, IEEE Journal of Solid State Circuits, Oct,
       1990.
- Koulohers et. al. FPGA area vs. Cell Granularity LUT & PLA Cells, First ACM
       workshop on FPGAs, Feb, 1992.
- Bench Mark Circuits, ISCAS 1989
- Rose et. al. The Effect of LUT and Cluster Size on Deep-Submicron FPGA
       Performance and Density, FPGA 2000, Feb 2000.
                                                  Dr. J. M. Emmert
                                            Dept of Electrical Engineering
          Generic FPGA Architecture
1. Size of the Functional Unit                                              Functional
                                                                               Unit
                                                                          (Combinational   Z   1
                                                                              Logic)                       Y

   A. # inputs for a combinational block of logic?                                             0




      •   What is the right #                                        I3
                                                                     I2
                                                                     I1
                                                                     I0

      •
                                                                                               1

          Example:                                                                             0   D
                                                                                                       Q
                                                                                                           Y
                                                                                                           Q

                                                                     S0
                                                                     S1


                                                                     CLK




   B. # outputs for a combinational unit?
      •   What is the right #
                                                                                Functional
                                                                                   Unit
                                                                              (Combinational
                                                                                  Logic)
                                                                                                   ?
      •   Decomposable?
      •   Example:                                               ?



                                      Dr. J. M. Emmert
                                Dept of Electrical Engineering
      Generic FPGA Architecture
A. Number of Inputs for combinational
  functional unit
• Example: f = abd + bcd‟ + a‟b‟c‟
     – Given 2LUTs
 3
 2
 1
 0




                       Dr. J. M. Emmert
                 Dept of Electrical Engineering
             Generic FPGA Architecture
    • Example: f = abd + bcd‟ + a‟b‟c‟
         – Given 2LUTs
     3   1    a ●b
     2   0
     1   0
     0   0
a
b




                           Dr. J. M. Emmert
                     Dept of Electrical Engineering
             Generic FPGA Architecture
    • Example: f = abd + bcd‟ + a‟b‟c‟
         – Given 2LUTs
     3   1   a ●b    3   1   a●b●d
                                        3   1    a●b●d+ b●c●d’
     2   0           2   0
                     1   0              2   1
     1   0
                     0   0              1   1
     0   0                                                3   1   a●b●d+ b●c●d’+a’●b’●c’
                                        0   0
a                                                         2   1
b               d                                         1   1
                                                          0   0
     3   1   b ●c    3   0   b●c●d’
     2   0           2   1
     1   0           1   0
     0   0           0   0
b
c               d


     3   0   a’●b’   3   0   a’●b’●c’
     2   0           2   1
     1   0           1   0
     0   1           0   0
a
b               c



                                                  Dr. J. M. Emmert
                                            Dept of Electrical Engineering
             Generic FPGA Architecture
    • Example: f = abd + bcd‟ + a‟b‟c‟
         – Given 2LUTs
     3   1   a ●b    3   1   a●b●d
                                        3   1    a●b●d+ b●c●d’
     2   0           2   0
                     1   0              2   1
     1   0
                     0   0              1   1
     0   0                                                3   1   a●b●d+ b●c●d’+a’●b’●c’
                                        0   0
a                                                         2   1
b               d                                         1   1
                                                          0   0
     3   1   b ●c    3   0   b●c●d’
     2   0           2   1
     1   0           1   0
     0   0           0   0
b
c               d

                                                                         10+7=17 signal connections
     3   0   a’●b’   3   0   a’●b’●c’                                    8*4 = 32 MCs
     2   0           2   1                                               4 levels of logic
     1   0           1   0
     0   1           0   0
a
b               c



                                                  Dr. J. M. Emmert
                                            Dept of Electrical Engineering
             Generic FPGA Architecture
    • Example: f = abd + bcd‟ + a‟b‟c‟
         – Given 3LUTs
     7   1
     6   0
     5   0    a●b●d
     4   0
     3   0
     2   0
     1   0
     0   0
a
b
d




                            Dr. J. M. Emmert
                      Dept of Electrical Engineering
             Generic FPGA Architecture
    • Example: f = abd + bcd‟ + a‟b‟c‟
            – Given 3LUTs
    7   1
    6   0
    5   0     a●b●d
    4   0
    3   0
    2   0
    1   0                                                     7   1
                  7   0
    0   0                                                     6   1
                  6   1                                       5   1      f
a                 5   0   b●c●d’
b                 4   0                                       4   1
                  3   0                                       3   1
d                                                             2   1
                  2   0
                  1   0                                       1   1
                  0   0                                       0   0
             b                 7   0
             c                 6   0
             d                 5   0       a’●b’●c’
                               4   0                                  10+3=13 signal connections
                               3   0                                  4*8 = 32 MCs
                               2   0                                  2 levels of logic
                               1   0
                               0   1
                          a
                          b
                          c


                                             Dr. J. M. Emmert
                                       Dept of Electrical Engineering
              Generic FPGA Architecture
    • Example: f = abd + bcd‟ + a‟b‟c‟
          – Given 4LUTs
     15   1
     14   1
     13   1    f
     12   0
     11   0
     10   0
      9   0
      8   0
      7   0
      6   1
      5   0
      4   0
      3   0
      2   0
      1   1
      0   1
a
b                                                  5 signal connections
c                                                  16 MCs
d                                                  1 levels of logic




                            Dr. J. M. Emmert
                      Dept of Electrical Engineering
   Generic FPGA Architecture                               31
                                                           30
                                                                x
                                                                x

• Example: f = abd + bcd‟ + a‟b‟c‟
                                                           29   x
                                                           28   x
                                                           27   x
                                                           26   x
  – Given 5LUTs                                            25
                                                           24
                                                           23
                                                                x
                                                                x
                                                                x
                                                           22   x
                                                           21   x
                                                           20   x
                                                           19   x
                                                           18   x
           5 signal connections                            17   x
           32 MCs                                          16   x
           1 levels of logic                               15   1
                                                           14   1
                                                           13   1   f
                                                           12   0
                                                           11   0
                                                           10   0
                                                            9   0
                                                            8   0
                                                            7   0
                                                            6   1
                                                            5   0
                                                            4   0
                                                            3   0
                                                            2   0
                                                            1   1
                                                            0   1
                                                     ‘0’
                                                     a
                                                     b
                                                     c
                                                     d
                          Dr. J. M. Emmert
                    Dept of Electrical Engineering
       Generic FPGA Architecture
• Example: f = abd + bcd‟ + a‟b‟c‟

            nLUT            # MCs            # connections   # levels
               2              32                     17         4
               3              32                     13         2
               4              16                      5         1
               5              32                      5         1

•   Fewer inputs => more signal nets between functional units (routing)
•   Fewer inputs => more levels of logic (delay)
•   More inputs (>4) => wasted area for the example
•   More inputs => more area per functional unit
     – Further apart functional units are located
        => Longer connections between functional units (delay and routing)

                                  Dr. J. M. Emmert
                            Dept of Electrical Engineering
   Generic FPGA Architecture
B. Number of outputs for functional unit
• Example: Multiple Output kLUT
  – How do we implement
        • Z1 =A ● B ● C
        • Z2 =A + B + C

          7   1                       7   1
          6   0                       6   0
          5   0   Z1                  5   0     Z2
          4   0                       4   1
          3   0                       3   0
          2   0                       2   1
          1   0                       1   1
          0   0                       0   0
    A                          A
    B                          B
    C                          C




                                Dr. J. M. Emmert
                          Dept of Electrical Engineering
   Generic FPGA Architecture
• Example: Multiple Output kLUT
  – How do we implement
        • Z1 =A ● B ● C
        • Z2 =A + B + C
                                        Both outputs common internal addressing
          7   1   1                     All input variables must be the same
          6   0   0                        - Reduced number of routed input signals
          5   0   0
          4   0   1 Z1                  More than one output
          3   0   0                        - Increased number or output signals must be routed
          2   0   1                     Less routing outside of LB
          1   0   1 Z2
          0   0   0 \                   Increased number of MCs per LUT
    A                                   More functionality per LB
    B
    C




                                Dr. J. M. Emmert
                          Dept of Electrical Engineering
             Generic FPGA Architecture
 • Example: Multiple Output Decomposable kLUT
         –   Smaller LUTs can be combined to form larger LUTs
         –   Single output (M =1) with k inputs -or-
         –   Two outputs (M =2) with k-1 inputs -or-
     3   –   Three outputs (M =3) with k-2 inputs -or- …
     2                                               O3
     1
     0
                                                              Example 16 MC Decomposable 4LUT
                   0
     3                                               O2       M=4 with k=2 (S2=S1=S0=‘1’)
     2             1                                              Use O3, O2, O1, and O0
     1
     0                                                                   -or-
                                                              M=2 with k=3 (S0 = ‘1’ , S1=I0, and S2=I3)
                                                                  Use O2 and O0
S2                                                                        -or-
     3
     2                                               O1       M=1 with k=4 (S1=S2=I1 and S0 = I0)
     1                                                            Use O0 only
     0
                                               0
                   0                                 O0
     3                                         1
     2             1
     1
     0


S1
S0
                                            Dr. J. M. Emmert
                                      Dept of Electrical Engineering
                    Generic FPGA Architecture
           • Example: Multiple Output Decomposable kLUT
                –   Smaller LUTs can be combined to form larger LUTs
                –   Single output (M =1) with k inputs -or-
                –   Two outputs (M =2) with k-1 inputs -or-
            3   –   Three outputs (M =3) with k-2 inputs -or- …
            2                                               O3
            1
            0
I7                                                                   Example 16 MC Decomposable 4LUT
I6                       0
            3                                               O2       M=4 with k=2 (S2=S1=S0=‘1’)
            2            1                                               Use O3, O2, O1, and O0
            1
            0                                                                   -or-
I5
                                                                     M=2 with k=3 (S0 = ‘1’ , S1=I0, and S2=I3)
I4                                                                       Use O2 and O0
      S2                                                                         -or-
‘1’         3
            2                                               O1       M=1 with k=4 (S1=S2=I1 and S0 = I0)
            1                                                            Use O0 only
            0
I3                                                   0
I2                       0                                  O0
            3                                        1
            2            1
            1
I1          0
I0
‘1’
      S1
‘1’
      S0
                                                   Dr. J. M. Emmert
                                             Dept of Electrical Engineering
                    Generic FPGA Architecture
           • Example: Multiple Output Decomposable kLUT
                –   Smaller LUTs can be combined to form larger LUTs
                –   Single output (M =1) with k inputs -or-
                –   Two outputs (M =2) with k-1 inputs -or-
            3   –   Three outputs (M =3) with k-2 inputs -or- …
            2
            1
            0
I5                                                                   Example 16 MC Decomposable 4LUT
I4                       0
            3                                               O2       M=4 with k=2 (S2=S1=S0=‘1’)
            2            1                                               Use O3, O2, O1, and O0
            1
            0                                                                   -or-
I5
                                                                     M=2 with k=3 (S0 = ‘1’ , S1=I0, and S2=I3)
I4                                                                       Use O2 and O0
      S2                                                                         -or-
I3          3
            2                                                        M=1 with k=4 (S1=S2=I1 and S0 = I0)
            1                                                            Use O0 only
            0
I2                                                   0
I1                       0                                  O0
            3                                        1
            2            1
            1
I2          0
I1
I0
      S1
‘1’
      S0
                                                   Dr. J. M. Emmert
                                             Dept of Electrical Engineering
                   Generic FPGA Architecture
          • Example: Multiple Output Decomposable kLUT
               –   Smaller LUTs can be combined to form larger LUTs
               –   Single output (M =1) with k inputs -or-
               –   Two outputs (M =2) with k-1 inputs -or-
           3   –   Three outputs (M =3) with k-2 inputs -or- …
           2
           1
           0
I3                                                                  Example 16 MC Decomposable 4LUT
I2                      0
           3                                                        M=4 with k=2 (S2=S1=S0=‘1’)
           2            1                                               Use O3, O2, O1, and O0
           1
           0                                                                   -or-
I3
                                                                    M=2 with k=3 (S0 = ‘1’, S1=I0, and S2=I3)
I2                                                                      Use O2 and O0
     S2                                                                         -or-
I1         3
           2                                                        M=1 with k=4 (S1=S2=I1 and S0 = I0)
           1                                                            Use O0 only
           0
I3                                                  0
I2                      0                                  O0
           3                                        1
           2            1
           1
I3         0
I2
I1
     S1
I0
     S0
                                                  Dr. J. M. Emmert
                                            Dept of Electrical Engineering
   Generic FPGA Architecture
A & B. # LB IO pins
  – More pins => more routing resources between PLBs



        Functional                           Functional
          Unit                                 Unit



                     W                                    W




                               Dr. J. M. Emmert
                         Dept of Electrical Engineering
      Generic FPGA Architecture
A & B. # LB IO pins
  – More pins => more routing resources between PLBs



           Functional                                      Functional
             Unit                                            Unit



                                W                                                       W

  – More pins => more logic per LB => fewer connecting signal nets
    between functional units for the same amount of mapped logic
       • Performance: # levels of logic
  –   Summary
       •   More IO / functional unit => more logic mapped / functional unit
       •   More IO / functional unit => required fewer signal nets between functional units for the same amount of
           combinational logic
       •   More IO / functional unit => fewer levels of logic in implementation (less delay)
       •   More IO / functional unit => more routing resources to connect LBs
       •   More IO / functional unit => larger functional units => functional units are further apart (more delay)
       •   More IO / functional unit => unused MCs when functions are smaller than the functional unit

                                          Dr. J. M. Emmert
                                    Dept of Electrical Engineering
     Generic FPGA Architecture


2. Type of Functional Unit
   Functionality ≡ # of different Boolean logic functions that a
     functional unit can implement
   – What affects functionality?
       • Type of element
                   Mux
                   LUT
                   PLA NAND-NAND (AND-OR)
       • # inputs
       • LB Internal routing structure



                               Dr. J. M. Emmert
                         Dept of Electrical Engineering
      Generic FPGA Architecture
     eg: k=3, N=3, M=1 NAND-NAND PAL Structure

                               A+B+C =Z?
                               AB + BC +AC = Z ?
                               A + B + C=Z?

                  Z




I2   I1 I0




                            Dr. J. M. Emmert
                      Dept of Electrical Engineering
         Generic FPGA Architecture
     eg: k=3, N=3, M=1 NAND-NAND PAL Structure

                               A+B+C =Z?
                               AB + BC +AC = Z ?
                               A + B + C=Z?

                  Z




I2   I1 I0
A    B   C




                            Dr. J. M. Emmert
                      Dept of Electrical Engineering
         Generic FPGA Architecture
     eg: k=3, N=3, M=1 NAND-NAND PAL Structure

                               A+B+C =Z?
                               AB + BC +AC = Z ?
                               A + B + C=Z?

                  Z




I2   I1 I0
A    B   C




                            Dr. J. M. Emmert
                      Dept of Electrical Engineering
         Generic FPGA Architecture
     eg: k=3, N=3, M=1 NAND-NAND PAL Structure

                               A+B+C =Z?
                               AB + BC +AC = Z ?
                               A + B + C=Z?

                  Z                          (A’B+AB’)’C +(A’B+AB’)C’ ?
                                             (A’B’C + ABC + A’BC’+ AB’C’ ) ?

                                         7   1
                                         6   0
                                         5   0
I2   I1 I0                               4   1
A    B   C                               3   0
                                         2   1
                                         1   1
                                         0   0




                            Dr. J. M. Emmert
                      Dept of Electrical Engineering
         Generic FPGA Architecture
     eg: k=3, N=3, M=1 NAND-NAND PAL Structure

                       1st level NAND functions (F1, F2, and F3)
             F1
                                   output a ‘1’ => 1 (combinations of no var)
                                   output a ‘0’ => 1
             F2                    output A or A’ => 2 (combinations of one var)
                  Z
                                   output B or B’ => 2
             F3                    output C or C’ => 2
                                   combinations of A NAND B => 4 (comb of two var)
                                   combinations of B NAND C => 4
                                   combinations of A NAND C => 4
I2   I1 I0                         combinations of A NAND B NAND C => 8 (comb of three)
A    B   C                                      total => 28 functions
                       2nd level NAND

                                   F1’, F1 NAND F2, F1 NAND F2 NAND F3
                                                => 3 functions of 1st level
                       => 84 functions




                            Dr. J. M. Emmert
                      Dept of Electrical Engineering
            Generic FPGA Architecture
        eg: k=3 input LUT (3LUT)

    7   0                   7   1                                        7   1
    6   0                   6   0                                        6   1
    5   0    Z1             5   0      Z1                                5   1    Z1
    4
    3
    2
        0
        0
        0
                            4
                            3
                            2
                                0
                                0
                                0
                                                        …                4
                                                                         3
                                                                         2
                                                                             1
                                                                             1
                                                                             1
    1   0                   1   0                                        1   1
    0   0                   0   0                                        0   1
A                     A                                              A
B                     B                                              B
C                     C                                              C


              2k = 23 = 8 MCs
              28 = 256 combinations (00000000 to 11111111) to program the cells

              In general?

              How about if we have a new transistor with 3 states instead of 2 (binary)?




                                          Dr. J. M. Emmert
                                    Dept of Electrical Engineering
      Generic FPGA Architecture
• Costs of Functional Unit Functionality
  – Functionality vs. Area
     • Calculation of area per unit
        –   NAND       =>           2● # inputs [transistors]
        –   INV        =>           2● # inputs [transistors]
        –   2:1 Mux    =>           2 [transistors]
        –   kLUT       =>           2k ●size of MC+Muxes+INVs [transistors]
        –   …
     • Total # units

     • In general: as Functionality ↑ Area ↑
     • In general: as Functionality ↑ # required units/system ↓

                                  Dr. J. M. Emmert
                            Dept of Electrical Engineering
          Generic FPGA Architecture
3. Is it advantageous to include a FF in PLB?                             Functional
                                                                             Unit
                                                                        (Combinational   Z   1



   –
                                                                            Logic)                       Y


        Or should we create FFs with combinational functional units?                         0




                                                                       I3



4. Routing Area vs. Logic Area
                                                                       I2
                                                                       I1
                                                                       I0
                                                                                             1           Y
                                                                                                         Q
                                                                                                 D
                                                                                             0       Q



   –
                                                                       S0


        Size of routing area versus size of logic area?                S1


                                                                       CLK



   –    Model for comparing routing area versus logic area:




                                         Dr. J. M. Emmert
                                   Dept of Electrical Engineering
          Generic FPGA Architecture
3. Is it advantageous to include a FF in PLB?
   –    Or should we create FFs with combinational functional units?
4. Routing Area vs. Logic Area
   –    Size of routing area versus size of logic area?
   –    Model for comparing routing area versus logic area:




                                         Dr. J. M. Emmert
                                   Dept of Electrical Engineering
          Generic FPGA Architecture
3. Is it advantageous to include a FF in PLB?
   –    Or should we create FFs with combinational functional units?
4. Routing Area vs. Logic Area
   –    Size of routing area versus size of logic area?
   –    Model for comparing routing area versus logic area:




                                         Dr. J. M. Emmert
                                   Dept of Electrical Engineering
          Generic FPGA Architecture
3. Is it advantageous to include a FF in PLB?
   –    Or should we create FFs with combinational functional units?
4. Routing Area vs. Logic Area
   –    Size of routing area versus size of logic area?
   –    Model for comparing routing area versus logic area:

BA ≡ bit area (size of MC)
kLUT area => M ●2k● BA
   M ≡ number of outputs                                               LB
LBLUT area => LUT area + FA
   FA ≡ fixed area for other circuitry

LBLUT area = M ●2k● BA + FA




                                         Dr. J. M. Emmert
                                   Dept of Electrical Engineering
          Generic FPGA Architecture
3. Is it advantageous to include a FF in PLB?
   •    Or should we create FFs with combinational functional units?
4. Routing Area vs. Logic Area
   •    Size of routing area versus size of logic area?
   •    Model for comparing routing area versus logic area:

RP ≡ routing pitch
    distance between routing tracks
W ≡ # tracks or channel width                                           LB




                                                                       RP


                                         Dr. J. M. Emmert
                                   Dept of Electrical Engineering
          Generic FPGA Architecture
3. Is it advantageous to include a FF in PLB?
   –    Or should we create FFs with combinational functional units?
4. Routing Area vs. Logic Area
   –    Size of routing area versus size of logic area?
   –    Model for comparing routing area versus logic area:

RP ≡ routing pitch
   distance between routing tracks
W ≡ # tracks or channel width                                          LB


                                     MC




                                         Dr. J. M. Emmert
                                   Dept of Electrical Engineering
          Generic FPGA Architecture
3. Is it advantageous to include a FF in PLB?
   •    Or should we create FFs with combinational functional units?
4. Routing Area vs. Logic Area
   •    Size of routing area versus size of logic area?
   •    Model for comparing routing area versus logic area:




                                                                    {
RP ≡ routing pitch
    distance between routing tracks
W ≡ # tracks or channel width                                           LB
CL ≡ sqrt (LBLUT area)                                         CL




                                         Dr. J. M. Emmert
                                   Dept of Electrical Engineering
          Generic FPGA Architecture
3. Is it advantageous to include a FF in PLB?
   •    Or should we create FFs with combinational functional units?
4. Routing Area vs. Logic Area
   •    Size of routing area versus size of logic area?
   •    Model for comparing routing area versus logic area:

RP ≡ routing pitch
    distance between routing tracks
W ≡ # tracks or channel width                                           LB
CL ≡ sqrt (LBLUT area)

                                                         W●RP       {

                                         Dr. J. M. Emmert
                                   Dept of Electrical Engineering
          Generic FPGA Architecture
3. Is it advantageous to include a FF in PLB?
   •    Or should we create FFs with combinational functional units?
4. Routing Area vs. Logic Area
   •    Size of routing area versus size of logic area?
   •    Model for comparing routing area versus logic area:

RP ≡ routing pitch
    distance between routing tracks
W ≡ # tracks or channel width                                          LB
CL ≡ sqrt (LBLUT area)
Routing Area Per Block = 2(CL ● W ●RP) +(W●RP)2




                                         Dr. J. M. Emmert
                                   Dept of Electrical Engineering
          Generic FPGA Architecture
3. Is it advantageous to include a FF in PLB?
    •   Or should we create FFs with combinational functional units?
4. Routing Area vs. Logic Area
    •   Size of routing area versus size of logic area?
    •   Model for comparing routing area versus logic area:

Total area required to map a circuit

Total Area = NLBs(LBLUT area + Routing Area/LB)                        LB




                                         Dr. J. M. Emmert
                                   Dept of Electrical Engineering
             Generic FPGA Architecture
3. Is it advantageous to include a FF in PLB?
     •    Or should we create FFs with combinational functional units?
4. Routing Area vs. Logic Area
     •    Size of routing area versus size of logic area?
     •    Model for comparing routing area versus logic area:

For a PLA based functional element substitute the following equations

Cw = max(18,sqrt(BA))●k + max(10,sqrt(BA))●M+98

Ch = max(10,sqrt(BA))●N + 136

LBPLA Area = Cw ● Ch + FA ● M




                                                                       Ref: Brown
                                            Dr. J. M. Emmert
                                      Dept of Electrical Engineering
                                Review
• So far … Functionality vs. Area
    –   LB Functionality ↑ => # LBs required for a ckt or system implementation ↓
    –   LB Functionality ↑ => Total # LB connections ↓ (fewer signal nets)
    –   LB Functionality ↑ => Distance between connected LBs ↑
    –   LB Functionality ↑ => LB area ↑
    –   LB Functionality ↑ => IO pins / LB ↑
    –   LB Functionality ↑ => External Routing Resources ↑
    –   Is there an optimum Functionality vs. Area point????

• Consider FPGA Area
    – FPGA Area ≡ total LB area + total Routing area
               =     10-30% + 70-90%

• How do we decide the right balance between LB area and Routing area?
    –   What is the right W?
    –   What is the right k for KLUTs?
    –   What is the right functionality?
    –   What is the right FC?
    –   What is the right FS?

                                    Dr. J. M. Emmert
                              Dept of Electrical Engineering
       Generic FPGA Architecture
• Solution: Evaluate the area required to map several
  Bench Mark (BM) circuits

   – Representative of the circuits that will use FPGAs


• Definitions
   – Sparse circuits:            # signal nets ~ # circuit components
   – Dense circuits:             # signal nets ~ # circuit components2




                              Dr. J. M. Emmert
                        Dept of Electrical Engineering
•
         Generic FPGA Architecture
   Studies looked at five types of logic elements
                                                                                            Functional
                                                                                               Unit

1. Single output kLUT                                                                     (Combinational
                                                                                              Logic)
                                                                                                           Z   1

                                                                                                               0
                                                                                                                           Y



2. Multiple output kLUT                                                              I3
                                                                                     I2
                                                                                     I1
                    7   1   1                                                        I0
                    6   0   0                                                                                  1
                                                                                                                           Y
                    5   0   0
                                                                                                               0   D       Q
                    4   0   1     Z1
                    3   0   0                                                                                          Q
                    2   0   1                                                        S0
                    1   0   1                                                        S1
                                  Z2
                    0   0   0 \
           A
           B                                                                         CLK
           C



3. Multiple output (M) decomposable kLUT
                    3
                    2                      O3
                    1
                    0




               S2
                    3
                    2
                    1
                    0
                              0



                              1
                                           O2




                                           O1
                                                                                                Functional
                                                                                                   Unit
                                                                                              (Combinational
                                                                                                  Logic)
                                                                                                                   ?
                    3
                    2
                    1
                    0
                                           O0
                                       0

                              0




                                                                                 ?
                    3                  1
                    2
                              1
                    1
                    0

               S1
               S0




4. PLA based block
    – k inputs
    – N product terms
    – M outputs
5. Dedicated FF?
                                                      Dr. J. M. Emmert
                                                Dept of Electrical Engineering
             Generic FPGA Architecture
• Steps for implementing benchmark circuits on experimental FPGA
  architectures for study
    1.   Technology Mapping
         •   Benchmark circuit => netlist of logic cell functions
         •   Each logic cell function must fit the functional element being tested
             –     Eg: a four variable function with one output must be divided to fit in 3LUTs

                                                                     A(6)   B(6)       Co(6)           S(6)
                                                                      0      0          0               0
                                                                      0      0          1               1
                                                                      0      1          0               1
                                                                      0      1          1               0
                                                                      1      0          0               1
                                                                      1      0          1               0
    2.   Placement                                                    1
                                                                      1
                                                                             1
                                                                             1
                                                                                        0
                                                                                        1
                                                                                                        0
                                                                                                        1

         •   Each logic cell function is assigned to a specific functional unit in the FPGA layout

                                                                                   1       2
                                                                                                                                    6       1   4
                                                                              3        4       5
                                                                                                       +                       =>   2       8   5

    3.   Global Route                                                              6
                                                                                                                                    7       3

         •   Selects path through channels for each signal net                7            8


         •   The number of signal nets through any given channel cannot exceed the channel width (W)
                                                                                                                                        5

                                                                                                        1          2


                                                                                                   3           4       5



                                                                                                           6

                                                                                                   7               8


                                                                                                                           8



                                                      Dr. J. M. Emmert
                                                Dept of Electrical Engineering
       Generic FPGA Architecture
• Results
1. Single output kLUT
   –   Logic area      k ↑ => #LBs ↓
                       k ↑ => area ↑
   –   Routing area    k ↑ => W ↑




   –   K=4 independent of BA !


   Ref: Rose 90
                                Dr. J. M. Emmert
                          Dept of Electrical Engineering
       Generic FPGA Architecture
2. Multi-output kLUT?
• Multiple outputs bad because
   of increased chip area
• M=1 => smallest FPGA chip
   area




                             Dr. J. M. Emmert
                       Dept of Electrical Engineering
                  Generic FPGA Architecture
  3. Multi-output decomposable kLUT?
  • M = 1, 2, 4, and 8 outputs were calculated
  • Required more routing area due to k
                                                                    Ref: Rose
  • More flexible (fewer blocks were required)
  • Best area M=4
No. of
Blocks                                       Total Area
1100                                         220000


900

                                             180000
700


500
                                             140000



         M=   1    2   4          8                   M=    1   2   4      8

                                 Dr. J. M. Emmert
                           Dept of Electrical Engineering
       Generic FPGA Architecture
4. PLA based block
• 8 to 10 inputs
• 3 to 4 outputs
• 12-13 product terms




   Ref: Koulohers 90
                              Dr. J. M. Emmert
                        Dept of Electrical Engineering
       Generic FPGA Architecture
5. Dedicated FF? YES!
• Typically, number of LBs required doubled when FFs were not
   available
• k = 4 still best number of inputs




   Ref: Rose 89,90
                             Dr. J. M. Emmert
                       Dept of Electrical Engineering
            Generic FPGA Architecture
• Summary for the study
   –   4 LUT is best choice for single output
   –   LBs with high functionality per pin are more area efficient
   –   Single output most area efficient for non-decomposable
   –   M=4 is best for decomposable
   –   For PLA, k = 8-10, M = 3-4, N = 12-13
   –   Dedicated FF is advantageous
• Designing your own FPGA? General approach to designing an FPGA
   –   What is the best architecture
   –   Pick representative BM circuits
   –   Vary FPGA design parameters (W, k, M, …)
   –   Optimally tech map
   –   Optimally place
   –   Determine the minimum W to completely route
   –   Look at design metrics (area, …)
                                       Dr. J. M. Emmert
                                 Dept of Electrical Engineering

								
To top