Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out

On Relevance of Wire Load Models

VIEWS: 2 PAGES: 22

									Explicit Modeling of Control
and Data for Improved NoC
     Router Estimation

      Andrew B. Kahng+*, Bill Lin*
        and Siddhartha Nath+
    UCSD CSE+ and ECE* Departments
        {abk, billlin, sinath}@eng.ucsd.edu
Outline
•   Motivation
•   Our work: Overview
•   Methodology
•   Flit-level power estimation
•   Summary




                                  2
NoC Modeling So Far… (ORION)

                   Arbiter


 SRC       BUF I             SINK


 Link      BUFE              Link

 Link      BUFW              Link

 Link      BUFN    XBAR      Link

 Link      BUFS              Link


                                               Leakage
                                                power


        ORION1.0                    ORION2.0      Clock
         (2002)                      (2009)       power

  6NOR + 2INV + DFF           6NOR + 2INV + DFF
                                                          3
What Is The Problem?
                Arbiter


 SRC    BUF I             SINK


 Link   BUFE              Link

 Link   BUFW              Link

 Link   BUFN    XBAR      Link

 Link   BUFS              Link




• RTL code mismatch
• Logic transformation and
  technology mapping
  mismatch                       6NOR + 2INV + DFF
                                                     4
How Bad Is It?
Router RTL generators:
Netmaker – Cambridge, UK
Stanford NoC - Stanford

                       25000
                         60000
                                                                           460%
                                 ORION2.0    NetMaker    Stanford           89%
                         50000
                       20000
                                  ORION2.0    NetMaker    Stanford
      Instance count




                        40000
    Instance Count




                       15000

                        30000
                       10000

                        20000
                        5000
                        10000
                       Why such large errors?
                        0
                          0 Assumed logic template inaccurate
                                 16          24                32    64
                                   5           6                 8    10
                            Control logic not modeled
                                              Flit-Width (bits)
                                                 # Ports

                            Implementation details missing
                                                                                  5
Outline
•   Motivation
•   Our work: Overview
•   Methodology
•   Flit-level power estimation
•   Summary




                                  6
We Propose: Step 1
• Derive router component block parametric models from post
  -synthesis netlists
      P    V   B   F    # Instances

       5
      10   2   8   16
                   32       400
                           3300
                                           ~F
                                           ~P2
      5
      8    2   8   32       825
                           2112

      5    2   8   64
                   32       825
                           1673



               ~P2 XBAR ~ P2F
v -Key idea: No assumed logic template
P #Ports
V #VCs
v -Component models derived from actual RTL
B - #BUFs
   synthesized with cell libraries
F – Flit-width
                                                          7
We Propose: Step 2
• Automatic fitting of models with post-P&R
  power and area

      XBAR ~ P2F
                                         XBARarea =
  P   V B     F    Area     LSQR         a 1. P 2F + a 0
  5   2   8   16   1439.9

  5   2   8   32   2916.0
v Key idea: Capture implementation details using
  5 2 8 64     5867.4
  automatic regression fit
  8 2 8 32     7465.1
v Characterization performed only once and usable for
  multiple design space explorations                       8
Outline
•   Motivation
•   Our work: Overview
•   Methodology
•   Flit-level power estimation
•   Summary




                                  9
Model Development
NoC router RTL      µArch params:        Impl params:
  generators          P, V, B, F       Clock Frequency
                                             •    Two RTL generators:
                 Synthesis and P&R:                – Netmaker (Cambridge, UK)
                   DC/RC, SOCE                     – Stanford NoC
                                             •    SP&R tools:
            Analysis of blocks: XBAR, SW
                                                   – Cadence RC & Synopsys DC for
             & VC arbiter, Input & Output
                        buffers                      hierarchical synthesis to analyze
                                                     each block
                 New models for each               – Cadence SOC Encounter for
                  component block
                                                     P&R
 Component                                       Model
    XBAR                                          P 2F
    SWVC                              9(P2V2 + P2 + PV – P)
    InBUF         180PV + 2PVBF + 2P2VB + 3PVB + 5P2B + P2 + PF + 15P
   OutBUF                                   25P + 80PV
  CLKCTRL                     0.02(SWVC + InBUF + OutBUF)                         10
Overall Methodology
                            ORION_NEW models
Technology                                                         Post P&R
  Library                                                        data per block
                         Basic            Regression fit
                                                                  Std. cell count
  Cell area                                                           & area
 Cell leakage                                                     Leakage power
                     Manual                   LSQR
  Pin cap.                                                            Internal power
   Internal
    energy               Estimates for gate count                 Switching power


                                          Power: leakage, internal,
                  Area
                                                 switching
• Manual                            • LSQR
   – Quick and easy                      – Accurate (captures implementation
   – Misses implementation                 details)
     details                             – One-time overhead (generation of
                                           P&R training data points)     11
Results: Area And Power

    100%
                                           Avg Max MaxMin   Min
           AREA
                                         Avg

    80%
           POWER
                                                                     4x
                                                                   reductio
    60%
    60%                       6.5x                                    n
                              reductio
    40%
    40%                          n


    20%
    20%

     0%
     0%
           NEW          2.0    NEW         2.0        NEW    2.0    NEW          2.0
           NEW          2.0    NEW         2.0        NEW    2.0    NEW          2.0
                 45nm                            65nm                     45nm
                 45nm                            65nm                     45nm
                                   Stanford
                                   Stanford
                                     NoC
                                     NoC

Methodology scales across technologies, router RTL generators
                                                                                       12
Outline
•   Motivation
•   Our work: Overview
•   Methodology
•   Flit-level power estimation
•   Summary




                                  13
Flit-level Power Estimation
• Dynamic power estimation using flit-level bit encodings
• Have integrated with full-system NoC simulator (GARNET)
                     Post-P&R
                   router netlist




                     Gate-level                         Power
      Testbench                           VCD
                     simulation                        analysis



      ORION_NEW                             Power
                    Regression fit          Report
        models

                   Flit-level power           GARNET          gem5
                         model
                                    Flit-level power
                                        estimates
                                                                     14
Results: Flit-level Power
 • Accurate estimation of flit-level dynamic
   power
  80%
                          Avg    Max   Min
  60%
                                        3.6x
                                       reduction
  40%

  20%

  0%
        Flit    NEW             2.0      Flit      NEW   2.0

               Stanford                                        NetM
                 NoC
                                                               15
Outline
•   Motivation
•   Our work: Overview
•   Methodology
•   Flit-level power estimation
•   Summary




                                  16
Summary
• New hybrid modeling methodology: relax the
  template mindset
   – Explicitly models control and data signals
   – Captures RTL and implementation details
• Using proposed parametric regression methodology,
  worst-case estimation errors reduced by a factor of
   – 6.5x from ORION2.0 for power
   – 4x from ORION2.0 for area
• We propose an application of our methodology for flit
  -level dynamic power modeling and integration with
  GARNET
   – 3.6x worst-case error reduction in dynamic power estimation
• Ongoing: Non-parametric modeling of post-P&R
  power and area
                                                                   17
Thank You !




              18
Back up




          19
Regression analysis approach
• Multi-step regression fit
  – Step 1: Fit instances of each router component with
    post-layout instance counts
          a1. Instsmodel <component> + a0 = Inststool <component>
        InstsRmodel <component> = a1. Instsmodel <component> + a0
   ¤Step 2a: Fit area of each router component with
    post-layout area
         b1. InstsRmodel <component> + b0 = Areatool <component>
   ¤Step 2b: Fit power of each router component with
    post-layout power (leakage, internal, switching
    separately)
     {c5, d5, e5}. InstsRmodel XBAR + {c4, d4, e4}.InstsR model SWVC +
     {c3, d3, e3}.InstsRmodel InBUF + {c2, d2, e2}.InstsRmodel OutBUF +
              {c1, d1, e1}.InstsRmodel CLKCTRL + {c0, d0, e0} =
                         {Pleak tool,Pint tool, PSW tool}
                                                                          20
 Related work
                                             NoC Modeling
• Architecture templates
   – ORION2.0
                                   Circuit                    Regression
• Gate-level analytical            model                        model

  models
                                         Arch                            Non-
• Parametric regression
                      Analytical
                                       templates
                                                      Parametric
                                                                      parametric
   – Pre- and post-layout
     power estimation                         ORION_NEW              Control
                                              + regression;
   – RTL simulations                             flit-level
                                                                       Tool
• Non-parametric
  regression
        Significant Departure: Relax the “template”
   – MARS                  mindset
                                                                              21
Results
                             60000                                                              4000
                                                ORION2.0                                                      NEW
                             50000                                                              3500
                                                NetMaker                                                      NetMaker
                                                                                                3000
            Instance Count




                                                                               Instance Count
                             40000              Stanford NoC
                                                                                                2500          Stanford NoC
                             30000                                                              2000
                             20000                                                              1500
                                                                                                1000
                             10000
                                                                                                 500
                                   0                                                               0
                                        5          6           8   10                                  5           6          8   10
                                                 # Ports                                                         # Ports

                       25000                                                               25000
                                             ORION2.0                                                          NEW
                       20000                 NetMaker                                      20000               NetMaker
                                                                                                               Stanford NoC




                                                                        Instance Count
    Instance Count




                                             Stanford NoC
                       15000                                                               15000

                       10000                                                               10000

                             5000                                                               5000

                               0                                                                  0
                                       16         24          32   64                                  16         24         32   64
                                            Flit-Width (bits)                                               Flit-Width (bits)


• Avg. estimation error in # instances reduced from 109.5% to
  8.8%
   – Avg. estimation error in area reduced to 9.8%
   – Avg estimation error in power reduced to 4.58%          22

								
To top