Performance Analysis and Power Estimation of ARM processor by zCcBpI7

VIEWS: 27 PAGES: 21

									              Performance Analysis and Power
               Estimation of ARM Processor


              Team:
                          Ajayshanker Krishnamurthy
                           Swathi Tanjore Gurumani
                                  Zexin Pan

              Project Advisor:
                           Dr.Alexander Milenkovic




Apr 14,2003                      CPE 631 Project
                      Agenda


    • Overview
    • Tools Used
    • Performance Analysis - Results
    • Power Estimation - Results
    • Conclusion




Apr 14,2003           CPE 631 Project
                                     Overview

                           Target Binaries
                                                     Simulator         Performance
                 Compile                                               Metrics
Benchmarks                 Exe

                                                  Power Estimator         Power
                                                                          Dissipated


                                                                 Performance
  • MiBench                                                      Metrics
  • SimpleScalar
  • PowerAnalyzer

   Apr 14,2003                          CPE 631 Project
                                 Tools Used

Benchmarks:
Critical part of design process due to performance based designs
Embedded Benchmarks:Fastest growing market segment in the u-processor industry
MiBench: (University of Michigan)
• Free, commercially representative embedded benchmark suite
• Set of 35 embedded applications of six categories
    – Automotive and Industrial Control, Network, Security, Consumer Devices, Office
      Automation and Telecommunications
• Security Algorithms– Rijndael, Blowfish, Sha, Pgp
• Small data set represents a light-weight, useful embedded application
• Large data set provides a more stressful, real-world application


     Apr 14,2003                     CPE 631 Project
                          Tools Used…

SimpleScalar: (Born 1982, @ University of Wisconsin)
• Provides an infrastructure for simulation and architectural
  modeling
• Can model a variety of platforms
     - unpipelined processors to detailed micro architectures
• Suited to the needs of researchers and instructors
    - meets the critical requirements: Performance, Flexibility & Detail
• Supports popular instruction sets -Alpha, Power PC, x86 & ARM
• Baseline simulator models:
   - Sim-safe, Sim-fast, Sim-cache, Sim-profile, Sim-bpred,
      Sim-fuz, Sim-outorder
   Apr 14,2003                 CPE 631 Project
                         Tools Used…

PowerAnalyzer: SimpleScalar-Arm Power Modeling Project
• Joint venture of U Michigan & U Colorado
• Estimator that allows power/performance trade-offs to be
  examined
• Tightly Coupled with SimpleScalar Toolset for ARM
• Gives Power dissipation for each component individually
   – Switching, Internal &Leakage
• Can be configured based on two models:
   – Analytical & Empirical



  Apr 14,2003                 CPE 631 Project
              Measurement Methodology


• Configured for Current (SA 110) and Next (PXA 250)
  generation
• Input: Same dataset (>3M) for all algorithms to achieve
  fair comparison and reliable result
• Output: raw data related to performance and power
  consumption are obtained from PowerAnalyzer report
• Data Processing (digesting) and visualizing




Apr 14,2003              CPE 631 Project
                 Performance Analysis

• Configured Sim-outorder to represent current and next generation of
  embedded processors
• Intel SA-110 for current generation
   – 32 bit general purpose micro processor
   – On chip data cache(16K),instruction cache(16 K) and MMU
   – Used in PDAs, Smart phones, digital cameras etc.
• Intel PXA-250 for next generation
   – High performance Intel Xscale core
   – On chip data cache(32 K),instruction cache(32 K),branch target buffer
      and MMU
   – Used in Multimedia Applications



Apr 14,2003                   CPE 631 Project
                 Configuration

                     Current          Next

I Fetch Q size       2                4

Branch Pred.         Not Taken        Bimod

I Issue Width        1                1

Cache dl1            16:32:32         32:32:32

Cache il1            16:32:32         32:32:32

TLB itlb             16:4096:4        16:4096:4

TLB dtlb             32:4096:4        32:4096:4


Apr 14,2003         CPE 631 Project
                                                         Results

                                                        Execution Time

                                 18
       Execution Time(seconds)




                                 16
                                 14
                                 12
                                 10
                                 8
                                 6
                                 4
                                 2
                                 0
                                      RCE   RNE   RCD   RND BFCE BFNE BFCD BFND SHAC SHAN




Apr 14,2003                                              CPE 631 Project
                                             Results

                                                  CPI

                       3

                      2.5
       Clock cycles




                       2

                      1.5

                       1

                      0.5

                       0
                            RCE   RNE   RCD RND BFCE BFNE BFCD BFND SHAC SHAN




Apr 14,2003                                  CPE 631 Project
                                                  Results


                                      Prediction m isses per 1000 instructions


                         60
                         50
     # misses per 1000




                         40
                         30
                         20
                         10
                         0
                              RCE   RNE   RCD    RND    BFCE    BFNE   BFCD      BFND   SHAC SHAN


                                                Current generation predictor : Not Taken
                                                Next generation predictor     : Bimod

Apr 14,2003                                        CPE 631 Project
                                            Results

                                          % of Branches

              7
              6
              5
   Branches




              4
              3
              2
              1
              0
                            D

                                 D




                                                                 AC

                                                                        AN
                  E

                       E




                                                   CD

                                                          ND
                                      CE

                                             NE
              RC

                      RN

                           RC

                                RN




                                                               SH

                                                                      SH
                                     BF

                                           BF

                                                  BF

                                                        BF




Apr 14,2003                                 CPE 631 Project
 Why use power as performance’s criteria?

• T. Mudge, “Power: A first class design constraint,”
  Computer, vol. 34, no. 4, April 2001, pp. 52-57
     – Limiting power consumption is critical, particularly in portable
       and mobile applications such as cell phone and laptop due to limit
       battery life
     – One of the major markets of ARM is portable and mobile products




Apr 14,2003                    CPE 631 Project
                     Power Estimation

• Measurement Methodology
     – ARM simulator & power measurement tools: PowerAnalyzer 1.1
       from UMICH
     – Configured for Current (SA 110) and Next (PXA 250) generation
     – Input: Same dataset (>3M) for all algorithms to achieve fair
       comparison and reliable result
     – Output: raw data related to performance and power consumption
       are obtained from Power Analyzer report
     – Data processing (digesting) and visualizing




Apr 14,2003                  CPE 631 Project
                      Power Estimation

• Difficulties using PowerAnalyzer
     – Report gives power consumption for every ARM component, but
       no unit!
     – Since all these numbers are huge, we have difficulties figuring out
       what they mean ??




Apr 14,2003                    CPE 631 Project
                                   Power Estimation
                                                                     Pow er distribution of Intel XScale using Rijndael   aio
Power distribution of Intel StrongARM SA- aio
           110 using Rijndael                                                                                             dio
                                          dio
                                                                                                                          irf
                                                irf
                                                                                                                          il1
                                                il1
                                                                                                                          dl1
                                                dl1
                                                                                                                          itlb
                                                itlb                                                                      dtlb
                                                dtlb                                                                      bimod
                                                clock                                                                     clock
                                                uarch                                                                     uarch


 Power distribution of StrongARM SA-110 using                       Power distribution of Intel XScale using Blowfish aio
                    Blowfish                     aio                                                                      dio
                                                 dio                                                                      irf
                                                 irf                                                                      il1
                                                 il1                                                                      dl1
                                                 dl1                                                                      itlb
                                                 itlb                                                                     dtlb
                                                 dtlb                                                                     bimod
                                                 clock                                                                    clock
                                                 uarch                                                                    uarch

Apr 14,2003                                             CPE 631 Project
                             Power Estimation

                            Total Power vs. Benchmark


                       15
           Billions
   Power




                       10                                    SA-110
                        5                                    XScale
                        0
                                        nc


                                        es




                                                         a
                                         c


                                         s


                                                       sh
                                     en


                                     de
                                    l .e


                                    l .d


                                   h.


                                   h.
                                 ae


                                 ae


                               fis


                               fis
                              nd


                              nd


                          ow


                          ow
                      rij


                          rij


                       bl


                       bl




                               Benchmark in Security
Apr 14,2003                         CPE 631 Project
                                        Power Estimation

                                   Power consumption per Enc/Des byte
  Power per Byte




                               5
                   Thousands




                               4                                        SA-110
                               3
                               2                                        XScale
                               1
                               0
                                                nc


                                                es




                                                                    a
                                                 c


                                                 s


                                                                  sh
                                             en


                                             de
                                            l .e


                                            l .d


                                           h.


                                           h.
                                         ae


                                         ae


                                       fis


                                       fis
                                      nd


                                      nd


                                  ow


                                  ow
                        rij


                                  rij


                               bl


                               bl




                                        Behchmark in Security
Apr 14,2003                                     CPE 631 Project
                            Conclusion

     • The performance gain in next generation of processors is offset by the
       increase in power consumption. Intel Xscale almost doubles the
       power consumption with about 10% performance gain over SA- 110
     • The next generation of processors with larger caches improve
       performance
     • The bimodal branch predictor greatly reduces the number of miss
       predictions
     • Power consumption not only depends on hardware architecture and
       system configuration (system clock,etc.), but also heavily relies on
       Benchmark and input dataset




Apr 14,2003                    CPE 631 Project
              Thank You


              Questions…



Apr 14,2003     CPE 631 Project

								
To top