Docstoc

ppt - Computer Engineering Research Group - University of Toronto

Document Sample
ppt - Computer Engineering Research Group - University of Toronto Powered By Docstoc
					A Probabilistic Pointer Analysis
 for Speculative Optimizations
       Jeff DaSilva
       Greg Steffan

Electrical and Computer Engineering
University of Toronto
                                 ASPLOS XII
Pointers Impede Optimization

foo(int *a) {
 …                           Loop Invariant
 while(…)                     Code Motion
{
     x = *a;
     …                        Parallelize
 }
}


                Can pointer analysis help?
     University Of Toronto    ASPLOS XII      2
Pointer Analysis

                                                                           optimize
          *a = ~
           ~ = *b                       Pointer             Definitely
                                        Analysis          Definitely Not
                                                             Maybe
*a = ~                      ~ = *b

   Do pointers a and b point to the same location?
     Repeat         for every pair of pointers at every program point

    How can we optimize                          Maybe     cases?
    University Of Toronto            ASPLOS XII                              3
Lets Speculate
   Implement a potentially unsafe optimization
         Verify         and Recover if necessary
    int *a, x;                                   int *a, x, tmp;
    …                                            …
    while(…)                                     tmp = *a;
    {                                            while(…)
          x = *a;               a is probably    {
          …                     loop invariant       x = tmp;
    }                                                …
                                                 }
                                                 <verify, recover?>

        University Of Toronto       ASPLOS XII                        4
Data Speculative Optimizations
   EPIC Instruction sets
        Support for speculative load/store instructions (eg. Itanium)

   Speculative compiler optimizations
        Dead store elimination, redundancy elimination, copy propagation,
         strength reduction, register promotion

   Thread-level speculation (TLS)
        Hardware and compiler support for speculative parallel threads

   Transactional programming
        Hardware and software support for speculative parallel transactions

    Heavy reliance on detailed profile feedback
    University Of Toronto         ASPLOS XII                              5
Can We Quantify                                Maybe          ?
   Estimate the potential benefit for speculating:
               Recovery                               Maybe
                penalty           Overhead
             (if unsuccessful)    for verify       Probability
                                                   of success
    Expected
    speedup
     (if successful)                                      Speculate?



    Ideally                  Maybe    would be a probability
      University Of Toronto           ASPLOS XII                       6
 Conventional Pointer Analysis
Probabilistic Pointer Analysis (PPA)

                                                                      optimize
            *a = ~
             ~ = *b                    Pointer         Definitely
                                                        p = 1.0
                                        PPA
                                       Analysis      Definitely Not
                                                        p = 0.0
                                                     0.0 Maybe 1.0
                                                         <p<
 *a = ~                     ~ = *b

   With what probability p, do pointers a location?
    Do pointers a and b point to the sameand b point
      Do same location?
    to thethis for every pair of pointers at every program point
     Repeat         for every pair of pointers at every program point
    Potential bonus: PPA doesn’t have to be safe
    University Of Toronto         ASPLOS XII                            7
PPA Research Objectives
   Accurate points-to probability information
      at     every static pointer dereference
   Scalable analysis
      Goal:         entire SPEC integer benchmark suite
   Understand scalability/accuracy tradeoff
      through              flexible static memory model

     Improve our understanding of programs

    University Of Toronto            ASPLOS XII            8
Related Work
                          Pointers? SPEC INT? Probabilities?

Ju. et.al.                                        
Chen et.al.                                       
Fernandez
& Espasa
                                     Hot code      
Bhowmik &
Franklin
                                        Select
                                         Apps
                                                    
Our PPA                                           
  University Of Toronto           ASPLOS XII               9
Algorithm Design Choices
   Fixed
      Bottom Up / Top Down Approach
      Linear transfer functions (for scalability)
      One-level context and flow sensitive

   Flexible
      Edge          profiling (or static prediction)
      Safe (or unsafe)
      Field        sensitive (or field insensitive)


    University Of Toronto            ASPLOS XII         10
    Traditional Points-To Graph
int x, y, z, *b = &x;
void foo(int *a) {                     = pointer          =   Definitely


    if(…)                              = pointed at       =    Maybe
       b = &y;
    if(…)                                    b        a
       a = &z;
    else(…)
       a = b;

    while(…) {
                                       x         y    z           UND

     x = *a;
     …
}
    }
                                  Results are inconclusive
        University Of Toronto   ASPLOS XII                             11
    Probabilistic Points-To Graph
int x, y, z, *b = &x;
void foo(int *a) {                           = pointer                      =      p = 1.0


    if(…)0.1 taken(edge profile)                                       p
                                             = pointed at                   =     0.0<p< 1.0
       b = &y;
    if(…)0.2 taken(edge profile)                        b              a
       a = &z;                                                   0.72
                                                   0.9   0.1                0.2
    else
       a = b;                                                    0.08

    while(…) {
                                             x               y              z         UND

     x = *a;
     …
}
    }
                                    Results provide more information
        University Of Toronto         ASPLOS XII                                             12
                        Linear
                        One -
                        Level
                        Interprocedural
                        Probabilistic
                        Pointer Analysis



                        LOLLIPOP
                 Our PPA Algorithm

University Of Toronto        ASPLOS XII    13
Fundamentals
   Location sets (Wilson & Lam)
     One or more mem locations
     Classified as pointer, pointer target, or both

   Matrix-based approach
     Nice       properties, optimized implementations
   Two key matrices:
     Points-to matrix
     Linear transformation matrix


    University Of Toronto     ASPLOS XII                 14
Points-To Matrix
                               Location Sets
                                       M-1 M

                         1
                         2
                         …         Area
      Pointer Sets                  Of
                                 Interest



                                    I
                         …
    Location Sets        N-1
                         N




               All matrix rows sum to 1.0
 University Of Toronto          ASPLOS XII     15
Points-To Matrix Example

                                                        x     y    z     UND

                                                 a     0.72 0.08 0.20
                                                       0.90 0.10
           b                a                    b
                   0.72
     0.9     0.1            0.2
                                                  x    1.0
                   0.08

                                                                   I
                                                  y          1.0
x              y            z     UND
                                                  z                1.0
                                                 UND                     1.0


    University Of Toronto           ASPLOS XII                                 16
Computing New Points-To Matrices

            Points-To
            Matrix In             I
                         Any Instruction


           Points-To
           Matrix Out             I
 University Of Toronto       ASPLOS XII    17
  The Fundamental PPA Equation




                                Transformation Points-To
  Points-To
  Matrix Out                =       Matrix      Matrix In



This can be applied to any instruction (incl. function calls)
    University Of Toronto          ASPLOS XII               18
Transformation Matrix
                                     Pointer Sets           Location NSets
                                  123…                             N-1


                              1
                              2

                            …

 Pointer Sets                                 Area of Interest




Location Sets             …
                          N-1
                          N
                                             ø                   I
                All matrix rows sum to 1.0
  University Of Toronto                  ASPLOS XII                          19
Transformation Matrix Example

S1:   a = &z;                     a      b   x      y    z     UND

                          a      1.0

                          b            1.0

                           x                 1.0
      TS1 =                y                       1.0
                           z                             1.0
                          UND                                   1.0


  University Of Toronto         ASPLOS XII                            20
  Example - The PPA Equation
                                                                          a = &z;
  PTout = TS1                                  PTin                 S1:



                                                               x     y     z    UND


                                               1.0      a     0.72 0.08 0.20
           b             a                                    0.90 0.10
                   0.72
                        1.0                             b
     0.9     0.1             0.2
PTout =
                                   1.0                  x     1.0
                   0.08
                                         1.0            y           1.0
 x             y             z       UND
                                               1.0      z                 1.0
                                                        UND
                                                       1.0                      1.0
     University Of Toronto                ASPLOS XII                                  21
  Example - The PPA Equation
                                                                             a = &z;
  PTout = TS1                                 PTin                     S1:



                     x       y    z     UND

             a                    1.0
                                                               b             a
             b     0.90 0.10                             0.9   0.1


PTout =
              x     1.0


                                  I
              y             1.0                      x             y          z    UND

              z                   1.0
           UND
                                        1.0

    University Of Toronto               ASPLOS XII                                 22
Combining Transformation Matrices


           I
   Basic Block
       S1: Instr
                         PTout   = PTin PTin PTin
                                   TS2 TS1
                                    S3
                                    S1
       S2: Instr

       S3: Instr



                                  PTout       = TBB PTin

           I
 University Of Toronto           ASPLOS XII                23
Control flow - if/else


                                           p + q = 1.0
      X                  Y
      p                  q

                             =            TX +   TY

 University Of Toronto       ASPLOS XII               24
Control flow - loops


                         N
             X                   = TX

                                              U          i
             Y
                         <L,U>
                              =     U-L+1
                                          
                                          1
                                              iL
                                                    TY
 <L,U>  <min,max>
 Both operations can be implemented efficiently
 University Of Toronto       ASPLOS XII                  25
Safe vs. Unsafe
Pointer Assignment Instructions
                                                 Safe?
           x = &y        Address-of Assignment
                                                 
           x= y          Copy Assignment
                                                 
           x = *y        Load Assignment           
                                                   
           *x = y        Store Assignment          
                                                   


Shadow/invisible vars: non-linear to handle
 University Of Toronto        ASPLOS XII                 26
Safety for Shadow Variables

                              Steensgaard
 Unsafe                        FICI PA for          Safe
                              shadow vars
    p         q                                     p      q
                         +   p_1            q
                                                =
  p_1         q                                     p_1    q



        = pointer             = shadow ptr

        = pointed at

 University Of Toronto         ASPLOS XII                      27
    LOLLIPOP Implementation
 .spd         SUIF Infrastructure


Edge
Profile
              ICFG                SMM            BU              TD           .spx

                                   Static     Implementation details:
                                              TF-Matrix          Points-To   Results
                                  Memory      •Sparse matrices Matrix
                                               Collector
                                   Model      •Efficient matrix Propagator
                                                                algorithms    Stats
                                              •Result memoization



                                                         MATLAB
                                                         C Library

          University Of Toronto             ASPLOS XII                        28
                        Measuring
                        LOLLIPOP’s
        Efficiency and Accuracy




University Of Toronto      ASPLOS XII   29
SPEC2000 Benchmark Data
 Benchmark        LOC      Matrix    PPA Analysis Time   PPA Analysis Time
                           Size N        [Unsafe]             [Safe]
   Bzip2          4686       251        0.3 seconds        0.3 seconds
    Mcf           2429       354        0.39 seconds       0.61 seconds
   Gzip           8616       563        0.71 seconds       0.77 seconds
   Crafty        21297      1917        5.49 seconds       5.51 seconds
    Vpr          17750      1976        9.33 seconds      10.34 seconds
   Twolf         20469      2611       16.59 seconds      20.64 seconds
   Parser        11402      2732       30.72 seconds      50.04 seconds
  Vortex         67225      11018     3min 59seconds     4min 56seconds
    Gap          71766      25882     54min 56seconds    83min 38seconds
  Perlbmk        85221      20922     44min 15seconds    89min 43seconds
    Gcc          22225      42109       5hour 10 min           N/A

Experimental Framework: 3GHz P4 with 1GB of RAM

   University Of Toronto
                           Scales to all of SPECint
                                    ASPLOS XII                               30
SPEC2000 Benchmark Data
        Benchmark          PPA Analysis Time            Points-To
                                [Safe]                 Profile Time
            Bzip2            0.3 seconds              13min 34seconds
             Mcf             0.61 seconds             19min 56seconds
             Gzip            0.77 seconds             3min 48seconds
            Crafty           5.51 seconds             14min 47seconds
             Vpr            10.34 seconds              3hour 17 min
            Twolf           20.64 seconds                  N/A
            Parser          50.04 seconds             84min 52seconds
           Vortex          4min 56seconds               0.7 seconds
             Gap           83min 38seconds            55min 56seconds   Reduced
          Perlbmk          89min 43seconds                 N/A          Input Set
             Gcc                 N/A                  39min 58seconds
Experimental Framework: 3GHz P4 with 1GB of RAM
        More efficient than points-to profiling                              31
   University Of Toronto                 ASPLOS XII
                Easy SPEC2000 Benchmarks
                                              7
                Average Dereference Size



                                                                                     6.1
                                              6
more accurate




                                              5

                                              4                                                              unsafe
                                                                                                             safe
                                              3                                                              p > 0.001

                                                                            1.8
                                              2                      1.5
                                                     1.4
                                                               1.2                                    1.3
                                                                                              1.0
                                              1

                                              0
                                                    gzip      vpr    mcf   crafty   vortex   bzip2   twolf


                A one-level Analysis is often adequate (i.e. safe~=unsafe)
                                           University Of Toronto            ASPLOS XII                                   32
                Challenging SPEC 95/2000 Benchmarks

                                                                             143.8
                Average Dereference Size



                                              140

                                              120
more accurate




                                              100                   89.0
                                                                                          80.1                   unsafe
                                               80
                                                                                                                 safe
                                                                                                                 p > 0.001
                                               60
                                                       42.5
                                               40
                                                                                                          18.5
                                               20
                                                                                                   6.7

                                                0
                                                     parser        perlbmk   gap           li     ijpeg   perl


                                    Many improbable points-to relations can be pruned away
                                           University Of Toronto                     ASPLOS XII                              33
Metric: Average Max Certainty

while(…)                          Max probability value = 0.72
{
     x = *a;                {   (0.72, x ), (0.08, y ), (0.2, z )         }
     …
}



                                       Σ   (max probability value)
           Avg Certainty         =       (num of loads & stores)


    University Of Toronto            ASPLOS XII                      34
                SPEC2000 Average Certainty
                                                           0.949   0.970                       0.969                                1.000
                                              1                                      0.946
                                                   0.905                    0.920                                         0.908             0.901
                                             0.9                                                        0.870
                Probabilistic Certainty




                                                                                                                0.783
                                             0.8

                                             0.7

                                             0.6
more accurate




                                             0.5

                                             0.4

                                             0.3

                                             0.2

                                             0.1

                                              0

                                                                                                        k




                                                                                                                                  2
                                                ip




                                                                           cf




                                                                                                              p




                                                                                                                                           f
                                                                 cc




                                                                                                                        x
                                                         r




                                                                                             er
                                                                                   ty




                                                                                                                                         ol
                                                                                                       m
                                                       vp




                                                                                                                               ip
                                                                                                            ga

                                                                                                                    rte
                                              gz




                                                                       m

                                                                                af

                                                                                          rs
                                                              *g




                                                                                                                                       tw
                                                                                                 rlb




                                                                                                                             bz
                                                                                                                  vo
                                                                                cr

                                                                                        pa

                                                                                               pe




       On average, LOLLIPOP can predict a single likely points-to relation
                                          University Of Toronto                         ASPLOS XII                                                  35
    Metric: NAED
       Normalized Average Euclidean Distance
                  Ps LOLIPoP’s Static prediction
while(…)
{
                                {   (0.72, x ), (0.08, y ), (0.2, z )      }
    x = *a;
    …
                                Pd Pts-to Profiler’s Dynamic freq vector
}                               {   (0.88, x ), (0.02, y ), (0.1, z )      }

                NAED            =
                                       1          Σ║P – P ║
                                                         s     d

                                      √2        (num of loads & stores)


        University Of Toronto              ASPLOS XII                      36
                Norm Avg Euclidean Dist (NAED)

                        0.6

                        0.5
more accurate




                                                                                       uniform
                NAED




                        0.4
                                                                                       safe-ref
                        0.3                                                            unsafe-ref
                                                                                       unsafe-train
                        0.2
                                                                                       unsafe-noep
                        0.1

                         0

                                crafty         gzip   perl         vortex   spec-avg


                       University Of Toronto          ASPLOS XII                                  37
Summary
 A novel matrix-based PPA algorithm
 Scales to the SPECint 95/2000 benchmarks
      One        level context and flow sensitive
   Future Work
      Applying LOLLIPOP to TLS/TM
      Further optimize LOLLIPOP’s implementation




    University Of Toronto      ASPLOS XII            38
University Of Toronto   ASPLOS XII   39
References
   Manuvir Das, Ben Liblit, Manuel Fahndrich, and Jakob Rehof. Estimating
    the Impact of Scalable Pointer Analysis on Optimization. SAS 2001,
    260-278.

   Peng-Sheng Chen, Ming-Yu Hung, Yuan-Shin Hwang, Roy Dz-Ching Ju,
    and Jenq Kuen Lee. Compiler support for speculative multithreading
    architecture with probabilistic points-to analysis. PPOPP 2003, 25-36.

   Jin Lin, Tong Chen, Wei-Chung Hsu, Peng-Chung Yew, Roy Dz-Ching Ju,
    Tin-Fook Ngai and Sun Chan, A Compiler Framework for Speculative
    Analysis and Optimizations. PLDI 2003, 289-299.

   R.D. Ju, J. Collard, and K. Oukbir. Probabilistic Memory Disambiguation
    and its Application to Data Speculation. SIGARCH Comput. Archit. News
    27 1999, 27-30.

   Manel Fernandez and Roger Espasa. Speculative Alias Analysis for
    Executable Code. PACT 2002, 221-231.



    University Of Toronto        ASPLOS XII                                  40

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:21
posted:3/9/2012
language:English
pages:40