Concurrency Control from Sequential Proofs _What You Prove is

Document Sample
Concurrency Control from Sequential Proofs _What You Prove is Powered By Docstoc
					 Pointer Analysis
       Lecture 2



    G. Ramalingam
Microsoft Research, India
               Recap:
 A basic pointer analysis algorithm
         1              S1 = [x -> {null}, y -> {null}, p -> {null},…]
             x = &a
         2              S2 = AS[x = &a] S1           S2 = S1 [x -> {a}]
         y=x
         3              S3 = AS[y = x] S2           S3 = S2 [y -> S2(x)]
p = &x         p = &y
   4            5              …
  skip         skip
         6
         *x = &c
         7                     …
             *p = &c
         8
    Abstract Transformers
• AS[stmt] : AbsDataState -> AbsDataState


• AS[ x = y ]    s = s[x  s(y)]
• AS[ x = null ] s = s[x  {null}]
• AS[ x = *y ] s = s[x  s*(s(y))]
  where s*({v1,…,vn}) = s(v1)  …  s(vn)
    Abstract Transformers
AS[stmt] : AbsDataState -> AbsDataState


AS[ *x = y ] s =
        Andersen’s Analysis

• A flow-insensitive analysis
  – computes a single points-to solution valid at
    all program points
  – ignores control-flow – treats program as a
    set of statements
  – equivalent to merging all vertices into one
    (and applying algorithm A)
  – equivalent to adding an edge between every
    pair of vertices (and applying algo. A)

  – a solution R: Vars -> 2Vars’ such that
       R  IdealMayPT(u) for every vertex u
           Example
    (Flow-Sensitive Analysis)

           1
x = &a;        x = &a
           2
y = x;      y=x
           3

x = &b;     x = &b
           4
               z=x
z = x;
           5
              Example:
          Andersen’s Analysis

              1
x = &a;           x = &a
              2
y = x;         y=x
              3

x = &b;        x = &b
              4
                  z=x
z = x;
              5
       Andersen’s Analysis
• Strong updates?

• Initial state?
Why Flow-Insensitive Analysis?
• Reduced space requirements
  – a single points-to solution
• Reduced time complexity
  – no copying
     • individual updates more efficient
  – no need for joins
  – number of iterations?
  – a cubic-time algorithm
• Scales to millions of lines of code
  – most popular points-to analysis
     Andersen’s Analysis
A Set-Constraints Formulation
• Compute PTx for every variable x
    Statement       Constraint


   x = null


   x = &y


   x=y


   x = *y


   *x = y
     Steensgaard’s Analysis
• Unification-based analysis
• Inspired by type inference
  – an assignment “lhs := rhs” is interpreted as a
    constraint that lhs and rhs have the same
    type
  – the type of a pointer variable is the set of
    variables it can point-to
• “Assignment-direction-insensitive”
  – treats “lhs := rhs” as if it were both “lhs :=
    rhs” and “rhs := lhs”
• An almost-linear time algorithm
  – single-pass algorithm; no iteration required
              Example:
          Andersen’s Analysis

              1
x = &a;           x = &a
              2
y = x;         y=x
              3

y = &b;        y = &b
              4
                  b = &c
b = &c;
              5
               Example:
         Steensgaard’s Analysis

               1
x = &a;            x = &a
               2
y = x;         y=x
               3

y = &b;        y = &b
               4
                   b = &c
b = &c;
               5
   Steensgaard’s Analysis
• Can be implemented using Union-
  Find data-structure
• Leads to an almost-linear time
  algorithm
    Exercise
x = &a;

y = x;

y = &b;

b = &c;

*x = &d;
May-Point-To Analyses
    Ideal-May-Point-To
                  ???


         Algorithm A

     more efficient / less precise


           Andersen’s

     more efficient / less precise


        Steensgaard’s
      Ideal Points-To Analysis:
          Definition Recap
• A sequence of states s1s2 … sn is said to be an
  execution (of the program) iff
   – s1 is the Initial-State
   – si | si+1 for 1 <= I < n
• A state s is said to be a reachable state iff there
  exists some execution s1s2 … sn is such that sn = s.
• RS(u) = { s | (u,s) is reachable }
• IdealMayPT (u) = { (p,x) | $ s  RS(u). s(p) == x }
• IdealMustPT (u) = { (p,x) | " s  RS(u). s(p) == x }
Does Algorithm A Compute
The Most Precise Solution?
     Ideal <-> Algorithm A
              • Abstract away correlations
                between variables
x: &y y: &x     – relational analysis vs.
x: &y y: &z     – independent attribute

x: &b y: &x
x: &b y: &z
                  x: {&y,&b} y: {&x,&z}


x: &b y: &x

x: &y y: &z
Does Algorithm A Compute
The Most Precise Solution?
     Is The Precise Solution
          Computable?
• Claim: The set RS(u) of reachable
  concrete states (for our language) is
  computable.

• Note: This is true for any collecting
  semantics with a finite state space.
Computing RS(u)
   Precise Points-To Analysis:
           Decidability
• Corollary: Precise may-point-to analysis is
  computable.

• Corollary: Precise (demand) may-alias
  analysis is computable.
  – Given ptr-exp1, ptr-exp2, and a program point
    u, identify if there exists some reachable state at
    u where ptr-exp1 and ptr-exp2 are aliases.

• Ditto for must-point-to and must-alias

• … for our restricted language!
   Precise Points-To Analysis:
   Computational Complexity
• What’s the complexity of the least-fixed point
  computation using the collecting semantics?

• The worst-case complexity of computing
  reachable states is exponential in the number
  of variables.
  – Can we do better?

• Theorem: Computing precise may-point-to is
  PSPACE-hard even if we have only two-level
  pointers.
May-Point-To Analyses
    Ideal-May-Point-To
     more efficient / less precise


         Algorithm A

     more efficient / less precise


           Andersen’s

     more efficient / less precise


        Steensgaard’s
  Precise Points-To Analysis:
            Caveats
• Theorem: Precise may-alias analysis is
  undecidable in the presence of
  dynamic memory allocation.
  – Add “x = new/malloc ()” to language
  – State-space becomes infinite

• Digression: Integer variables +
  conditional-branching also makes any
  precise analysis undecidable.
May-Point-To Analyses
       Ideal (with Int, with Malloc)


Ideal (with Int)          Ideal (with Malloc)


         Ideal (no Int, no Malloc)


               Algorithm A


                   Andersen’s


               Steensgaard’s
  Dynamic Memory Allocation
• s: x = new () / malloc ()
• Assume, for now, that allocated object stores
  one pointer
  – s: x = malloc ( sizeof(void*) )
• Introduce a pseudo-variable Vs to represent
  objects allocated at statement s, and use
  previous algorithm
  – treat s as if it were “x = &Vs”
  – also track possible values of Vs
  – allocation-site based approach
• Key aspect: Vs represents a set of objects
  (locations), not a single object
  – referred to as a summary object (node)
 Dynamic Memory Allocation:
         Example

           1
x = new;       x = new
           2
y = x;     y=x
           3

*y = &b;   *y = &b
           4
               *y = &a
*y = &a;
           5
Dynamic Memory Allocation:
 Summary Object Update



                   4
                   *y = &a
                   5
 Dynamic Memory Allocation:
       Object Fields
• Field-sensitive analysis
  class Foo {
     A* f;
     B* g;
  }
  s: x = new Foo()

  x->f = &b;

  x->g = &a;
 Dynamic Memory Allocation:
       Object Fields
• Field-insensitive analysis
  class Foo {
     A* f;
     B* g;
  }
  s: x = new Foo()

  x->f = &b;

  x->g = &a;
Interpreting Branch
     Conditions
   Conditional Control-Flow
     (In The Concrete Semantics)
• Encoding conditional-control-flow
  – using “assume” statements

                            1
 if (P) then     assume P       assume !P
    S1;
 else                2              4

    S2;               S1             S2
                     3              5
 endif
   Conditional Control-Flow
     (In The Concrete Semantics)
• Semantics of “assume” statements
  – DataState -> {true,false}



                             1
 if (P) then      assume P       assume !P
    S1;
 else                 2              4

    S2;                S1             S2
                      3              5
 endif
         Abstracting “assume”
              statements
                                      1
if (x != null) then   assume (x != null)
   y = x;                                  assume (x == null)
else                         2                     4
                              y=x                   S2
   …
                             3                     5
endif
    Abstracting “assume”
         statements

2


assume x == y

3
          Other Aspects
• Context-sensitivity
• Indirect (virtual) function calls and
  call-graph construction
• Pointer arithmetic
• Object-sensitivity
        Andersen’s Analysis:
Further Optimizations and Extensions
• Fahndrich et al., Partial online cycle elimination in
  inclusion constraint graphs, PLDI 1998.
• Rountev and Chandra, Offline variable substitution
  for scaling points-to analysis, 2000.
• Heintze and Tardieu, Ultra-fast aliasing analysis using
  CLA: a million lines of C code in a second, PLDI
  2001.
• M. Hind, Pointer analysis: Haven’t we solved this
  problem yet?, PASTE 2001.
• Hardekopf and Lin, The ant and the grasshopper: fast
  and accurate pointer analysis for millions of lines of
  code, PLDI 2007.
• Hardekopf and Lin, Exploiting pointer and location
  equivalence to optimize pointer analysis, SAS 2007.
• Hardekopf and Lin, Semi-sparse flow-sensitive
  pointer analysis, POPL 2009.
       Andersen’s Analysis:
      Further Optimizations
• Cycle Elimination
  – Offline
  – Online
• Pointer Variable Equivalence
    Context-Sensitivity Etc.
• Liang & Harrold, Efficient computation of
  parameterized pointer information for
  interprocedural analyses. SAS 2001.
• Lattner et al., Making context-sensitive points-to
  analysis with heap cloning practical for the real
  world, PLDI 2007.
• Zhu & Calman, Symbolic pointer analysis revisited.
  PLDI 2004.
• Whaley & Lam, Cloning-based context-sensitive
  pointer alias analysis using BDD, PLDI 2004.
• Rountev et al. Points-to analysis for Java using
  annotated constraints. OOPSLA 2001.
• Milanova et al. Parameterized object sensitivity for
  points-to and side-effect analyses for Java. ISSTA
  2002.
            Applications
• Compiler optimizations

• Verification & Bug Finding
  – use in preliminary phases
  – use in verification itself
Questions?

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:2
posted:2/21/2013
language:Latin
pages:43