Docstoc

Concurrency Control from Sequential Proofs _What You Prove is .._1_

Document Sample
Concurrency Control from Sequential Proofs _What You Prove is .._1_ Powered By Docstoc
					 Pointer Analysis



    G. Ramalingam
Microsoft Research, India
A Constant Propagation Example


   x = 3;
                •   x is always 3 here
                •   can replace x by 3
   y = 4;       •   and replace x+5 by 8
                •   and so on
   z = x + 5;
A Constant Propagation Example
         With Pointers


 x = 3;
              • Is x always 3 here?

 *p = 4;

 z = x + 5;
    A Constant Propagation Example
             With Pointers
 p = &y;           if (?)               p = &x;
 x = 3;              p = &x;            x = 3;
 *p = 4;           else                 *p = 4;
 z = x + 5;          p = &y;            z = x + 5;
                   x = 3;
x is always 3      *p = affect
                pointers4;        x     is always 4
                   z = x + analyses
             most program 5;


                         x may be 3 or 4
              (i.e., x is unknown in our lattice)
    A Constant Propagation Example
             With Pointers
 p = &y;        if (?)          p = &x;
 x = 3;           p = &x;       x = 3;
 *p = 4;        else            *p = 4;
 z = x + 5;       p = &y;       z = x + 5;
                x = 3;
 p always       *p = 4;        p always
points-to y     z = x + 5;    points-to x


              p may point-to x or y
       Points-to Analysis
• Determine the set of targets a pointer
  variable could point-to (at different
  points in the program)
  – “p points-to x”
    • “p stores the value &x”
    • “*p denotes the location x”
  – targets could be variables or locations in
    the heap (dynamic memory allocation)
    • p = &x;
    • p = new Foo(); or p = malloc (…);
  – must-point-to vs. may-point-to
A Constant Propagation Example
         With Pointers


 *q = 3;        Can *p denote the
                same location as *q?
 *p = 4;

 z = *q + 5;
               what values can
               this take?
       More Terminology
• *p and *q are said to be aliases (in a
  given concrete state) if they
  represent the same location
• Alias analysis
  – determine if a given pair of references
    could be aliases at a given program
    point
  – *p may-alias *q
  – *p must-alias *q
         Pointer Analysis
• Points-To Analysis   • Alias Analysis
  – may-point-to         – may-alias
  – must-point-to        – must-alias
            Points-To Analysis:
            A Simple Example


p = &x;
q = &y;
if (?) {
   q = p;
}
x = &a;
y = &b;
z = *q;
                 Points-To Analysis:
                 A Simple Example
             p        q       x      y      z

            null     null    null   null   null
p = &x;
             x       null    null   null   null
q = &y;
             x        y      null   null   null
if (?) {
             x        y      null   null   null
   q = p;
             x        x      null   null   null
}
             x       {x,y}   null   null   null
x = &a;      x       {x,y}    a     null   null
y = &b;      x       {x,y}    a      b     null
z = *q;      x       {x,y}    a      b     {a,b}
        Points-To Analysis
x = &a;
y = &b;
if (?) {         How should we handle
   p = &x;       this statement?
} else {
   p = &y;      Weak update Strong update
}
             x: a       y: b       p: {x,y}   a: null
*x = &c;
             x: a       y: b       p: {x,y}   a: c
*p = &c;
             x: {a,c}   y: {b,c}   p: {x,y}   a: c
                  Questions
• When is it correct to use a strong
  update? A weak update?

• Is this points-to analysis precise?

• What does it mean to say
  –   p   must-point-to x at pgm point u
  –   p   may-point-to x at pgm point u
  –   p   must-not-point-to x at u
  –   p   may-not-point-to x at u
 Points-To Analysis, Formally

• We must formally define what we
  want to compute before we can
  answer many such questions
  Static Program Analysis
• A static program analysis computes
  approximate information about the
  runtime behavior of a given program
  1. The set of valid programs is defined by
     the programming language syntax
  2. The runtime behavior of a given program
     is defined by the programming language
     semantics
  3. The analysis problem defines what
     information is desired
  4. The analysis algorithm determines what
     approximation to make
    Programming Language:
           Syntax
• A program consists of
  – a set of variables Var
  – a directed graph (V,E,entry) with a
    distinguished entry vertex, with every edge
    labelled by a primitive statement
• A primitive statement is of the form
  • x = null
  • x=y             Omitted (for now)
                    • Dynamic memory allocation
  • x = *y
                    • Pointer arithmetic
  • x = &y;         • Structures and fields
  • *x = y          • Procedures
  • skip
  (where x and y are variables in Var)
        Example Program
                                        1
x = &a;      Vars = {x,y,p,a,b,c}
                                            x = &a
y = &b;                                 2
if (?) {                                y = &b
   p = &x;                              3
                             p = &x           p = &y
} else {
                                    4          5
   p = &y;                      skip          skip
}                                       6
                                        *x = &c
                                        7
*x = &c;                                    *p = &c
*p = &c;                                8
    Programming Language:
     Operational Semantics
• Operational semantics == an
  interpreter (defined mathematically)
• State
  – Data-State ::= Var -> (Var U {null})
  – PC ::= V (the vertex set of the CFG)
  – Program-State ::= PC x Data-State
• Initial-state:
  – (entry, \x. null)
                    Example States
Vars = {x,y,p,a,b,c}
                        Initial data-state
         1
                         x: N, y:N, p:N, a:N, b:N, c:N
             x = &a
         2
                        Initial program-state
          y = &b        <1, x: N, y:N, p:N, a:N, b:N, c:N >
         3
p = &x         p = &y
    4           5
  skip         skip
         6
          *x = &c
         7
             *p = &c
         8
      Programming Language:
       Operational Semantics
• Meaning of primitive statements
    – CS[stmt] : Data-State -> Data-State

•   CS[   x =y]s=
•   CS[   x = *y ] s =
•   CS[   *x = y ] s =
•   CS[   x = null ] s =
•   CS[   x = &y ] s = s[x  y]
      Programming Language:
       Operational Semantics
• Meaning of primitive statements
    – CS[stmt] : Data-State -> Data-State

•   CS[   x = y ] s = s[x  s(y)]
•   CS[   x = *y ] s = s[x  s(s(y))]
•   CS[   *x = y ] s = s[s(x)  s(y)]    must say what
                                        happens if null is
•   CS[   x = null ] s = s[x  null]     dereferenced

•   CS[   x = &y ] s = s[x  y]
    Programming Language:
     Operational Semantics
• Meaning of program
  – a transition relation  on program-states
  –   Program-State X Program-State
  – state1  state2 means that the execution of
    some edge in the program can transform
    state1 into state2
• Defining 
  – (u,s)  (v,s’) iff the program contains a
    control-flow edge u->v labelled with a
    statement stmt such that M[stmt]s = s’
    Programming Language:
     Operational Semantics
• A sequence of states s1s2 … sn is said to
  be an execution (of the program) iff
  – s1 is the Initial-State
  – si  si+1 for 1 <= I < n
• A state s is said to be a reachable
  state iff there exists some execution
  s1s2 … sn is such that sn = s.
• Define RS(u) = { s | (u,s) is reachable }
    Programming Language:
     Operational Semantics


                               All of this
                           formalism for this
                             one definition




• Define RS(u) = { s | (u,s) is reachable }
    Ideal Points-To Analysis:
        Formal Definition
• Let u denote a vertex in the CFG

• Define IdealMustPT (u) to be
   { (p,x) | forall s in RS(u). s(p) == x }

• Define IdealMayPT (u) to be
  { (p,x) | exists s in RS(u). s(p) == x }
    May-Point-To Analysis:
Formal Requirement Specification

          May Point-To Analysis
     Compute R: V -> 2Vars’ such that
         R(u)  IdealMayPT(u)
       (where Var’ = Var U {null})
      For every vertex u in the CFG,
       compute a set R(u) such that
    R(u)  { (p,x) | $sRS(u). s(p) == x }
    May-Point-To Analysis:
Formal Requirement Specification
            Compute R: V -> 2Vars’ such that
                R(u)  IdealMayPT(u)
• An algorithm is said to be correct if the solution R it
  computes satisfies
             "uV. R(u)  IdealMayPT(u)
• An algorithm is said to be precise if the solution R it
  computes satisfies
             "uV. R(u) = IdealMayPT(u)
• An algorithm that computes a solution R1 is said to
  be more precise than one that computes a solution
  R2 if
                 "uV. R1(u)  R2(u)
                 Back To Our
            May-Point-To Algorithm
              p      q       x      y      z

             null   null    null   null   null
p = &x;
              x     null    null   null   null
q = &y;
              x      y      null   null   null
if (?) {
              x      y      null   null   null
   q = p;
              x      x      null   null   null
}
              x     {x,y}   null   null   null
x = &a;       x     {x,y}    a     null   null
y = &b;       x     {x,y}    a      b     null
z = *q;       x     {x,y}    a      b     {a,b}
    (May-Point-To Analysis)
         Algorithm A
• Is this algorithm correct?
• Is this algorithm precise?

• Let’s first completely and formally
  define the algorithm.
  Algorithm A: A Formal Definition
  The “Data Flow Analysis” Recipe
• Define semi-lattice of abstract-values
  – AbsDataState ::= Var -> 2Var’
  – f1  f2 = \x. (f1 (x)  f2 (x))
  – bottom = \x.{}
• Define initial abstract-value
  – InitialAbsState = \x. {null}
• Define transformers for primitive
  statements
    • AS[stmt] : AbsDataState -> AbsDataState
   Algorithm A: A Formal Definition
   The “Data Flow Analysis” Recipe
• Let st(v,u) denote stmt on edge v->u
                x(v)           x(w)
                  v             w

              st(v,u)          st(w,u)
                         u
                        x(u)
• Compute the least-fixed-point of the following
  “dataflow equations”
   – x(entry) = InitialAbsState
  – x(u) =   v->u AS(st(v,u)) x(v)
             Algorithm A:
           The Transformers
• Abstract transformers for primitive
  statements
    – AS[stmt] : AbsDataState -> AbsDataState
• AS[ x = y ] s = s[x  s(y)]
• AS[ x = null ] s = s[x  {null}]
• AS[ x = &y ] s = s[x  {y}]
• AS[ x = *y ] s = s[x  s*(s(y))]
  where s*({v1,…,vn}) = s(v1)  …  s(vn)
• AS[ *x = y ] s = ???
   Correctness & Precision
• We have a complete & formal
  definition of the problem.
• We have a complete & formal
  definition of a proposed solution.

• How do we reason about the
  correctness & precision of the
  proposed solution?
        Enter: The French Recipe
        (Abstract Interpretation)
Concrete Domain
• Concrete states: C
• Semantics: For every statement st,
         CS[st] : C -> C




                                       a
                                       g




          2Data-State                      2Var x Var’
   Points-To Analysis
(Abstract Interpretation)

                                 MayPT(u)

                   a




                                   
   RS(u)           a           IdealMayPT(u)



2Data-State                     2Var x Var’

  a(Y) = { (p,x) | exists s in Y. s(p) == x }

           IdealMayPT (u) = a ( RS(u) )
Approximating Transformers:
   Correctness Criterion
                           c is said to be correctly
                              approximated by a
                                       iff
                                    a(c)  a


    c1      correctly
                                 a1
         approximated by
     f                             f#

    c2      correctly
                                 a2
         approximated by



    C                             A
Approximating Transformers:
   Correctness Criterion

            concretization
    c1                           a1
                  g
     f                            f#
             abstraction
    c2                           a2
                  a


             requirement:
    C                            A
         f#(a1) ≥ a (f( g(a1))
     Concrete Transformers
• CS[stmt] : Data-State -> Data-State

•   CS[   x = y ] s = s[x  s(y)]
•   CS[   x = *y ] s = s[x  s(s(y))]
•   CS[   *x = y ] s = s[s(x)  s(y)]
•   CS[   x = null ] s = s[x  null]

• CS*[stmt] : 2Data-State -> 2Data-State
• CS*[st] X = { CS[st]s | s  X }
    Abstract Transformers
• AS[stmt] : AbsDataState -> AbsDataState


• AS[ x = y ] s = s[x  s(y)]
• AS[ x = null ] s = s[x  {null}]
• AS[ x = *y ] s = s[x  s*(s(y))]
  where s*({v1,…,vn}) = s(v1)  …  s(vn)
• AS[ *x = y ] s = ???
    Algorithm A: Tranformers
      Weak/Strong Update
x: &y y: &x z: &a
                        g    x: {&y} y: {&x,&z} z: {&a}
x: &y y: &z z: &a


       f *y = &b;                         f# *y = &b;



x: &b y: &x z: &a
                    a       x: {&y,&b} y: {&x,&z} z: {&a,&b}
x: &y y: &z z: &b
    Algorithm A: Tranformers
      Weak/Strong Update
x: &y y: &x z: &a
                        g   x: {&y} y: {&x,&z} z: {&a}
x: &y y: &z z: &a


       f *x = &b;                      f# *x = &b;



x: &y y: &b z: &a
                    a        x: {&y} y: {&b} z: {&a}
x: &y y: &b z: &a
  Algorithm A: Transformers
     Weak/Strong Update
• Transformer for “*p = q”

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:3
posted:2/21/2013
language:English
pages:42