Concurrency Control from Sequential Proofs _What You Prove is by pptfiles

VIEWS: 2 PAGES: 43

• pg 1
```									 Pointer Analysis
Lecture 2

G. Ramalingam
Microsoft Research, India
Recap:
A basic pointer analysis algorithm
1              S1 = [x -> {null}, y -> {null}, p -> {null},…]
x = &a
2              S2 = AS[x = &a] S1           S2 = S1 [x -> {a}]
y=x
3              S3 = AS[y = x] S2           S3 = S2 [y -> S2(x)]
p = &x         p = &y
4            5              …
skip         skip
6
*x = &c
7                     …
*p = &c
8
Abstract Transformers
• AS[stmt] : AbsDataState -> AbsDataState

• AS[ x = y ]    s = s[x  s(y)]
• AS[ x = null ] s = s[x  {null}]
• AS[ x = *y ] s = s[x  s*(s(y))]
where s*({v1,…,vn}) = s(v1)  …  s(vn)
Abstract Transformers
AS[stmt] : AbsDataState -> AbsDataState

AS[ *x = y ] s =
Andersen’s Analysis

• A flow-insensitive analysis
– computes a single points-to solution valid at
all program points
– ignores control-flow – treats program as a
set of statements
– equivalent to merging all vertices into one
(and applying algorithm A)
– equivalent to adding an edge between every
pair of vertices (and applying algo. A)

– a solution R: Vars -> 2Vars’ such that
R  IdealMayPT(u) for every vertex u
Example
(Flow-Sensitive Analysis)

1
x = &a;        x = &a
2
y = x;      y=x
3

x = &b;     x = &b
4
z=x
z = x;
5
Example:
Andersen’s Analysis

1
x = &a;           x = &a
2
y = x;         y=x
3

x = &b;        x = &b
4
z=x
z = x;
5
Andersen’s Analysis
• Strong updates?

• Initial state?
Why Flow-Insensitive Analysis?
• Reduced space requirements
– a single points-to solution
• Reduced time complexity
– no copying
• individual updates more efficient
– no need for joins
– number of iterations?
– a cubic-time algorithm
• Scales to millions of lines of code
– most popular points-to analysis
Andersen’s Analysis
A Set-Constraints Formulation
• Compute PTx for every variable x
Statement       Constraint

x = null

x = &y

x=y

x = *y

*x = y
Steensgaard’s Analysis
• Unification-based analysis
• Inspired by type inference
– an assignment “lhs := rhs” is interpreted as a
constraint that lhs and rhs have the same
type
– the type of a pointer variable is the set of
variables it can point-to
• “Assignment-direction-insensitive”
– treats “lhs := rhs” as if it were both “lhs :=
rhs” and “rhs := lhs”
• An almost-linear time algorithm
– single-pass algorithm; no iteration required
Example:
Andersen’s Analysis

1
x = &a;           x = &a
2
y = x;         y=x
3

y = &b;        y = &b
4
b = &c
b = &c;
5
Example:
Steensgaard’s Analysis

1
x = &a;            x = &a
2
y = x;         y=x
3

y = &b;        y = &b
4
b = &c
b = &c;
5
Steensgaard’s Analysis
• Can be implemented using Union-
Find data-structure
• Leads to an almost-linear time
algorithm
Exercise
x = &a;

y = x;

y = &b;

b = &c;

*x = &d;
May-Point-To Analyses
Ideal-May-Point-To
???

Algorithm A

more efficient / less precise

Andersen’s

more efficient / less precise

Steensgaard’s
Ideal Points-To Analysis:
Definition Recap
• A sequence of states s1s2 … sn is said to be an
execution (of the program) iff
– s1 is the Initial-State
– si | si+1 for 1 <= I < n
• A state s is said to be a reachable state iff there
exists some execution s1s2 … sn is such that sn = s.
• RS(u) = { s | (u,s) is reachable }
• IdealMayPT (u) = { (p,x) | \$ s  RS(u). s(p) == x }
• IdealMustPT (u) = { (p,x) | " s  RS(u). s(p) == x }
Does Algorithm A Compute
The Most Precise Solution?
Ideal <-> Algorithm A
• Abstract away correlations
between variables
x: &y y: &x     – relational analysis vs.
x: &y y: &z     – independent attribute

x: &b y: &x
x: &b y: &z
x: {&y,&b} y: {&x,&z}

x: &b y: &x

x: &y y: &z
Does Algorithm A Compute
The Most Precise Solution?
Is The Precise Solution
Computable?
• Claim: The set RS(u) of reachable
concrete states (for our language) is
computable.

• Note: This is true for any collecting
semantics with a finite state space.
Computing RS(u)
Precise Points-To Analysis:
Decidability
• Corollary: Precise may-point-to analysis is
computable.

• Corollary: Precise (demand) may-alias
analysis is computable.
– Given ptr-exp1, ptr-exp2, and a program point
u, identify if there exists some reachable state at
u where ptr-exp1 and ptr-exp2 are aliases.

• Ditto for must-point-to and must-alias

• … for our restricted language!
Precise Points-To Analysis:
Computational Complexity
• What’s the complexity of the least-fixed point
computation using the collecting semantics?

• The worst-case complexity of computing
reachable states is exponential in the number
of variables.
– Can we do better?

• Theorem: Computing precise may-point-to is
PSPACE-hard even if we have only two-level
pointers.
May-Point-To Analyses
Ideal-May-Point-To
more efficient / less precise

Algorithm A

more efficient / less precise

Andersen’s

more efficient / less precise

Steensgaard’s
Precise Points-To Analysis:
Caveats
• Theorem: Precise may-alias analysis is
undecidable in the presence of
dynamic memory allocation.
– Add “x = new/malloc ()” to language
– State-space becomes infinite

• Digression: Integer variables +
conditional-branching also makes any
precise analysis undecidable.
May-Point-To Analyses
Ideal (with Int, with Malloc)

Ideal (with Int)          Ideal (with Malloc)

Ideal (no Int, no Malloc)

Algorithm A

Andersen’s

Steensgaard’s
Dynamic Memory Allocation
• s: x = new () / malloc ()
• Assume, for now, that allocated object stores
one pointer
– s: x = malloc ( sizeof(void*) )
• Introduce a pseudo-variable Vs to represent
objects allocated at statement s, and use
previous algorithm
– treat s as if it were “x = &Vs”
– also track possible values of Vs
– allocation-site based approach
• Key aspect: Vs represents a set of objects
(locations), not a single object
– referred to as a summary object (node)
Dynamic Memory Allocation:
Example

1
x = new;       x = new
2
y = x;     y=x
3

*y = &b;   *y = &b
4
*y = &a
*y = &a;
5
Dynamic Memory Allocation:
Summary Object Update

4
*y = &a
5
Dynamic Memory Allocation:
Object Fields
• Field-sensitive analysis
class Foo {
A* f;
B* g;
}
s: x = new Foo()

x->f = &b;

x->g = &a;
Dynamic Memory Allocation:
Object Fields
• Field-insensitive analysis
class Foo {
A* f;
B* g;
}
s: x = new Foo()

x->f = &b;

x->g = &a;
Interpreting Branch
Conditions
Conditional Control-Flow
(In The Concrete Semantics)
• Encoding conditional-control-flow
– using “assume” statements

1
if (P) then     assume P       assume !P
S1;
else                2              4

S2;               S1             S2
3              5
endif
Conditional Control-Flow
(In The Concrete Semantics)
• Semantics of “assume” statements
– DataState -> {true,false}

1
if (P) then      assume P       assume !P
S1;
else                 2              4

S2;                S1             S2
3              5
endif
Abstracting “assume”
statements
1
if (x != null) then   assume (x != null)
y = x;                                  assume (x == null)
else                         2                     4
y=x                   S2
…
3                     5
endif
Abstracting “assume”
statements

2

assume x == y

3
Other Aspects
• Context-sensitivity
• Indirect (virtual) function calls and
call-graph construction
• Pointer arithmetic
• Object-sensitivity
Andersen’s Analysis:
Further Optimizations and Extensions
• Fahndrich et al., Partial online cycle elimination in
inclusion constraint graphs, PLDI 1998.
• Rountev and Chandra, Offline variable substitution
for scaling points-to analysis, 2000.
• Heintze and Tardieu, Ultra-fast aliasing analysis using
CLA: a million lines of C code in a second, PLDI
2001.
• M. Hind, Pointer analysis: Haven’t we solved this
problem yet?, PASTE 2001.
• Hardekopf and Lin, The ant and the grasshopper: fast
and accurate pointer analysis for millions of lines of
code, PLDI 2007.
• Hardekopf and Lin, Exploiting pointer and location
equivalence to optimize pointer analysis, SAS 2007.
• Hardekopf and Lin, Semi-sparse flow-sensitive
pointer analysis, POPL 2009.
Andersen’s Analysis:
Further Optimizations
• Cycle Elimination
– Offline
– Online
• Pointer Variable Equivalence
Context-Sensitivity Etc.
• Liang & Harrold, Efficient computation of
parameterized pointer information for
interprocedural analyses. SAS 2001.
• Lattner et al., Making context-sensitive points-to
analysis with heap cloning practical for the real
world, PLDI 2007.
• Zhu & Calman, Symbolic pointer analysis revisited.
PLDI 2004.
• Whaley & Lam, Cloning-based context-sensitive
pointer alias analysis using BDD, PLDI 2004.
• Rountev et al. Points-to analysis for Java using
annotated constraints. OOPSLA 2001.
• Milanova et al. Parameterized object sensitivity for
points-to and side-effect analyses for Java. ISSTA
2002.
Applications
• Compiler optimizations

• Verification & Bug Finding
– use in preliminary phases
– use in verification itself
Questions?

```
To top