# Certifying Software for Correctness w.r.t. its Mathematical

Document Sample

Certifying Software for Correctness
w.r.t. its Mathematical Specification

Automated Software Engineering
Lecture 14
Grigore Rosu
(Thanks for help to Bernd Fischer and Jon Whittle of NASA Ames)
Software Analysis I
Automation

Efficacy
   Type checking
   Automatic, efficient, effective, limited set of errors
   Testing
   Ad-hoc, but the most used in practice
Program Analysis II

   Advanced static analyzers (Polyspace, ESC)
   Often scale up, false positives and negatives
   Runtime Verification and Monitoring
   Explore the execution trace of a program for
potential or actual errors
   Scale up well, find many errors, not all the errors
   Model checking and theorem proving
   General, good confidence, do not scale up well
Goal:
Lectures 14, 15, 16, 17, 18, …
   Learn what program synthesis and
theorem proving are
   Understand why theorem proving is hard
   The role of code annotations (hints)
   Sometimes annotations can be generated
automatically (happy case!)
   Program synthesis, certifying compilers
   Apply it on safety-critical domain-specific
programs
Program Synthesis
   Schema based program synthesis
    Amphion – synthesizes astronomical navigation software
    AutoBayes – synthesizes data analysis programs
    AutoFilter – synthesizes state estimation

Algorithm Schemas                   Synthesis              Program
Engine
I(A2)
A2
A1
A3
I(A3)    I(A1)
Specification
Amphion:
   NAIF is a NASA library with functions for
   Example problem:
Given a time tEarth and an observation point
pObs on Earth, calculate the angle under which
Saturn is seen from pObs at time tEarth; the angle
is relative to the normal on Earth’s surface in pObs

   Amphion – NAIF based program
synthesis system
Amphion: Example
mEarth        pObs
...                                                  dNorm
Earth                dNorm
// tEarth : time of observation
// pObs : position of observation                         angle
rEarth := bodvar(earthId, ‘radii’);                        pObs   Saturn
dNorm := surfnm(rEarth, pObs);                           dSaturn
rEarth
pEarth := findp(earthId, tEarth);           Earth
mEarth := bodmat(earthId, tEarth);
pObs := mtxv(mEarth, pObs);                               pObs
dNorm := mtxv(mEarth, dNorm);
tSaturn := sent(saturnId, earthId, tEarth);
pSaturn := findp(saturnId, tSaturn);                             pSaturn
pEarth
dSaturn := vsub(pSaturn, pObs);
angle := vsep(dNorm, dSaturn);
J2000
Amphion: Abstract Domain
AutoBayes:
Synthesizes Data Analysis Programs
Initial model:
model landsat as ‘Landsat Clustering’.

...                      const nat n_points as ‘number of pixels’.
const nat n_bands as ‘number of bands’.
const nat n_classes := 5 as ‘number of classes’
where n_classes << n_points.
double phi(1..n_classes) as ‘relative class strenghts’
where 1 = sum(I := 1..n_classes, phi(I)).
double mu(1..n_classes), sigma(1..n_classes)
where 0 < sigma(_).
int c(1..n_points) as ‘class assignments’.
c(_) ~ discrete(phi).
data double x(1..n_points, 1..n_bands) as ‘pixels’.
Ground cover map:                    x(I,_) ~ gauss(mu(c(I)), sigma(c(I))).
max pr(x | {phi,mu,sigma}) for {phi,mu,sigma}.
• multiple Landsat-bands
• for pixels: estimate classes       Model refinements:
• for classes: estimate parameters   sigma(_) ~ invgamma(delta/2+1,sigma0*delta/2).
...
Implementation problems:
Model changes:
• which model?                        x(I,_) ~ cauchy(mu(c(I)), sigma(c(I))).
• which algorithm?                    x(I,_) ~ mixture(c(I) cases
• efficient C/C++ code?                                1 -> gauss(0, error),
_ -> cauchy(mu(c(I)),sigma(c(I)))).
• correctness?                        ...
AutoBayes
AutoBayes System Architecture:
Program Schemas        Current Status & Work in Progress:
Decomposition         • Textbook solutions for...
– normal models with priors 
– poisson, binomial models 
EM-Algorithm          • Clustering & Classification
– applied to NASA datasets 
(Landsat, metorite classification)
– simple EM & k-means schemas 
...             – efficient data structures
• Changepoint / Markov models
– applied to NASA datasets 
(-ray burst step detection)
• end-to-end system                                   – sensor failure models ()
• code & documentation                                – application to diagnosis tasks
• multiple backends (C/Matlab, C++/Octave, ...)
• multiple programs for one model
• fast & scalable – Landsat: ~400LoC / ~5secs
AutoFilter:
Synthesizes State Estimation Programs
AutoFilter: code generator for state estimation problems
(e.g., estimate the orientation of a air/space craft given a
model of the spacecraft’s behavior and noisy sensor data)
Based on schema-based program synthesis techniques: core algorithms (e.g.,
kalman filter, particle filter) are formalized as generic template algorithms.
Given a high level problem description, Autofilter instantiates and
composes template algorithms to produce executable code (Matlab/C or C)

problem details

AutoFilter “fills in holes”
template with “holes”
AutoFilter Example:
Deep Space Attitude Estimation

d                                       enter prior estimate and
      1 2 (  1 )
ˆ                              its error covariance
dt
q  qSRU  q *est
q  0
Compute Kalman gain

Spec
Update estimate with
measurement

Compute error covariance
code                                                                    ^
for updated measurement
generated                                                               
Annotated Program Synthesis
   Schema based program synthesis
    AutoFilter – synthesizes Kalman filters
    AutoBayes – synthesizes data analysis programs
    Amphion – synthesizes astronomical navigation software

Annotated                                               Annotated
Algorithm Schemas                   Synthesis                Program
Engine
A2                                                  I(A2)
A1
A3                     Specification
I(A3)    I(A1)

Theorem proving
Annotations
Certifier    Warnings        Monitor
Certifying Domain Specific
Properties of Synthesized Code

   AutoFilter

   Focus on “certifying” rather than “synthesizing”

   Use theorem proving
The General Picture

Spec
Domain model
Domain specific                 Specification language
software generator (DSSG)
Algorithm library
Spec-to-code translator
code + proofs of properties     Properties to check

Certification engine
Safety policy
Kernel proof checker
checked proofs
An Instance: Mathematical Software

Model of Kalman Filters,
Differential equations             differentials, matrices,
linearization, discretization etc.
Schemas representing basic
AutoFilter program synthesis            Kalman Filter, linearized KF,
system for Kalman Filters               extended KF, particle filter
Schema-based synthesis engine
statistically optimal?
Kalman Filter + statistical optimality   frame consistency?
proof; frame/unit safety proof           units consistency?

Axioms for mathematical
Maude certifier                       operations
Inductive theorem prover
checked proofs
What is the Problem?

   Code generators increasingly used, even in
mission-critical applications. How do we know
that correct code is generated?
   Two approaches:
   verify the code generator: too costly - generator is big,
complex and proofs have to be redone at each update
      product-oriented certification: verify each individual
generated program. Can be automated by building in
proof schemas in the generator that output code + a
proof of its correctness. The proof can be then checked
by a simple, fully verified proof checker.
Our Approach

   We focus on
   Code generators for mission-critical applications
   NASA specific safety critical domains
   Generating large and useful programs rather
than small, “correct-by-construction” and often
useless (for NASA’s requirements!) programs
   Prove crucial properties about generated code
   Product-oriented certification approach
   Independent, simple domain-specific certifiers
Autofilter Code Generator 1

AutoFilter: code generator for state estimation problems
(e.g., estimate the orientation of a air/space craft given a
model of the spacecraft’s behavior and noisy sensor data)
Based on schema-based program synthesis techniques: core algorithms (e.g.,
kalman filter, particle filter) are formalized as generic template algorithms.
Given a high level problem description, Autofilter instantiates and
composes template algorithms to produce executable code (Matlab/C or C)

problem details

AutoFilter “fills in holes”
template with “holes”
Autofilter Code Generator 2

Advantage: (parts of) algorithms can be formalized once and then
combined in non-trivial ways to automatically generate code that is
highly tuned for the particular problem

Problem: how do we know schemas are composed correctly?
For Kalman Filters, correctness means that the code computes a
statistically optimal estimate: i.e., the mean squared error is minimized

Solution: we augment the schemas to provide segments of the
optimality proofs that are then composed during synthesis to
output code + a proof of optimality of the code
Autofilter Code Generator 3
Advantage: generic proofs can be formalized once and forever by
trained experts, then combined in non-trivial ways to automatically
generate global correctness proofs for the entire generated code

Problem: Detailed proofs are hard to compose! Proof
compositionality is becoming an important but complex area.

Partial solution: Synthesize proofs from high level proof hints!
Hints are more conceptual and have higher chances to compose.
We use Maude’s ITP tool and a database of domain-specific
lemmas to generate detailed proofs from high level hints.
Example:
Deep Space Attitude Estimation

d                                       enter prior estimate and
      1 2 (  1 )
ˆ                              its error covariance
dt
q  qSRU  q *est
q  0
Compute Kalman gain

Spec
Update estimate with
measurement

Compute error covariance
code                                                                    ^
for updated measurement
generated                                                               
Kalman Filter Schema

/* applicability conditions */                                 Schema builds data structure
… process and measurement noises are Gaussian …                of slots – each slot instantiated
… process/measurement noise independent …                      by this or other schemas
/* set up template */
result := kalman(local(%),initialize(%),loop(%),postloop(%))   A slot is a code fragment
%loop := for(pvar, 0, n, update(zupdate(%),
phiupdate(%),                         Slots are annotated with
hupdate(%),                           pre/post conditions
gain(%),
estimateUpdate(%),                    Proof goal: minimization of
covarUpdate(%),                       mean-squared error. This is
storeOutput(%),                       proved by showing a chain of
propagateEstimate(%),                 post  pre. These subproofs
propagateCovar(%)))                   and the pre/post-conditions
/* fill in some slots */                                       are provided as annotations
/* recursively invoke schemas to fill remaining slots */       with the generated code
Schemas and Proofs

Each slot is a “gap” that is filled either by recursive
schema invocation, symbolic computation, low-
level reasoning – or a combination

Each slot has its pre/post conditions, so as long
as the pre/post conditions are satisfied by whatever
fills the gap, the overall proof is valid

This means we can reuse the same proof structure
for multiple instantiations of the schema – i.e. we
can provide generic proofs for generic programs
Proof checking

for mission-critical applications, code
code + proof goal + proof               reviewer needs to be able to check
the proofs easily

proof checked by independent party using a
Certification engine         trusted proof checker. Currently, this proof checker
is the Maude rewrite engine (not trusted, however!)

automatically checked proof
note: proof is too complex to be proved
certification engine requires axioms      from the code alone. The schema
about matrices, differentiation, etc.     provides the crucial lemmas needed for
(currently ~ 500 axioms and lemmas)       the proof as hints. The proof checker
just replays the proof.
Results (AutoFilter)

   AutoFilter used to generate KF code
for Deep Space application and for
thruster control software
 synthesizes 

150 LOC (C/C++) per second
 high leverage factors (1:30+)

 similar specifications ~

different solutions
Results (Certification)

   Certification approaches developed for
minimization of mean-squared error, frame
consistency, measurement unit consistency
   Basic, extended and information KF schemas with
proof annotations were certified (proof of basic KF
takes 2 minutes on a 2 x 2.4GHz, 4GB machine; > 1
million proof steps; > 10 pages of informal textbook
proof)
   Measurement unit certifier for a segment of C
   Coordinate frame certifier: ~ 1,000 LOC per second
Future Work

   Proof compositionality (glue proofs)
   Develop a simple proof checker for
membership equational logic
   Use Maude and its ITP tool to generate proofs,
not to check them!
   Add more schemas with proofs to AutoFilter
   Apply our certification technology to other
schema-based synthesis engines.

DOCUMENT INFO
Shared By:
Categories:
Stats:
 views: 31 posted: 4/29/2009 language: English pages: 28