Run-Time Verification

Document Sample
Run-Time Verification Powered By Docstoc
					Verification Based on Run-Time,
     Field-Data, and Beyond
               Séverine Colin
               Laboratoire d’Informatique (LIFC)
               Université de Franche-Comté-CNRS-INRIA

               Leonardo Mariani
                Dipartimento di Informatica, Sistemistica e Comunicazione (DISCo)
                Università di Milano Bicocca

               Tope Omitola
                Computer Laboratory
                University of Cambridge, UK

          Traditional Run-Time Verification Techniques
           –   checking properties on execution data at run-time
          Test and Verification Techniques based on Field-
           –   gathering execution data to increase effectiveness of (off-
               line) test and verification techniques
          Discussion on Test, Verification and Model-Checking
          Conclusions

       Run-Time Verification Techniques

          Basic idea : to extract an execution trace of
           an executing program and to analyze it to
           detect errors
          To check classical error pattern (data races,
          To verify a program against formal

       Data races detection

          Data race: two concurrent threads access a
           shared variable and at least one access is a
           write in same time
          Eraser tool dynamically detects data races
          To enforce every shared variable is protected
           by some lock
          Eraser algorithm is used by PathExplorer,
           Visual Thread
       Deadlock Detection

          Deadlock: to occur whenever multiple shared
           resources are required to accomplish a task
          A model representation of the program is
           constructed during the program execution
          Deadlock: circularity in the dependency
          Used by VisualThread and PathExplorer

       Monitoring and Checking (MaC)

          System requirements are formalised
          Monitoring script is constructed:
           –   to instrument the code
           –   to establish a mapping from low-level information
               into high-level events
          At run-time, generated events are monitored
           for compliance with the requirements
       MaC: Events and Conditions

          Events occur instantaneously during the
           system execution
          Conditions are information that hold for a
           duration of time
          Three-valued logic: true, false, undefined
          PEDL (Primitive Event Definition Language):
           language for monitoring scripts
          MEDL (Meta Event Definition Language):
7/36       language for safety requirements
       PathExplorer (1/2)

          Instrumentation module (using Jtrek): it emits
           relevant events
          An interaction module: send events to
           observer module
          An observer module: it verifies the
           requirement specification

       PathExplorer (2/2)

          Requirements are written using past LTL
           (Monitoring operators are added: ↑F, ↓F,
           [F,F)S, [F,F)w
          Use the recursive nature of past time
           temporal logic: the satisfaction relation for a
           formula can be calculated along the
           execution trace looking only one step
           backwards (see our paper for the algorithm)

        T&V Techniques based on Field-data

           Field-data: “run-time data collected from the field”
           Why collecting field data for Test and Verification?
            –   limited knowledge about the final system,
                    e.g., sw components are usually developed in isolation,
                     assembled with third-party components and, finally, deployed
                     in unknown environments
            –   uncertainty of the final environment
                    e.g., in the case of ubiquitous computing, pervasive
                     computing, mobile computing, and wireless networks, it is not
                     possible to predict in advance every possible situation
            –   dynamic environments
                    e.g., in the case of mobile code, self-adaptive systems and
                     peer-to-peer systems, resources suddenly appear and
        Existing Approaches

           Field-data has been collected for:
            –   Evaluating usability of an application (usability
            –   Modelling usage of the system
                    which components, modules and functionalities are used?
            –   Learning properties of the implementation
            –   Modelling program faults
                    which failures have been recognized on the target system?

        Evaluating Usability

           Traditionally, data for usability testing has been
            gathered by running testing sessions
           Novel approaches: silent data-gathering systems
            –   Automatic Navigability Testing System (ANTS) [Rod02]
            –   Web Variable Instrumented Program (Webvip) [VG]
            –   Gamma System [OLHL02]

          Silent Data-Gathering Systems (1/2)

ANTS server
                       ANTS                                       Webvip
                                                Data server
                                                              upload          session file
 server agent                                                                       user’s actions

        communication          user’s actions

          http://...                                          http://...

             client-side agent

13/36                                                                script
                 multimedia content
        Silent Data-Gathering Systems (2/2)


                                   figure appeared in [OLHL02]
        Modelling Usage of the System (1/2)

           for performing system-specific impact analysis
            –   Law and Rothermel’s impact analysis [LR03]
                    the program is instrumented to produce execution traces
                     representing the procedure-level execution flow, e.g.,
                    the impacted set for procedure P is computed by selecting
                     procedures that are called by P and procedures that are in the call
                     stack when P returns
            –   Orso et al.’s impact analysis [OAH03]
                    entity-level instrumentation: an execution trace is a sequence of
                     traversed entities
                    a change c on entity e potentially affects all entities of traces
                     containing e
                    the impact set is given from the intersection between the potentially
                     affected entities and the result of a forward slicing with variable
15/36                used on change c as slicing criterion
        Modelling Usage of the System (2/2)

           Information from impact analysis can be used in regression testing
             –   Orso et al’s regression testing [OAH03]
                   entity-level instrumentation
                   test suite T’ is initialized with all test cases contained in existing test suite
                    T traversing the change
                   T’ is augmented with test cases covering uncovered impacted entities
                    computed with Orso et al’s impact analysis technique
                   test suite prioritization is performed by privileging test cases covering more
                    impacted entities
           for increasing confidence of the program
             –   Pavlopoulou and Young’s perpetual testing [PY99]
                     normal executions are considered as tests
                     instrumentation measures statement coverage of uncovered blocks, even
                      in the final environment
                     the program can be iteratively generated to reduce instrumentation
        Learning Properties (1/2)
           Automatic synthesis of properties/invariants
             –   Ernst et al’s approach [ECGN01]
                     initially, a large set of invariants is supposed to hold over monitored
                     each execution can falsify some invariants. Falsified invariants are
                     for each of true invariants is computed the probability that it “randomly
                     if this probability is below a given threshold the invariant is accepted
                     synthesized properties are defined by the set of accepted invariants
           Automatic synthesis of programs
             –   Many approaches from machine learning, but they learn very simple
             –   Lau et al’s approach [LDW03]
                     it is still simple, but it learns small computer programs
17/36                based on accurate execution traces and programming constructs
        Learning Properties (2/2)

           Synthesized properties, invariants and
            programs can be used to
            –   check the implementation with respect to the
            –   verify safety of updates (in terms of components’
                    Ernst at al. approach has been used to verify Pre-cond,
                     Post-cond and Inv corresponding to implemented services
                     when replacing components [ME03]
            –   derive test suites
            –   provide to the programmer confidence over the
18/36           implementation
        Test, Verification and Model-Checking

           Evolution of Testing, Model Checking, and
            Run-time Verification
           Will mention their advantages and
           Mention future research agenda
           Conclusion


           It started with “The Software Crisis” [NATO,
           Led to calls for software “Engineering”
            [Bauer, 1968]
           Focus on methodology for constructing
            software (e.g. Structured Programming
            [Dijkstra, 1969]; Chief Programmer Team
            [Harlan Mills @ IBM, 1973])


           Higher level languages viewed as panacea
            (C, Java, ML, Meta-ML)
           Buggy software was still being produced
           Focus shifted to detecting and preventing
            mistakes during software construction ---

        TVM - Testing

           2 main approaches to Testing: Reliability
            Growth Modelling (RGM) and Random
           In RGM, program is corrected, tested, fails,
            corrected, tested again, goes on many times
           MTBF (Mean Time Between Failure) entered
            into a mathematical model derived from
            previous experiences
        TVM - Testing

           When the model indicates a very long MTBF,
            we stop testing, and ship product
           Pitfalls of RGM:
           Very tenuous (weak) link between past
            development processes and the current one
           Correction of a bug can introduce new bugs,
            which reduces dependability, and

        TVM - Testing

           Industrial practice found you need extremely
            large amounts of failure-free testing
           Thereby not cost-effective
           Random Testing: test cases are selected
            randomly from a domain of possible inputs
           Advantages of Random Testing over RGM:
           Random, therefore non-automatable, you are
            more likely to find errors, and

        TVM - Testing

           Random testing draws on tools from
            information theory to analyse results
           Pitfalls of Random Testing:
           Distribution of random test cases may not be
            the same as real usage of system
           Random testing takes no account of program
            size, a 10-line program treated the same as
            a 10000-line program

        TVM - Program Review

           Buggy software was still being produced
           Another panacea tried was Program Review
            (Software Inspection)
           Depends on humans making the right
           Fallible on human errors

        TVM - Program Proving (Theorem

           Solution then became Formal Deductive
            Reasoning – Program Proving
           Automated Theorem Provers (e. g. Isabelle
            [Camb]) developed to prove programs
           A main problem with theorem provers is the
            impracticality of proving all layers of the
            system from software programs to hardware
            to circuits
        TVM - Model Checking

           Alternative approach to theorem provers is
            model checking
           In model checking, specification for a system
            is expressed in temporal logic, and the
            system is modelled as graph of finite state
            transitions, and a model checker checks
            whether the graph matches the temporal
            logic specification

        TVM - Model Checking

           Advantages over theorem provers:
           Algorithmic, so the user need only to press a
            button and wait for the result while in
            theorem provers, a user may need to direct
            the theorem prover to find a solution
           Gives counterexamples if formula is not

        Model Checking

           Disadvantage of model checking:
           Computational complexity, and
           Some information about the system is lost
            when you turn a system with an infinite
            number of states to a finite number
           There are calls for Run-Time Verification of

        TVM - Run-Time Verification (RTV)

           Some ideas of this were presented above.
           Observations of some RTV tools:
           Simply debuggers with fancy features
           Or they provide good tracing mechanisms
           Encouraging observations of RTV tools:
           Some use LTL (or extensions) to describe
            the program monitor

        TVM - RTV

           Some use LTL as the basis for a Property
            Specification Language, such as PEDL,
           May be used as a basis for understanding
            and for theory

        Call to Arms - Future Research Agenda

           We need a Theory of Testing
           Such theory should integrate good aspects of
            testing, model checking, and run-time
           I shall mention some approaches (references
            in our paper)

        Some Approaches to Theory of Testing

           Type Systems/Abstract Interpretation
           Work from compiling and type systems directed
            towards optimisation of code can provide good
            information to direct selection of test cases
           Polymorphism and linearity can help
           Very little work so far on Semantics of Testing
            (encouraging work from this workshop)

        Some Approaches to Theory of Testing

           Developing semantic structures (e.g. of
            domain) that facilitate testing may be
            something to look at
           Semantics of A.I. Planning to provide a basis
            for semantics of run-time verification (ref. in
            our paper)
           Domain theory in concurrency to provide
            semantics for distributed system testing (ref.
            in paper)

           Call to arms for theory builders and tool
           Come up with good theories and better tools
           Provide tools for software professionals to
            use for system specification, design, build,
            test, audit, monitor systems
           Let’s do it !!!