PowerPoint Presentation - Department of Computer Science by dfhdhdhdhjr

VIEWS: 1 PAGES: 43

									Automatic Synthesis of Fault-Tolerance


                   Ali Ebnenasir
Software Engineering and Network Systems Laboratory
    Computer Science and Engineering Department
             Michigan State University

          Advisor: Dr. Sandeep Kulkarni
                                  Problem
•   Given a program p and a class of faults f,

    Question: How do we add desired fault-tolerance properties to p
              in order to create a new program p’ such that:

       Requirements:

           1. In the absence of f, the resulting fault-tolerant program p’ behaves similar
              to p
           2. In the presence of f, the resulting fault-tolerant program p’ satisfies the
              desired fault-tolerance property.




                                                                                  2
                    Solution Strategies
•   Two possible approaches

    1. Redesign p’ and verify its correctness w.r.t problem
       requirements
       •   Expensive approach


    2. Automatically synthesize p’ from p
       •   Correct by construction




                                                              3
Previous Work on Automated Synthesis




                                       4
                                   Synthesis:
                               Specification-Based
                                  Specification of p
                                  (Temporal Logic Expressions/
                                          Automata)



     Fault-tolerance            Synthesis Algorithm
      requirements               (prove the satisfiability              Faults
(Temporal Logic Expressions)         of the specification)



                                 Fault-tolerant program p’


   Program synthesis:                      Fault-Tolerance synthesis:

   [EmersonClarke 1982]                    [AroraAttieEmerson 1998]
   [AttieEmerson 2001]
   [KupfermannVardi 2001]                                                        5
                                    Synthesis:
                                   Calculational

                           Fault-intolerant program p
                                          (Transitions)




Fault-tolerance
 requirements
                                Synthesis Algorithm                    Faults
                                (Calculate the set of transitions)   (Transitions)




                            Fault-tolerant program p’
                                          (Transitions)

 [KulkarniArora 2000]
 [KulkarniAroraChippada 2001]

                                                                                     6
         The Complexity of Calculational Synthesis


     • High atomicity model: processes can atomically read/write all
       program variables
           – Polynomial in the state space of the fault-intolerant program p [KA00]

     • Low atomicity model (distributed programs): processes have
       read/write restrictions with respect to program variables

           – Exponential in the state space of the fault-intolerant program p for
             synthesizing masking fault-tolerance [KA00]

Propose techniques for the synthesis of fault-tolerant distributed programs

[KA00] S.S. Kulkarni and A. Arora, Automating the addition of fault-tolerance, FTRTFT 2000.
                                                                                              7
                                  Outline
• Preliminary concepts
• Synthesis problem
• Current results
   – Theoretical issues
       • Step-wise automation
       • Polynomial-time boundary
       • Heuristics
       • Pre-synthesized fault-tolerance components
   – Practical issues
       • A framework for the synthesis of fault-tolerant programs

• Contributions
• Open problems                                                     8
                     Preliminary Concepts:
                      Programs and Faults
• Program
     – Finite number of variables with finite domains
     – Finite number of processes
•   State: a valuation of program variables
•   Finite state space Sp
•   State predicate X                  X  Sp
•   Program p, Fault f                    { (s0, s1) | (s0, s1)  Sp  Sp }
•   Closure: X is closed in p


           Sp
                       X
                               p


                                                                               9
                 Preliminary Concepts:
           Specifications and Fault-Tolerance
• Safety specification: something bad never happens
   – Representation                 { (s0, s1) | (s0, s1)  Sp  Sp }
         • E.g., transitions that change the value of a counter from non-zero
            values to zero
•   Liveness specification: something good will eventually happen
     – In the absence of faults, fault-tolerant program p’ satisfies the liveness
       specification of the fault-intolerant program p
• Invariant S, fault-span T       Sp
• Fault-tolerance: Failsafe, Nonmasking, Masking

                      Sp
                                                                               Program
                             T         S        f
                                                                               Fault
                                 p/f       p


                                                                                         10
                      Preliminary Concepts:
                       Distribution Model
• Read/Write restrictions (low atomicity model)
   – Assumption: a process cannot write a variable that it cannot read.
• Example: program p                                  a                      b
        • Two processes j, k                                  j              k
        • Two Boolean variables a and b
        • Process j cannot read b, but can read and write a
• Write restrictions
    – Can we include the following transition in the set of transitions of
      process j?
                a=0,b=0                             a=1,b=1


Write restrictions identify the set of transitions of each process.
                                                                             11
                 Preliminary Concepts:
            Distribution Model – Continued
 • Read restrictions
     – Can we include the following transition in the set of transitions
       of process j?

                a=0,b=0                     a=1,b=0

       Only if we include the transition
                 a=0,b=1                    a=1,b=1


Groups of transitions (instead of individual transitions) must be chosen.
                                                                 12
                                  Outline
• Preliminary concepts
• Synthesis problem
• Current results
   – Theoretical issues
       • Step-wise automation
       • Polynomial-time boundary
       • Heuristics
       • Pre-synthesized fault-tolerance components
   – Practical issues
       • A framework for the synthesis of fault-tolerant programs

• Contributions
• Open problems                                                     13
                         Synthesis Problem
                            Distribution restrictions
Fault-intolerant                                          (Masking/Nonmasking/Failsafe)
program p                                                  Fault-tolerant
                                                           program p'
Specification Spec

Invariant S              Synthesis Algorithm
                                                              Invariant S'
Faults f



                         Desired level of Fault-intolerance
                         (Masking/Nonmasking/Failsafe)
•     Requirements
     1.    No new behaviors are added in the absence of faults.
     2.    In the presence of faults, p’ provides desired level of fault-tolerance.
                                                                                 14
                                  Outline
• Preliminary concepts
• Synthesis problem
• Current results
   – Theoretical issues
       • step-wise automation
       • Polynomial-time boundary
       • Heuristics
       • Pre-synthesized fault-tolerance components
   – Practical issues
       • A framework for the synthesis of fault-tolerant programs
• Contributions
• Open problems
                                                                    15
                          Theoretical Issues:
                  Step-Wise Automation


                              Masking fault-tolerant




                                     [KA00]
Failsafe fault-tolerant                                Nonmasking fault-tolerant



           Failsafe
                               Intolerant Program




                                                                                   16
                                  Outline
• Preliminary concepts
• Synthesis problem
• Current results
   – Theoretical issues
       • step-wise automation
       • Polynomial-time boundary
       • Heuristics
       • Pre-synthesized fault-tolerance components
   – Practical issues
       • A framework for the synthesis of fault-tolerant programs
• Contributions
• Open problems
                                                                    17
                        Theoretical Issues:
              Polynomial-Time Boundary
• Complexity: reduction from 3-SAT to the problem of
  synthesizing failsafe fault-tolerant distributed programs
   In general, the problem of synthesizing failsafe fault-tolerant
      distributed programs from their fault-intolerant version is
                           NP-complete.
• Intuitively, the exponential complexity is due to the inability of a
  process to safely estimate unreadable variables even in the
  absence of faults (grouping issue).

• What are the necessary and sufficient conditions for polynomial
  synthesis of failsafe fault-tolerant distributed programs?
• Restrictions on
   – The transitions of the fault-intolerant programs
   – Specifications                                             18
                                   Theoretical Issues:
                   Monotonicity of Specifications
     • Definition: A specification spec is positive monotonic with
       respect to a Boolean variable x iff:
                   • For every (s0, s1) and (s’0, s’1) grouped together due to inability of
                     reading x


If                                                            Then
                                                                    x = true         x = true
     x = false                x = false                                   s’0           s’1
              s0              s1



 Does not violate safety                                         Does not violate safety


                                                                                              19
                           Theoretical Issues:
                 Monotonicity of Programs
     • Definition: Program p with invariant S is positive
       monotonic with respect to a Boolean variable x iff:
              • For every (s0, s1) and (s’0, s’1) grouped together due to inability of
                reading x


                                     x = true         x = true
                                           s’0           s’1



                     x = false            x = false
                             s0           s1

Invariant S


Monotonicity requirements capture the notion that safe assumptions
        can be made about variables that cannot be read
                                                                                    20
                      Theoretical Issues:
                Monotonicity Theorem
• Sufficiency:
   if
        • Program is negative monotonic, and
        • Spec is positive monotonic
   – Or
        • Program is positive monotonic, and
        • Spec is negative monotonic
   Then
        Synthesis of failsafe fault-tolerance can be done in polynomial time
• Necessity: If only one of these conditions is satisfied then
  synthesizing failsafe fault-tolerance remains NP-complete.
• For many problems, these requirements are easily met
  (e.g., Byzantine agreement, consensus, atomic commit)
                                                                               21
                                       Theoretical Issues:
         An Example for Monotonicity Theorem
• Dijkstra’s guarded commands (actions)
   – Guard  Statement
   – { (s0, s1) | Guard holds at s0 and atomic execution of Statement yields s1 }
                                                                                        g
• Example: Byzantine agreement
   – Safety Specification of Byzantine agreement:
                                                                                    j   k    l

        • Agreement: No two non-Byzantine non-generals can finalize with different
          decisions
        • Validity: If g is not Byzantine then no non-Byzantine process can finalize
          with a different decision with respect to g

   – Processes: General, g, and three non-generals j, k, and l
              –   d.g : {0, 1}
              –   d.j, d.k, d.l : {0, 1, ┴ }
              –   b.g, b.j, b.k, b.l : {0, 1}
              –   f.j, f.k, f.l : {0, 1}

                                                                                        22
                           Theoretical Issues:
    An Example for Monotonicity Theorem

• Program actions for process j
       d.j = ┴  f.j = 0              d.j := d.g
       d.j ≠ ┴  f.j = 0              f.j := 1


• Fault transitions for process j
       ¬b.g  ¬b.j  ¬b.k  ¬b.l      b.j := 1
       b.j                            d.j :=0|1


• Read/Write restrictions:
   – Readable variables for process j:
       • b.j, d.j, f.j, d.g, d.k, d.l
   – Process j can write d.j, f.j

                                                    23
                              Theoretical Issues:
 An Example for Monotonicity Theorem – Continued
• Observation 1: Negative monotonicity of specification with
                  respect to f.j
• Observation 2: Positive monotonicity of program, consisting of
                  the transitions of j, with respect to f.k


• Observation 3: Positive monotonicity of specification with
                         respect to b.j
   – The specification does not stipulate anything about the Byzantine processes
• Observation 4: Negative monotonicity of program, consisting of
                         the transitions of j, with respect to b.k
   Synthesis of agreement program that is failsafe to Byzantine
             faults can be done in polynomial time.           24
                                  Outline
• Preliminary concepts
• Synthesis problem
• Current results
   – Theoretical issues
       • step-wise automation
       • Polynomial-time boundary
       • Heuristics
       • Pre-synthesized fault-tolerance components
   – Practical issues
       • A framework for the synthesis of fault-tolerant programs
• Contributions
• Open problems
                                                                    25
                          Theoretical Issues:
                                 Heuristics
•   Heuristic: A strategy for making deterministic decisions to
     reduce the complexity of synthesis
    – Example: Reuse the structure of nonmasking programs in the synthesis of
        their masking versions


                                 Masking fault-tolerant
                                                             Fault-Tolerance
                                                             Enhancement

                                                          Nonmasking fault-tolerant




                                  Intolerant Program

                                                                                      26
                                  Outline
• Preliminary concepts
• Synthesis problem
• Current results
   – Theoretical issues
       • step-wise automation
       • Polynomial-time boundary
       • Heuristics
       • Pre-synthesized fault-tolerance components
   – Practical issues
       • A framework for the synthesis of fault-tolerant programs
• Contributions
• Open problems
                                                                    27
                                       Theoretical Issues:
    Pre-Synthesized Fault-Tolerance Components
 • What if existing heuristics fail?

 • How can we reuse the techniques used in the synthesis of a
   program, in the synthesis of another program?

 • Can we encapsulate commonly encountered synthesis patterns in
   terms of pre-synthesized fault-tolerance components?

 • Detectors and correctors are necessary and sufficient in the design
   of fault-tolerance [AK98]
       – Detectors and correctors have the potential to provide a rich library of pre-
         synthesized fault-tolerance components
[AK98] A. Arora and S.S. Kulkarni, Detectors and Correctors: A Theory of Fault-Tolerance , IEEE ICDCS 1998.
                                                                                                    28
               Theoretical Issues:
       Using Pre-Synthesized Components

• If available heuristics fail to add recovery from a deadlock
  state sd

            Automatically specify the required component


            Extract the component from the component library


            Verify the interference-freedom of the composition


          Add extracted component to the fault-intolerant program
                                                                    29
             Theoretical Issues:
Pre-Synthesized Components - Achievements
• Reducing the chance of failure in the synthesis

• Providing a mechanism for the reuse of synthesis
  techniques

• Extending the scope of synthesis problem where the state
  space is expanded during the synthesis

• Controlling the way new variables are introduced


                                                         30
                                  Outline
• Preliminary concepts
• Synthesis problem
• Current results
   – Theoretical issues
       • step-wise automation
       • Polynomial-time boundary
       • Heuristics
       • Pre-synthesized fault-tolerance components
   – Practical issues
       • A framework for the synthesis of fault-tolerant programs
• Contributions
• Open problems
                                                                    31
                      Practical Issues:
                    Framework Goals
• Goals of the framework design

  – Ability to synthesize fault-tolerant programs from their fault-
    intolerant versions

  – Ability to integrate new heuristics into the framework

  – Ability to change implementation




                                                              32
                              Practical Issues:
                         Synthesis Framework
                 Library of pre-synthesized fault-tolerance components
      Pre-synthesized                                             Component
      detectors/correctors                                        specification

                                 Synthesis algorithm

        p, S, f, spec          Results                  Query      p’, S’

                               Interactive user interface

         p, S, f, spec         Results                  Query                p’, S’

                                        The user
Guarded commands,             (Fault-tolerance developer)          Guarded commands/
State predicates                                                   Promela,
                                                                   State predicates
                                                                              33
                                   Practical Issues:
           Framework Internals –Synthesis Algorithm

    Fault-intolerant program
          p, S, f, spec                                      Interaction points


Initialization                 Preserve Invariant              Modify Invariant

    Expand the reachability             Calculate a valid            Calculate a valid
            graph                          fault-span                    invariant


    Remove bad transitions               Ensure safety           Ensure deadlock freedom


       Remove bad states           Ensure deadlock freedom        Resolve non-progress
                                                                         cycles

                                    Fault-tolerant program       Reachability graph of the
                                            p’, S’                fault-tolerant program


                                                                                         34
                     Practical Issues:
             Current Status of the Framework
• Example synthesized programs:
   – Token ring with 7 processes
   – Byzantine agreement with 4 non-general processes and one general
     process
   – An agreement program that is subject to both Byzantine and fail-stop
     faults (1.3 million states)


• Currently, the framework can
   – handle different types of faults (e.g., process restart, Byzantine, fail-
     stop)
   – synthesize programs that are simultaneously subject to multiple types
     of faults


                                                                            35
                                  Outline
• Preliminary concepts
• Synthesis problem
• Current results
   – Theoretical issues
       • step-wise automation
       • Polynomial-time boundary
       • Heuristics
       • Pre-synthesized fault-tolerance components
   – Practical issues
       • A framework for the synthesis of fault-tolerant programs
• Contributions
• Open problems
                                                                    36
                             Contributions
• Showing the NP-completeness of synthesizing failsafe fault-tolerance


• Identifying the necessary and sufficient conditions for polynomial-time
   synthesis of failsafe fault-tolerance


• Reusing the computational structure of fault-intolerant programs to
   reduce the complexity of synthesis (enhancement)


• Identifying synthesis patterns as pre-synthesized fault-tolerance
  components

• Developing a framework for the synthesis of fault-tolerant programs
                                                                            37
                                  Outline
• Preliminary concepts
• Synthesis problem
• Current results
   – Theoretical issues
       • step-wise automation
       • Polynomial-time boundary
       • Heuristics
       • Pre-synthesized fault-tolerance components
   – Practical issues
       • A framework for the synthesis of fault-tolerant programs
• Contributions
• Open problems
                                                                    38
                          Open Problems
• Theoretical issues
   – Non-monotonic programs/specifications to monotonic ones
       • Extending the scope of the programs that can reap the benefit of efficient
         automation

   – Necessary and sufficient conditions for simultaneous addition of multiple
     pre-synthesized fault-tolerance components

   – Necessary and sufficient conditions for polynomial-time synthesis of
     nonmasking fault-tolerant programs

   – Automated synthesis of multitolerance


                                                                                39
                  Open Problems - Continued

       • Practical issues
            – Distributed synthesis algorithm
            – Symbolic synthesis of fault-tolerance


                      Distributed Synthesis Algorithm

Verify safety   Y/N   Closure    Y/N                    Cycle              Y/N
                                                        detection

     SAT solver         SAT solver
                                            . . .               SAT solver

                                                                      40
        Open Problems - Continued

• Using model checkers for acquiring behavioral
  information during synthesis


           Distributed Synthesis Algorithm




SPIN           SPIN
                                . . .        SPIN



                                                  41
                          Publications
• Published papers
   – Sandeep S. Kulkarni and Ali Ebnenasir. "Enhancing The Fault-
     Tolerance of Nonmasking Programs". IEEE ICDCS 2003.
   – Ali Ebnenasir. "Algorithmic Synthesis of Fault-Tolerant Distributed
     Programs". Doctoral Symposium of ICDCS 2003.
   – Sandeep S. Kulkarni and Ali Ebnenasir. "The Complexity of Adding
     Failsafe Fault-Tolerance". IEEE ICDCS 2002.

• Submitted papers
   – Sandeep S. Kulkarni and Ali Ebnenasir. "Adding Fault-Tolerance
     Using Pre-Synthesized Components". Submitted to CBSE7, ICSE
     2004.
   – Ali Ebnenasir and Sandeep S. Kulkarni . "A Framework for
     Automatic Synthesis of Fault-Tolerance". Submitted to DSN 2004.
   – Sandeep S. Kulkarni and Ali Ebnenasir. "Automated Synthesis of
     Multitolerance". Submitted to DSN 2004.

                                                                   42
    Thank you!




Questions and comments?




                          43

								
To top