Software Testing by 7YI81K68

VIEWS: 0 PAGES: 104

									CS 406 Fall 98 Software Testing
  Part III: Test Assessment and Improvement

                  Aditya P. Mathur
                  Purdue university




Last update: July 19, 1998
Learning Objectives
 To understand the relevance and importance
  of test assessment.
 To learn the fundamental principle
  underlying test assessment.
 To learn various methods and tools for test
  assessment.


               Test assessment and improvement   2
Learning objectives
 To understand the relative
  strengths/weaknesses of test assessment
  methods.
 To learn how to improve tests based on a
  test assessment procedure.



               Test assessment and improvement   3
What is test assessment?
 Once a test set T, a collection of test inputs,
  has been developed, we ask:
                How good is T?
 It is the measurement of the goodness of T
  which is known as test assessment.
 Test assessment is carried out based on one
  or more criteria.

                Test assessment and improvement     4
Test assessment-continued
 These criteria are known as test adequacy
  criteria.
 Test assessment is also known as test
  adequacy assessment.




               Test assessment and improvement   5
Test assessment-continued
 Test assessment provides the following
  information:
  – A metric, also known as the adequacy score or
    coverage, usually between 0 and 1.
  – A list of all the weaknesses found in T, which
    when removed, will raise the score to 1.
  – The weaknesses depend on the criteria used for
    assessment.

                 Test assessment and improvement     6
Test assessment-continued
 Once the coverage has been computed, and the
  weaknesses identified, one can improve T.
 Improvement of T is done by examining one or
  more weaknesses and constructing new test
  requirements designed to overcome the
  weakness(es).
 The new test requirements lead to new test
  specifications and to further testing of the
  program.
                Test assessment and improvement   7
Test assessment-continued
 This is continued until all weaknesses are
  overcome, i.e. the adequacy criterion is
  satisfied (coverage=1).
 In some instances it may not be possible to
  satisfy the adequacy criteria for one or more
  of the following reasons:
     • Lack of sufficient manpower
     • Weaknesses that cannot be removed because they
       are infeasible.
                 Test assessment and improvement        8
Test assessment-continued
      • The cost of removing the weaknesses is not
        justified.
 While improving T by removing its
  weaknesses, one usually tests the program
  more thoroughly than it has been tested so
  far.
 This additional testing is likely to result in
  the discovery of remaining errors.
                   Test assessment and improvement   9
Test assessment-continued
 Hence we say that test assessment and
  improvement helps in the improvement of
  software reliability.
 Test assessment and improvement is
  applicable throughout the testing process
  and during all stages of software
  development.

               Test assessment and improvement   10
Test assessment-summary procedure
  0                  Develop T

                Select an adequacy
  1             criterion C.

  2           Measure adequacy of T
              w.r.t. C.
                                                  Yes
  3               Is T adequate?
      Yes                     No
  4                  Improve T

  5         More testing is warranted       ?
                             No
  6                     Done
                Test assessment and improvement         11
Principle underlying test assessment

 There is a uniform principle that underlies
  test assessment throughout the testing
  process.
 This principle is known as the coverage
  principle.
 It has come about as a result of intensive
  research at Purdue and other research
  groups in software testing.
                Test assessment and improvement   12
The coverage principle
 To formulate and understand the coverage
  principle, we need to understand:
  – coverage domains
  – coverage elements
 A coverage domain is a finite domain,
  related to the program under test, that we
  want to cover. Coverage elements are the
  individual elements of this domain
               Test assessment and improvement   13
The coverage principle-continued
 Coverage Domains                Coverage Elements

Requirements
Classes
Functions
Interface mutations
Exceptions

              Test assessment and improvement    14
The coverage principle-continued
 Measuring test adequacy and improving a
 test set against a sequence of well defined,
 increasingly strong, coverage domains leads
 to improved confidence in the reliability of
 the system under test.




              Test assessment and improvement   15
The coverage principle-continued
 Note the following properties of a coverage
  domain:
  – It is related to the program under test.
  – It is finite.
  – It may come from program requirements,
    related to the inputs and outputs.



                Test assessment and improvement   16
The coverage principle-continued
 – It may come from program code. Can you think
   of a coverage domain that comes from the
   program code?
 – It aids in measuring test adequacy as well as the
   progress made in testing. How?




                Test assessment and improvement   17
The coverage principle-continued
 Example:
  – It is required to write a program that takes in
    the name of a person as a string and searches
    for the name in a file of names. The program
    must output the record ID which matches the
    given name. In case of no match a -1 is
    returned.
    What coverage domains can be identified from
                     this requirement?
                 Test assessment and improvement      18
The coverage principle-continued
 As we learned earlier, improving coverage
  improves our confidence in the correct
  functioning of the program under test.
 Given a program P and a test T suppose that
  T is adequate w.r.t. a coverage criterion C.
 Does this mean that P is error free?
 Obviously……???

               Test assessment and improvement   19
Test effort
 There are several measures of test effort.
 One measure is the size of T. By this
  measure a test set with a larger number of
  test cases corresponds to higher effort than
  one with a lesser number of test cases.



                Test assessment and improvement   20
Error detection effectiveness
 Each coverage criterion has its error
  detection ability. This is also known as the
  error detection effectiveness or simply
  effectiveness of the criterion.
 One measure of the effectiveness of
  criterion C is the fraction of faults
  guaranteed to be revealed by a test T that
  satisfies C.
                Test assessment and improvement   21
Effectiveness-continued
 Another measure is the probability that at
  least fraction f of the faults in P will be
  revealed by test T that satisfies C.
 Unfortunately there is no absolute measure
  of the effectiveness of any given coverage
  criterion for a general class of programs and
  for arbitrary test sets.

                Test assessment and improvement   22
Effectiveness-continued
 One coverage criterion results in an
  exception to this rule: What is it?
 Empirical studies conducted by researchers
  give us an idea of the relative goodness of
  various coverage criteria.
 Thus, for a variety of criteria we can make a
  statement like: Criterion C1 is definitely
  better than criterion C2.
                Test assessment and improvement   23
Effectiveness-continued
 In some cases we may be able to say:
  Criterion C1 is probably better than
  criterion C2.
 Such information allows us to construct a
  hierarchy of coverage criteria.
 This hierarchy is helpful in organizing and
  managing testing. How?

               Test assessment and improvement   24
Strength of a coverage criterion
 The effectiveness of a coverage criterion is
  also referred to as its strength.
 Strength is a measure of the criterion’s
  ability to reveal faults in a program.
 Criterion C1 is considered stronger than
  criterion C2 if C1 is is capable of revealing
  more faults than C2.

                Test assessment and improvement   25
The Saturation Effect
 The rate at which new faults are discovered
  reduces as test adequacy with respect to a
  finite coverage domain increases; it reduces
  to zero when the coverage domain has been
  exhausted.

      f / c

                0      coverage           1
                    Test assessment and improvement   26
 Saturation Effect: Fault View
      N
Remaining
Faults




      M

      0
       Functional   tfs   tfe           tds                 tdfe   tme
                                              Testing Effort
                          Test assessment and improvement                27
Saturation Effect: Reliability View

                                           R’d               R’df      R’m
                        R’f
Reliability
              Rm
              Rdf                                                    Mutation

              Rd                                      Dataflow
              Rf                       Decision

                    Functional
                            t fs    t fe         tds tde      tdfs tdfe   tms tfe
                                                           Testing Effort
       True reliability (R)                 FUNCTIONAL, DECISION, DATAFLOW
       Estimated reliability (R’)           AND MUTATION COVERAGE PROVIDE
       Saturation region                    VARIOUS TEST EVALUATION CRITERIA.
                           Test assessment and improvement                          28
Coverage principle-discussion
 Discuss:
    How will you use the knowledge of coverage
    principle and the saturation effect in organizing
    and managing testing?

     Can you think of any other uses of the coverage
     principle and the saturation effect?


                  Test assessment and improvement   29
Control flow graph
 Control flow graph (CFG) of a program is a
  representation of the flow of execution
  within the program.
 It is useful in program analysis such as that
  required during test assessment and
  improvement.
 More formally, a CFG G is:

                Test assessment and improvement   30
Control flow graph
 – G=(N,A)
    where N: set of nodes and A: set of arcs
 – There is a unique entry node en in N.
 – There is a unique exit node ex in N. A node
   represents a single statement or a block.
 – A block is a single-entry-single-exit sequence
   of instructions that are always executed in a
   sequence without any diversion of path except
   at the end of the block.
                 Test assessment and improvement    31
Control flow graph-continued
 – Every statement in a block, except possibly the
   first one, has exactly one predecessor.
 – Similarly, every statement in the block, except
   possibly the last one, has exactly one successor.
 – An arc a in A is a pair (n,m) of nodes from N
   which represent transfer of control from node n
   to node m.
 – A path of length k in G is an ordered sequence
   of arcs, a1 , a2, .....,ak from A such that:
                Test assessment and improvement    32
Control flow graph-continued
    • The first node in a1 is en
    • The last node in a k is ex
    • For any two adjacent arcs ai = (n,m) and a j =
      (p,q), m=p.
 – A path is considered executable or feasible if
   there exists a test case which causes this path to
   be traversed during program execution,
   otherwise the path is unexecutable or infeasible.


                 Test assessment and improvement       33
Control flow graph-example
   Class exercise:
   Draw a CFG for the following program:
        1.     scanf (x,y); if (y<0)
        2.               pow=0-y;
        3.     else pow=y;
        4.     z=1.0;
        5.     while (pow !=0)
        6.               {z=z*x; pow=pow-1;}
        7.     if (y<0)
        8.               z=1.0/z;
        9.     printf(z);
                        What does the above program compute?
               Test assessment and improvement             34
Control flow graph-example
   Class exercise:

    For the CFG you have drawn, list all paths of length
    at most 10.

    Are there more paths than what you have listed?




                         What does the above program compute?
                Test assessment and improvement             35
Structure-based test adequacy
 Based on the CFG of a program several test
  adequacy criteria can be defined.
 Some are:
     •   statement coverage criterion
     •   branch coverage criterion
     •   condition coverage criterion
     •   path coverage criterion


                    Test assessment and improvement   36
Statement coverage
 The coverage domain consists of all
  statements in the program. Restated, in
  terms of the control flow graph, it is the set
  of all nodes in G.
 A test T satisfies the statement coverage
  criterion if upon execution of P on each
  element of T, each statement of P has been
  executed at least once.
                Test assessment and improvement    37
Statement coverage-continued
 Restated in terms of G, T is adequate w.r.t.
  the statement coverage criterion if each
  node in N is on at least one of the paths
  traversed when P is executed on each
  element of T.




                Test assessment and improvement   38
Statement coverage-continued
 Class exercise:
  – For the program for which you have drawn the
    control flow graph, develop a test set that
    satisfies the statement coverage criterion.
  – Follow the procedure for test assessment and
    improvement suggested earlier.



                Test assessment and improvement   39
Statement coverage-weakness
 Consider the following program:
     int abs (x);
     int x;
     {
         if (x>=0) x=0-x;
         return x;
     }




                    Test assessment and improvement   40
Statement coverage-weakness
 Suppose that T= {(x=0)}.
 Clearly, T satisfies the statement coverage
  criterion.
 But is the program correct and is the error
  revealed by T which is adequate w.r.t. the
  statement coverage criterion?
  What do you suggest we do to improve T?

                Test assessment and improvement   41
Branch (or edge) coverage
 In G there may be nodes which correspond
  to conditions in P. Such nodes, also called
  condition nodes, contain branches in P.
 Each such node is considered covered if
  during some execution of P, the condition
  evaluates to true and false; these executions
  of P need not be the same.

                Test assessment and improvement   42
Branch coverage
 The coverage domain consists of all
  branches in G. Restated, in terms of the
  control flow graph, it is the set of all arcs
  exiting the condition nodes.
 A test T satisfies the branch coverage
  criterion if upon execution of P on each
  element of T, each branch of P has been
  executed at least once.
                 Test assessment and improvement   43
Branch coverage
 Class exercise:
     • Identify all condition nodes in the flow graph you
       have drawn earlier.
     • Does T= {(x=0)} satisfy the branch coverage
       criterion?
     • If not, then improve it so that it does.




                  Test assessment and improvement           44
Branch coverage-weakness
 Consider the following program which is
 suppose to check that the input data item is
 in the range 0 to 100, inclusive:
     int check(x);
     int x;
     {
         if ((x>=0 )&& (x<=200))
         check=true;
         else check=false;
     }
                     Test assessment and improvement   45
Branch coverage-weakness
 Class exercise:
     • Do you notice the error in this program?
     • Find a test set T which is adequate w.r.t. statement
       coverage and does not reveal the error.
     • Improve T so that it is adequate w.r.t. branch
       coverage and does not reveal the error.
     • What do you conclude about the weakness of the
       branch coverage criterion?


                   Test assessment and improvement            46
Condition coverage
 Condition nodes in G might have compound
  conditions.
 For example, in the check program the
  condition node contains the condition:
           ((x>=0 ) && (x<=200))

 This is a compound condition which
  consists of the elementary conditions x>=0
  and x<=200.
               Test assessment and improvement   47
Condition coverage-continued
 A compound condition is considered
  covered if all of its constituent elementary
  conditions evaluate to true and false,
  respectively, during some execution of P.
 A test set T is adequate w.r.t. condition
  coverage if all conditions in P are covered
  when P is executed on elements of T.

                Test assessment and improvement   48
Condition coverage-continued
 Class exercise:
     • Improve T from the previous exercise so that it is
       adequate w.r.t. the condition coverage criterion for
       the check function and does not reveal the error.
     • Do you find the above possible?




                  Test assessment and improvement             49
Branch coverage-weakness, continued

 Consider the following program:
     0.   int set_z(x,y);
           {
          1.        int x,y;
          2.        if (x!=0)
          3.                  y=5;
          4.        else z=z-x;
          5.        if (z>1)
          6.                  z=z/x;               What might happen here?
          7.        else
          8.                  z=y;
           }
                 Test assessment and improvement                             50
Branch coverage-weakness
 Class exercise:
     • Construct T for set_z such that (a) T is adequate
       w.r.t. the branch coverage criterion and (b) does not
       reveal the error.
     • What do you conclude about the effectiveness of the
       branch and condition coverage criteria?




                  Test assessment and improvement         51
Path coverage
 As mentioned before, a path through a
  program is a sequence of statements such
  that the entry node of the program CFG is
  the first node on the path and the exit node
  is the last one on the path.
    Is this definition equivalent to the one given
    earlier?


                 Test assessment and improvement     52
Path coverage-continued
 A test set T is considered adequate w.r.t. the
  path coverage criterion if all paths in P are
  executed at least once upon execution on
  each element of T.
 Class exercise:
     • Construct T for set_z such that T is adequate w.r.t.
       the path coverage criterion and does not reveal the
       error.
     • Is the above possible?
                  Test assessment and improvement             53
Path coverage-weakness
 The number of paths in a program is usually
  very large.
 How many paths in set_z?
 How many paths in check?
 How many in the program that computes
                   xy ?

               Test assessment and improvement   54
Path coverage-weaknesses
 It is the infinite or a prohibitively large
  number of paths that prevent the use of this
  criterion in practice.
 Suppose that a test set T covers all paths.
  Will it guarantee that all errors in P are
  revealed ?
 Is obtaining 100% path coverage equivalent
  to exhaustive testing?
                 Test assessment and improvement   55
Variants of path coverage
 As path coverage is usually impossible to
  attain, other heuristics have been proposed.
 Loop coverage:
  – Make sure that each loop is executed 0, 1, and 2
    times.
 Try several combinations of if and switch
  statements. The combinations must come
  from requirements.
                Test assessment and improvement   56
Hierarchy in Control flow criteria

         Path coverage
       Condition coverage
        Branch coverage
       Statement coverage
                                             X
                                                 X subsumes Y.
                                             Y

           Test assessment and improvement                       57
Exercise
 Develop a test set T that is adequate w.r.t.
  the statement, condition, and the loop
  coverage criteria for the exponentiation
  program.




                Test assessment and improvement   58
Testing technique or strategy
 One can develop a testing strategy based on
  any of the criteria discussed.
 Example:
  – A testing strategy based on the statement
    coverage criterion will begin by evaluating a
    test set T against this criterion. Then new tests
    will be added to T until all the statements are
    covered, i.e. T satisfies the criterion.

                 Test assessment and improvement        59
Definitions
 Error-sensitive path: a path whose
  execution might lead to eventual detection
  of an error.
 Error revealing path: a path whose
  execution will always cause the program to
  fail and the error to be detected.


               Test assessment and improvement   60
Definitions
 Reliable: A testing technique is reliable for
  an error if it guarantees that the error will
  always be detected.
  – This implies that a reliable testing technique
    must lead to the exercising of at least one error-
    revealing path.



                 Test assessment and improvement     61
Definitions
 Weakly reliable: A testing technique is
  weakly reliable if it forces the execution of
  at least one error sensitive path.




                Test assessment and improvement   62
Example: error detection
 Let us go over the example in Korel and
  Laski’s paper.
 It is a sorting program which uses the
  bubble sort algorithm.
 It sorts an array a[0:N] in descending order.
 There are two, nested, loops in the program.
 The inner loop from i6-i10 finds the largest
  element of a[R1:N].
                Test assessment and improvement   63
Example: error detection
 The largest element is saved in R0 and R3
  points to the location of R0 in a.
 The outer loop swaps a(R1) with a(R3).
 The completion of one iteration of the outer
  loop ensures that the sub-array a[0:R1-1]
  has been sorted and that a[R1-1] is greater
  than or equal to any element of a[R1:N].

               Test assessment and improvement   64
Example: error detection
 There is a missing re-initialization of R3 to
  R1 at the beginning of the inner loop.
 In some cases this will cause the program to
  fail.
       What are these cases?
 We will get back to this error later!


                Test assessment and improvement   65
Class exercise
 Is the path testing strategy reliable for the
  sort program and for the missing
  initialization error in it ?
 Is it viable ?
 What about the branch testing strategy?
 What about loop testing?


                Test assessment and improvement   66
Data flow graph
 It represents the flow of data in a program.
 The graph is constructed from the control
  flow graph (CFG) of the program.
 A statement that occurs within a node of the
  CFG might contain variables occurrences.
 Each variable occurrence is classified as a
  def or a use.

                Test assessment and improvement   67
defs and uses
 A def represents the definition of a variable.
  Here are some sample defs of variable x:
     •   x=y*x;
     •   scanf(&x,&y);        All defs of x are italicized.
     •   int x;
     •   x[i-1]=y*x;
 A use represents the use of a variable in a
  statement. Here a few examples of use of
  variable x:
                   Test assessment and improvement            68
def-use-continued

                                     All uses of x are italicized.
     •   x=x+1;
     •   printf (“x is %d, y is %d”, x,y);
     •   cout << x << endl << y
     •   z=x[i+1]
     •   if (x<y)…
 Uses of a variable in input and assignments
  are classified as c-uses. Those in conditions
  are classified as p-uses.
                     Test assessment and improvement                 69
def-use-continued
 c-use stands for computational use and p-
  use for predicate-use.
 Both c- and p-uses affect the flow of
  control: p-uses directly as their values are
  used in evaluating conditions and c-uses
  indirectly as their values are used to
  compute other variables which in turn affect
  the outcome of condition evaluation.
               Test assessment and improvement   70
def-use-continued
 A path from node i to node j is said to be
  def-clear w.r.t. a variable x if there is no def
  of x in the nodes along the path from node i
  to node j. Nodes i and j may have a def of
  x.
 A def-clear path from node i to edge (j,k) is
  one in which no node on the path has a def
  of x.
                 Test assessment and improvement   71
global-def
 A def of a variable x is considered global to
  its block if it is the last def of x within that
  block.
 A c-use of x in a block is considered global
  c-use if there is no def of x preceding this c-
  use within this block.


                 Test assessment and improvement   72
def-use graph: definitions
 def(i): set of all variables for which there is
  a global def in node i.
 c-use(i): set of all variables that have a
  global c-use in node i.
 p-use(i,j): set of all variables for which
  there is a p-use for the edge (i,j).
 dcu(x,i): set of all nodes such that each node
  has x in its c-use and x is in def(i).
                 Test assessment and improvement    73
def-use graph: definitions
 dpu(x,i): set of all edges such that each edge
  has x in its p-use , x is in def(i).
 The def-use graph of program P is
  constructed by associating defs, c-use, and
  p-use sets with nodes of a flow graph.

    The next example is from Jalote’s text, pp425-
    428.
                Test assessment and improvement      74
def-use graph-continued

  Sample program:
        1.   scanf (x,y); if (y<0)
        2.             pow=0-y;
        3.   else pow=y;
        4.   z=1.0;
        5.   while (pow !=0)
        6.             {z=z*x; pow=pow-1;}
        7.   if (y<0)
        8.             z=1.0/z;
        9.   printf(z);


             Test assessment and improvement   75
def-use graph-continued
                                   def={x,y}             Unlabeled edges
                         1         c-use=               imply empty p-use set.
                     y         y
   def={pow}                                 def={pow}
               2                        3    c-use={y}
   c-use={y}
                         4     def={z}
                               c-use=
         def=
         c-use=         5
                                             def=
def={z,pow}         pow       pow            c-use=
                6                       7
c-use={z,x,pow}
               def={z}              y       y           def=
               c-use={z} 8                            9 c-use={z}
                    Test assessment and improvement                        76
def-use graph-class exercise
 Draw a def-use graph for the following program.
       0.    int set_z(x,y);
              {
             1.        int x,y;
             2.        if (x!=0)
             3.                  y=5;
             4.        else z=z-x;
             5.        if (z>1)
             6.                  z=z/x;
             7.        else
             8.                  z=y;
              }


                 Test assessment and improvement   77
def-use graph-continued
 Traverse the graph to determine dcu and dpu sets.
    (node, var)              dcu                      dpu
    (1,x)                    {6}                      
    (1,y)                    {2,3}                    {(1,2),(1,3),(7,8),(7,9)}
    (2,pow)                  {6}                      {(5,6),(5,7)}
    (3,pow)                  {6}                      {5,6),(5,7)}
    (4,z)                    {6,8,9}                  
    (6,z)                    {6,8,9}                  
    (6,pow)                  {6}                      {(5,6),(5,7)}
    (8,z)                    {9}                      
                    Test assessment and improvement                          78
Test generation
 Class exercises:
   – For the above graph generate a test set that satisfies
       • the branch coverage criterion
       • the all-defs criterion - for definitions of all variables at least
         one use (c- or p- use) must be exercised.
       • the all-uses criterion- all p-uses and all c-uses of all variable
         definitions be covered.

  Develop the tests incrementally, i.e. by modifying
  the previous test set!

                        Test assessment and improvement                       79
Data flow testing tool
 We will use SUDS, a data flow testing tool
  developed at Bellcore and available
  commercially from IBM.
 The acronym SUDS stands for Software
  Understanding and Debugging System.
 SUDS is a collection of tools of which
  ATAC is the one that measures control
  flow and data flow coverage.
               Test assessment and improvement   80
ATAC processing: phase I

P, Program under            Preprocess, compile
                                                       Test set
test                        and instrument


.atac files             Instrumented version of P (executable)

                                            upon execution
.trace file                    Program output




                   Test assessment and improvement                81
ATAC processing: phase II

       .atac files                     .trace file


                     coverage analyzer




             control flow and data flow
             coverage values




              Test assessment and improvement        82
ATAC demo
 Open DOS window.
 Go to /Program Files/bellcore/xSUDS/tutorial
 Type
     ataccl /Fedemo main.c wc.c
 Type
     xsuds *.atac
 You may now view program complexity statistics in the
  suds window


                   Test assessment and improvement        83
ATAC demo-continued
 Go back to the DOS window and type:
    demo -c input1
 Go to the xSUDS window and examine various coverage
  values.
 Go back to the DOS window and type:
    demo -c input2
 Go to the xSUDS window and examine how various
  coverage values have changed.


                  Test assessment and improvement       84
ATAC demo-continued
 Repeat the above steps of executing demo
  on several test inputs. Analyze coverage
  values and observe how they change with
  new test data.
 Other tools in SUDS will be discussed in
  the laboratory.


               Test assessment and improvement   85
Mutation testing
 What is mutation testing?
  – Mutation testing is a code-based test assessment
    and improvement technique.
  – It relies on the competent programmer
    hypothesis which is the following assumption:
     Given a specification a programmer develops a
     program that is either correct or differs from the
     correct program by a combination of simple errors.

                   Test assessment and improvement        86
Mutation testing-continued
 The process of program development is
 considered as iterative whereby an initial
 version of the program is refined by
 making simple, or a combination of simple
 changes, towards the final version.




              Test assessment and improvement   87
Mutation testing-definitions
 Given a program P, a mutant of P is obtained by making a
  simple change in P.
    Program                              Mutant
       1.     int x,y;                  1.             int x,y;
       2.     if (x!=0)                 2.             if (x!=0)
       3.              y=5;             3.                      y=5;
       4.     else z=z-x;               4.             else z=z-x;
       5.     if (z>1)                  5.             if (z>1)
       6.              z=z/x;           6.                      z=z/zpush(x);
       7.     else                      7.             else
       8.              z=y;             8.                      z=y;
                                                            What is zpush?
                     Test assessment and improvement                            88
Another mutant
 Program                                 Mutant
    1.     int x,y;                     1.          int x,y;
    2.     if (x!=0)                    2.          if (x!=0)
    3.              y=5;                3.                   y=5;
    4.     else z=z-x;                  4.          else z=z-x;
    5.     if (z>1)                     5.          if (z<1)
    6.              z=z/x;              6.                   z=z/x;
    7.     else                         7.          else
    8.              z=y;                8.                   z=y;




                       Test assessment and improvement                89
Mutant
 A mutant M is considered distinguished by
  a test case t T iff:
         • P(t)M(t)
           where P(t) and M(t) denote, respectively, the
           observed behavior of P and M when executed on
           test input t.
 A mutant M is considered equivalent to P
  iff:
         • P(t)M(t) t  T.
                     Test assessment and improvement       90
Mutation score
 During testing a mutant is considered live if
  it has not been distinguished or proven
  equivalent.
 Suppose that a total of #M mutants are
  generated for program P.
 The mutation score of a test set T, designed
  to test P, is computed as:
    number of live mutants/(#M-number of equivalent mutants)

                    Test assessment and improvement            91
Test adequacy criterion
 A test T is considered adequate w.r.t. the
  mutation criterion if its mutation score is 1.
 The number of mutants generated depends
  on P and the mutant operators applied on P.
 A mutant operator is a rule that when
  applied to the program under test generates
  zero or more mutants.

                Test assessment and improvement   92
Mutant operators
 Consider the following program:
     int abs (x);
     int x;
     {
         if (x>=0) x=0-x;
         return x;
     }




                    Test assessment and improvement   93
Mutation operator
 Consider the following rule:
     • Replace each relational operator in P by all possible
       relational operators excluding the one that is being
       replaced.
 Assuming the set of relational operators to
  be: {<, >, <=, >=, ==, !=}, the above
  mutant operator will generate a total of 5
  mutants of P.

                   Test assessment and improvement         94
Mutation operators
 Mutation operators are language dependent.
 For Fortran a total of 22 operators were
  proposed.
 For C a total of 77 operators were proposed.
  None have been proposed for C++ though
  most of the operators for C are applicable to
  C++ programs.

                Test assessment and improvement   95
Equivalent mutant
 Consider the following program P:
     int x,y,z;
     scanf(&x,&y);
     if (x>0)
        x=x+1; z=x*(y-1);
     else
        x=x-1; z=x*(y-1);

 Here z is considered the output of P.

                   Test assessment and improvement   96
Equivalent mutant-continued
 Now suppose that a mutant of P is obtained
  by changing x=x+1 to x=abs(x)+1.
 This mutant is equivalent to P as no test
  case can distinguish it from P.




                Test assessment and improvement   97
Mutation testing procedure
 Given P and a test set T:
    1. Generate mutants
    2. Compile P and the mutants
    3. Execute P and the mutants on each test
           case.
    4. Determine equivalent mutants..
    5. Determine mutation score.
    6. If mutation score is not 1 then improve
            the test set and repeat from step 3.
                 Test assessment and improvement   98
Mutation testing procedure
 In practice the above procedure is
  implemented incrementally.
 One applies a few selected mutant
  operators to P and computes the mutation
  score w.r.t. to the mutants generated.
 Once these mutants have been distinguished
  or proven equivalent, another set of mutant
  operators is applied.
               Test assessment and improvement   99
Mutation testing procedure
 This procedure is repeated until either all
  the mutants have been exhausted or some
  external condition forces testing to stop.
 We will not discuss the details of practical
  application of mutation testing.



                Test assessment and improvement   100
Tools for mutation testing
 Mothra: for Fortran, developed at Purdue,
  1990
 Proteum: for C, developed at the University
  of Saõ Paulo at Saõ Carlos in Brazil.




               Test assessment and improvement   101
Uses of Mutation testing
 Mutation testing is useful during integration
  testing to check for integration errors.
 Only the variables that are in the interfaces
  of the components being integrated are
  mutated. This reduces the complexity of
  mutation testing.


                Test assessment and improvement   102
Summary
 Test adequacy criterion
 Test improvement
 Coverage principle
 Saturation effect
 Control flow criteria
 Data flow criteria
  – def, use, p-use, c-use, all-uses
                  Test assessment and improvement   103
Summary continued
 xSUDS, data flow testing tool.
 Mutation testing
  – mutant, distinguishing a mutant, live mutant,
    mutant score, competent programmer
    hypothesis.




                 Test assessment and improvement    104

								
To top