Indus - Java Program Slicer by msz78385

VIEWS: 38 PAGES: 10

									                   Indus - Java Program Slicer
               Venkatesh Prasad Ranganath, Kansas State University
                          <rvprasad@cis.ksu.edu>



Table of Contents
     Background ............................................................................................................... 1
           Program Slicing .................................................................................................. 1
           Dependences ...................................................................................................... 1
           Java Program Slicer ............................................................................................. 2
     Design and Architecture ............................................................................................... 2
           Design Rationale ................................................................................................. 2
           Architecture ....................................................................................................... 3
     Implementation details ................................................................................................. 5
           Bird's eye view ................................................................................................... 5
           The Details ........................................................................................................ 7
     Closing Note .............................................................................................................. 9
     Bibliography .............................................................................................................. 9



Background
Program Slicing
     Program slicing is a well known (at least in the research arena) program analysis technique that can be
     used to find the program points1 affected by a given program point and vice versa. Given a program
     point the slicing algorithm identifies the program points that affect the given program point. The pro-
     gram in which the slice is identified is referred to as substrate program, the given program point is re-
     ferred to as slice criterion, and the identified program points constitute a program slice w.r.t. the given
     criteria2 .

     A program slice constructed by identifying program points that affect a given program point is called a
     backward slice. A program slice composed of program points affected by a program point is called a
     forward slice. We use the term complete slice to indicate a program slice which is the union of both
     backward and forward slice w.r.t the given criteria. We refer to this aspect of the slice as type of the
     slice. It is clear to see that the program points identified as mentioned above may not constitute an ex-
     ecutable program. Hence, we introduce the term executable slice to indicate that a slice is executable.

Dependences
     The concept of a program point affecting/affected by another program point is captured as dependences.
     Dependence can be thought of as a relation between two program points x and y that indicates if x de-
     pends on y. In a dependence relation between x and y where x depends on y, we refer to x as the depend-
     ent and y as the dependee 3.

     There are many notions of dependences and data and control dependence are the most common and
     simple notions of dependences that can occur even in simple non-procedural sequential programs. In a
     1
       A program point may be an expression or a statement in a program.
     2
       We shall use criteria and slice criteria interchangeably. Likewise, we shall use slice and program slice interchangeably.
     3
      We shall often refer to dependent program points as the dependent and dependee program point as the dependee.

                                                               1
                                             Indus - Java Program Slicer


     simple setting data dependence indicates if the variable being read at a program point is influenced by
     another program point at which the same variable is being written. Similarly, control dependence indic-
     ates if the flow of control to a program point is dependent on another program point.

     Given the notion of dependences, a program slice can be thought of as the transitive closure of the slice
     criteria based on dependence relation.

     Please refer to the user guide of StaticAnalyses module for more information about dependences.

Java Program Slicer
     Most of the literature about program slicing concentrates on it's use for the purpose of program under-
     standing. Lately there have been efforts to apply slicing in areas such as error detection [cite Snelting/
     Krinke] and model size reduction in model checking [CorbettICSE00]. The last application has been our
     driving force to design and implement a program slicer for Java in Java.4.

     Bandera is a tool kit that can be used to verify properties about a Java program via model checking. Giv-
     en a property various tools are used to extract a model of the program from the source and verify if it has
     the given property, hence, verifying the program also has the property. During the process of extracting
     a model, we have applied a program slicing to prune out parts of the program that are not necessary to
     discharge the existence of the given property.

     Our first implementation of a Java program slicer was unsuccessful from 2 reasons. One reason was it
     was buggy. The second reason was that it was tightly coupled with Bandera. Both these reasons com-
     pelled us to design and implement a Java program slicer from scratch, hence, the current product and the
     document you are reading!

     Please refer to [HatcliffSAS99] for more information about slicing for the purpose of model checking.


Design and Architecture
Design Rationale
     Any one with a mind for software engineering will realize that dependence information and program sli-
     cing should not be tightly coupled as the latter depends on the former while the former does not depend
     on the latter. Hence, we have separated the slicer from dependence analyses. This means that the slicer
     module depends on "a" dependence module to provide the information it requires via a well defined
     "minimalistic" interface. Hence, the slicer can be composed with any implementation of the specified in-
     terface. Similarly, any application requiring dependence information can use the dependence analyses as
     is5 The net effect being we were able to break down the previous single large chunk of the slicer into
     two smaller reusable modules.

     Our previous implementation of the slicer was monolithic as it was specifically designed and implemen-
     ted for Bandera. Hence, it was geared to generate executable backward slices. As mentioned before this
     is just one type of a slice that can be generated. Our experience indicates that various types of slices and
     various properties required of the slice can be combined in various ways and not all combinations are
     valid. For example, it may be possible to collect minimal extra program points, such as return statements
     in procedures, along with minor alterations to the slice into a backward slice to make it executable such
     that the behavior of the slice is identical to that of the substrate program up until the program points that
     are the slice criteria. However, the same is not true for forward slices as the future state of the program
     depends on the previous states beyond the criteria, hence, requiring a backward slice from the criteria
     and this makes the slice a complete slice. This is the reason we decoupled the generation of a type of a
     slice and ensuring any property, such as executability, required of the slice. This is reflected "as is" in

     4
       In accordance with "Every good work of software starts by scratching a developer's personal itch", one of the 7 lessons in "The
     Cathedral and The Bazaar" document.
     5
       May be with a thin interface adaptation layer.

                                                             2
                                      Indus - Java Program Slicer



     the design by having a module to generate the slice of a type while another module "massages" the gen-
     erated slice so that it possesses the required property. We refer to the former module as the slicing en-
     gine and the latter is considered as part of post processing phase.


     Figure 1. The design of the Slicer




     Most literature on slicing do not make the distinction between the identification of the slice and the rep-
     resentation of the slice as they do not consider the end application. For those familiar with slicing this
     may seem rather too subtle and artificial but it is not. The reason being that by definition a program slice
     is the just some parts of the program picked based on some algorithm by tracking dependences and this
     process only concerns the identification of these parts and nothing more. The application that uses the
     slice decides on the representation of the slice. If the application is a visual program understanding tool,
     it may require the slice to be represented as tagged AST nodes. An application that validates program
     slicers will require that the slice to be residualized as a XML document which can be compared with an-
     other XML document containing the expected slice. If slicing was used to remove unnecessary code, say
     logging, from the code base as a form of optimization then slice will require the slice to residualized into
     executable form, say a class file in Java. This clearly indicates that the process of identification of the
     slice and the representation of the slice are two different activities and we have used this distinction to
     further modularize our design by breaking down the post processing into slice post processing phase and
     residualization phase.

     One major ramification of the above distinction is that it enables one to view program slicing as an ana-
     lysis contrary to the traditional view as a program transformation. This may enable other transformations
     such as specialization to be combined with program slicing.

     Figure 1 provides a graphical illustration of various parts of the slicer along with their dependences
     based on control flow between them. This modularization of the slicer renders various parts of the slicer
     to be libraries which leads to another benefit: customization. Given these library modules the users will
     be able to assemble a slicer customized to their needs without much hassle.

Architecture

                                                    3
                                Indus - Java Program Slicer



The slicer is available as a single unit with many modules. Each module is assigned a particular func-
tionality. The classes of a module may solely provide the functionality of the module or collaboratively
provide the functionality along with other classes in the module. Each module will also provide a well-
defined interface if the functionality is aimed for extension by the user. Based on this design principle,
the following modules exist in the slicer. Figure 2 provides a UML style illustration of the modules and
dependences between them.


slicer                             This module is responsible for the identification of the slice, hence, it
                                   contains factory classes required to generate slice criteria, classes
                                   that contain the algorithm to identify the slice, and classes to collect
                                   the identify the slice. In our implementation, we have chosen to
                                   identify the slice by annotating the AST nodes that are part of the
                                   slice. Note that as mentioned earlier, this is a plausible representation
                                   technique as well.

slicer.processing                  This module contains various forms of post processing that can be
                                   performed on the identified slice in the slice post processing phase.
                                   For example, the functionality of making a sliced executable is real-
                                   ized as a class with a well-defined interface. The user can implement
                                   this interface to hook in another post processing strategy.

slicer.transformations             This module contains classes that transform the program based on
                                   the identified slice. One may use other transformations which may
                                   be driven by the identified slice but was not intended to be driven by
                                   it. However, the basic intention was to capture the transformations
                                   that are specific to slicing in this module. Hence, the user would find
                                   classes that can be used to residualize a slice in this module.

tools.slicer                       This module contains classes that package all the relevant parts re-
                                   quired for slicing as a "slicer" facade or tool that can be readily used
                                   by the end application. The facades adhere to Indus Tool API for the
                                   sake of consistency and compositionality.

                                   Most first time users would want to start experimenting with the tool
                                   implementation available in this module and later use these classes
                                   as examples to assemble a dedicated "slicer".

toolkits                           This module contains adapter classes that adapt facade/tool classes
                                   available in tools.slicer module to be amenable to a tool kit via
                                   preferably a tool API. For example, we adapt the facades for Bandera
                                   in and plan to do the same for Eclipse.



Figure 2. UML-style dependence/relationship between various modules in the
slicer




                                              4
                                         Indus - Java Program Slicer




Implementation details
     This implementation uses Jimple from the Soot toolkit as the intermediate representation. Soot is avail-
     able from the Sable group at McGill University. Hence, the object system should be represented in
     Jimple to use this slicer. The reader should be comfortable with the basic concepts of Soot.

     Each of the modules listed in the section called “Architecture” is a implemented as a Java package with
     the same name rooted in the package edu.ksu.cis.indus. Hence, the fully qualified Java package
     name of the module tools.slicer is edu.ksu.cis.indus.tools.slicer. However, we shall refer
     to the packages via their module-based name rather than with their fully qualified Java name for the sake
     of simplicity.


Bird's eye view
     tools.slicer.SliceXMLizerCLI is a class that uses the tools.slicer.SlicerTool 6 to
     generate the slice and it residualizes the slice as an XML document and each class in the slice as a
     Jimple file and a class file. Following is a snippet of the main code in this class. We will provide a walk
     through the main control flow of this class below.


        public static void main(final String[] args) {
          final SliceXMLizer _driver = .....
     tools.slicer.SlicerToolSlicerTool

                                                     5
                                 Indus - Java Program Slicer



    _driver.initialize();
    _driver.execute();
    _driver.writeXML();
    _driver.residualize();
}
protected final void execute() {
  slicer.setTagName(nameOfSliceTag);
  slicer.setSystem(scene);
  slicer.setRootMethods(rootMethods);
  slicer.setCriteria(Collections.EMPTY_LIST);
  slicer.run(Phase.STARTING_PHASE, true);
}



    The XMLizer is created and initialized in the first 2 lines of main. This is followed by the execu-
    tion of the slicer which is followed by the writing of the slice and the substrate program as XML
    document7 writeXML and the residualization of the slice as Jimple files and class files at re-
    sidualize. These documents and class files are used as artifacts in the regression test frame-
    work used to test the slicer.
    As we mentioned earlier we use a annotation-based approach to identify the slice. We use the in-
    herent support in Soot to tag AST nodes to identify the slice, hence, in this step we provide the
    name of the tag that should be used to annotate AST nodes of the substrate program to identify
    them as belonging to the slice.
    Soot uses a Scene as a abstraction of the system that is being operated on. All the classes and it's
    components can be accessed from the Scene via well defined interfaces. To use the slicer the user
    loads up the classes that form the system into a Scene and provide it to the slice in this step.
    Given just the criteria, the slicer can include parts of the system that may not be relevant in a par-
    ticular run. Although this information is useful in impact analysis, it is overly imprecise in most
    cases. Hence, the user should identify the set of methods in the system that should be considered as
    entry point while generating the slice. The identified entry point methods or root methods (from
    the view of a call graph) is provided to the slicer in this step.
    The slice criteria is set in this step. However, it may be shocking that the code is passing an empty
    collection of criteria. As the slicer was designed and implemented as part of a larger model check-
    ing project, the SlicerTool has the logic that can be switched on to auto generate criteria which
    are crucial to detect deadlocks in the system. These criteria would correspond exactly to
    enter_monitor and exit_monitor statements.

    As for the part of toggling switches, SlicerTool is based on Indus Tool API which has inherent
    support for configuration based on XML data via a SWT-based GUI. Hence, the tools.slicer
    package comes with a default configuration that is used if none are specified and it controls the
    toggling of various switches. This default configuration will use all possible dependences in their
    most precise mode to calculate an executable backward slice that preserves the deadlocking prop-
    erty of the system.
    The wheels start to roll here. Although the invoked method is part of the Indus Tool API, the sim-
    plified under-the-hood view is that the tool is asked to verify if it's current configuration. If so, it is
    asked to execute. Please refer to the documentation of Indus for the details of the arguments.

    The slicer tool executes in 3 stages: starting/initial, dependence calculation, and slicing. If it seems
    that these phases depart from the phases mentioned earlier, it is because the tool is providing a
    facade. A user just wanting to customize the residualization process can extend SlicerTool to
    alter the post processing phase suitably and use the extended version. The classes from
    tools.slicer.processing and transformations.slicer will be used in the post
    processing phase. If he/she want more fine tuned customization then they are advised to put togeth-
    er a new facade on lines similar to that of SlicerTool according to their needs.


                                               6
                                      Indus - Java Program Slicer



          We will get into the guts of the slicer in the next section.


     Figure 3. SlicerTool Configuration GUI




     As the slicer adheres to Indus Tool API it comes with a built in configuration GUI (as illustrated in Fig-
     ure 3) that can be used from inside the application using the slicer. The configuration logic comes with
     serialization and deserialization support as well.

The Details
     In this section we shall deal with the implementation of the SlicerTool. In particular, we shall only

                                                    7
                                 Indus - Java Program Slicer


present the details of how the slicing engine is setup and driven to identify the slice which is later mas-
saged via post processing. The following snippet is the only sequence of method invocations required on
the slicing engine to identify a slice.


   void execute() {
     ....
     engine.setTagName(tagName);
     engine.setCgi(callGraph);
     engine.setSliceType(
       _slicerConfig.getProperty(
          SlicerConfiguration.SLICE_TYPE));
     engine.setInitMapper(initMapper);
     engine.engine.setBasicBlockGraphManager(bbgMgr);
     engine.setAnalysesControllerAndDependenciesToUse(
       daController, _slicerConfig.getNamesOfDAsToUse());
     engine.setSliceCriteria(criteria);
     engine.initialize();
     engine.slice();
     postProcessSlice();
   }



     The root methods/entry point methods set on the SlicerTool earlier will be used to construct a
     call graph which will be used by the slicing algorithm to deal with interprocedural control flow.
     Hence, the call graph provided by ICallGraphInfo interface defined in StaticAnalysis project
     is provided to the engine in this step.
     The type of the slice is set in this step. Note that this does not specify anything about any addition-
     al property required of the slice.
     This is more of a residue of the fact that the instantiation of an object in Java is represented as 2
     statements in Jimple. This is in accordance with the byte code format where the object is created
     by one instruction and later one initialized by another instruction. Hence, the coupling between an
     allocation site and the constructor invocation site needs to be explicated and this is provdied via an
     implementation of edu.ksu.cis.indus.interfaces.INewExpr2InitMapper inter-
     face as it is done in this step.
     As a matter of optimization, rather than creating basic blocks of the graphs every time it is re-
     quired,              we             cache              the           graphs             via             a
     edu.ksu.cis.indus.common.graph.BasicBlockGraphMgr class instance. Also, this
     makes it very easy to vary control flow graph representation used to create the basic block graph
     across all analyses and transformations being used. This manager instance is provided to the slicing
     engine in this step.
     In this step the slicing engine is provided with a analysis controller that was used to drive various
     analyses along with the IDs of the dependence analyses that should be considered while slicing.
     The analysis controller serves as reference container for the dependence analyses.
     The slicing criteria are provided to the slicing engine in this step. If the user wants to create slicing
     criteria on his own then he/she should use the slicer.SliceCriteriaFactory.
     In this step we request the slicing engine to initialize itself. This step should succeed for the slicer
     to function assuming the provided objects are in valid states.
     The slicing engine identifies the slice in this step by annotating/tagging the AST nodes that belong
     to the slice with a tag of the name provided to it.

The call to postProcessSlice method in the SlicerTool combines various post processing
classes to massage the slice and the core of this method is given below.


   if (((Boolean) _slicerConfig.getProperty(


                                               8
                                      Indus - Java Program Slicer


          SlicerConfiguration.EXECUTABLE_SLICE)).booleanValue()) {
          final ISlicePostProcessor _postProcessor =
            new ExecutableSlicePostProcessor();
          _postProcessor.process(_methods, bbgMgr, _collector);
      }

      if (_sliceType.equals(SlicingEngine.FORWARD_SLICE)) {
        _gotoProcessor = new ForwardSliceGotoProcessor(_collector);
      } else if (_sliceType.equals(SlicingEngine.BACKWARD_SLICE)) {
        _gotoProcessor = new BackwardSliceGotoProcessor(_collector);
      } else if (_sliceType.equals(SlicingEngine.COMPLETE_SLICE)) {
        _gotoProcessor = new CompleteSliceGotoProcessor(_collector);
      }
      _gotoProcessor.process(_methods, bbgMgr);



          The generated slice is massaged to make it executable, if required, in this step.
          Depending on the type of slice, a goto processor is picked. The purpose of this processing is to en-
          sure that the control flow skeletal of the slice is identical to that of the substrate program as uncon-
          ditional jumps are not considered by the slicing algorithm for the reason that they do not alter the
          control flow during execution. The slice is processed through the selected goto processor to
          provide a possibly extended slice.

          For the interested reader, _collector is an object that is used by the slicing engine to do book-
          keeping operations pertaining to the identification of the slice. In particular, it annotates the AST
          nodes that are part of the slice and maintains auxiliary information about the identified slice.
          However, the users should be concerned with this class if they plan to add to the post processing
          phase.


Closing Note
    The XMLizing classes used by this project and it's parent and sibling projects use the xmlzing frame-
    work to drive the slicer. So, we urge you to peruse the source code of these classes before asking ques-
    tions on the forum or the mailing list. We will be glad to answers any question you may have regarding
    the usage, but it probably would be faster if the user mocked an existing working piece of code while
    starting to use a new tool.

    The reader is encouraged to use the modules as is or to extend them as required. In the due process, the
    users are urged to submit bug reports of any bugs uncovered with suitable information about the trigger-
    ing input and configuration.

    The interface of the modules are not fixed as the development team has not forseen all possible applica-
    tions and tweaks to the slicer. Hence, the users are encouraged to raise change requests to the develop-
    ment team along with any feature requests they may have. However, please note that the development
    team may not be able to implement all requested features in which case they will assist by providing any
    information or alterations to enable the requested features.

    Please refer to Indus [http://indus.projects.cis.ksu.edu] for more documentation, distribution, mailing
    list, forums, and links to other subprojects.

    We hope you have a pleasant experience using our product.



Bibliography
    [HatcliffSAS99] John Hatcliff. James C. Corbett. Matthew B. Dwyer. Stefan Sokolowski. Hongjun

                                                    9
                              Indus - Java Program Slicer


        Zheng. “A Formal Study of Slicing for Multi-threaded Programs with JVM Concurrency Prim-
        itives”. Proceedings on the 1999 International Symposium on Static Analysis (SAS'99). Sep
        2000.

[CorbettICSE00] James C. Corbett. Matthew B. Dwyer. John Hatcliff. Shawn Laubach. Corina S. Pas-
         areanu. Robby. Hongjun Zheng. “Bandera: Extracting Finite-state Models from Java source
         code”. Proceedings of the 22nd International Conference on Software Engineering (ICSE'00).
         439-448. June 2000.




                                          10

								
To top