Docstoc

Pointcut rejuvenation Recovering pointcut expressions in evolving aspect-oriented software

Document Sample
Pointcut rejuvenation Recovering pointcut expressions in evolving aspect-oriented software Powered By Docstoc
					  Pointcut Rejuvenation: Recovering Pointcut Expressions in Evolving
                      Aspect-Oriented Software 1

                    Raffi Khatchadourian2                      Phil Greenwood, Awais Rashid
                    Ohio State University                          Lancaster University
                 khatchad@cse.ohio-state.edu               {greenwop,awais}@comp.lancs.ac.uk
                                                   Guoqing Xu
                                               Ohio State University
                                              xug@cse.ohio-state.edu

                                        Technical Report COMP-001-2008
                                                   August 2008
                                             Computing Department
                                                    InfoLab21
                                                   South Drive
                                              Lancaster University
                                            Lancaster LA1 4WA UK




   1
     This material is based upon work supported in part by European Commission grants IST-33710 (AMPLE) and IST-2-
004349 (AOSD-Europe).
   2
     This work was administered during this author’s visit to the Computing Department, Lancaster University, United King-
dom.
                                                    Abstract

   Pointcut fragility is a well-documented problem in Aspect-Oriented Programming. Changes to the advised
base-code can lead to join points incorrectly falling in or out of the scope of pointcut expressions. In this paper,
we present a semi-automated approach which limits the problems associated with fragile pointcuts by assisting the
developer to rejuvenate pointcuts as the base-code evolves. The approach is based on deriving intentional patterns
from the advised join points which can later be used to offer suggestions of new join points which should also be
advised. We demonstrate in two phases of evaluation, i) the accuracy of the patterns using single versions of 23
AspectJ programs, and ii) the suggestions based on these patterns using multiple versions of 4 of these programs
are accurate. Not only do these results reveal the usefulness of such a tool but also provide insights on the design
of pointcut.




                                                         1
Contents

Contents                                                                                                                                                                                        2

List of Figures                                                                                                                                                                                 3

List of Tables                                                                                                                                                                                  4

1   Introduction                                                                                                                                                                                5

2   Motivating Example                                                                                                                                                                          7

3   Algorithm                                                                                                                                                                                    9
    3.1 Assumptions . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    9
    3.2 Pointcut Rejuvenation . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   10
    3.3 Concern Graphs . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   11
         3.3.1 Specification . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   12
         3.3.2 Intention Patterns   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   15

4   Experimental Evaluation                                                                                                                                                                     23
    4.1 Implementation . . . . . . . . . . . .                  .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   23
    4.2 Experimental Evaluation . . . . . . .                   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   23
        4.2.1 Phase I: Correlation Analysis .                   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   23
        4.2.2 Phase II: Expression Recovery                     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   25

5   Related Work                                                                                                                                                                                29
    5.1 Pointcut Fragility . . . . . . . . . . . . . . . . . .                              .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   29
    5.2 Aspects and Refactoring . . . . . . . . . . . . . .                                 .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   29
    5.3 Automated Aspect-Oriented Software Development                                      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   29
    5.4 Concern Traceability . . . . . . . . . . . . . . . .                                .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   30

6   Conclusion and Future Work                                                                                                                                                                  31

Bibliography                                                                                                                                                                                    33




                                                                                2
List of Figures

 2.1   Hybrid automobile example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                 8
 2.2   Speeding prevention aspect. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                8
 2.3   A new fuel cell class. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                             8

 3.1   Algorithm formalism notation. . . . . . . . . . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   10
 3.2   Top-level rejuvenation algorithm. . . . . . . . . . . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   11
 3.3   A subset of CG + computed from the motivating example.
                        P                                               .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   12
 3.4   Rejuvenation approach meta-model. . . . . . . . . . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   13
 3.5   Extended concern graph formalism notation. . . . . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   14
 3.6   Intention pattern creation algorithm. . . . . . . . . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   15
 3.7   Vertex-based pattern path extraction algorithm. . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   18
 3.8   Arc-based pattern path extraction algorithm. . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   19
 3.9   Pattern attribute equations. . . . . . . . . . . . . . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   20

 4.1   Venn diagram depicting relationships between PCEs in subsequent software versions. . . . . . . .                                                     26
 4.2   Rejuvenation results: Receiver operating characteristic (ROC) plot. . . . . . . . . . . . . . . . . .                                                27
 4.3   Receiver operating characteristic (ROC) curve. . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                 28




                                                        3
List of Tables

 4.1   Phase I: Analysis experimental results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   24
 4.2   Phase II: Rejuvenation experimental subjects. . . . . . . . . . . . . . . . . . . . . . . . . . . . .     25
 4.3   Phase II: Rejuvenation experimental results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    25




                                                       4
Chapter 1

Introduction

Aspect-Oriented Programming (AOP) [19] has emerged to reduce the scattering and tangling of crosscutting con-
cern (CCCs) implementations. This is achieved through specifying that certain behavior (advice) should be com-
posed at specific execution points (join points) which are quantified via pointcut expressions (PCEs). A PCE does
so by logically connecting various predicates over static and dynamic program characteristics like method/field
names and execution control-flow. An optimal PCE is one that truly conveys the essence of where a CCC applies
in the base-code so that not only are current join points selected but future ones as well.
   With the requirements of typical software tending to change over time the constituent source code may undergo
many alterations such as refactoring, incorporation of new technologies, architectural restructuring, design modifi-
cations, etc. The fragile pointcut problem [21] can manifest itself in such circumstances with join points incorrectly
falling in or out of the scope of the PCEs. Designing a robust PCE is often considered a “dark-art” with multiple
design choices available to the developer. For example, if all method executions within a class named Foo need to
be advised, two different strategies could be employed: (i) a generic PCE could be specified that quantifies over all
methods (e.g., execution(* Foo.*(..))), or (ii) each method could be enumerated individually (e.g., execution
(* Foo.methodA())|| execution(* Foo.methodB())|| ˜\ldots˜). Deciding which strategy is best in order
to balance robustness, correctness, and precision is non-trivial task. Moreover, it is maintenance changes, whether
or not a given PCE is, in fact, robust.
   For a multitude of reasons, the true intentions of the programmer may not always be accurately captured in a
PCE. Firstly, PCE languages may not be sufficiently expressive to represent such intentions [22], these difficulties
being rooted at the inherent fragility of typical PCE languages [21]. Furthermore, PCEs, in general, and particu-
larly in AspectJ [18], are difficult to write [36] and, thus, may require appropriate levels of savvy and familiarity
with the intricacies of the PCE language.
   Several approaches [25, 6, 29, 20, 30] aim to combat the fragile pointcut problem by proposing new pointcut
language constructs to improve their expressiveness. Other approaches attempt to limit the scope of where advice
may apply in the base-code through more clearly defined interfaces [1, 13], or by enforcing structural and/or
behavioral constraints upon advice [16, 12, 33]. However, each of these approaches tends to require some level
of anticipation of the future. Consequently, there may nevertheless exist situations where PCEs must be manually
updated and augmented in order to capture new join points as the software evolves. This process unfortunately
develops into a vicious cycle where these new PCEs may also inherit similar problems. Instead, we propose an
approach that provides semi-automated assistance in rejuvenating a PCE upon changes to the base-code to alleviate
the following problems:

Evolution. Detect when changes made to the base-code cause a PCE to be invalidated.
Anticipation. Determine the common characteristics shared by each join point in the scope of PCE to anticipate
      future join points which may fall within the scope of a given PCE.

                                                          5
Maintenance. Validate the accuracy of each PCE, ensuring that the all join points are captured as intended.

  The key contributions of our approach are as follows:

Intention recognition. Our approach semi-automatically infers, based on specific patterns inherent to the join
      points selected in the original PCE, new join points that ought to be included in a revised version of the
      PCE. This helps developers maintain PCEs by analyzing patterns in the software’s underlying intentional
      structure.

Correlation analysis. To support our intention recognition algorithm, we applied the analysis phase of our pro-
     posal to PCEs contained with single versions of 23 AspectJ programs in order to study the quality of the
     patterns produced. We found that the derived patterns, on average, were able to cluster elements correspond-
     ing to join points that were solely contained within the analyzed PCE with low α (false positive) and β (false
     negative) error rates of 18% of 16%, respectively.

Expression recovery. To ensure the applicability and practicality of our approach, we implemented our inferenc-
     ing algorithm as an Eclipse IDE1 plugin. We empirically evaluated the tool’s performance and accuracy in
     assisting the recovery of PCEs within 4 multi-version AspectJ projects. We found that the tool was able
     to successfully infer, appearing in the top 4th percentile of the results, 94% percent of new join points that
     were intended to be applicable to an existing pointcut as the software evolved. These studies indicate that
     the approach is useful in alleviating the burden of recovering PCEs upon base-code modification.




  1
      http://www.eclipse.org


                                                        6
Chapter 2

Motivating Example

Figure 2.1 shows an example control system for a hybrid-powered vehicle which has two different power sources:
a diesel engine (line 20) and electric motor (line 26)1 . Calls to DieselEngine.increase(Fuel) (line 22) or
ElectricMotor.increase(Current) (line 28) increase the vehicle’s speed (line 3) with the HybridAutomobile
notified to compute the new speed (lines 7–8, 13–14).
    Suppose now that certain highways exhibit a feature that notifies vehicles of the speed limit. As such, the
code portrayed in Figure 2.1 is augmented by an aspect SpeedingViolationPrevention (Figure 2.2) to prevent
speeding. It does so via around advice (lines 2–4). The PCE (line 3) specified to compose this advice selects
join points corresponding to the execution of DieselEngine.increase(Fuel) and ElectricMotor.increase(
Current). Class Energy (not shown) is an abstract super class which both classes Fuel and Current (also not
shown) extend. The type pattern Energy+ is a wildcard that denotes object references of type Energy and its
subclasses.
    Further suppose that the base-code evolves to add a new fuel cell energy source. Consequently, a FuelCell
class (Figure 2.3) is created. Requests to increase power from the FuelCell require passing a numerical parameter
(e.g., double) to a method (line 4) representing the amount of energy the fuel cell is to generate. Intuitively, the
SpeedingViolationPrevention aspect should also apply to the execution of this method. However, the PCE
listed on line 3, Figure 2.2 fails to capture this new but semantically equivalent join point. Although the new
method’s signature is consistent with the other join points with only the parameter type differing, i.e., double is
a primitive type that could not hold references to type Energy or any of its sub-classes, this difference causes the
PCE not to match this method. It would be helpful to developers if such join points that may have been overlooked
when manually updating PCEs to reflect new changes in the base-code could be mechanically suggested. In the
following sections, we will continue to use the hybrid-powered vehicle example to demonstrate how our proposed
approach can be used to identify such new join points in a semi-automated fashion.




  1
      This example was inspired by one of the authors’ work at the Center for Automotive Research at the Ohio State University.


                                                                    7
 1   package p;
 2   class HybridAutomobile {
 3     private double overallSpeed;
 4
 5     / / S e t s t h e new s p e e d f o r c h a n g e s i n f u e l .
 6     public void notifyChangeIn(Fuel fuel) {
 7         this.overallSpeed +=
 8            fuel.calculateDeltaInMPH(this);
 9         / ∗ Update a t t a c h e d o b s e r v e r s . . . ∗ / }
10
11     / / S e t s t h e new s p e e d f o r c h a n g e s i n e l e c t r i c i t y .
12     public void notifyChangeIn(Current current) {
13         this.overallSpeed +=
14            current.calculateDeltaInMPH(this);
15         / ∗ Update a t t a c h e d o b s e r v e r s . . . ∗ / }
16
17     public double getOverallSpeed() {
18       return overallSpeed;}}
19
20   class DieselEngine {
21     private HybridAutomobile car;
22     public void increase(Fuel fuel) {
23       // ...
24       this.car.notifyChangeIn(fuel);}}
25
26   class ElectricMotor {
27     private HybridAutomobile car;
28     public void increase(Current current) {
29       // ...
30       this.car.notifyChangeIn(current);}}
31
32   class Dashboard {
33     private HybridAutomobile car;
34     public void update() {
35       // ...
36       this.display(car.getOverallSpeed()); }}




                                                         Figure 2.1. Hybrid automobile example.


 1   aspect SpeedingViolationPrevention {
 2     Object around() :
 3       execution(void increase(Energy+))
 4         { / ∗ . . . ∗ / }}




                                                        Figure 2.2. Speeding prevention aspect.


 1   package p;
 2   class FuelCell {
 3     private HybridAutomobile car;
 4     public void increase(double amount) {
 5       // ...
 6       Current current =
 7         this.generateCurrent(amount);
 8       this.car.notifyChangeIn(current);}}




                                                               Figure 2.3. A new fuel cell class.


                                                                                         8
Chapter 3

Algorithm

In this section, we present our inferencing algorithm that serves as a mechanism for unveiling developers’ un-
derlying intent in capturing certain program elements in a PCE. The algorithm utilizes a concept similar to that
of a concern graph [28] extended with several elements found in current Java languages, e.g., annotations1 , and
adapted for use with AOP. Concern graphs have been previously used to discover, describe, and track concerns
in evolving source code [27]. Our goal, however, is to exhaustively exploit rich, structural relationships between
program elements corresponding to captured join points, extract general patterns related to this data (as inspired by
[7]), and finally apply these patterns to later versions of the software in order to accurately and effectively recover
PCEs.

3.1      Assumptions

   The algorithm presented in this section works under a key assumption that the initial PCE to be analyzed and
later rejuvenated is specified correctly. Specifically, we assume that advice is created in order to materialize the
implementation of a concern that crosscuts the underlying base-code; thus, the PCE bound to the advice should
quantify over the join points that correspond to this CCC. In fact, we treat the bound PCE as a mapping between
the concern realized by the advice and where the concern applies in the base-code. This assumption is crucial in
the ability of the algorithm to successfully infer new join points that should be incorporated in a revised version of
the PCE.
   For the sake of presentation, we further make several simplifying assumptions about the underlying source code
to be analyzed; we discuss in Chapter 4 how much of these are relaxed in our implementation.

       • We assume that inter-type declarations (static crosscutting) are not utilized by the analyzed aspects. Inter-
         type declarations allow aspects to introduce and modify facets of the base-code, e.g., member introduction,
         class hierarchy alteration, interface implementation injection, exception softening, existing at compile-time
         .

       • Although it is possible for a PCE to capture join points associated within an advice body (possibly the one it
         is bound to), we adopt the perspective that aspects are indeed separate from the base-code; advice may only
         apply to join points associated with classes, interfaces, and other Java types.

       • We assume that we are able to statically identify all references to program entities contained within the
         base-code. This assumption could be invalidated through the use of reflection and custom class loaders.
   1
     During our empirical studies, we found that augmenting concern graphs with program elements found in current Java languages to be
especially usefully in uncovering the essence of join points underlying a certain PCE. The reason being is that annotations, in particular,
are used, surprisingly, rather often as a mechanism to specify where a CCC should apply. We discuss these details further in Chapter 4.


                                                                    9
 ω              a join point shadow; code corresponding to a join point
 A              a piece of advice
 Apce           a pointcut bound to advice A; a set of join point shadows
 Apce           a subsequent revision of Apce
 P              the original program, the underlying base-code
 P              a subsequence revision of program P
 ΩP             the set of join point shadows contained within program P
 CG +
    P           a finite graph representing structural relationships between program elements in P
 π              an acyclic path (sequence of arcs) in CG + ; an intention
                                                          P
 ΠP             a set of acyclic paths in CG +P
 ˆ
 π              an intention pattern derived from π, possibly containing wild-cards
 W(ˆ )
    π           a multiset projection of wild-card elements in πˆ
 ˆP
 Π              a set of intention patterns derived from CG P+


                                              Figure 3.1. Algorithm formalism notation.


       • Furthermore, we assume that the original source code successfully compiles under an AspectJ Development
         Tools2 (AJDT) 1.6 compiler.

   Recall that a join point refers to a well-defined point in the execution of the base-code; thus, the definition of a
join point is dynamic in nature. A join point shadow, on the other hand, refers to base-code corresponding to a join
point [35], i.e., a point in the program text where the compiler may actually perform the weaving [24]. Whether or
not the base-code is actually advised at that point is dependent on (i) advice being applicable at that point, and (ii)
possible dynamic conditions being met. As such, we consider a given program P as containing a set of join point
shadows ΩP (see Figure 3.1) that may or may not be currently under the influence of advice3 . Moreover, a given
piece of advice A specifies its applicability to a base program P via its bound PCE Apce , which selects a subset of
shadows contained within P, i.e., A applies to P at ΩP ∩ Apce . Therefore, each ω ∈ ΩP ∩ Apce specifies where
A should apply to P but does not specify when. That is, we assume that no dynamic conditions are associated
with Apce ; thus, we utilize solely static information in our analysis. Chapter 4 discusses how our implementation
conservatively relaxes this assumption so that PCEs utilizing dynamic conditions may nevertheless be used as
input to our tool.
   Lastly, we assume that we can accurately resolve the declaration of a particular piece of advice across varying
versions of the software. This assumption is important to our analysis since, in AO languages like AspectJ, advice
is considered to be anonymous, which may make it difficult to track it in subsequent versions. Chapter 6 discusses
future plans on how our tool can be altered to enforce the validity of this assumption.

3.2       Pointcut Rejuvenation

   Figure 3.2 depicts the Rejuvenate function, the top-level pointcut rejuvenation algorithm that drives the ap-
proach. Input to the function is a pointcut Apce from the original program P to be recovered as a result of the new
version of the program P . Conceptually, applying this function to the example given in Chapter 2, we would take
Apce to be the PCE declared on line 3, Figure 2.2, P to be the sequence of classes depicted in Figure 2.1, and
P to be P concatenated with the FuelCell class found in Figure 2.3. Parameter d, which will be discussed in
more detail later, represents the maximum analysis depth which serves to restrict the depth of structural program
relationships analyzed, thus limiting the length (in the number of arcs) of the patterns produced by the algorithm.
   2
       http://www.eclipse.org/ajdt
   3
       This definition differs slightly from those given in the literature.


                                                                         10
function Rejuvenate(Apce , P, P , d)
          +
  1: CG
          P ← BuildGraph(P) /*Construct the extended concern graph for the original program*/
     ˆ                                  +
  2: ΠP ← CreatePatterns(Apce , CG , d) /*Derive intention patterns relevant to the PCE from the graph of the
                                        P
     original program*/
          +
  3: CG
          P ← BuildGraph(P ) /*Construct the extended concern graph for the revised program*/
     ˆ P ← CreatePatterns(ΩP , CG + , d) /*Derive all possible intention patterns from the graph of the revised
  4: Π
                                       P
     program*/
     ˆ           ˆ    ˆ
  5: ΠP∩P ← ΠP ∩ ΠP /*Intersect the patterns derived from the old version with the ones from the new version.*/
                              ˆ
  6: S ← MakeSuggestions(ΠP∩P , CG
                                           +
                                           P ) /*Create a set of suggestion, confidence pairs*/
  7: Apce ← ∅ /*The rejuvenated PCE to be returned*/

  8: for all (ω, c) ∈ Sort(S) do /*For all suggestion, confidence pairs by descending confidence*/

  9:   Suggest(ω, c)
 10:   if Selected (ω) then
 11:      Apce ← Apce ∪ {ω}
 12:   end if
 13: end for

 14: return Apce



                                  Figure 3.2. Top-level rejuvenation algorithm.


Line 1 constructs an adaptation of a concern graph CG + using program elements from the original program P.
                                                        P
In the next section, we specify the graph more precisely; Chapter 4 discusses how the graph was generated in our
prototype implementation.

3.3   Concern Graphs

   A representation similar to that of a concern graph is adapted and extended in our approach to help uncover
the essence of shadows captured by a particular PCE. Each finite, acyclic path in the directed graph represents a
developer’s intention of where a CCC may apply in the source code. Informally, a developer intention is an aim
or goal the developer has in mind when creating and/or maintaining programming elements to realize a particular
requirement [11]. For instance, consider the code of the hypothetical hybrid automobile example given in Chapter
2. Here, the developer has written a piece of advice (Figure 2.2, lines 2–4) that is intended to “advise the executions
of methods that are responsible for contributing to overall speed of the vehicle in order to bypass them under
certain conditions” (I1 ). To carry out this across the software that are responsible for this behavior, and writes
a suitable PCE (e.g., Figure 2.2, line 3) that captures the executions of these methods. Of course, exactly how
this intention is encoded may not be unique and is largely dependent on the expressiveness of the available PCE
language, as well as the developer’s expertise with that language.
   Relating this to our example, the intention I1 has been encoded using the PCE execution(void increase
(Energy+)). We can rephrase this intention in another way, e.g., “to advise the executions of methods which
possibly call methods that possibly write (or set) the field HybridAutomobile.overallSpeed” (I2 ). Although both
intentions I1 and I2 share a common goal in advising the same set of method executions, the latter expresses the
intention using various low-level program elements and structural characteristics existing at compile time. Thus,
we can represent I2 diagrammatically as portrayed in Figure 3.3, which depicts a subset of an extended concern
graph computed from the motivating example given in Chapter 2. Here, I2 is encoded in terms of elements of
the graph manifested as two finite, acyclic paths increase(Fuel)        overallSpeed and increase(Current)
overallSpeed. The highlighted vertices denote executions of the represented methods as being “selected” (or
enabled) by the encoding.

                                                          11
                                                                   p


                                                        contains       contains


                                                 DieselEngine               ElectricMotor


                                              declares_method               declares_method


                                               increase(Fuel)                 increase(Current)


                                              calls_method                        calls_method


                                         notifyChangeIn(Fuel)               notifyChangeIn(Current)


                                                         sets_field           sets_field


                                                                   overallSpeed




                       Figure 3.3. A subset of CG + computed from the motivating example.
                                                  P


   The meta-model depicting the topology of entities and their corresponding relationships in regards to our reju-
venation approach is portrayed in Figure 3.4. We invite the reader to freely and periodically refer to this diagram
as the algorithm discussion progresses.

3.3.1        Specification

   We now specify the extended concern graph more formally utilizing the notation in Figure 3.5.

Definition 1 An extended concern graph CG + constructed from program P is a labeled multidigraph consisting
                                         P
of a 4-tuple CG + = (V, A, R, ) where
                P

       • V = φP ∪µP ∪ψP ∪εP ∪ιP ∪γP ∪υP is a set of vertices representing the declarations of program elements
         contained in P 4 , e.g., packages, classes, interfaces, enumeration types, annotations, methods, fields,

       • A = (u, v) u, v ∈ V ∧ ∃ r ∈ R r(u, v)                     is a multiset of arcs connecting vertices in V ,

       • R (see definition 2) is a set of binary relations depicting structural relationships existing amongst program
         elements in P at compile time,

       •    : A → R (see definition 3) is a labeling function labeling arcs with the relationships in R that elements
           represented by their source and target vertices, respectively, satisfy.

   The set of vertices V of CG + represent declarations of various program element, e.g., packages, classes, in-
                                P
terfaces, enumeration types, annotations, methods, fields5 . The set of arcs A connect vertices in V depending on
   4
     The program parameter is implicit in this definition in order to simplify the presentation.
   5
     We do not consider local variables and other parameters in our analysis as crosscutting concerns tend to crosscut a larger granularity
of programming elements.


                                                                       12
                                            Method Call
                    Advice                                                Field Set JPS                   Field Get JPS
                                               JPS


                     binds


                   Pointcut                Relationship                       Concern
                                                                                                          Intention Path
                  Expression                   JPS                             Graph                  1
                       1

                   captures                                                                    contains
                                           associated with                    contains
                                                1                                          *
                       *                                                                                                   <<matches>>
                                            Program          represents
                  Join Point                                                                                                              Intention
                                             Element                            Arc                         Intention
                 Shadow (JPS)                                                                                                              Pattern
                                           Relationship
                                                                                                                                               1
                                                *
                                                has                                                                                        contains
                                                                     source           destination
                                               1                                                                                              *
                                                             represents
                   Method                    Program                                                         Vertex
                  Execution                                                    Vertex                                                    Arc Wildcard
                                             Element                                                        Wildcard
                    JPS


                associated with



                    Method                    Package                           Type                          Field




                                         Figure 3.4. Rejuvenation approach meta-model.


the truth value of relations found in the set R when applied to the source and target vertices, respectively. The
relations in R, details of which are specified more formally in definition 2, represent various structural relation-
ships, e.g., method calls6 , field accesses. Many such kinds of relationships may exist, however, for simplicity, we
mainly focus on several popular relationship types as previously utilized in the literature [7, 5, 28, 27]. Arcs are
then labeled with the satisfied relations via the labeling function , which is specified more formally in definition
3. Notice that, in this example, the analysis performed utilizes solely static information, in particular, class hierar-
chical analysis (CHA) [8] is used to identify method call relationships. Chapter 6 touches upon future work which
may potentially result in a more accurate estimate of the truth values associated with such relationships.

   6
       For simplicity of presentation, our formalism groups class instance creations, i.e., constructor calls, with method calls.




                                                                              13
 φP    {f    f is a field declared in P}
 µP    {m     m is a method declared in P}
 ψP    {c   c is a class declared in P}
 εP    {e   e is an enumeration type declared in P}
 ιP    {i   i is an interface decared in P}
 υP    {a    a is an annotation type declared in P}
 γP    {p   p is a package declared in P}

                           Figure 3.5. Extended concern graph formalism notation.


Definition 2 A set of relations R of binary predicates, i.e., relations, over program elements is the set

                     R = {GetsField : µP × φP → B,
                             SetsField : µP × φP → B,
                             CallsMethod : µP × µP → B,
                             OverridesMethod : µP × µP → B,
                             ImplementsMethod : µP × µP → B,
                             DeclaresMethod : ψP ∪ εP ∪ ιP × µP → B,
                             DeclaresField : ψP ∪ εP ∪ ιP × φP → B,
                             DeclaresType : ψP ∪ ιP ∪ υP × ψP ∪ εP ∪ ιP ∪ υP → B,
                             ExtendsClass : ψP × ψP → B,
                             ExtendsInterface : ιP × ιP → B,
                             ImplementsInterface : ψP ∪ εP × ιP ∪ υP → B,
                             ContainsType : γP × ψP ∪ εP ∪ ιP ∪ υP → B,
                             Annotates : υP × γP ∪ υP ∪ ιP ∪ εP ∪ ψP ∪ µP ∪ φP → B}

Definition 3 A labeling function : A → R labels arcs corresponding to the predicates in R their constituent
vertices satisfy such that

         
         GetsField
                                     ⇐⇒    u ∈ µP ∧ v ∈ φP ∧ GetsField (u, v)
         
         SetsField                   ⇐⇒    u ∈ µP ∧ v ∈ φP ∧ SetsField (u, v)
         
         
         
         
         CallsMethod                 ⇐⇒    u ∈ µP ∧ v ∈ µP ∧ CallsMethod (u, v)
         
         
         
         
         OverridesMethod             ⇐⇒    u ∈ µP ∧ v ∈ µP ∧ OverridesMethod (u, v)
         
         
         
         
         ImplementsMethod            ⇐⇒    u ∈ µP ∧ v ∈ µP ∧ ImplementsMethod (u, v)
         
         
         
         
         DeclaresMethod              ⇐⇒    u ∈ ψP ∪ εP ∪ ιP ∧ v ∈ µP ∧ DeclaresMethod (u, v)
         
         
         
 (u, v) = DeclaresField               ⇐⇒    u ∈ ψP ∪ εP ∪ ιP ∧ v ∈ φP ∧ DeclaresField (u, v)
         
         DeclaresType                ⇐⇒    u ∈ ψP ∪ ιP ∪ υP ∧ v ∈ ψP ∪ εP ∪ ιP ∪ υP ∧ DeclaresType(u, v)
         
         
         
         
         ExtendsClass                ⇐⇒    u ∈ ψP ∧ v ∈ ψP ∧ ExtendsClass(u, v)
         
         
         
         
         ExtendsInterface            ⇐⇒    u ∈ ιP ∧ v ∈ ιP ∧ ExtendsInterface(u, v)
         
         
         
         
         ImplementsInterface         ⇐⇒    u ∈ ψP ∪ εP ∧ v ∈ ιP ∪ υP ∧ ImplementsInterface(u, v)
         
         
         
         
         ContainsType                ⇐⇒    u ∈ γP ∧ v ∈ ψP ∪ εP ∪ ιP ∪ υP ∧ ContainsType(u, v)
         
         
         
         
          Annotates                   ⇐⇒    u ∈ υP ∧ v ∈ γP ∪ υP ∪ ιP ∪ εP ∪ ψP ∪ µP ∪ φP ∧ Annotates(u, v)
         


                                                        14
function CreatePatterns(Apce , CG + = (V, A, R, ), d)
                                       P
  1: T ← ∅ /*The set of patterns to be returned, initially empty*/
                                               +
  2: for all v ∈ V do /*For all vertices in CG */
                                               P
  3:   if EnabledVertex (v, Apce ) then /*If associated with the PCE*/
  4:      for all π ∈ PathsThrough(v, d) do /*For all acyclic paths of length ≤ d passing through v*/
  5:        ˆ
            Π ← ExtractVertexPatterns(π, v) /*Obtain vertex patterns from π using v*/
  6:        T ←T ∪Π    ˆ
  7:      end for
  8:   end if
  9: end for
                                                +
 10: for all (u, v) ∈ A do /*For all arcs in CG */
                                                P
 11:   if EnabledArc(u, v, Apce ) then /*If associated with the PCE*/
 12:      for all π ∈ PathsAlong(u, v, d) do /*For all acyclic paths of length ≤ d along (u, v)*/
 13:        ˆ
            Π ← ExtractArcPatterns(π, u, v) /*Obtain arc patterns from π using (u, v)*/
 14:        T ←T ∪Π    ˆ
 15:      end for
 16:   end if
 17: end for

 18: return T



                               Figure 3.6. Intention pattern creation algorithm.


3.3.2   Intention Patterns

    Once CG + is constructed, the next step is to identify intention patterns that express general shapes of paths
               P
(i.e., intentions) within the graph from elements associated with shadows currently selected by the given PCE.
Returning to the Rejuvenate function depicted in Figure 3.2, intention patterns of a maximum length of d are
derived (line 2) from finite, acyclic paths in CG + which are relevant to the input PCE. The purpose of these
                                                   P
patterns is to approximate the essence of the developer’s intentions behind the original PCE.

Derivation
   The function CreatePatterns, whose definition is depicted in Figure 3.6, initializes a set T to be a empty set
of patterns to be returned at line 1. The algorithm then proceeds to cycle through each vertex (line 2) and arc
(line 10), searching for graph elements that are considered enabled (or selected) by the given PCE. It does so via
a vertex-pointcut association relation EnabledVertex : V × P(ΩP ) → B and a arc-pointcut association relation
EnabledArc : A × P(ΩP ) → B, respectively, where P denotes the power set operation and B the set of boolean
values, determines whether a graph element is associated with a given PCE.

Pointcut Association
Intuitively, EnabledVertex , given a vertex v and a PCE Apce , is true iff there exists a shadow ω ∈ Apce such
that ω corresponds to the program element e represents. Likewise, EnabledArc, given either an arc (u, v) and a
PCE Apce , is true iff there exists a shadow ω ∈ Apce such that ω corresponds to the program element relationship
 (u, v) represents. For instance, if a PCE contains a method execution shadow for a method x, a vertex in CG + P
representing x is considered enabled w.r.t. the PCE. Conversely, if a PCE contains a method call shadow from
a method y to a method z, then an arc between the vertices representing y and z with the label CallsMethod is


                                                       15
considered enabled w.r.t. the PCE. Relating this to our motivating example given in Chapter 2, the PCE execution
(void increase(Energy+)) captures the execution of two methods, namely, DieselEngine.increase(Fuel)
and ElectricMotor.increase(Current). As such, we assign dual semantics to different kinds of graph elements
in order to relate them to PCEs. In this case, vertices representing method declarations also happen to represent
the shadow corresponding to each method’s execution. To illustrate this notion, the subset of CG + depicted in
                                                                                                     P
Figure 3.3 portrays the vertices representing these methods as shaded, denoting that the Enabled relation is true
for the combination of each method and the given PCE.
   We now specify the EnabledVertex relation more formally.

Definition 4 For an extended concern graph CG + = (V, A, R, ), a vertex v ∈ V , and a PCE Apce ∈ P(ΩP ),
                                             P
we have that EnabledVertex (v, Apce ) ⇐⇒ v ∈ µP ∧ ∃ ω ∈ Apce ω = execution(v) .
  As definition 4 depicts, a vertex is considered enabled w.r.t. to a given PCE iff the vertex represents a method
whose corresponding execution join point shadow is currently being advised by the PCE. Note that execution(v)
denotes the method execution join point shadow corresponding to the method represented by the vertex v.
  We now specify the EnabledArc relation more formally.

Definition 5 For an extended concern graph CG + = (V, A, R, ), an arc (u, v) ∈ A, and a PCE Apce ∈ P(ΩP ),
                                             P
we have that

 EnabledArc(u, v, Apce )    ⇐⇒       (u, v) = GetsField ∧ ∃ ω ∈ Apce ω = get(v)&& withincode(u)
                                    ∨ (u, v) = SetsField ∧ ∃ ω ∈ Apce ω = set(v)&& withincode(u)
                                    ∨ (u, v) = CallMethod ∧ ∃ ω ∈ Apce ω = call(v)&& withincode(u)

  As definition 5 depicts, an arc is considered enabled w.r.t. to a given PCE iff either

  (i) the arc is labeled as a field read access and there exists a shadow in the given PCE s.t. the shadow represents
      a field get join point on the field represented by the target vertex v (denoted by get(v)), and the shadow is
      located within the body of the method represented by the source vertex u (denoted by withincode(u)), or

 (ii) the arc is labeled as a field write access and there exists a shadow in the given PCE s.t. the shadow represents
      a field set join point on the field represented by the target vertex v (denoted by set(v)), and the shadow is
      located within the body of the method represented by the source vertex u (denoted by withincode(u)), or

 (iii) the arc is labeled as a method call and there exists a shadow in the given PCE s.t. the shadow represents a
       method call join point of which the called method is that of the method represented by the target vertex v
       (denoted by call(v)), and the shadow is located within the body of the method represented by the source
       vertex u (denoted by withincode(u))

   As future work, we plan to incorporate more AO shadow types into this scheme, e.g., handler(). Chapter 6
discusses these future plans in more detail.

Path Extraction
Once determined that a vertex v is indeed enabled by the given PCE, CreatePatterns (line 4) traverses each
acyclic, finite path π of length ≤ d in CG + passing through v. E.g., one such path in the case of the example given
                                          P
                                                                                              cm
                                                                                              −
in Figure 3.3, when taking v = increase(Fuel) and d = 2 would be increaseFuel(Fuel) −→ notifyChangeIn
         sf
(Fuel)   →
         − HybridAutomobile.overallSpeed, where the labels cm and sf refer to the satisfied binary relations


                                                        16
in R, CallsMethod : µP × µP → B and SetsField : µP × φP → B, respectively. The traversal process is similar
for arcs (line 12), except that paths along the arc are considered as opposed to passing through a specific vertex.
   In the case of vertices, the function then proceeds (line 5) to obtain intention patterns from the path π under
consideration using the enabled vertex as a guide, doing so via a helper function ExtractVertexPatterns : ΠP ×
V → P(ΠP ). This function extracts a vertex-based intention pattern; a vertex-based pattern is similar in nature
            ˆ
to the path it is extracted from except that certain vertices are replaced with wildcard elements. Wildcard elements
are used for matching patterns with other paths in the graph in order to ultimately obtain suggested shadows. A
vertex on the path may be replaced by a vertex wildcard, which only matches other vertices. A vertex wildcard
may be enabled or disabled depending on its position in the path relative to the input vertex. Shadows represented
by vertices matched by enabled wildcards are those that eventually become suggested. The pattern-path matching
scheme is discussed in more detail in Section 3.3.2.
   In the case of arcs, the function proceeds on line 13 to obtain intention patterns from π using the enabled arc
as a guide, doing so via a helper function ExtractArcPatterns : ΠP × A → P(ΠP ). This function extracts an
                                                                                       ˆ
arc-based intention pattern; an arc-based pattern is similar in nature to the path it is extracted from except that
certain vertices and arcs are replaced with wildcard elements. An arc on the path may be replaced by an arc
wildcard, which only matches other arcs. Similar to a vertex wildcard, an arc wildcard may be enabled or disabled
depending on its position in the path relative to the input arc. Shadows represented by arcs matched by enabled
wildcards are those that eventually become suggested.
   Intuitively, the functions works by replacing combinations of elements along the path with wildcards depending
on the position of the enabled graph element. Relating this concept to the example subset graph given Figure
3.3, suppose π, the path under consideration, is the one previously considered, namely, increaseFuel(Fuel)
cm                                     sf
 −                         →
−→ notifyChangeIn(Fuel) − HybridAutomobile.overallSpeed, and suppose the enabled vertex is the first
vertex increaseFuel(Fuel). Thus, in this case, ExtractVertexPatterns would produce a set consisting of a
                                                 cm   sf
single vertex pattern π , namely, ?∗ −→ ? − HybridAutomobile.overallSpeed where ? denotes a disabled
                        ˆ               −       →
wildcard and ?∗ denotes an enabled wildcard.
   Figure 3.7 defines the ExtractVertexPatterns function more formally. To do so, we first define two additional
helper functions on arcs of the extended concern graph CG + = (V, A, R, ), s : A → V and t : A → V , which,
                                                             P
given an arc, project the constituent source and target vertices (see Figure 3.4 for more details), respectively.
Function ExtractVertexPatterns receives two parameters, namely, a path π in which to extract the vertex-based
pattern from and an enabled vertex v ∈ {u ∀a ∈ π[u = s(a) ∨ u = t(a)]} along π. Recall from the notation
given in Figure 3.1 that both paths and patterns are sequences of arcs, and an arc belonging to a path consisting
only of that single arc is considered both the first and last arc in that path. The function then returns a set of
constructed patterns to be later matched with actual paths. Likewise, Figure 3.8 defines the ExtractArcPatterns
function more formally. Parameter e denotes the enabled arc to which to base the pattern by, while an enabled arc
is denoted by raising the pair of vertices to the wildcard symbol ?∗ . Function CreatePatterns finally returns the
set of all such patterns T on line 18, Figure 3.6.
   Returning to the Rejuvenate function depicted in Figure 3.2, lines 3 and 4 perform similar actions as the
previous two, however, they do so for new version of the base-code P . Also notice that at line 4, patterns are
derived by enabling all shadows contained in the revised program (ΩP ), thus, all possible intention patterns are
derived from CG + . Then, at line 5, the patterns derived from the old version of the base-code are intersected with
                   P
                                                                        ˆ
all possible patterns derived from the new version. The intersection ΠP∩P represents surviving set of patterns
between the two versions, thereby removing patterns containing program elements that no longer exist in the new
version of the base-code. On line 6, a set S of suggested shadow, confidence7 pairs is created to serve as suggested
shadows to be included in a new version of the PCE. S is constructed by matching each intention pattern in the
               ˆ
surviving set ΠP∩P with the graph of the revised base-code CG + . P
  7
      The confidence factor is inspired by [7].


                                                           17
function ExtractVertexPatterns(π = a1 , a2 , . . . , an , v)
     ˆ
  1: Π ← ∅ /*The set of patterns to be returned, initially empty*/

  2: π ←
     ˆ        /*A single pattern to be built, initially the empty sequence of arcs.*/
  3: for i ← 1, n do /*For each arc along path π*/

  4:   if i = 1 ∧ s(ai ) = v ∧ t(ai ) = v then /*If it is the first arc and both the source nor target vertices are
       disabled*/
  5:     π ← π + (s(ai ), ?) /*Append a new arc consisting of the old source as the source vertex and a disabled
          ˆ     ˆ
         wildcard as the target vertex to the pattern.*/
  6:   else if i = 1 ∧ t(ai ) = v then /*Otherwise, if it is the first arc and the target vertex is enabled*/
  7:     π ← π + (s(ai ), ?∗ ) /*Append a new arc consisting of the old source as the source vertex and a enabled
          ˆ     ˆ
         wildcard as the target vertex to the pattern.*/
  8:   else if i = n ∧ s(ai ) = v ∧ t(ai ) = v then /*Otherwise, if it is the last arc and both the source and target
       vertices are disabled*/
  9:     π ← π + (?, t(ai )) /*Append a new arc consisting of a disabled wildcard as the source vertex and the old
          ˆ     ˆ
         target as the target vertex to the pattern.*/
 10:   else if i = n ∧ s(ai ) = v then /*Otherwise, if it is the last arc and the source vertex is enabled*/
 11:     π ← π + (?∗ , t(ai )) /*Append a new arc consisting of a enabled wildcard as the source vertex and the old
          ˆ     ˆ
         target as the target vertex to the pattern.*/
 12:   else if i = 1 ∧ i = n ∧ t(ai ) = v then /*Otherwise, if it is neither the first nor the last arc and the target
       vertex is enabled*/
 13:     π ← π + (?, ?∗ ) /*Append a new arc consisting of a disabled wildcard as the source vertex and an enabled
          ˆ     ˆ
         wildcard as the target vertex to the pattern.*/
 14:   else if i = 1 ∧ i = n ∧ s(ai ) = v then /*Otherwise, if it is the first but not the last arc and the source vertex
       is enabled*/
 15:     π ← π + (?∗ , ?) /*Append a new arc consisting of an enabled wildcard as the source vertex and a disabled
          ˆ     ˆ
         wildcard as the target vertex to the pattern.*/
 16:   else if i = 1 ∧ i = n ∧ s(ai ) = v ∧ t(ai ) = v then /*Otherwise, if it is neither the first nor the last arc and
       both the source and target vertices are disabled*/
 17:     π ← π + (?, ?) /*Append a new arc consisting a disabled wildcard as the source vertex and an enabled
          ˆ      ˆ
         wildcard as the target vertex to the pattern.*/
 18:   else if i = 1 ∧ i = n ∧ s(ai ) = v then /*Otherwise, if it is neither the first nor the last arc and the source
       vertex is enabled*/
 19:     π ← π + (?∗ , ?) /*Append a new arc consisting of an enabled wildcard as the source vertex and a disabled
          ˆ     ˆ
         wildcard as the target vertex to the pattern.*/
 20:      ˆ      ˆ
         Π ← Π ∪ {ˆ } /*Add the completed pattern to the set to be returned.*/
                      π
 21:     π ← /*Reset π to be the empty sequence of arcs.*/
          ˆ                ˆ
 22:   else /*Otherwise, it must be that it is neither the first nor the last arc and the target vertex is enabled*/
 23:     π ← π + (?, ?∗ ) /*Append a new arc consisting of a disabled wildcard as the source vertex and an enabled
          ˆ     ˆ
         wildcard as the target to the pattern.*/
 24:      ˆ      ˆ
         Π ← Π ∪ {ˆ } π
 25:     π←
          ˆ
 26:   end if
 27: end for
               ˆ
 28: return Π ∪ {ˆ } /*Return the accrued set of patterns along with the last completed pattern.*/
                    π

                          Figure 3.7. Vertex-based pattern path extraction algorithm.



                                                          18
function ExtractArcPatterns(π = a1 , a2 , . . . , an , e)
     ˆ
  1: Π ← ∅ /*The set of patterns to be returned, initially empty*/

  2: π ←
     ˆ         /*A single pattern to be built, initially the empty sequence of arcs.*/
  3: for i ← 1, n do /*For each arc along path π*/

  4:   if i = 1 ∧ ai = e then /*If it is the first arc and it is disabled*/
  5:      π ← π + (s(ai ), ?) /*Append a new arc consisting of the old source as the source vertex and a disabled
          ˆ      ˆ
          wildcard as the target vertex to the pattern.*/
  6:   else if i = n ∧ ai = e then /*Otherwise, if it is the last arc and it is disabled*/
  7:      π ← π + (?, t(ai )) /*Append a new arc consisting of a disabled wildcard as the source vertex and the old
          ˆ      ˆ
          target as the target vertex to the pattern.*/
  8:   else if i = 1 ∧ i = n ∧ ai = e then /*Otherwise, if it is neither the first nor the last arc and it is disabled*/
  9:      π ← π + (?, ?) /*Append a new arc consisting of disabled wildcards as both the source and target vertices
          ˆ      ˆ
          to the pattern.*/
 10:   else if (i = 1 ∨ i = n) ∧ i = n ∧ ai = e then /*Otherwise, if it is the first arc or the last arc but not the only
       arc and it is enabled*/
                           ∗
 11:      π ← π + (?, ?)? /*Append a new enabled arc wildcard consisting of disabled wildcards as both the source
          ˆ      ˆ
          and target vertices to the pattern.*/
 12:   else if i = 1 ∧ i = n ∧ ai = e then /*Otherwise, if it is neither the first nor the last arc and it is enabled*/
                            ∗
 13:      π ← π + (?, ?)?
          ˆ      ˆ
 14:      ˆ       ˆ
          Π ← Π ∪ {ˆ } /*Add the completed pattern to the set to be returned.*/
                       π
                         ∗
 15:      π ← (?, ?)? /*Reset π to be a sequence consisting of a new enabled arc wildcard with disabled wild-
          ˆ                         ˆ
          cards as both the source and target vertices.*/
 16:   else /*Otherwise, it must be that it is the only arc and it is enabled*/
                                 ∗
 17:      π ← π + (s(ai ), ?)? /*Append a new enabled arc wildcard consisting of the old source as the source
          ˆ       ˆ
          vertex and a disabled wildcard as the target vertex to the pattern.*/
 18:      ˆ       ˆ
          Π ← Π ∪ {ˆ } π
                             ∗
 19:      π ← (?, t(ai ))? /*Append a new enabled arc wildcard consisting of a disabled wildcard as the source
          ˆ
          vertex and the old target as the target vertex to the pattern.*/
 20:   end if
 21: end for
                ˆ
 22: return Π ∪ {ˆ } /*Return the accrued set of patterns along with the last completed pattern.*/
                     π

                            Figure 3.8. Arc-based pattern path extraction algorithm.




                                                          19
                              
                              0
                                                                          if |Match(ˆ , Paths(CG + ))| = 0
                                                                                     π            P
      error α (ˆ , Apce ) =
               π                     |Apce ∩ Match(ˆ , Paths(CG + ))|
                                                    π           P                                               (3.1)
                              1 −                                         otherwise
                                        |Match(ˆ , Paths(CG + ))|
                                                π
                              
                                                             P
                              
                              1
                                                                          if |Apce | = 0
      error β (ˆ , Apce ) =
               π                     |Apce ∩   Match(ˆ , Paths(CG + ))|
                                                     π            P
                                                                                                                (3.2)
                              1 −                                         otherwise
                                                    |Apce |
                              
                              
                              1                      if |ˆ | = 0
                                                          π
                   π
               abs(ˆ ) =         |ˆ | − |W(ˆ )|
                                  π          π                                                                  (3.3)
                              1 −                  otherwise
                                        |ˆ |
                                         π
        conf (ˆ , Apce ) = 1 − error α (ˆ , Apce )(1 − abs(ˆ )) + error β (ˆ , Apce )abs(ˆ )
              π                         π                  π               π             π                      (3.4)


                                     Figure 3.9. Pattern attribute equations.


Path Matching
We now define the pattern to path matching scheme of our approach more formally. To do so, we first define a
function Match : Π × P(Π) → P(Ω) that, given a pattern and a set of paths, returns a set of suggested shadows.
                   ˆ
Function Match works by matching the given pattern, which may contain wildcard graph elements, against the
given paths in the graph. We define this notion more formally as follows.

                      ˆ
Definition 6 A pattern π matches a path π iff

   • for each vertex u along π at position i there exists a vertex v in π at the same position i in which either
                                                                        ˆ
     u = v or v is a wildcard,

   • for each arc (p, q) along π at position j there exists an arc (s, t) in π at the same position j in which either
                                                                             ˆ
      (p, q) = (s, t) or (s, t) is a wildcard.

   We define an equivalence relation over vertices of CG + = (V, A, R, ) to be that of the traditional equality
                                                            P
relation, i.e., that u and v refer to the same vertex. As for arcs, the function refers to the labeling function
introduced in definition 1, thus, (p, q) = (s, t) denotes that each arc satisfies the same structural relation. Recall
from definition 3 that a given arc may only satisfy a single relation from the set R.
   Given that a pattern matches a particular path, suggested shadows are ones represented by graph elements
(vertices and/or arcs) along the path which matched enabled wildcards in the pattern. Vertices representing method
declarations matched by enabled wildcards produce a suggested shadow associated with the execution of the
method. Likewise, arcs representing relationships, e.g., calls, field reads, field writes, between program elements
matched by enabled wildcards produce a suggested shadow associated with the relationship, e.g., call, get, set,
between the program elements, e.g., withincode, represented by the source and target vertices.
                                                                          cm      sf
   In terms the graph subset portrayed in Figure 3.3, the pattern ?∗ −→ ? − overallSpeed would match
                                                                          −      →
increase(Fuel)          overallSpeed and increase(Current)          overallSpeed. Also note that the same pat-
tern would also match the path increase(double)         overallSpeed had we augmented the graph in Figure 3.3
with the revised version of the base code portrayed in Figure 2.3 from our motivating example. Furthermore,
notice that increase(double) would be represented by a vertex that would match an enabled wildcard element
in the pattern, thus, this method would be suggested to be included in a new version of the PCE.


                                                          20
Confidence
A confidence c paired with each suggested shadow ω is a real number in the interval [0, 1] that represents the
degree to which we believe a revised version of the original PCE to be applicable to the shadow ω. The confidence
of a suggested shadow is inherited from the pattern in which it was produced and is derived from 3 components as
depicted in Figure 3.9. We refer to each of these components as pattern attributes in respect to a PCE Apce , which
is the PCE to be rejuvenated.
    The first attribute is the error α rate, depicted in equation (3.1), which is a ratio of the number of shadows
captured by both the PCE Apce and the pattern π when applied to finite, acyclic paths in the graph Paths(CG + ) to
                                                  ˆ                                                              P
the number of shadows captured by solely the pattern. The α signifies the metric’s association with the rate of type
I (or α) errors which relates to the number of false positives produced by the pattern. Conceptually, the error α
rate quantifies the pattern’s ability in matching solely the shadows contained within the PCE; the closer the error α
is to 0, the more likely the shadows matched by the pattern are also ones contained within the PCE. It refers to
the quality of results that the pattern is likely to produce in the future. A pattern with a low error α is one that
expresses a strong relationship amognst join points captured by the PCE; we would expect future join points to
exhibit similar characteristics. Naturally, if a given pattern does not match any shadows, its corresponding error α
rate is 0.
    The second pattern attribute in respect to a given PCE is the error β rate, depicted in equation (3.2), which is a
ratio of the number of shadows captured by both the PCE Apce and the pattern π when applied to finite, acyclic
                                                                                     ˆ
paths in the graph Paths(CG + ) to the number of shadows captured by solely by the given PCE. The difference
                                 P
between error α and error β is subtle but important; the β signifies the metric’s association with the rate of type II
(or β) errors which relates to the number of false negatives produced by the pattern. Conceptually, the error β rate
quantifies the pattern’s ability in matching all of the shadows contained within the PCE; the closer the error β is
to 0, the more likely the pattern is to match all the shadows contained within the PCE. It refers to the quantity of
correct results that the pattern is likely to produce in the future. A pattern with a low error β expresses properties
similar to the ones expressed by the given PCE, whether or not those properites are common to the captured
shadows. Naturally, if the given PCE does not contain any shadows, the pattern’s corresponding error β rate is 1
since it could not possibly match any of the join points contained within PCE.
                                                                            ˆ
    As portrayed by function CreatePatterns in Figure 3.6, a pattern π is derived from a path π by replacing
concrete elements in the path with wildcard elements. Wildcard graph elements may match a number of elements
contained in the graph as detailed previously. When predicting a pattern’s future ability to rejuvenate a given
PCE, we would like to take into account its abstractness (abbreviated by abs), i.e., the ratio of the number of
constituent wildcard elements to concrete elements. Let |ˆ | denote the number of unique elements (vertices and
                                                              π
arcs), including wildcards, contained within pattern π . Moreover, let W(ˆ ) denote the multiset projection of
                                                          ˆ                      π
wildcard elements contained in pattern π (see Figure 3.1). Likewise, |W(ˆ )| projects the number of wildcard
                                             ˆ                                   π
                                      ˆ                              ˆ
elements contained within pattern π . Then, the abs of a pattern π , which is independent of any particular PCE, is
given by equation (3.3). Note that an empty pattern has no concrete elements, thus, we consider such a pattern to
be completely abstract, i.e., having an abstractness of 1.
    The corresponding intuition behind a pattern abstractness attribute is that patterns containing many wildcard
elements are more likely to match a greater number of concrete graph elements and vice versa. Thus, we would like
to combine the α and β error rates of a pattern by use of a weighted mean weighted by the pattern’s abstractness.
The intuition behind the chosen weighting scheme is as follows. A pattern that is very abstract (i.e., containing
many wildcards) is typically less likely to hone in on shadows that are only contained within the given PCE.
Conversely, a pattern that is less abstract (i.e., more concrete, containing fewer wildcards) is less likely to cover
all shadows contained within the given PCE. The combined metrics are used to derive a confidence (abbreviated
conf ) pattern attribute, portrayed in equation (3.4), which is a convenient, single metric to judge the confidence we
have in the pattern being useful in detecting accurate shadows to be included in a future, rejuvenated version of the


                                                         21
corresponding PCE. The closer a pattern’s confidence is to 1, the more likely it will produce accurate suggestions
in the future.
   Returning to the Rejuvenate function depicted in Figure 3.2, line 7 commences the final rejuvenation process
by initializing the new PCE Apce to be returned by the function as the empty set of shadows. Then, for each
shadow, confidence pair sorted by decreasing confidence (line 8), the suggestion is presented to developer along
with its confidence (line 9). The selected shadows (line 10) are then augmented to the rejuvenated PCE Apce (line
11) and returned (line 14).




                                                       22
Chapter 4

Experimental Evaluation

In this section, we provide an overview of the experimental study conducted to quantitatively ascertain the useful-
ness of the rejuvenation approach in terms of its ability to make accurate suggestions of shadows to be incorporated
into a revised version of a PCE upon evolution of the base-code.

4.1     Implementation

   We implemented our algorithm as a plug-in to the popular Eclipse IDE1 . Eclipse abstract syntax trees (ASTs)
with source symbol bindings were used as an intermediate program representation. The intention graph is con-
structed with the aid of a JayFX fact extractor, extended for Java 1.5 and AspectJ, which generates facts (using
CHA) pertaining to structural properties and relationships, e.g., field accesses, method calls. Furthermore, we
leveraged the AJDT compiler for the implementation of the Enabled relation described in Chapter 3, which as-
sociates PCEs with both vertices and labeled arcs depending on the kind of shadows. The actual path matching
described in Chapter 3 is implemented via the Drools2 rules engine.
   To increase applicability to real-world applications, we relaxed several assumptions described in Chapter 3.
For example, we conservatively assume that dynamic advice, i.e., advice bound to a pointcut containing run time
predicates, is always applied. If the tool encountered any inter-type declarations or any other form of the static
crosscutting the associated aspect was still processed but these constructs were ignored. For further details on our
implementation, we refer readers to our tool demonstration [17].

4.2     Experimental Evaluation

  The experiment is separated into two phases: analysis and rejuvenation. The analysis phase (I) explores the
quality of intention patterns produced, while the rejuvenation phase (II) investigates the ability of these patterns to
make accurate suggestions.

4.2.1     Phase I: Correlation Analysis

  To evaluate the quality of the patterns produced by our inferencing algorithm we ran the analysis phase of
our algorithm described in Chapter 3 on 23 AspectJ applications, benchmarks, and libraries3 s which are listed in
Table 4.1. The second and third columns of the table show the varying size of each benchmark from small (e.g.,
Quicksort) to large (e.g., MySQL Connector/J). For each benchmark, the analysis was executed five times on
   1
     http://www.eclipse.org
   2
     http://www.jboss.org/drools
   3
     All available on our web site: http://tinyurl.com/6ewl2r


                                                          23
       benchmark         LOC    class.   adv.   shad.    patt.   abs.   σabs.      α    σα       β    σβ    conf.   time (s)
       AJHotDraw        21750     298     32       90    3362    0.61   0.01    0.32   0.12   0.06   0.25    0.62       101
               Ants      1572      33     22      297    1254    0.63   0.02    0.15   0.11   0.23   0.41    0.62        43
              Bean        121       2      2        4      16     0.6      0    0.24      0   0.23      0    0.53          4
        Contract4J      10722     199     15      350    1809    0.57   0.03    0.26   0.09   0.44   0.23     0.3       115
              DCM        1680      29      8      343    2472    0.57   0.02    0.15   0.14   0.45   0.25     0.4          4
             Figure        94       5      1        6      22    0.61      0    0.11      0   0.45      0    0.44          8
          Glassbox      25940     430     55      208    2620    0.61   0.02    0.28    0.1   0.13   0.29    0.59       228
     HealthWatcher       5716      76     27      122    1004     0.6   0.02    0.21    0.2   0.16   0.38    0.63        22
     Jakarta Cactus      7573      93      4      222    2151     0.6   0.01    0.21   0.07   0.52   0.19    0.27          8
    LawOfDemeter         1586      29      5      164     540    0.58   0.01    0.15   0.09   0.41   0.41    0.44        46
      MobilePhoto        3806      52     25       25     775     0.6   0.02    0.23   0.13      0      0    0.77        11
 MySQL Connector/J      44016     187      2     3016   17564    0.58      0    0.12      0   0.58      0    0.31       379
        NullCheck        1474      27      1      112      92     0.6      0    0.17      0   0.55      0    0.28       293
         N-Version        552      15      4        9      80    0.58   0.01    0.19   0.13   0.24   0.06    0.57          1
          Quicksort        73       3      4        7      56    0.61   0.01    0.19   0.07   0.15   0.29    0.66          3
           RacerAJ        576      13      4        9      15    0.54   0.01    0.23   0.16   0.09    0.2    0.68          5
    RecoveryCache         222       3      4       14      72    0.58      0    0.11    0.1   0.21   0.11    0.68          6
          Spacewar       1415      21      9       58     225    0.62   0.01    0.15   0.11   0.22   0.34    0.63        37
         StarJ-Pool     38218     511      1        3      67    0.66      0    0.25      0      0      0    0.75        75
           Telecom        277      10      4        5      32    0.62   0.01    0.21   0.07   0.02   0.06    0.77          7
              Tetris     1043       8     18       27     498    0.59   0.04    0.16   0.23   0.01   0.05    0.82        14
        TollSystem       5195      88     35       85    1677    0.59   0.03    0.26   0.14   0.06   0.26    0.68        20
            Tracing       366       5     16      132     676    0.59   0.01    0.17   0.09    0.4   0.13    0.44          1
             Totals:   173987    2137    298    5308    37079    0.59   0.01    0.18   0.06   0.16   0.23   0.66      1431


                                Table 4.1. Phase I: Analysis experimental results.


a 2.16 GHz Intel Core 2 Duo machine with 2GB RAM and a maximum heap size of 1GB. On average, the analysis
time (column time in Table 4.1) was 8.22 seconds per KLOC, 4.80 seconds per advice which is practical even for
large applications.
   Columns adv., shad., and patt., respectively, show the number of advice, shadows currently advised, and inten-
tion patterns that were derived from the shadows (averaging 6.99 per shadow). For this experiment, we fixed the
maximum analysis depth parameter, which corresponds to the maximum number of arcs a pattern may consist of,
at 2, which clearly has an influence on the abs. column describing the abstractness of the patterns as defined in
equation (3.3), Figure 3.9; column σabs. indicates the corresponding standard deviation. Recall that the value pro-
duced by equation (3.3) is a ratio of the number of wildcard elements to the number of concrete elements contained
within a pattern. Altering the depth parameter would directly influence this characteristic and may have significant
effects on the overall results. Although setting this value to 1 would theoretically improve the performance of
the tool, we chose a value greater than 1 since CCCs (e.g., logging) tend to effect a wide variety of programming
elements originating from heterogeneous functional modules of the architecture. Due to such diversity, it may
often become necessary to “peel back” layers of the architecture in order to infer the developer’s true intentions
behind creating a given PCE. Clearly, further experimentation would be required to truly validated this claim, as
well as to discover if increasing the parameter to a value greater than 2 would have a notable effect. We designate
such endeavors to future work.
   Columns α and σα refer to the average error α rate portrayed in equation (3.1) and its standard deviation,
respectively, for all derived patterns. Both α and σα values are weighted by the number of patterns produced
from the associated benchmarks, thus, we valued a higher precision for benchmarks containing more advised




                                                           24
                                           benmark        vers.    adv.     ∆pce     orig.      +     −
                                         Contract4J          5      25       13       260      272   215
                                      HealthWatcher          8      15         6       40       32    36
                                      Jakarta Cactus         6      20         4      553      200   245
                                       MobilePhoto           7      39       30        30       33    29
                                                Total:      26      99        53      883      537   525


                               Table 4.2. Phase II: Rejuvenation experimental subjects.

              benchmark        PPV       σPPV      FPR      σFPR    RCR       σRCR      RTR      σRTR        %     σ%    time (s)
               Contract4J      0.48      0.23     0.006    0.005     0.81     0.35      0.77     0.44      0.05   0.06      1046
            HealthWatcher      0.44      0.46     0.003    0.003     1.00     0.00      1.00     0.00      0.10   0.18       146
            Jakarta Cactus     0.76      0.13     0.019    0.027     1.00     0.00      1.00     0.01      0.04   0.03       370
             MobilePhoto       0.70      0.45     0.001    0.001     0.97     0.18      1.00     0.00      0.02   0.09       266
                     Total:    0.62      0.40     0.004    0.009     0.94     0.23      0.94     0.23      0.04   0.10     1828


                                Table 4.3. Phase II: Rejuvenation experimental results.


shadows of which the patterns were produced from4 . The low error α value in Table 4.1 suggests a high correlation
of structural relationships amongst program elements corresponding to shadows captured by a particular PCE.
Columns β and σβ allude to the average error β rate portrayed in equation (3.2) and its corresponding standard
deviation for the same patterns. The low value here not only suggests that the correlation is high, as indicated by
the α column, but also relatively wide spread. Unlike the α value in the previous column, β and σβ values are
calculated using the arithmetic mean across all benchmarks. This is due to the error β rate being indicative of how
well the pattern matches elements corresponding to all shadows captured by the analyzed pointcut. We felt that
it was not particularly important for a single pattern to achieve this feat but that all patterns derived should do
so. Thus, the value located in the corresponding total row is weighted by the number of pointcuts analyzed, i.e.,
column adv.. These values are then combined using the average pattern abstractness from equation (3.3) to form
the average confidence from equation (3.4) of all derived patterns in column conf.

4.2.2     Phase II: Expression Recovery

   To evaluate how our tool helps developers recover PCEs, we rejuvenated pointcuts within 4 popular, open source
AspectJ projects selected from the benchmarks listed in Table 4.1. Table 4.2 gives an overview of the number of:
versions analysed (vers.), affected advice (adv.), textual modification to PCEs (∆pce ), original shadows present
(orig.) and shadows added (+) or (−) removed throughout the various versions.
   In this part of the experiment, we aimed to determine how our tool would aid developers transitioning from
one version of a program P to the subsequent version P . The modifications we targeted required recovering a
particular PCE Apce resulting in a revised PCE Apce . The experiment outlined in this section aims to assess how
our tool would have mechanically helped in discovering such join points that were to be included in Apce .
   Figure 4.1 illustrates the relationships among the shadows and PCEs between versions. The outer region repre-
sents the union of the shadows contained within the original and new versions of the base-code. The region labeled
Apce portrays where Apce applies to P. Likewise, the region labeled Apce portrays where Apce applies to P .
The relationships between shadows and versions are denoted by the 3 lines that cut through the regions. The line
labeled I designates the region containing shadows that are only in the original version of the base-code, the line
    4
      Note that, by the nature of the algorithm, there is a direct correlation between the number of advised shadows and the number of
patterns produced.



                                                                     25
define a project version to be one where the
nt shadows captured by an advice-bound PCE                                    Figure ?? helps depict the experiment diagra
 een software releases. In other words, these                            by showing the relationships between the join
where the base-code evolved.                                             ows and PCEs between the two versions in a m
 v. in Table 2 signifies the total number of ad-                          sense.
                                                                ω       a join point shadow; code corresponding to a join point
ons contained within all versions of the asso- ω a piece ofpointP
                                                                A             ΩP advice
                                                                             a join ∪ Ω shadow; code corresponding to a join point
                                                                             a piece of advice
                                                                A A a TODO: Raffi advice             explain set of join point shadows
marks. Column ∆pce denotes the number of pceApce pointcut bound toto adviceA; aachart joinFigure 4.2.2.                 in
                                                                             a pointcut bound to           A; set of        point shadows
                                                                        a
 eclarations whose bound PCE contained tex- pceApce subsequent revision ofof Apce
                                                            I   A                                       Apce
                                                                             a subsequent revision
                                                                        the the original
                                                                P            original
                                                                            TODO:program, the underlying base-code
                                                                                          Raffi to the underlying base-code
between versions. In other words, these are P a subsequence program,explain chart in Figure 4.2.2.
                                                                P P  II                      revision of program P
                                                                             a subsequence revision of program P
 hich the PCE evolved; these were exactly the P Ω the set set join point shadows contained within program PP
                                                                Ω            the of of join point shadows contained within program
                                                                      P
 re we deemed that the tool could have helped PIG P finite graph representing structural relationshipsthis s
                                                                IG               finite graph analysis. The purpose of
                                                                        a aQualitative representing structural relationshipsbe-
                                                                            I
                                                                             II                                                         be-
e for evaluation. Columns + and − symbol-                                analyseprogram elements in P
                                                                             tween the suggestions
                                                                        tween program elements in P that are generated fro
                                                                             an
                                                                                  derieved (sequenceof phase P an intention
                                                                                                 from of from program P
                                                                         ternsacyclic path paths derivedarcs) in I P ;; an intention
 of the evolution involving the base-code cap- π an acyclic path (sequencethearcs) in IGanalysis. The
                                                                π
                                                                             a set of acyclic
                                                                Π ΠP set of acyclic paths
 PCEs, denoting the number of shadows being P π amadeintention patternderived from details and arcs) de- th         program P
                                                                             an will be discussed and vertices and arcs) de-
                                                                     ˆ an intention pattern (sequence of vertices regarding
                                                                                                     (sequence of
                                                                ˆ
                                                                π
moved between these versions, respectively.                              ness will also be given, correlating these sugge
                                                                             rived from π, possibly containing wildcards
                                                                     between PCEs in possibly containing wildcards
               Figure 4.1. Venn diagram depicting relationships ˆ rived from π, subsequent software versions.
                                                                             echanges made to derived from program P
                                                                                set of intention patterns derived from
  riment, the scenario we aim to reproduce and P ΠPath a of intention patterns HealthWatcher. P
                                                                ˆ
                                                                Π          set                                           program
 analyze how our tool would have performed are in Figureoriginal and revisedformalism notation. were
           labeled II designates the region containing shadows that                 Ineffective rejuvenation. There
                                                                           both the 4. Algorithm versions of base-code.
                                                                      Figure 4. Algorithm formalism revised The
e developer transitioning Figure 4.1 labeled III designates the region containing shadows that are only in thenotation. firs
           Conversely, the line in a particular software                 sions when rejuvenation is ineffective.
           ˆ . As a result, the developer is re-
 version P Table 4.3 depicts the results of the evaluation of our toolarose in version 7 of identify shadows that where an
           version of the base code. The activity of discovering shadows that fall in this category is denoted as recovery.
                                                                          in its ability to automatically HealthWatcher
                                                                   produced successfully inferring new join points that should
                            ˆ
           were PCE Apce , resulting in to a new produced successfully patternversion of the PCE. that to red
   a particular both retained and recovered due a revised be of thebase-code. Column PPV[20]itswas introducedshould
                                                                         the Adapter inferring new corresponding
                                                              version incorporated in a revised and join points
           standard deviation σPPV signifymotivating ex- positive predictive to Servlet rejuvenation types. for This modi
Relating this situation to our               a form of the average       dencies value per PCE specific adapted
                                                                      For the sake revised version of the PCE.
                                                              be incorporated in aof presentation, we further make several
             the assessment of ranked results, also known as r-precision, a stringent metric in information retrieval [23]. The
                                                                       simplifying assumptions about we further source code
             PPV is a ratio which signifies how well the tool wasFor to select positivepresentation,the underlyingmake several
                                                                       able the sake of results, i.e., shadows which fell into
                                                                       to as analyzed; we 4.1. Conceptually, it is a ratio source of
             regions II (retained shadows) and III (recovered shadows) be depicted in Figure discuss in Section 4.1 how muchcode
                                                                 simplifying assumptions about the underlying of
                                                                        10
             the count of shadows which were captured by both the these arerevised versionourthe PCE and suggestedWethe
                                                                        manually relaxed
                                                                 to be analyzed;bywe in of implementation. byhow much of
                                                                                                  discuss in Section 4.1 the
             tool to the sum of the same count and that of the shadows source codethe tool which were not captured
                                                                                                                                by assume that
                                                                       the   suggested        utilizes only a subset of weaving semantics
                                                                 these are relaxed in our implementation. We assume that
             manually revised PCE. In other words, the numerator is the count to the shadows that the tool correctly inferred, AspectJ
                                                                       available      of the underlying AO language, such as
                                                                 the made by the tool; the we exclude subset theof an aspect to stat-
                                                                                                 closer the a
             while the denominator is the count of all suggestions source code utilizes onlyPPV is to 1,of weaving semantics better the
                                                                       [31]. Specifically,                       the ability
                                                                 available to the a base program, i.e., introduce and modify
                                                                                           is quite stringent, and the average such
             tool’s performance. The ranked version of PPV used in this experiment underlying AO language, value of as AspectJ
                                                                       ically crosscut
             0.62 as Table 4.3 depicts indicates a respectable value in this context.
                                                                 [31]. Specifically, wecolumnscompile time (e.g., member to stat-
                While PPV measures the tools performance in predicting positive results, exclude the ability of an aspect intro-
                                                                       facets of the base-code at FPR and its corresponding
                                                                       false positive a base program, i.e., introduce and modify
                                                                 ically crosscutrate, also known as fall-out, per rejuvenated although
             standard deviation σFPR denote a form of the average duction, class hierarchy alteration). Furthermore,
                                                                 facets is possible of the countcompile time to points
                                                                       it of the ratio for a at to capture join the member intro-
             PCE, similarly adapted for ranked results. FPR is defined asthe base-codePCE of false positives(e.g., total associated
             count of all shadows which are not contained within any PCE, i.e., advice of the count of false positives and bound to), we
                                                                       within an hierarchy alteration). Furthermore,
                                                                                          sum body (possibly the one
                                                                 duction, classthetool that were not included in the it is true although
             negatives. False positives is indicative of shadows suggested by the                                            manually
                                                                       adopt the for a PCE to capture
                                                                 it are ranked, weperspective that aspects are indeed separate from
                                                                                                                         ΩP | points
             revised version of the PCE. Again, since the results is possible only consider the top |Apce ∩ join results in associated
                                                                 within an is to 0, theadvicethe tool’s performance. join bound to), we
                                                                       the base-code; better may only apply to
             the FPR assessment. Unlike the PPV, the closer the FPR value advice body (possibly the one it is points associ-
                                                                       ated with classes, interfaces, and other Java types. We
                Figure 4.2 plots the obtained PPV against the FPR for each rejuvenation that aspects are indeed separate also
                                                                 adopt the perspective experiment; the resulting chart is                 from
                                                                       assume Light shaded points represent the ranked result
             also known as a receiver operating characteristic (ROC) plot. that we are able to statically identify all references
                                                                 the base-code;entities may only apply tobase-code. associ-  join points
             assessment, i.e., those summarized in Table 4.3, while the dark shaded advicecontainedsame results but without This as-
                                                                       to program points represent the within the
                                                                         with classes, runs through and other the use of
                                                                 ated The line y could interfaces, the through Java types. We also
             the confidence ranking solely for comparison purposes.sumption = x thatbe invalidated ROC space represents reflection
                                                                 assume that weclassable to statically identify all that the
             the “line of no discrimination;” any point that falls on this line is indicative of a “random” guess. Correspondingly, references
                                                                       and custom are loaders. Moreover, we assume
             points falling to the north west of this line represent predictions that are better than random, the point (0, 1)
                                                                       original source contained within the base-code. AJDT
                                                                 to program entitiescode successfully compiles under aThis as-
             being a perfect prediction, while points falling to the south east represent predictions that are worst than random.
                                                                 sumption they are to invalidated through the use of
                                                                        closer could be
             The closer the results are to the north west corner, the 1.6 compiler. mimicking the developer’s manual PCE reflection
                                                                            2

                                                                      Recall that join point refers to a we assume that the
                                                              and custom classa loaders. Moreover, well-defined point in
                                                                  the source of the base-code; thus, the definition a join
                                                              originalexecutioncode successfully compiles underof aAJDT
                                                              1.62point is dynamic in nature. A join point shadow, on the
                                                                   26
                                                                   compiler.
                                                                  other hand, refers to base-code corresponding to a join point
                                                                 Recall that a join point refers to a well-defined point in
                                                                  [50], i.e., a point in the program text where the compiler
                                                              the execution of the base-code; thus, the definition of a join
                                                                    may actually perform the weaving [39]. Whether or not the
                             1                                                                                                          Legend

                                                                                                                                                   Ranked result
                                      perfect
                                                                                                                                                   Unranked result
                            0.8                 better



                                                                                                                                   s)
                                                                                                                              es
                                                                                                                         gu
                            0.6                                                                                     om
                                                                                                               nd
                                                                                                         (ra
                      PPV

                                                                                                    on
                                                                                           in   ati
                                                                                     rim
                                                                           d   isc
                            0.4                                         no
                                                                    f
                                                               eo
                                                         lin


                            0.2                                                                                                                  worse




                             0

                                  0              0.2                            0.4                                           0.6                  0.8               1
                                                                                                     FPR



               Figure 4.2. Rejuvenation results: Receiver operating characteristic (ROC) plot.



revisions perfectly. Out of 99 PCEs rejuvenated, 94% of these points fell above the line of no discrimination.
   Figure 4.3 portrays the associated ROC curve, commonly used for the assessment of machine learning and
data mining techniques, for the results shown in Table 4.3. Similar to the ROC plot shown in Figure 4.2, the
chart portrays the line of discrimination y = x, however, unlike the plot, only the ranked results are displayed.
Furthermore, the curve can be seen as plotting the fraction of true positives (PPV) vs. the fraction of false positives
(FPR). As shown, as the PPV, i.e., the ability of the tool to automatically predict shadows manually selected by
the developer, converges to perfect, the likelihood of predicting shadows that were not selected by the developer
also increases. However, what is interesting in this result is the slope of the curve between the origin and the FPR
of approximately 0.1. In this region of the curve, the tool is able to select positive results at very low false positive
rate. Correspondingly, the area under the curve (AUC), an impressive 0.94 yielded by our experiments, can be
interpreted as the probability that the tool will rank a randomly chosen positive instance (i.e., a developer selected
shadow) higher than a randomly chosen negative one (i.e., a shadow not selected by the developer) [10]. To place
this notion into our context, imagine that we randomly chose a single shadow from our experimental space. There
exist two possibilities for this shadow, (i) that it was captured in the revised version of the PCE, and (ii) that it was
not captured by the revised version of the PCE. Our results indicate that if the shadow falls into the first category,
the tool is 94% likely to assign this shadow a higher confidence than one that falls into the second case.
   Returning to the remaining columns of Table 4.3, recall that we defined recovery as the activity of identifying
shadows in a revised version of the base-code that should be captured by an existing PCE. Columns RCR and
corresponding standard deviation σRCR designate the recovery rate in which the tool was able to successfully infer
new shadows that were intended to be applicable to an existing pointcut as the software evolved. Relating this
concept to the Venn diagram in Figure 4.1, these shadows would fall into the region labeled III. Our tool was able
to discover 94% of such shadows.
   Recall that we defined retention to be the activity of identifying shadows that should remain captured by a
PCE despite evolution of the base-code. Columns RTR and corresponding standard deviation σRTR represents the
retention rate in which the tool was able to suggest shadows that were originally captured by an existing PCE.


                                                                                       27
                            1




                           0.8




                           0.6
                     PPV


                                                            AUC = 0.94

                           0.4




                           0.2




                            0
                                 0       0.2          0.4                0.6     0.8            1
                                                                 FPR



                             Figure 4.3. Receiver operating characteristic (ROC) curve.



These are shadows that would fall into region II of the Venn diagram depicted in Figure 4.1. The results indicate
that the rejuvenation process was able to largely retain these original shadows.
   Further recall that the suggestions made by the tool are sorted by decreasing order of confidence. Columns
% and corresponding standard deviation σ% represent the average percentile (or position) of where the positive
suggestions (shadows that were both recovered and retained) fell in the list. We found that, on average, the
shadows that were both suggested by the tool and manually included by the developer fell, on average, in the top
4% of the tool’s output. Column time depicts the time (seconds) it took the tool to produce these results, averaging
approximately 15 seconds per KLOC, indicating that the tool was indeed practical to use.




                                                            28
Chapter 5

Related Work

5.1   Pointcut Fragility

   It is claimed that current PCE languages are not be sufficiently expressive to represent the developer’s true
intentions in capturing join points corresponding to a PCE [22], these difficulties being rooted at the inherent
fragility of typical PCE languages [21]. Several approaches [25, 9, 15, 20, 6, 31, 29] attempt to add expressiveness
to help combat this problem by altering or abstracting the underlying join point model. Others [5, 14] go even
further by proposing approaches that combat fragility in these models. The proposal presented in this paper
approaches the problem from a fundamentally different perspective, i.e., it combats pointcut fragility for current
languages (i.e., AspectJ [18]) by essentially maintaining a rich join point model underneath the current one. In
this view, the tool makes suggestions based off this rich model while affording the developer the luxury of using a
familiar AO language.

5.2   Aspects and Refactoring

   A technique for automatic update of PCEs upon various refactorings of the base code is presented in [34].
However, the associated tool updates PCEs only when predefined refactorings are invoked. Moreover, unlike the
tool presented in this paper, the approach is not able to update PCEs due to additions of new shadows introduced
in the new version of the base code.
   Given a set of join points, [3] clusters the join points based on common characteristics in the method names,
using lexical matching, for refactoring non-AO software to use aspects. However, the propsed approach does
not consider the PCE maintenance involved due to base-code evolution. Nevertheless, we foresee an interesting
scenario where the tool proposed by [3] may be integrated with our rejuvenation tool presented to automatically
cluster suggested shadows.

5.3   Automated Aspect-Oriented Software Development

   Several techniques, e.g., [26, 32, 2] aim to automate AOP-based development. However, they focus on analyz-
ing the changes in shadows between software versions so that the developer fully understands the impact of the
alteration of the base-code on advice behavior. In contrast, the focus of our approach is to infer shadows that likely
belong in a new version of the PCE based upon those changes.
   Automated tools such as AJDT and PointcutDoctor [36] display join points that currently and almost, respec-
tively, match a given PCE, but do not analyze the differences in properties exhibited by join points between versions
of the base-code.


                                                         29
5.4      Concern Traceability

  Techniques such as [27, 7] track and manage concerns in source code during evolution, respectively. While they
do not specifically deal with AOP, our approach may be seen as an adaption of these approaches to the paradigm.
Also, these approaches are centered on generating intensional1 descriptions of the locations of concerns in source
code, whereas our approach is focused on inferring the developers’ intent in creating PCEs.




  1
      Note that the distinction made between intension and intention is deliberate in this context.


                                                                      30
Chapter 6

Conclusion and Future Work

We have proposed an automated approach which limits the problems associated with pointcut fragility by assisting
the developer in rejuvenating pointcuts as the base-code evolves. Join points whose associated program elements
exhibit common patterns in the intentional structure of the base-code are suggested for incorporation into existing
PCEs. Not only do these results reveal the usefulness of such a tool but also provide insights into the design of
pointcut expressions.
   Our current tool research prototype is publicly available for download1 . In its current state, the tool presents
the user with the new suggested join points for manual integration, however, in future versions of the tool, once
the selection is final, the pointcut will be rewritten using existing refactoring support [4] adapted for AspectJ
constructs.
   Potential future work entails incorporating aspect types and semantics of inter-type declarations into the con-
struction of the extended concern graph. Also, a more accurate assessment of the dynamic applicability of advice
may be an interesting avenue to explore, possibly using dynamic traces in the initial analysis. Dynamic analysis
may also be valuable in more accurately estimating the truth values associated with the relationships depicted by
the concern graph.




  1
      http://tinyurl.com/6fdz55


                                                        31
Acknowledgments

We would like to thank Barthelemy Dagenais, Tao Xie, Alexander Egyed, Martin Robillard, Alfred V. Aho, Marc
Eaddy, and Linton Ye for their answers to our many technical and research related questions and for referring us
to related work.




                                                      32
Bibliography

 [1] J. Aldrich. Open modules: Modular reasoning about advice. In Eur. Conf. Object-Oriented Programming, 2005.
 [2] P. Anbalagan and T. Xie. Apte: automated pointcut testing for aspectj programs. In WTAOP, 2006.
 [3] P. Anbalagan and T. Xie. Automated inference of pointcuts in aspect-oriented refactoring. In International Conference
     on Software Engineering, 2007.
          a
 [4] D. B¨ umer, E. Gamma, and A. Kiezun. Integrating refactoring support into a Java development tool. In Conference on
     Object-Oriented Programming Systems, Languages, and Applications, 2001.
 [5] M. Braem, K. Gybels, A. Kellens, and W. Vanderperren. Automated pattern-based pointcut generation. In Software
     Composition, 2006.
 [6] W. Cazzola, S. Pini, and M. Ancona. Design-based pointcuts robustness against software evolution. In Reflection, AOP,
     and Meta-Data for Software Evolution, 2006.
 [7] B. Dagenais, S. Breu, F. W. Warr, and M. P. Robillard. Inferring structural patterns for concern traceability in evolving
     software. In Int. Conf. Automated Software Engineering, 2007.
 [8] J. Dean, D. Grove, and C. Chambers. Optimization of object-oriented programs using static class hierarchy analysis.
     In Eur. Conf. Object-Oriented Programming, 1995.
 [9] M. Eichberg, M. Mezini, and K. Ostermann. Pointcuts as functional queries. In Asian Symposium on Programming
     Languages and Systems, 2004.
[10] T. Fawcett. An introduction to roc analysis. Pattern Recognition Letters, 2006.
[11] S. Fickas and B. R. Helm. Knowledge representation and reasoning in the design of composite systems. IEEE Trans.
     Softw. Eng., 1992.
[12] W. Griswold, K. Sullivan, Y. Song, M. Shonle, N. Tewari, Y. Cai, and H. Rajan. Modular software design with
     crosscutting interfaces. IEEE Software, 2006.
[13] S. Gudmundson and G. Kiczales. Addressing practical software development issues in aspectj with a pointcut interface.
     In Advanced Separation of Concerns, 2001.
[14] K. Gybels and J. Brichau. Arranging language features for more robust pattern-based crosscuts. In Int. Conf. Aspect-
     Oriented Software Development, 2003.
[15] A. Kellens, K. Mens, J. Brichau, and K. Gybels. Managing the evolution of aspect-oriented software with model-based
     pointcuts. In Eur. Conf. Object-Oriented Programming, 2006.
[16] R. Khatchadourian, J. Dovland, and N. Soundarajan. Enforcing behavioral constraints in evolving aspect-oriented
     programs. In Foundations of Aspect-Oriented Languages, 2008.
[17] R. Khatchadourian and A. Rashid. Rejuvenate pointcut: A tool for pointcut expression recovery in evolving aspect-
     oriented software. In To appear in the IEEE International Working Conference on Source Code Analysis and Manipu-
     lation, 2008.
[18] G. Kiczales, E. Hilsdale, J. Hugunin, M. Kersten, J. Palm, and W. Griswold. An overview of aspectj. In Eur. Conf.
     Object-Oriented Programming, 2001.
[19] G. Kiczales, J. Lamping, A. Mendhekar, C. Maeda, C. Lopes, J. Loingtier, and J. Irwin. Aspect oriented programming.
     In Eur. Conf. Object-Oriented Programming, 1997.
[20] K. Klose and K. Ostermann. Back to the future: Pointcuts as predicates over traces. In Foundations of Aspect-Oriented
     Languages, 2005.
[21] C. Koppen and M. Stoerzer. PCDiff: Attacking the fragile pointcut problem. In Eur. Int. Workshop on Aspects in
     Software, 2004.
[22] M. Lippert and C. Lopes. A study on exception detection and handling using AOP. In International Conference on
     Software Engineering, 2002.



                                                             33
                                                u
[23] C. D. Manning, P. Raghavan, and H. Sch¨ tze. Introduction to Information Retrieval. Cambridge University Press,
     2008.
[24] H. Masuhara, G. Kiczales, and C. Dutchyn. A compilation and optimization model for aspect-oriented programs. In
     International Conference on Compiler Construction, 2003.
[25] K. Ostermann, M. Mezini, and C. Bockisch. Expressive pointcuts for increased modularity. In Eur. Conf. Object-
     Oriented Programming, 2005.
[26] M. A. Perez-Toledano, A. Navasa, J. M. Murillo, and C. Canal. Titan: a framework for aspect oriented system evolution.
     In Software Engineering Advances, 2007.
[27] M. P. Robillard. Tracking concerns in evolving source code: An empirical study. In Int. Conf. Software Maintenance,
     2006.
[28] M. P. Robillard and G. C. Murphy. Concern graphs: finding and describing concerns using structural program depen-
     dencies. In International Conference on Software Engineering, 2002.
[29] K. Sakurai and H. Masuhara. Test-based pointcuts: a robust pointcut mechanism based on unit test cases for software
     evolution. In Linking aspect technology and evolution, 2007.
[30] L. M. Seiter. Role annotations and adaptive aspect frameworks. In Linking Aspect Technology and Evolution, 2007.
[31] J. Sillito, C. Dutchyn, A. D. Eisenberg, and K. D. Volder. Use case level pointcuts. In Eur. Conf. Object-Oriented
     Programming, 2004.
[32] M. Stoerzer and J. Graf. Using pointcut delta analysis to support evolution of aspect-oriented software. In Int. Conf.
     Software Maintenance, 2005.
[33] K. Sullivan, W. Griswold, Y. Song, Y. Cai, M. Shonle, N. Tewari, and H. Rajan. Information hiding interfaces for
     aspect-oriented design. In ACM SIGSOFT Symposium on the Foundations of Software Engineering, 2005.
                                      a
[34] J. Wloka, R. Hirschfeld, and J. H¨ nsel. Tool-supported refactoring of aspect-oriented programs. In Int. Conf. Aspect-
     Oriented Software Development, 2008.
[35] G. Xu and A. Rountev. Ajana: a general framework for source-code-level interprocedural dataflow analysis of aspectj
     software. In Int. Conf. Aspect-Oriented Software Development, 2008.
[36] L. Ye and K. D. Volder. Tool support for understanding and diagnosing pointcut expressions. In Int. Conf. Aspect-
     Oriented Software Development, 2008.




                                                           34

				
DOCUMENT INFO
Shared By:
Tags:
Stats:
views:4
posted:11/4/2011
language:English
pages:35