Document Sample
APPROACH TO DETERMINING AN EXTERNAL PROBLEM FOR SELF-HEALIN - Ubiquitous Computing and Communication Journal Powered By Docstoc
					Special Issue on Ubiquitous Computing Security Systems

                      FOR SELF-HEALING

                     Jeongmin Park, Joonhoon Lee, Hyunsang Youn, and Eunseok Lee
                           School of Information and Communication Engineering
                          Sungkyunkwan University Suwon 440-746, South Korea,,,

               Self-healing is a methodology used for constructing a system that can detect faults
               and recover itself and returns from an abnormal state to a normal state. Much
               attention has recently been focused on self-healing ability that recognizes problems
               arising in a target system. However, if a system wants to provide self-healing
               functionalities, there are many loads such as target system analysis and system
               environment analysis for external problem. Thus, this paper proposes using
               deployment diagram for self-healing approach to determine problem arising in
               external environment. The UML deployment diagram is widely used for resource
               specification of a system and generally designed in the system design phase. The
               approach proposes of 1) analysis for associations between software and hardware;
               2) generating a monitor using constraints in deployment diagrams; and 3) adding
               the monitor to the component after adapting it to the specific software architecture.
               As proof of the approach, we automatically generate a resource monitor
               automatically, and used a video conference system. We illustrate how the method
               detects anomalies using the example.

               Keywords: External problem, Problem Deternmination, External state

1   INTRODUCTION                                            •   Analyzing associations between software and
                                                                hardware in the UML deployment diagram of a
    The complexity of the software execution                    component.
environment poses new challenges for software               •   Generating the resource monitor using constraints
developers. When computer systems operate                       specified by the designer in the diagram.
abnormally, detecting and resolving the problem             •   Adding the monitor to the component after
requires much time and effort. Therefore, software              adapting the component's structure
should adapt without human intervention to achieve
a self-healing ability. Self-healing is concerned with          Through our approach, resource monitor can be
the ability of the system to automatically recover          generated automatically using the deployment
from faults [1,2].                                          diagram for a target system. It is useful in
    Self-healing components have been the subject of        implementing the resource monitor for the
several studies. For constructing a system that             component because it reduces additional work for
facilitates self-healing, Shin et al.[3,4,5] propose        monitoring the resources.
self-healing component architecture. Faults can be              Component developers just simply modify parts
divided into two types in views of the system: the          of the monitor generated automatically for adaptation
fault occurred in software and the fault from               and can easily add healing strategies to it. For
resources such as a cpu usage, a ram usage, and             illustrating the approach, we tested our method by
bandwidth, etc.                                             adapting a video conference system for evaluation.
     However, this approach does not focus on faults        We can see that the monitor generated by our method
arising from resources. The monitor for self-healing        worked correctly when a resource problem occurred.
in the Healing Layer must be implemented by the             The next section of the paper describes related work.
developer and it requires additional efforts in the         Section 3 presents the approach in more detail.
software development process.                               Section 4 illustrates evaluations for the approach.
    In this paper, we describe an approach to               The paper ends with a summary in Section 5.
generate the resource monitor automatically by using
a UML Deployment diagram. The approach consists
of the following steps:

UbiCC Journal – Volume 4                                                                                     670
Special Issue on Ubiquitous Computing Security Systems

2    RELATED WORK                                                    The architecture does not allow
                                                                     detailed mistakes.
                                                                     Only faults that occurred in the
   In this section, we present a self-healing
                                                                     component can be detected.
component architecture [3,4,5] and an Autonomic
Failure-Detection algorithm [6], which is one of
the failure detection methods.                           2.2. Autonomic Failure-Detection Algorithm

                                                            Mills et al. [6] proposed an algorithm that detects
2.1. Layered software architecture for a self-           failures automatically. In the approach, objects and
healing component                                        devices that need to be observed send a signal to the
                                                         monitor periodically, similarly to a human’s
   Each self-healing component consists of a             heartbeat. The monitor can manage many
healing layer and a service layer.[3,4,5] The            components. It determines whether the object or
service layer performs tasks requested by another        device has a problem by checking the signal over
task or component in the system. It also contains        time. Let H p represent the period of a signal. The
active objects, connectors, and passive objects,         maximum time for detecting faults will then also be
which are accessed by active objects. The active
                                                         H p . However, faults can occur at any time during
object can execute another active object or a
passive object. In contrast, a passive object is         the signal period. The average time for detecting
called only by an active object. It cannot perform       faults is H p / 2 . This algorithm can identify whether
independently unless another object calls it. The        an object has problems or not in a very short time.
connectors transfer messages to or from tasks and        However, it has an overhead cost because it requires
synchronize them.                                        frequent communication to exchange the signal
   The healing layer makes a decision that an            between the monitor and the objects.
object in the service layer of the component
becomes sick, the healing process is launched via
connectors. It is composed of 6 objects as follows.      3   PROPOSED APPROACH

•   Component Monitor: This module observes                 In this paper, we present an improved self-
    behaviors of each object through messages            healing component architecture that can recover
    from connectors in the service layer.
                                                         resource problems. We do not focus on inner
•   Component          Reconfiguration          Plan     problems in this paper because this is covered by
    Generator:       This       module      produces     Shin et al.[3,4,5] The resource in this case could
    reconfiguration plans for when a fault occurs        be independent of the software. The monitor
    in the service layer. It also has information for    measures the state of resources periodically and
    objects in the service layer.                        decides whether self-healing policies should be
                                                         adopted or not. For this, we used a modified
                                                         “heartbeat” algorithm. The algorithm sends the
•   Component Repair Plan Generator: This                signal to resources. Through this mechanism, the
    module constructs self-healing strategies for
                                                         resource monitor can measure values and detect
    faulty objects. It has recovery plans for each
    object in the service layer.                         anomalies.

•   Component Reconfiguration Executor,                  3.1. Architecture for generating resource monitor
    Component     Repair    Executor:    These
    modules execute plans generated by the plan              The architecture can be divided into an
    generators.                                          analyzing phase and a generation phase. Figure 1
                                                         illustrates the flow of structure. The architecture
                                                         can be divided into an analyzing phase and a
•   Component Self-healing Controller: This              generation phase. Figure 1 illustrates the flow of
    module controls the five modules above.              structure.
    This architecture has the following features.
            The architecture can identify an object      •   UML Deployment Diagram: This is the input of
            with faults.                                     the architecture. The diagram is transformed into
            Healing strategies for each object are           an XMI (XML Meta-Interchange)[7,8].

UbiCC Journal – Volume 4                                                                                   671
Special Issue on Ubiquitous Computing Security Systems

•    XMI Parser: The XMI parser analyzes resource                means the duration time until the detection of
    constraints of and associations with the resource.           a fault. It can be also said to be the waiting
    In the analyzing phase, the outputs are                      time in the method; initially, its value is 1
    monitoring targets and constraints. These outputs            second. This value is used as a setting value
    are parsed in XML format.
                                                                 for experiments and can be changed for any
•   Monitor Template Generator: The monitor                      system environment.
    template generator uses the output of the XMI
    parser. It generates a monitor template, which                          Table 1: Constraints List
    detects device problems or resources selected for        Contents            Input                Unit
    monitoring. This template is implemented in the            CPU
                                                                                0.0 ~ 1.0           Percent
    specific language.                                        usage
                                                                                0.0 ~ 1.0           Percent
•   Configuring: The monitoring template code                 usage
    need to be modified for adaptation. The software         Heartbeat         0.1 ~ 1.0            Second
    developer configures it for the structure of                             User defined
    software.                                                Bandwidth         minimum               KB/s
•   Resource Monitor: The resource monitor                                   User defined
    generated by the approach can be adapted to the              Method       connection
    software directly.                                                            type
                                                                             Duration time
                                                                 Duration    for detecting          Second

                                                         •       Step2 - Analyzing diagram: At first, the node
                                                                 (for example, client, server etc) was identified
                                                                 in the system. Next, constraints for resources,
                                                                 such as the constraints of cpu, Memory,
                                                                 Bandwidth and Heartbeat rate, were identified.
                                                                 The Parsing Engine parses XMI information
                                                                 (Fig. 4.) and generates XML about the two

        Figure 1: Architecture for generating
                  resource monitor

3.2. Process of approach
   We present the process composed of 4 steps in
this section (Fig. 2).
•   Step1 - Specifying the system using a UML
    deployment diagram: Initially, the software
    developer creates a deployment diagram (Fig.
    3). The deployment diagram is a diagram
    which represents a static aspect of the system
    in the UML design model and illustrates
    associations among components. Constraints
    proposed within the method are shown in
                                                                    Figure2: Process of approach (4-steps)
    Table 1 Method means linking techniques of
    network or physical devices and using them
                                                             •     Step3 - Generating monitor template: In this
    for detecting abnormal terminations. Duration                  step, the template for an executable resource

UbiCC Journal – Volume 4                                                                                      672
Special Issue on Ubiquitous Computing Security Systems

      monitor was generated by using the information
     analyzed in the previous step. The Template         3.3. Problem detection algorithm
     Generator (TG) performs the generation of a
     monitor by analyzing the XML generated by            In this section, we describe the parts that were
     the Parsing Engine. It also generates fault         adapted to the autonomic fault-detection algorithm
     processing and anomaly detection routines for       relate to our approach (Fig. 5). The resource monitor
     each constraint. (Fig. 2)                           in the self-healing layer judges the state of the
 •                                                       system as abnormal if a reply is sent to the devices or
                                                         resources and does not return in the period. It was
                                                         also regarded as abnormal if the values of the
                                                         resource violated a constraint. In this context, a self-
                                                         healing layer should construct a reconfiguration plan
                                                         and perform it. Unlike related works, Lmax and
                                                          Lavg are 1.5 times longer than before because the
                                                         monitor sends the signal first. The monitor
                                                         determines that a resource is still in the normal state
                                                         if a fault has occurred just after replying to the
                                                         monitor. At this time, it sends a signal that tells it to
      Figure3: Deployment diagram example
                                                         cycle to a new resource. However, the resource is
                                                         actually in fault, and a cycle is wasted because the
                                                         resource is already in trouble. Therefore, our
                                                         approach takes more time to detect faults than related

                                                                   Figure5: Error detection algorithm

                                                         3.4. Self-healing components including resource
                                                            Resource monitoring is illustrated in Fig. 6.
                                                         The device and self-healing component
 Figure4: XMI Information and constraints model          architecture featured resource monitoring.
 derived from a deployment diagram                       Devices and the modified architecture available
                                                         to resources monitoring the self-healing
 •   Step4 - Composing monitor: In this step, a          component architecture were designed by E. Shin
     developer modifies the resource monitor
                                                         [2, 3]. Resource monitoring is illustrated in Fig. 6.
     according to the software environment. The
     fault processing handler or guidelines are          The device and self-healing component
     actually implemented in the monitor template        architecture featured resource monitoring.
     generation level by the approach. It also           Devices and the modified architecture available
     performs customization regarding parts needed       to resources monitoring the self-healing
     and parts modified. Afterwards, a resource          component architecture were designed by E. Shin
     monitor is added to the self-healing layer or       [2, 3].

UbiCC Journal – Volume 4                                                                                     673
Special Issue on Ubiquitous Computing Security Systems

                                                            In this paper, we present an improved self-
                                                         healing component architecture that can recover
                                                         resource problems. We do not focus on inner
                                                         problems in this paper because this is covered by
                                                         Shin et al.[3,4,5] The resource in this case could
                                                         be independent of the software. The monitor
                                                         measures the state of resources periodically and
                                                         decides whether self-healing policies should be
                                                         adopted or not. For this, we used a modified
                                                         “heartbeat” algorithm. The algorithm sends the
                                                         signal to resources. Through this mechanism, the
                                                         resource monitor can measure values and detect
                                                            To evaluate the algorithm, we expressed the
    Figure6: Proposed Self-healing component             basic design of a video-based conference system.
                 architecture                            The purpose of this system was to successfully
                                                         conduct a video-based conference. During the
    Six objects used for healing referred
                                                         meeting, the client should not be interrupted by
components and three objects used for detecting
                                                         external problems of the software. In this paper,
resources and reorganizing is added in this
                                                         the purpose was to check whether the client
architecture. The added objects are divided into
                                                         detected errors that arose from the software's
three parts. : External Resource Monitor,
                                                         external problems after automating the resource
External     Resource   Reconfiguration    Plan
                                                         monitor and applying it to the client in the video-
Generator,       and     External      Resource
                                                         based conference system by the approach
Reconfiguration Executor.
   External Resource Monitor checks the status
of external devices and resources. External
Resource Reconfiguration Plan Generator makes
organizational plans for service levels in
accordance with external situations. External
Resource Reconfiguration Executor executes the
    The purpose of the External Resource                         Figure7: Parsing Engine Prototype
Reconfiguration Plan Generator is to make plans
that prevent other well-operating objects from
being affected by other resources by isolating
objects that are easily influenced by resources,
similar to the organization of the component
   Self-healing Controller that controls objects
in the self-healing layer governs the resource
reconfiguration    executor     to   perform    a
reconfiguration of the service layer. When it                  Figure8: Template Generator Prototype
comes to external errors, it performs in the same
way and allows anomalies of the service layer by         4.1. Environments
minimizing resources.
                                                             To evaluate this approach, we implemented
                                                         clients of a video conferencing system based
                                                         on .NET Framework 2.0. We used C# with the
4 Implementation and Evaluation                          implements in MS Windows XP. We used Borland
                                                         Together for UML modeling. The server was

UbiCC Journal – Volume 4                                                                                674
Special Issue on Ubiquitous Computing Security Systems

 implemented by Java2 SDK 1.4. The client                the client, and a routine that prints the error time in a
additionally used DirectShow.NET for the video           resource monitor in pursuit of the accuracy of the
device. A deployment analyzer and resource monitor       Failure-Detection Latency evaluation. The detection
template were also implemented in C#. Fig. 3             results for various constraints are listed in Table 2.
illustrated the deployment diagram that we used. Fig.    The error detection time, which was estimated for the
7 and Fig. 8 illustrate the Parsing Engine prototype     CPU for 10 times, is shown in Fig. 10.
and Template Generator.
                                                           Table 2: Experimental results of the monitoring
4.2. Normal case                                                                                   Success of
                                                            Check list         Constraints
   Resource monitor continues to monitor the                                                        detection
resource unless resource performs its work                    CPU
                                                                                Max 80%              Success
without any anomalies.                                        usage
                                                             usage              Max 70%              Success
4.3. Abnormal case
Monitor detects an abnormal state when the                 Bandwidth
                                                             usage            Min 50KB/s             Success
measured value was over the normal range or the
connection with the other resources was accidentally                           Abnormal
terminated. Figure 9 illustrates the case when the                              network              Success
CPU usage was in excess of 80%. In this paper, we                            determination
did not focus on self-healing strategies. Therefore,
strategies for healing the faulty state were generated
by the administrator.

                                                         Figure 10: Error detection time of resource monitor

                                                            As a result of the evaluation, the resource
                                                         monitor detected the four items that constraints
Figure 9: Detection of anomalies of CPU by monitor       are set up. Even though there were differences in
                                                         the average fault detection time, we were able to
                                                         verify that the resource monitor could detect it
4.4. Objective of evaluation and the results
                                                         within the maximum fault detection time.
   The purpose of the evaluation is to determine
whether the approach recognizes error situations or      5 CONCLUSION
not within a designated time in applied purpose             This paper proposed an approach to reduce the
systems and to compare applied target systems with       efforts of a self-healing developer and offered a
not applied to the system, if errors occur in the        software architecture that detects the resources
resources. We used programs such as the                  available. The produce of resource monitors can
benchmarking program and forced server                   be automated by using the deployment diagram.
determination in the case of extreme situations in the   The advantages are listed below.
system. Additionally, we added a routine that
immediately reports the time when errors occur in

UbiCC Journal – Volume 4                                                                                       675
Special Issue on Ubiquitous Computing Security Systems

     •   The resource monitor production is              [5] Micheal E.Shin, Jung Hoon An, “Self-
                                                             reconfiguration in self-healing systems”,
         automated                                           Proceedings of the 3th IEEE international
    •    A strategy is in place in the case of faults        Workshop on EASE’06, pp.106-116 (2006).
          in resources.                                  [6] K. Mills, S. Rose, S. Quirolgico, M. Britton,
                                                             C. Tan, "An autonomic failure-detection
   Until now, developers have to do more effort              algorithm",    ACM     SIGSOFT      Software
                                                             Engineering Notes, Vol. 29, Issue 1, pp. 79-
to implement the monitor which checks resources              83(2004).
for the software. However, in this study, we
                                                         [7] G. Booch, J. Rumbaugh, I. Jacobson, "The
confirmed that we could make resource monitors               Unified Modeling Language User Guide",
automatically that can include a self-healing                Addison Wesley, pp.100-150 (1999).
component by a deployment diagram. To                    [8] XMI Online Document,
evaluate these, we arranged a prototype
component and confirmed whether the detection
monitor operated correctly when an abnormal
situation occurred.
   However, we could not overcome a high
overhead since signals must be exchanged
frequently if errors are to be detected. To solve
this problem, a study that investigates self-
regulating cycles of exchanging signals between
monitors is needed. The study of automation in
self-healing strategies for recovering from faulty
states remains future work.

   This work was supported by the Korea Science
and Engineering Foundation (KOSEF) grant
funded by the Korea government (MEST) (No.
2009-0077453) and a result of Faculty Research
Fund (2008) of Sungkyunkwan University.
Corresponding author: Eunseok Lee.

[1] B.Topol, D.Ogle, D. Pierson, J. Thoensen, J.
    Sweitzer, M. Chow, M. A. Hoff-mann, P.
    Durham, R. Telford, S. Sheth, T. Studwell,
    “Automating problem determination: A first
    step toward self-healing computing system”,
    IBM white paper (2003).
[2] D. Ghosh, R. Sharman, H. R. Rao, S.
    Upadhyaya, "Self-healing - survey and
    synthesis", Decision Support Systems in
    Emerging Economies, Vol. 42, Issue 4, pp.
    2164-2185 (2007).
[3] Michael E. Shin, "Self-healing component in
    robust software architecture for concurrent
    and distributed systems", Science of
    Computer Programming, Vol. 57, No. 1, pp.
    27-44 (2005).
[4] Michael E. Shin and Jung Hoon An, "Self-
    Reconfiguration in Self-Healing Systems",
    Proceedings of the Third IEEE International
    Workshop on EASE'06, pp 89-98 (2006).

UbiCC Journal – Volume 4                                                                              676

Description: UBICC, the Ubiquitous Computing and Communication Journal [ISSN 1992-8424], is an international scientific and educational organization dedicated to advancing the arts, sciences, and applications of information technology. With a world-wide membership, UBICC is a leading resource for computing professionals and students working in the various fields of Information Technology, and for interpreting the impact of information technology on society.