Docstoc

Middleware Services (Ph.D. Thesis)

Document Sample
Middleware Services (Ph.D. Thesis) Powered By Docstoc
					Middleware Services for Dynamic Clustering of
     Application Servers (Ph.D. Thesis)



                 Giorgia Lodi




         Technical Report UBLCS-2006-06

                  March 2006




         Department of Computer Science
             University of Bologna
             Mura Anteo Zamboni 7
             40127 Bologna (Italy)
The University of Bologna Department of Computer Science Research Technical Reports are available in
PDF and gzipped PostScript formats via anonymous FTP from the area ftp.cs.unibo.it:/pub/TR/UBLCS
or via WWW at URL http://www.cs.unibo.it/. Plain-text abstracts organized by year are available in
the directory ABSTRACTS.

Recent Titles from the UBLCS Technical Report Series

2005-7 Whole Platform (Ph.D. Thesis), Solmi, R., March 2005.

2005-8 Loss Functions and Structured Domains for Support Vector Machines (Ph.D. Thesis), Portera, F., March
      2005.

2005-9 A Reasoning Infrastructure to Support Cooperation of Intelligent Agents on the Semantic Grid, Dragoni,
      N., Gaspari, M., Guidi, D., April 2005.

2005-10 Fault Tolerant Knowledge Level Communication in Open Asynchronous Multi-Agent Systems, Dragoni,
      N., Gaspari, M., April 2005.

2005-11 The AEDSS Application Ontology: Enhanced Automatic Assessment of EDSS in Multiple Sclerosis, Gas-
      pari, M., Saletti, N., Scandellari, C., Stecchi, S., April 2005.

2005-12 How to cheat BitTorrent and why nobody does, Hales, D., Patarin, S., May 2005.

2005-13 Choose Your Tribe! - Evolution at the Next Level in a Peer-to-Peer network, Hales, D., May 2005.

2005-14 Knowledge-Based Jobs and the Boundaries of Firms: Agent-based simulation of Firms Learning and Work-
      force Skill Set Dynamics, Mollona, E., Hales, D., June 2005.

2005-15 Tag-Based Cooperation in Peer-to-Peer Networks with Newscast, Marcozzi, A., Hales, D., Jesi, G., Arte-
      coni, S., Babaoglu, O., June 2005.

2005-16 Atomic Commit and Negotiation in Service Oriented Computing, Bocchi, L., Ciancarini, P., Lucchi, R.,
      June 2005.

2005-17 Efficient and Robust Fully Distributed Power Method with an Application to Link Analysis, Canright, G.,
      Engo-Monsen, K., Jelasity, M., September 2005.

2005-18 On Computing the Topological Entropy of One-sided Cellular Automata, Di Lena, P., September 2005.

2005-19 A model for imperfect XML data based on Dempster-Shafer’s theory of evidence, Magnani, M., Montesi,
      D., September 2005.

2005-20 Friends for Free: Self-Organizing Artificial Social Networks for Trust and Cooperation, Hales, D., Arte-
      coni, S., November 2005.

2005-21 Greedy Cheating Liars and the Fools Who Believe Them, Arteconi, S., Hales, D., December 2005.

2006-01 Lambda-Types on the Lambda-Calculus with Abbreviations: a Certified Specification, Guidi, F., January
      2006.

2006-02 On the Quality-Based Evaluation and Selection of Grid Services (Ph.D. Thesis), Andreozzi, S., March
      2006.

2006-03 Transactional Aspects in Coordination and Composition of Web Services (Ph.D. Thesis), Bocchi, L., March
      2006.

2006-04 Semantic Frameworks for Implicit Computational Complexity (Ph.D. Thesis), Dal Lago, U., March 2006.

2006-05 Fault Tolerant Knowledge Level Inter-Agent Communication in Open Multi-Agent Systems (Ph.D. Thesis),
      Dragoni, N., March 2006.
Middleware Services for Dynamic Clustering of Ap-
plication Servers (Ph.D. Thesis)

Giorgia Lodi1



Technical Report UBLCS-2006-06

March 2006


Abstract
Nowadays, middleware are well-established technologies developed in order to make easier the implemen-
tation of distributed applications. Among these applications, this thesis focuses on the so-called enteprise
applications.
    Usually, these applications exhibit stringent Quality of Service (QoS) requirements, which are to be
met in order to enable them to carry out their tasks effectively.
    QoS has been widely defined in the literature; for the purposes of this thesis it is intended to be a set of
non-functional application requirements that include availability, scalability, reliability and timeliness.
    In current industry practice, these requirements are usually specified within so-called Service Level
Agreements (SLAs) that, in the context of this work, are contracts used to regulate the contractual rela-
tionships between application owners and middleware platform providers.
    Application owners own applications that can be deployed, run and maintained using component-based
technologies termed application servers. These technologies support clustering of application server in-
stances for scalability, load balancing and fault-tolerance purposes; however, current clustering mech-
anisms can meet only partially the above mentioned non-functional (i.e., the QoS) requirements of the
applications they host, as these mechanisms are not designed to be QoS-aware. In this thesis, QoS-aware
application servers (QaASs) are such component-based technologies in which their clustering mecha-
nisms are capable of meeting QoS requirements that are specified within SLAs. Hence, the thesis proposes
the design, implementation and experimental evaluation of an open source middleware architecture, con-
structed out of QoS-aware middleware services, that extends current application server technology so
as to create QaASs.
    Specifically, the thesis focuses on three principal middleware services, termed Configuration Service
(CS), Monitoring Service (MS) and Load Balancing Service (LBS). The CS, MS and LBS, according
to the QoS specifications included into SLAs, are responsible for (i) configuring clusters of application
servers in order to meet the SLA requirements; (ii) monitoring and adapting those clusters in case the
QoS delivered by the resources of the clusters deviates from that required by the applications, and (iii)
distributing the client requests among application server instances in the clusters so as to honor SLAs.
    Experimental evaluations of the QoS-aware middleware services described in this thesis show that
these services can be used effectively to extend current application server technology so as to enable that
technology to meet its SLAs.
    Part of the work described in this thesis has been developed within the context of the EU funded project
TAPAS [TAP] and deployed for the production by the German Adesso AG company [ade], which partici-
pated to the TAPAS project.



1. Department of Computer Science, University of Bologna, Mura Anteo Zamboni 7, 40127 Bologna, Italy.


                                                                                                             1
Chapter 1

Introduction

This thesis proposes the design, implementation, and experimental evaluation of a collection of
QoS-aware middleware services, named Configuration Service (CS), Monitoring Service (MS) and
Load Balancing Service (LBS). These services are intended to be part of a generic open source mid-
dleware architecture, termed QoS Management subsystem, that this thesis describes in its design
and implementation. The principal objective of this middleware architecture is to extend cur-
rent application server technologies in order to make them QoS-aware, that is, in order to make
application servers capable of honoring SLAs.
    The contribution of this work is then to demonstrate that the aforementioned services can
configure, monitor and dynamically adapt clusters of application servers, in order to provide
distributed applications with such QoS guarantees as scalability, availability, reliability and time-
liness. The applications considered in this thesis, and generally hosted by application servers,
are enterprise applications; usually, enterprise applications are responsible for carrying out such
business functions as managing supply chains, managing customer relationships, and so forth.
Examples of these types of applications include e-commerce, automated stock-trading, and bank
asset management systems.
    The motivations for this work can be summarized as follows.


1     Motivations
A middleware platform is generally used as an architectural component for supporting the de-
velopment and the execution of distributed applications. Its main role is to create a level of
abstraction so as (i) to present a unified programming model to application developers and (ii)
to mask out problems of system and network heterogeneity.
   Middleware can be composed by multiple layers. There can be identified four principal levels
[Sch02]:
    • Host Infrastructure Middleware it encapsulates and enhances native operating system com-
      munication and concurrency mechanisms to create portable and reusable network pro-
      gramming components;
    • Distribution Middleware it defines higher-level distributed programming models whose
      reusable APIs and mechanisms automate the native operating system network program-
      ming capabilities encapsulated by the previous level;
    • Common Middleware Services the collection of the services of this level are responsible for
      augmenting the distribution middleware layer by defining higher-level domain-independent
      components that allow the application designers to concentrate on the application logic
      only;
    • Domain-specific Middleware Services these services are tailored to the requirements of a
      specific application domain and embody knowledge of that domain.

2
                                                                                        1   Motivations




                                       Application Level




                                       Middleware Level         QoS
                                                              facilities


                                           Operating
                                          System and
                                         Communication
                                            Level



                                  Figure 1. Levels of QoS Integration



     Nowadays the middleware technology is largely adopted, in order to make easier the de-
velopment of distributed applications; however, it is important that the middleware remains
effective for such types of applications (e.g., enterprise applications) that can impose demands
in terms of resource availability, adaptivity, reliability, scalability, and timeliness. In fact, these
applications must operate under changeable environment conditions and they present stringent
Quality of Service (QoS) requirements that are to be met in order to guarantee the correct behav-
ior of the applications themselves.
     QoS has been widely defined in the literature and it is usually referred to as collective effects
and performance of services that determine the degree of satisfaction of the end users, in using
those services. That satisfaction is generally associated with a set of non-functional requirements
(i.e., QoS requirements) that include dependability, reliability, timeliness, throughput and secu-
rity.
     Issues of QoS have been principally addressed in the design of mechanisms that allow pro-
grammers to control communication parameters such as network throughput, packet delay, mes-
sage loss and delay jitter (e.g. RSVP [ea97], IntServ [BCS94], DiffServ [ea98]) over QoS-enabled
communication technologies (e.g., ATM [Vet95]) [FLPS03]. These parameters indeed effect the
user perceived QoS of distributed applications. However, further QoS requirements emerge at
the application level and they cannot be fully met at the communication level, as they fall outside
the responsibility of this level (rather this level may provide support which can be crucial for
meeting them) [Ghi01]. Thus, QoS can be thought of as a pervasive system property, which is to
be preserved through a set of QoS functionalities (e.g., QoS negotiation, monitoring, adaptation,
etc). These functionalities are to be integrated in every subsystem of the software infrastructure,
from the communication subsystem level, up to the application level (end-to-end QoS).
     Figure 1 depicts the levels of the software infrastructure in which a QoS management system
should be provided. Thus, for example, at the operating system level, there should be mecha-
nisms for reserving such resources as CPU, memory and threads; the communication level should
provide applications with mechanisms for network monitoring and reservation; the middleware
level should be constructed out of services for QoS negotiation, monitoring and adaptation and
finally QoS monitoring and adaptation can be applied at the application level as well, by allowing
this level to monitor and adapt the QoS it may require.
     This thesis focuses on the middleware level of the stack shown in Figure 1, so as to provide
distributed applications, that exhibit QoS requirements, with a collection of QoS services embod-


UBLCS-2006-06                                                                                        3
                                                                                2    General Approach


ied into that level.

1.1 Service Level Agreements and QoS Specifications
In current industry practice, QoS requirements are commonly specified within legally binding
contracts termed Service Level Agreements (SLAs) [Lod04].
    In particular, SLAs are used to define the service guarantees an application hosting environ-
ment has to provide its hosted application with, and the metrics to assess the quality of service
delivered by that environment. The definition of such SLAs is a complex task, and is outside the
scope of this thesis (relevant works include [SLE04], [MJSCG04], [Con]) . However, it is worth
mentioning here the principal requirements that can be defined in industry SLAs. These require-
ments are divided into seven principal categories that are summarized below.
    • Performance it defines a quantitatively characterized service. The quantitative parameters
      are typically arrival rate, service time, service rate and timeliness of the service;

    • Availability it defines the duration during which a service or application is guaranteed to
      be available;
    • Reliability it is expressed in terms of mean-time between two failures of the service;
    • Maintainability it defines the maximum time required to recover a failure of the service;

    • Security: it defines the level of security required;
    • Monitoring it is necessary in order to produce the service level statistics;
    • Penalty within this parameter are included all the penalty clauses that are to be paid in
      case one of the two parties involved in the SLA cannot fulfill the contract.


2     General Approach
Owing to the above observations, the principal scope of this work is then to demonstrate that
the middleware level of Figure 1 can be extended with a collection of QoS services capable of
meeting the QoS requirements specified into SLAs and exhibited by distributed applications (e.g.,
enterprise applications).
    It is worth observing that most of such current distributed applications are generally devel-
oped by following a component-based approach. By using this approach, applications can be
constructed out of reusable software components (e.g., client, web, enterprise components), each
of which specialized to carry out some business functions. Usually, these component-based appli-
cations are deployed, run and maintained by means of specific middleware technologies, which
assume a relevant role in the context of this thesis. These technologies, termed application servers,
are indeed middleware architectures that consist of a collection of middleware services useful for
the development and deployment of component-based applications [FR03]. There exists a variety
of such middleware architectures; for example, those implementing the Java 2 Enterprise Edition
(J2EE) specifications (e.g., JBoss [jbo], JOnAS [jon], WebSphere [web]), the CORBA Component
Model (CCM) [WSO00] and .Net [Pro02]; however, among them, this thesis discusses those that
implement the J2EE specifications.
    Thus, J2EE application servers allow developers to construct distributed applications out
of reusable and interoperable software components (e.g., commercial operating systems, com-
munication protocols, middleware services). In addition, they support clustering of application
server instances in order to host the aforementioned component-based applications. Therefore,
they provide distributed applications with clustering solutions for scalability, load balancing and
fault-tolerance purposes.




UBLCS-2006-06                                                                                      4
                                                                                2   General Approach




                                         Figure 2. Scenario



    However, these middleware technologies present a severe limitation: their clustering mech-
anisms can meet only partially the aforementioned non-functional requirements (i.e., QoS re-
quirements) of the applications they host, as they are not fully instrumented for meeting those
requirements (i.e, they are not designed to be QoS-aware).
    Therefore, the principal objective of this thesis is to design, implement and evaluate a middle-
ware architecture, constructed out of QoS-aware middleware services, that extends current applica-
tion servers in order to create so-called QoS-aware application servers (QaASs).
    QaASs are defined as standard application servers extended with clustering mechanisms that
can meet QoS requirements specified with SLAs (i.e., clustering mechanisms capable of honoring
SLAs).
    The above QoS-aware middleware services can be located in the Common Middleware Ser-
vices layer, according to the previous definition, and are capable of providing application servers
with mechanisms for dynamic resource configuration, monitoring, adaptation and QoS-aware
load balancing. They enforce strategies for the assessment of the correct amount (and the charac-
teristics) of the resources necessary to meet the QoS application requirements included into SLAs.
This assessment entails determining the amount of resources needed to deliver the required QoS
levels, so as to honor those SLAs.
    In general, applications hosted in application servers (e.g., web applications, web services)
are characterized by high load variance; hence, the amount of resources needed to meet their
SLAs may vary notably, over time. Thus, in order to ensure that the SLA of an application be
not violated, one can adopt a resource over-provision policy; this policy can be based on evaluating
(e.g., via application modeling, via application benchmarking) the amount of resources the appli-
cation may require, in the worst case, and allocating statically these resources to that application.
Typically, this policy may lead to a largely suboptimal utilization of the hosting environment
resources, as allocated resources may remain unused, at run time.
    In the vast majority of state-of-the-art application servers, the hosting environment resources
are represented by application server instances, deployed in several nodes (i.e., workstations
connected through the same LAN) for scalability and fault tolerance purposes.
    Figure 2 illustrates the scenario considered in this work. As shown in Figure 2, clients, typi-
cally connected to a network, use an application that can be deployed in a clustered application
server.
    Within such a cluster, one machine is dedicated to the load balancing process (LBS in the
above Figure), that is, that machine receives the incoming client requests and dispatches them to
the appropriate clustered machine that has been selected by the LBS through a predefined load
balancing policy.

UBLCS-2006-06                                                                                      5
                                                                                2   General Approach


    In the above scenario, the client requests are addressed to the LBS and filtered by means of a
gateway, which is used in order to cope effectively with a possible crash of the LBS itself.
    In this context, an optimal resource utilization can be achieved by allocating to an application,
and maintaining at run time, the minimum number of clustered nodes required to meet the ap-
plication SLA. To this end, a dynamic cluster configuration mechanism is required that can adapt
the cluster configuration to possible variations of the application load by adding (or removing)
nodes in the cluster, at run time [LPRT05]. Specifically, in the scenario illustrated in Figure 2,
nodes can be added to the cluster, in order to cope with unexpected client load, or released from
the cluster when no longer necessary.
    This behavior can take advantage of service provision models such as utility computing
[IBM04] or on-demand computing (or even grid computing [FK98]), where resources are made
available as needed. Hence, the new nodes can be acquired on demand by either data-centers or
specific clusters of spare resources that can be allocated for this purpose, as shown in Figure 2.
    In this thesis, the QoS management subsystem middleware architecture provides one such a
kind of dynamic clustering technique, which allows one to have an effective resource manage-
ment. Therefore, the designed middleware architecture enables a clustered QoS-aware environ-
ment, termed QoS-aware clustering, in which every clustered node is represented by an instance
of the created QaAS.
    Each QaAS node of the cluster cooperates with one another, so as to avoid resource over-
provision and provide applications with an optimal resource utilization.

2.1 QoS-aware Middleware Services
To this end, each QaAS in the cluster embodies a collection of QoS-aware middleware services.
Specifically, this collection of services includes a Configuration Service, a Monitoring Service and
a Load Balancing Service.
   The principal functionalities carried out by these services are summarized as follows.

Configuration Service The main responsibilities of the CS consist of (i) configuring the application
server cluster at application deployment time, so as to meet an SLA, and (ii) possibly reconfig-
uring the cluster at run time, if the QoS the cluster delivers deviates from that specified in that
SLA. The reconfiguration consists of varying dynamically the cluster configuration at run time,
by adding or removing nodes.

Monitoring Service The MS is in charge of monitoring the aforementioned QoS-aware clustering
in order to detect variations of the QoS delivered by the clustered servers, at run-time. In case
variations occur, the MS cooperates with the CS for cluster reconfiguration purposes. The MS
carries out its own activity based on specific thresholds, termed warning and breaching points.
Specifically, the breaching points can be derived from the requirements included in the SLA. In
contrast, the warning thresholds can be calculated based on, for example, the expected load, or
even specific policy decisions concerning the risks one wishes to assume as to the violation of a
SLA.

Load Balancing Service The LBS described in this thesis is a service that has been included into
J2EE application servers and implemented at the middleware level, in order to balance the load
of HTTP client requests among the clustered QaASs. The motivation for implementing load
balancing at the middleware level is twofold; namely, implementing load balancing at this level
of abstraction allows one to be independent from any underlying operating system. Moreover,
the designed LBS can be easily made aware of specific application server conditions, such as
server response time and throughput as well as the cluster membership configuration.
    However, a HTTP load balancing service is not a standard component of the J2EE architec-
ture. To this end, such an architecture makes use of hardware or software solutions to face this
problem.


UBLCS-2006-06                                                                                      6
                                                                                2   General Approach


    Hardware load balancing services are implemented by network devices (e.g., the Cisco load
balancing [Bala], F5/BIG-IP Traffic Management [hlb]) that use static load balancing policies to
distribute client requests, only.
    Software load balancing services can be implemented in a variety of different ways, namely
as (i) operating system components (e.g., the Linux Virtual Server [Ser] is a highly scalable and
highly available server built on a cluster of real servers, with the load balancing running on the
Linux operating system), (ii) stand-alone applications (e.g., Apache mod jk [Mod], ZXTM Load
Balancer [Balb]), and (iii) components integrated in the application server technology (e.g., IBM
WebSphere Edge Components Load Balancer). All these software solutions are typically more
flexible than their hardware counterparts and enable different load distribution policies.
    However, these software (and also hardware) solutions, with some exceptions (e.g., IBM Web-
Sphere), principally adopt static load balancing strategies, which do not consider the actual com-
putational load of the clustered nodes, in dispatching the client requests among those nodes.
Static load distribution may affect the QoS delivered by clustered nodes; to this end, the designed
LBS embodies adaptive load balancing policies that can cope effectively with run time variations
of both the clustered nodes computational load, and the cluster configuration, in dispatching the
client requests.
    The architecture of the developed LBS can be thought of as a reverse proxy server which
interfaces the clients of an application to the nodes hosting that application, and includes both (i)
support for a request-based load balancing approach (each individual client request is intercepted
by the LBS, and dispatched to an application server for processing) and (ii) a session-based load
balancing approach (a specific client session is processed by the same node; this client-server
correspondence is termed session affinity).
    Thus, the LBS is mainly responsible for (i) intercepting each HTTP client request, (ii) selecting
a target node that can serve that request, by using specific load balancing policies, and (iii) ma-
nipulating the client request appropriately, in order to forward it to the selected target node, and
to enable it to return the reply to the LBS itself.

2.2 Middleware services evaluation
An experimental evaluation of a first prototype of the above middleware services has been car-
ried out. This evaluation shows that the services enable the construction of a clustered, applica-
tion hosting environment that can meet effectively non-functional application requirements and
optimize clustered resource utilization.
     However, a precise analysis of the obtained results has led to design and implement a second
prototype of the middleware architecture, described in this thesis. In fact, the purpose here has
been that of reducing as much as possible the number of reconfigurations carried out by the CS
(i.e., adding/releasing clustered nodes), due to their cost in terms of performance, and to improve
the preliminary obtained results concerning the clustered resources utilization.
     In addition, it has been carried out a further study of commercial SLAs. From this study, it has
emerged, indeed, that in common industry practice the QoS requirements specified within SLAs
are allowed to be breached a certain number of times, over a predefined timeframe [BCL+ 04].
Hence, the first SLA management model proposed in this thesis, with which no SLA violations
were permitted, has been relaxed so as to tolerate a limited number of violations of the QoS
requirements during a certain time period.
     It is worth noticing that changes in the SLA requirements have caused changes in the design
of the SLA Monitoring, which has been therefore re-developed and implemented. Moreover,
as to the adaptive load balancing policy of the LBS, different techniques have been evaluated in
order to find an easier and more precise estimation of the parameters used to adaptively dispatch
the HTTP client requests.
     The second prototype has been evaluated by performing a further experimental evaluation.
This evaluation confirmed some of the positive results obtained by testing the first prototype and
also permitted to improve the optimization of the usage of the clustered resources.



UBLCS-2006-06                                                                                      7
                                                                            3   Thesis Organization


3    Thesis Organization
The remainder of this thesis is structured as follows.
    Chapter 2 discusses in detail the designed middleware services, used to provide distributed
applications with application servers capable of honoring SLAs.
    Chapter 3 describes the implementation of a first prototype of the earlier mentioned middle-
ware services.
    Chapter 4 presents the principal results of an experimental evaluation of the first prototype
of the middleware services. The obtained results show the effectiveness of the proposed services
in terms of overhead introduced, capability of honoring SLAs and optimal resource utilization.
    Chapter 5 describes a new prototype of the middleware architecture in terms of design, imple-
mentation and experimental evaluation of it. In addition, it discusses some experimental results
obtained by deploying the clustered QaASs in a wide area network environment.
    Chapter 6 presents the state-of-the-art concerning end-to-end QoS architectures, architectures
for resource clustering and load balancing. It compares and contrasts the state-of-the-art ap-
proaches to the problem faced in this thesis with the proposed solution.
    Finally, Chapter 7 discusses some concluding remarks and future directions.




UBLCS-2006-06                                                                                    8
Chapter 2

QoS-aware Middleware Services

In order to construct a QoS-aware application server (QaAS), a standard application server (e.g.,
JBoss, JOnAS, WebSphere) has been extended with a collection of middleware services, which
provide distributed applications with such QoS management mechanisms as resource configura-
tion, QoS monitoring, adaptation and QoS-aware load balancing.
    These services are part of a middleware architecture termed QoS Management subsystem that
allows one to make standard application servers QoS-aware (Figure 1). Note that the absence
of this subsystem leads to have a standard application server constructed out of component
middleware (e.g., J2EE). Moreover, from the applications’ point of view, the QoS Management
subsystem operates transparently, that is, no changes to the application level specifications are
necessary in order to provide the applications with the above QoS management mechanisms.
    The QoS Management subsystem is constructed out of three principal middleware services,
termed Configuration Service (CS), Monitoring Service (MS) and Load Balancing Service (LBS),
respectively.
    This Chapter presents the design of each of those middleware services by describing their
principal functionalities and the interactions between them.


1     General Architecture
Figure 2 depicts the overall system architecture that has been designed in order to honor SLAs.
Such an architecture, termed QoS-aware application hosting enviroment (QoS-aware clustering for
short), consists of a number of clustered QaASs.
    Within the QoS-aware clustering, each clustered QaAS (i.e., node) incorporates the QoS Man-
agement subsystem illustrated in Figure 1. As the QoS Management is constructed out of the
aforementioned CS, MS and LBS, these services are in turn replicated in each node of the QoS-
aware clustering.
    However, only one node in the cluster is indeed responsible for the SLA enforcement, moni-
toring, and load balancing. This node, termed the Leader of the cluster (the bold circle of Figure
2), receives the clients requests that, by means of the LBS of the Leader, are dispatched to the
remaining nodes of the QoS-aware clustering. Hence, as depicted in Figure 2, there exists only
one CS, MS and LBS Leader at a time. The remaining CS, MS and LBS, deployed in the other
clustered nodes (i.e., slave nodes) are used as backup copies in case the Leader crashes; they re-
ceive from the Leader both the client requests for processing and the cluster configuration state
as soon as an event causes modifications to that state (i.e., every slave node has a consistent view
of the cluster configuration state).
    The cluster configuration state maintained by the Leader consists of the list of QaAS nodes
that are part of the QoS-aware clustering and the SLA related to the application that is currently
deployed in the cluster.
    The Leader maintains the state on the slave nodes consistent by carrying out a specific proto-
col (further details about this protocol are discussed in Section 3) that makes use of a collection of

                                                                                                    9
                                                                               1   General Architecture




                         Figure 1. The QoS-aware Application Server (QaAS)



primitives, made available, in turn, by an underlying reliable group communication mechanism,
as shown in Figure 2. Specifically, that mechanism provides the QaAS nodes with reliability
properties that include lossless message transmission, message ordering, and atomicity.
     In the design of the above middleware architecture, issues concerning such group communi-
cation mechanisms are outside the scope of this thesis. Hence, a reliable group communication
protocol, already included in standard application servers, has been used (in the implementation
within the JBoss application server the reliable group communication protocol is JGroups [jgr]).
     The principal activities performed by the above three middleware services of the Leader can
be summarized as follows.
     The CS is responsible for configuring the QoS-aware clustering so that it meets effectively
the hosting SLA of a customer application. Thus, in essence, as shown in Figure 3, the CS takes
in input a customer application hosting SLA, and discovers the available system resources that
can honor that SLA. Provided that the discovered resources be sufficient to meet that input SLA,
the CS reserves those resources, generates a resource plan (see below) for the hosted application,
and sets up the QoS-aware clustering for that application. In case the CS discovers that there
are no sufficient resources to host that application, it returns an exception (typically, one such
an exception can be handled either by rejecting the application hosting request, or by offering a
reduced service, for example, depending on the policy implemented by the owner of the hosting
environment).
     In essence, the resource plan mentioned above specifies the resources that are to be used
in order to construct the QoS-aware clustering capable of meeting the input hosting SLA. In
addition, for each resource, the resource plan includes a positive value, termed load balancing
factor that is a percentage used to dispatch the client requests during the load balancing process,
when an adaptive load balancing is enabled.
     The CS carries out its principal activities in cooperation with another service of the QoS Man-
agement subsystem termed Monitoring Service (MS). The MS is in charge of monitoring the QoS-
aware clustering at application run time, so as to detect possible (i) variations of the cluster mem-
bership, (ii) variations of the load of each clustered node and (iii) violations of the hosting SLA. In
order to prevent these latter violations, the MS is designed so that it takes appropriate actions if it
discovers that the QoS-aware clustering is not able to cope effectively with new operational con-
ditions (e.g., node overloading conditions, node crashes). Thus, for example, the MS can make
use of an “overload” warning threshold in order to detect dangerous load conditions that may
lead to node overloading. In case the input hosting SLA is close to being violated, the MS invokes
its related CS, and requires that the QoS-aware clustering be reconfigured appropriately, so that
it can adapt to the new load conditions, and continue to honor the application hosting SLA.
     Finally, the LBS is used in order to (i) enable effective clustering of application servers and
(ii) collect parameters that are useful for the cluster performances monitoring (monitoring of the
cluster response time, for example); hence, the LBS operates strictly in conjunction with the above

UBLCS-2006-06                                                                                       10
                                                                                                       1        General Architecture




                      QoS-aware                                            slave QaASs
                      Clustering


                          Leader QaAS
                                                          CS         MS

           Client                                              LBS
           requests
                           CS         MS                                                       CS          MS

                                LBS                                                                  LBS




                                                                          CS          MS

                                                                                LBS




                                            Reliable Group Communication



                                           Figure 2. QoS-aware Clustering




                                      SLA

                                                   <<get SLA>>



                                                      CS

                           <<set Resource Plan>>                     <<get monitoring info>>
                                                                                                       QoS
              Client                                                                                Management
             requests
                                       LBS          cooperate with             MS




                                                                                  Leader



                                           Figure 3. Middleware Services



UBLCS-2006-06                                                                                                                    11
                                                                            2   Service Level Agreement


CS and MS. Typically, this service can contribute to meeting the hosting SLA by both preventing
the occurrence of node overload, and avoiding the use of resources which have become unavail-
able (e.g., failed) at run time; it distributes appropriately the client requests load amongst the
clustered nodes according to some specified load balancing policy (see Section 5).
   The next Subsections describe in detail the concept of Service Level Agreement and the de-
sign and protocols of the CS and MS. Following the description of these two services, separate
Subsections introduce, both the design of the LBS and the adaptive load balancing policy.


2     Service Level Agreement
A Service Level Agreement (SLA) is a legally binding contract that specifies the QoS guarantees
an application hosting environment, such as an application server, has to provide its hosted ap-
plications with, under specific load conditions, and the metrics to assess the QoS delivered by
that environment [Con].
    Although SLAs are widely used in industry practice, there is no a standard for their definition;
rather, service providers adopt their own methodology to specify the QoS requirements they have
to commit to their customers. Thus, defining a SLA is a complex task and research works are
currently investigating the definition of languages to represent it. Examples of these researches
include [SLE04, KL03]. The detailed description of these works is outside the scope of this thesis;
however, it is worth mentioning here a language, named SLAng [SLE04], that has been used
in order to derive some parts of the example SLAs of this thesis. With SLAng, a collection of
contractual clauses have been represented in an XML form.
    Specifically, as depicted in Figure 4, the SLA derived by using SLAng is an XML file termed
hosting SLA. In the context of this thesis, hosting SLAs are defined as the SLAs that are used to
bind a QoS-aware application server to the applications it hosts.
    This hosting SLA, as illustrated in Figure 4, may consist of two principal parts, namely the
Client Responsibilities and the Server Responsibilities parts, which define the obligations of the client
and the server, respectively, that are to be respected in order to honor the SLA.
    For both client and server, the SLA may include different levels of the required quality of
service, each related to every (or some) operations of the application that is bound to the host-
ing environment. In the example, a bookshop application has been used and the operations,
which can be performed by the clients, are such operations as ”login”, ”catalog”, ”bookDetails”,
”AddtoCart” and so on.
    Hence, from a client point of view, the application operations can be classified according to
the maximum number of requests a client is allowed to send to the application server, within a
specified interval of time (the XML requestRate attribute of Figure 4).
    In contrast, from a server point of view other responsibilities are to be fulfilled that concern the
principal hosting obligations an application server should provide its applications with. Thus, as
included in the example SLA of Figure 4, an application server has to guarantee a specified sys-
tem availability; this parameter, expressed as a percentage, represents the proportion of time for
which the hosted application is accessible with predictable response times (e.g., daily availability
of the bookshop application will be no less than 70 percent).
    Moreover, yet again the application operations may be classified according to some quality
of service attribute. In the example of Figure 4, these attributes are the response time, i.e.,
the time elapsed between the delivery of a client request, for a specified operation of the appli-
cation, to the server and the transmission of the reply from the server back to the client, and the
throughput, i.e., the number of request served by the application server per second.


3     Configuration Service
The hosting SLA described above is the principal object of the system architecture that is used
to trigger the execution of the Configuration Service (CS). Hence, the main responsibilities of the
CS consist of (i) configuring the cluster at the time the hosting SLA is effectively deployed in the

UBLCS-2006-06                                                                                        12
                                                                                            3   Configuration Service




          <?xml version="1.0" encoding="UTF-8"?>
          <SLAng xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
                  <Parties>
                          <Client>
                                   <Name>Facilitare Inc.</Name>
                                   <Address>Frankfurt</Address>
                          </Client>
                          <Server>
                                   <Name>Subito Inc.</Name>
                                   <Address>Stockholm</Address>
                          </Server>
                  </Parties>
                  <SLS>
                          <Hosting>
                                   <ClientResponsibilities>
                                           <ContainerServiceUsage clauseId="Login" requestRate="100/s">
                                                   <Operations>
                                                            <Operation name="login.jsp"/>
                                                            <Operation name="login.do"/>
                                                   </Operations>
                                           </ContainerServiceUsage>
                                           <ContainerServiceUsage clauseId="CreateAuction" requestRate="50/
          s">
                                                   <Operations>
                                                            <Operation name="visualizeAuction"/>
                                                            <Operation name="createAuction"/>
                                                   </Operations>
                                           </ContainerServiceUsage>
                                   </ClientResponsibilities>
                                   <ServerResponsibilities serviceAvailability = "0.70">
                                           <OperationPerformance clauseId="Login" avgResponseTime="3.0s"
          avgThroughput="50/s">
                                                   <Operations>
                                                            <Operation name="LoginCtl"/>
                                                   </Operations>
                                           </OperationPerformance>
                                           <OperationPerformance clauseId="catalog.jsp"
          avgResponseTime="2.0s" avgThroughput="50/s">
                                                   <Operations>
                                                            <Operation name="catalog"/>
                                                   </Operations>
                                           </OperationPerformance>
                                           <OperationPerformance clauseId="Checkout" avgResponseTime="1.0s"
          avgThroughput="50/s">
                                                   <Operations>
                                                            <Operation name="CheckoutCtl"/>
                                                   </Operations>
                                           </OperationPerformance>

                                          <OperationPerformance clauseId="AddToCart" avgResponseTime="2.0s"
          avgThroughput="50/s">
                                                  <Operations>
                                                          <Operation name="AddToCart"/>
                                                  </Operations>
                                          </OperationPerformance>
                                                  <OperationPerformance clauseId="RemoveFromCart"
          avgResponseTime="3.0s" avgThroughput="50/s">
                                                  <Operations>
                                                          <Operation name="RemoveFromCart"/>
                                                  </Operations>
                                          </OperationPerformance>
                                  </ServerResponsibilities>
                          </Hosting>
                   </SLS>
          </SLAng>




                                         Figure 4. An example SLA


UBLCS-2006-06                                                                                                    13
                                                                          3   Configuration Service


QoS-aware clustering (i.e, at SLA deployment time), and (ii) possibly reconfiguring the cluster at
run time, if the QoS the cluster delivers deviates from that specified in the hosting SLA [LP04].
The skeleton code in Figure 5 illustrates the protocol performed by the CS.
    As shown in Figure 5, the CS is created and started at QaAS start up time. The actual cluster
configuration initiates at hosting SLA deployment time.
    Typically, the CS instance of the QaAS node where an SLA deployment occurs becomes the
CS Leader. Note that the possible crash of the Leader during the configuration (or run-time
reconfiguration) is detected by the backup CS instances, deployed in the slave nodes, through
their MSs (see next Subsection). These MSs are notified of the Leader crash from the underlying
group communication. Specifically, there exists a MS’s primitive named membershipChanged (see
Section 4.1) that is invoked by the group communication on the MSs as soon as a change of the
cluster membership configuration occurs.
    The group communication’s reliability properties of lossless message transmission, message
ordering and atomicity assure that each MS on the slave nodes, and consequently its local CS,
has a consistent view of the membership changes. Hence, in case of Leader crash, the following
simple recovery protocol is performed by each CS instance deployed in the slave nodes.
    Every CS is identified by an unique identifier (ID) that is the IP address of the machine where
the CS is deployed. Note that the CSs have a consistent cluster configuration state object that
consists of the list of the IP addresses of the available clustered nodes.
    When the crash of the Leader is detected by the MSs on the slave nodes, as previously de-
scribed, all the available CSs are informed by their related MSs so as to enable the CSs to start the
election of the new Leader. This election consists of taking the IP addresses of the available nodes
from the cluster configuration state, interpreting them as strings and computing the minimum of
these strings (by means of a lexicographical order of the strings).
    It is worth noticing that, for the aforementioned reliability properties of the used group com-
munication paradigm, all the available clustered nodes agree on electing the CS instance that is
deployed in the machine with the smallest IP address as new cluster Leader.
    This CS Leader takes as input the SLA, and parses it in order to extract from it the relevant
QoS parameters that guide and determine the required cluster configuration (e.g., in the above
SLA example, these parameters are such SLA attributes as the client request rate, the service
availability, and so on).
    The cluster configuration mainly consists of finding out other QaAS peers in the cluster (i.e.,
discovering the initial cluster configuration) in order to check whether or not that initial configu-
ration is suitable for meeting the input hosting SLA. The initial cluster configuration is obtained
by invoking the MS, deployed in the node Leader, as depicted in Figure 3.
    Once the initial cluster configuration has been obtained, a specific SLA parameter, the serviceAvailability
attribute of Figure 4, is examined so as to verify whether further nodes are to be added to the ini-
tial cluster configuration or excluded from it. This can be crucial in the light of the principal
objective of this work of providing an optimal resource utilization: only the minimum number
of clustered nodes, which can contribute to meeting the input hosting SLA, should be used.
    Thus, the CS computes the system availability at the time the cluster configuration starts, and
compares the resulting value with the serviceAvailability of the SLA. If the current system
availability is less than that specified in the SLA, new nodes are added to the cluster.
    In contrast, if the current system availability is higher than that of the SLA, the CS tempts to
exclude one node at a time, and computes the remaining system availability (i.e., the availability
after the exclusion of the node). If the remaining system availability is higher than that included
into the SLA, the node is effectively excluded from the cluster configuration; that node is then
included into the pool of spare resources. Otherwise, the node is still part of the built initial
cluster.
    Hence, let N be the number of nodes that form the initial cluster configuration, the system
availability is computed as follows:
                                                         N
                           System Availability = 1 − [       (1 − Ai )]                        (1)
                                                     i=1

UBLCS-2006-06                                                                                  14
                                                            3   Configuration Service




           // QaAS START UP TIME

                create-CS();
                start-CS();

           // SLA DEPLOYMENT TIME

                enableConfiguration(SLA) {
                 // Elect cluster leader: the node with smallest ID
                 leaderElection(SLA);

                    // Configure cluster
                    clusterConfiguration(SLA, getMembership);

                    // Set new state on slave nodes
                    setClusterState();
                }

                clusterConfiguration(SLA, membership) {
                     //Compute system availability
                     systemAvail = computeSystemAvailability();
                     if ( systemAvail < SLAAvailability) {
                        addNewInstancesReconfiguration();
                     }else {
                        //Check whether or not to exclude instances
                        if (excludeNodes())
                         excludeNodesReconfiguration();
                     }
                }

           // RUN TIME

                  addNewInstancesReconfiguration() {
                   addNode();
                }
                excludeNodesReconfiguration() {
                    // find node and exclude it
                }




                               Figure 5. CS Skeleton code




UBLCS-2006-06                                                                    15
                                                                                  4   Monitoring Service


    Thus, assuming independent repair of the clustered machines, the combined availability of
the system is calculated as 1 minus the product of the probability that a machine is unavailable,
for all the clustered machines. The motivations for using the formula (1) are as follows.
    A cluster of nodes can be considered completely failed if and only if all its nodes fail. In
contrast, the cluster is operational if at least one of its nodes is available. Hence, a cluster of
nodes can be modeled as a system constructed out of distinct parallel components. Thus, from
the dependability theory [Tri02, Bob03], the availability of one such a system can be computed
by making use of the equation (1).
    Ai in the formula above is the estimated steady state availability of the single machine i,
which deploys the clustered QaAS instance i; this availability value is obtained by the CS from a
specific configuration file of that machine.
    Note that issues concerning the per-machine estimated steady state availability are outside
the scope of this thesis. However, it is worth observing that this value can be either included
in the technical machine specifications, provided by the machine manufacturer, or computed via
benchmarking.
    In constructing the cluster configuration, the CS produces the resource plan object that in-
cludes, for each clustered node of the built cluster configuration, the above mentioned load bal-
ancing factor. As stated before, the load balancing factor is a positive value computed by using
the server response time, the server throughput, and the size of the free memory in (the JVM of)
each QaAS node of the resource plan object (further details about the load balancing factor are
described in Section 5).
    Once the cluster configuration is terminated, the CS protocol is executed at run time in case
the MS requires a reconfiguration of the cluster. The reconfiguration consists of either (i) adding
new nodes to the cluster, or (ii) releasing clustered nodes, and (iii) updating the aforementioned
load balancing factor.
    The activity (i) can be necessary both in case a cluster is to be augmented with additional
resources (e.g., in order to cope with dynamically increasing load), and in case a clustered node
fails and is to be replaced by one (or more) operational node(s). In both these cases, it may be
convenient to maintain a pool of spare servers available for inclusion in the cluster, at any time,
so as to eliminate the overhead induced by bootstrapping a new application server.
    The activity (ii) is carried out in order to optimize the resource utilization. Specifically, this
entails that, if the load for a hosted application decreases significantly (below predefined warning
thresholds), the resources allocated to that application can be deallocated dynamically.
    Finally, as to activity (iii), the load balancing factor, associated to each clustered node, is used
by the LBS in order to distribute the load among the clustered nodes, when an adaptive load
balancing policy is deployed.


4     Monitoring Service
The architecture of the MS is depicted in Figure 6. As shown in Figure 6, the MS consists of two
principal components, namely the Membership Interceptor and the SLA Monitoring, described in
the following separate Subsections.

4.1 Membership Interceptor
The Membership Interceptor is a MS component used to retrieve information about the cluster
membership. It uses underlying group communication primitives in order to detect new mem-
bers that join the cluster, and dead members that leave the cluster. In addition, it saves this
data in a file when changes occur to the membership, for logging purposes; to this end, another
component of the MS termed Measurement Service is being properly invoked by the Membership
Interceptor as depicted in Figure 6.
    The pseudo code in Figure 7 illustrates the above activities performed by this monitoring
component. As shown in Figure 7, at QaAS start up time, the Membership Interceptor is started


UBLCS-2006-06                                                                                        16
                                                                                                       4   Monitoring Service




                                                             Evaluation
                                                              Violation
                                                          Detection Service

             Client
                                              <<evaluate SLA>>
             Requests     Request                                                   Membership
                         Interceptor                                                Interceptor

                                                              Cluster
                                                            Performance
                                       <<send request>>
                                                              Monitor

                                                              SLA Monitoring

                              <<save SLA monitoring data>>        <<save membership data>>

                                                                                                  access
                                                          Measurement Service

                                                             Monitoring Service



                                         Figure 6. The MS architecture



up by running the monitoring thread. This thread is in charge of monitoring the cluster member-
ship and saving the obtained data in the stable storage. Note that these activities are performed
by the thread during a predefined time interval termed monitoring frequency, which has been fixed
during the experimental evaluation of the QoS-aware middleware services.
    At run time, the Membership Interceptor operates as listener in order to detect changes to
the cluster membership. To this end, as earlier discussed, the Membership Interceptor makes
available a primitive, named membershipChanged, for the use from the underlying group com-
munication. This primitive allows the Membership Interceptor to detect changes to the cluster
membership as it is properly invoked as soon as those changes occur in the cluster.
    It is worth observing that, in case of a member crash, the Membership Interceptor checks
whether or not the dead member is the QaAS Leader of the cluster. If the dead member is not the
Leader, the Membership Interceptor deployed in the node Leader invokes the SLA Monitoring
(the SLAEvaluation method of Figure 8), in order to check the adherence to the hosting SLA after
the membership change.
    In contrast, if the dead member is the Leader of the cluster, the recovery protocol, discussed
in the previous Section, is carried out so as to elect the new Leader of the cluster, responsible at
that point for monitoring the adherence to the hosting SLA of the new cluster configuration.

4.2 SLA Monitoring
The SLA Monitoring component of the Monitoring Service is responsible for monitoring the clus-
ter performances. It consists of three additional components named Client Request Interceptor,
Cluster Performance Monitor, and Evaluation and Violation Detection Service. These components are
described below, in isolation.

The Client Request Interceptor This interceptor is used to intercept the client requests in order to
evaluate the cluster performance for specific application requests; the cluster performance con-
sists of the set of throughput and response time values that allow one to detect whether or not the
nodes of the cluster are overloaded. This interceptor is used in the Load Balancing architecture
described in the next Subsection, and is named Request Interceptor.



UBLCS-2006-06                                                                                                             17
                                                                        4   Monitoring Service




                //QaAS START UP TIME

                 start-MS();

            //SLA DEPLOYMENT TIME AND RUN-TIME

                 run() {
                   while(true) { // Get cluster membership
                      getMembership();
                      // Save logs in stable storage
                     MeasuramentService.saveMembership();
                     sleep(monitoringFrequency);
                   }
                 }

                 membershipChanged(deadMembers,newMembers,allMembers)
                 {
                   // Record deadMembers, newMembers and allMembers
                    MeasuramentService.saveMembership();
                    getDeadMembers();
                    getNewMembers();
                 }

                 getDeadMembers() {
                   if (Leader is dead) { // Elect new leader }
                   else {
                     // Leader calls SLA evaluation with EVDS
                     EVDS.SLAEvaluation(;
                   }
                 }

                 getNewMembers() {
                   //set state to the new member
                 }




                       Figure 7. Membership Interceptor skeleton code



UBLCS-2006-06                                                                              18
                                                                           5   Load Balancing Service


The Cluster Performance Monitor This component cooperates with the Evaluation and Violation
Detection Service in order to assess the actual response time and the throughput delivered by
the application server cluster, and to detect possible SLA violations. Specifically, the Cluster Per-
formance Monitor (i) obtains the data required to assess response time and throughput from the
client request interceptor (e.g., number of received requests, for specific application operations,
number of served requests, for specific application operations), (ii) computes the average request
response time and throughput, and (iii) sends the results of its computations to the Evaluation
and Violation Detection Service. This latter service is invoked in order to check the adherence of
those values to the SLA requirements.

The Evaluation and Violation Detection Service This service is responsible for checking whether the
QoS delivered by the clustered application servers meet the hosting SLA requirements. Specifi-
cally, this component detects, at run time, variations in the operational conditions of the clustered
nodes, which may affect the QoS delivered by the cluster, and triggers the cluster reconfiguration,
if necessary.
    The skeleton code of the above components is depicted in Figure 8. As shown in Figure
8, the SLA Monitoring records and computes, through the Cluster Performance Monitor, the
cluster performance as soon as an incoming client request is intercepted over the monitoring
frequency interval. When the monitoring frequency interval expires, the Cluster Performance
Monitor invokes the Evaluation and Violation Detection Service in order to check the adherence
to the hosting SLA of both, the response time and throughput exhibited by the cluster.
    Hence, in either cases, if the computed value is higher than that of the related SLA bound,
the SLA is violated and an exception is raised at the application level. In contrast, if the values of
both the computed response time and throughput are lower than that specified into the SLA, the
Evaluation and Violation Detection Service checks whether or not an adaptation is to be applied,
which either adds new QaAS instances in the cluster or releases nodes as no longer necessary.
    To this end, predefined thresholds are used in order to compute so-called warning points.
There exist two types of warning points, namely a high load warning point and a low load warn-
ing point. The former is used in order to verify whether or not the cluster is overloaded; in this
case, in order to avoid SLA violations, the CS is invoked so as to add new instances in the cluster.
    In contrast, the latter indicates that the cluster is positively responding to the injected client
load and clustered nodes can be released as no longer necessary.
    Note that in the design of the SLA Monitoring a number of warning points (e.g., a throughput
warning point, a response time warning point) are used, as illustrated in Figure 8. However, it is
also worth observing that the Evaluation and Violation Detection Service invokes a CS reconfig-
uration that adds new instances in the cluster if there exists at least one computed value (either
response time or throughput) that is equal or higher than the related high load warning point. In
contrast, the Evaluation and Violation Detection Service invokes a CS reconfiguration, which re-
leases nodes in the cluster, if all the values computed during the monitoring frequency are equal
or lower than the related low load warning points.


5     Load Balancing Service
In the design of the Load Balancing Service (LBS) for clustered QoS-aware application servers,
two load balancing approaches have been evaluated, namely a “request-based” (or “per-request”)
load balancing approach, and a “session-based” (or “per-session”) load balancing approach.

5.1 Per-request Load Balancing
In “request-based” load balancing, each individual client request is processed by the LBS, and
dispatched to an application server, according to some specific load distribution policy (see be-
low). Thus, two consecutive requests from the same client may be dispatched to two different
servers for processing.


UBLCS-2006-06                                                                                      19
                                                                            5   Load Balancing Service




                //RUN TIME (Cluster Performance Monitoring)

                  // Get client requests via client request interceptor
                  getClientRequests();
                  /* Invoke Cluster Performance Monitor to compute RT and
                      Throughput */
                  ClusterPerformanceMonitor.computeRespTime();
                  ClusterPerformanceMonitor.computeThroughput();

                  class ClusterPerformanceMonitor {
                     while(true) {
                        // Record number of arrived requests
                        // Record number of served requests
                        // Compute response time
                        // Compute throughput
                        /* Evaluate response time via EvaluationViolationDetection
                           Service (EVDS) */
                        EVDS.SLAEvaluationRT(computedRT);
                        // Evaluate Throughput EVDS
                        EVDS.SLAThroughput(computedTh);
                        sleep(monitoringFrequency);
                     }
                  }

                  class EvaluationViolationDetectionService {
                      SLAEvaluation(membership) {
                         if (membership.size < membershipRequiredSLA) {
                            //invoke CS addNewInstancesReconfiguration
                         } else {
                            //No need to add replica; invoke CS rearrangeAgreedQoS
                        }
                      }
                      SLAEvaluationRT(computedRT) {
                         getRTThresholds();
                         if (computedRT > SLART) {
                            //SLA VIOLATED; return exception
                         else {
                              if (computedRT >= HLWPRT)
                                 /* RespTime warning point reached; invoke
                                     CS that adds new nodes */
                              else if (computedRT <= LLWPRT)
                               /* RespTime low load warning point reached; invoke
                                     CS that releases nodes */
                                   else
                                      // No need to reconfigure
                         }
                     }
                      SLAEvaluationTh(computedTh) {
                         getThThresholds();
                         if (computedTh < SLATh) {
                            //SLA VIOLATED
                         } else {
                              if (computedRT >= HLWPTH)
                                 /* Throughput warning point reached; invoke
                                     CS that adds new nodes */
                              else if (computedRT <= LLWPRT)
                               /* Throughput low load warning point reached
                                     invoke CS that releases nodes */
                                   else
                                      // No need to reconfigure
                         }
                     }
                  }




                                  Figure 8. SLA MS skeleton code



UBLCS-2006-06                                                                                      20
                                                                            5   Load Balancing Service


    In principle, the “request-based” load balancing is used when state replication among the
clustered servers is enabled. This replication mechanism allows one to construct a hosting en-
vironment that can support highly available applications. Hence, provided that the application
state be maintained mutually consistent across the clustered application servers, the crash of a
server can be masked by routing the client requests to a different server in the same cluster, trans-
parently to the client program. In practice, the cost of maintaining consistency among clustered
servers can be very high, in terms of both performance and memory overheads caused by the
consistency algorithms (note that, in this case, state replication among the clustered servers is re-
quired, as any of these servers can be used, without distinction, to serve incoming client requests
at any time) [RT04].

5.2 Per-session Load Balancing
In contrast, using “session-based” load balancing, a specific client session (i.e., a sequence of
client requests) is created in one of the clustered application server, at the time a client program
requires access to the application hosted by that server; every future request, within a speci-
fied timeframe, from that client will be processed by that server (these client-server sessions are
termed session affinity). Thus, the LBS intercepts each client request and, depending on the session
affinity, dispatches it to the appropriate server.
    “Session-based” load balancing can be implemented so as to impose no consistency require-
ments among the clustered servers, as all the client requests belonging to the same session are
served by the same server. However, if a server crashes while it is serving a client session, that
session cannot be recovered at the application server level. Thus, either the client program im-
plements its own recovery mechanisms, or this program may have to start its session again (at
the risk of causing multiple executions of earlier requests). It is worth mentioning that session
replication across multiple replica servers may well be implemented at the application server
level, and used to provide clients with transparent fault tolerant support to server crashes. How-
ever, the cost of maintaining mutually consistent client session replicas across multiple servers
may make this solution as impractical as that mentioned above, in the case of request-based load
balancing.

5.3 The Load Balancing Architecture
In view of the above observations, the architecture of the LBS includes support for both request-
based load balancing, and session-based load balancing; this architecture is depicted in Figure 9,
and summarized in the following.
    The designed LBS can be thought of as a reverse proxy server (the LBS deployed in the node
Leader of the cluster) which interfaces the clients of an application to the clustered nodes host-
ing that application. This LBS can accommodate a number of load balancing policies; typically,
this Service requires to be configured at cluster configuration time so as to use, at run time, one
specific load balancing policy out of those it incorporates.
    The architecture of the LBS consists of the following four principal components: namely, the
Request Interceptor, the Load Balancing Scheduler (LBScheduler), the HTTP Request Manager,
and the Sticky Session Manager, illustrated in Figure 9. Note that the LBScheduler component
in Figure 9 embodies three load balancing policies, namely the Round Robin, the Random and
the Workload. Only one of these policies can be selected at cluster configuration time; that policy
will be used at run time for load balancing purposes. It may be worth mentioning that the load
balancing policies shown in Figure 9 are those actually incorporated in the first implementation
prototype of the LBS; however, the Service is structured so that the set of policies embodied by
the LBScheduler can be extended easily with additional ones.
    As mentioned earlier, the LBS can balance the client request load either on a per-request basis
or on a per-session basis. Yet again, the decision as to which “granularity” (i.e., either per-request,
or per-session) of load balancing to use is to be taken at cluster configuration time.
    If per-request load balancing has been enabled, the Request Interceptor intercepts each client
request, and invokes the HTTP Request Manager. This Manager firstly interrogates the LBSched-
uler in order to obtain the address of a target node that can serve that request. The LBScheduler

UBLCS-2006-06                                                                                       21
                                                                            5   Load Balancing Service




                            Figure 9. Load Balancing Service Architecture



returns the requested address, based on the specific load balancing policy it has been configured
to use. As the HTTP Request Manager obtains the target node address, it manipulates the client
request appropriately in order to forward it to this target node, and to enable it to return the reply
to the Load Balancing Service itself. As the Load Balancing Service receives a response to a client
request, it forwards it to that client. The next request from the same client will be dealt with the
same way; hence, there will be no guarantee that the same target node will be selected for serving
it.
    In contrast, if the LBS has been configured to use a per-session load balancing, the Sticky
Session Manager cooperates with the LBScheduler to identify a target node that will serve the
client requests for the entire session. To this end, a unique cookie has been created and included
into the HTTP requests. This cookie identifies the selected node, is generated at the time a client
session starts, and maintained until that ends (see below). The use of cookies in HTTP requests
is a common mechanism for managing HTTP state information [RFC97]. In fact, by means of
cookies, it is possible to (i) carry state information (a sequence of client requests, for example)
between the client and a clustered node and (ii) easily identify the client who originated the
request. Hence, owing to these advantages, the above cookie-based solution has been favored.
    A few observations concerning the design of the LBS are in order. Firstly, it is worth mention-
ing that indeed a LBS architecture based on a reverse proxy may show some limitations, in terms
of scalability, and robustness, for example.
    However, as to the robustness, in order to overcome the single point of failure shortcoming,
the LBS has been replicated in each slave nodes (as described in Section 1); this allows us to take
over in case of failure of the LBS Leader (i.e., in case of failure of the reverse proxy). Neverthe-
less, it is worth observing that this mechanism is indeed effective as it has been assumed, in the
implementation, the use of a dynamic IP address translation (or IP takeover), which keeps trans-
parency with respect to the clients accessing the hosted application. Hence, if the LBS crashes,
a new node in the cluster is elected as LBS Leader, by following the protocol earlier described,
and the IP address of the crashed node is assigned to the new one. Then, a proper hardware
device, such as the gateway illustrated in Figure 2, is used in order to redirect the incoming client
requests toward the new LBS Leader. This mechanism totally masks the failures to the end-users

UBLCS-2006-06                                                                                      22
                                                                               5   Load Balancing Service


for which the application is still available.
    In addition, one such a reverse proxy architecture has the merit of enabling the development
of a Load Balancing Service which is fully transparent to both the client and the clustered QaASs.
The client should only enable its browser to accept cookies, in case a per-session load balancing
is used. Hence, on the balance, the reverse proxy based architecture introduced above has been
favored.
    Secondly, it is worth observing that the LBS has been designed so as to be completely inde-
pendent of the policy used to select the target node. In fact, the Service architecture developed
allows developers to plug into it the load balancing policies of their choice.
    Finally, note that the LBScheduler itself does not monitor the membership of the cluster.
Rather, the LBScheduler receives the list of available clustered nodes from the CS (i.e., the LB-
Scheduler receives from the CS the resource plan object, as depicted in Figure 3).

5.4 The Load Balancing Policies: The WorkLoad policy
Typically open-source J2EE platforms use software load balancing services that embody either
“Random” or “Round Robin” load balancing policies, for balancing the incoming client requests
within a cluster of application server instances. However, these policies are all non-adaptive, i.e.,
they are unable to adapt to variations of the QoS delivered by the clustered application servers.
As one of the focuses of this thesis was to evaluate adaptive load balancing policies, a WorkLoad
policy has been developed for this purpose.
    This policy enables the LBScheduler to select the most lightly loaded node among the clus-
tered nodes, in order to serve either a request or a client session. To this end, the LBScheduler
uses the above mentioned load balancing factor and selects a clustered QaAS with a probability
that is proportional to the load balancing factor value. For each QaAS, this value is computed
by considering three QoS parameters, such as the server response time, i.e., the time elapsed be-
tween the delivery of a client request to an application server, and the transmission of the reply
from that server, the server throughput, i.e., the number of client requests served by the application
server within a specific time interval, the server available memory, i.e., the size of the free memory
in (the JVM of) each application server.
    Therefore, the load balancing factor of a node i is computed by taking into account three prin-
cipal factors (i.e., the so-called RespTimeFactor, ThFactor and MemFactor below) that represent
the total cluster response time, throughput and JVM memory, respectively, the node i is capable
of offering within the cluster.
    Hence, let N be the number of clustered QaASs. At cluster configuration time for each node i,
the percentage of the total computational load that i is supporting, in terms of i’s response time,
throughput, and available memory is calculated as follows.

                                    1                                              n
                              ( RespT imei ) ∗ 100
       RespT imeF actori =                           where T otRespT ime =               RespT imei   (2)
                                T otRespT ime                                      i=1

                                                                         n
                                     (T hi ∗ 100)
                     T hF actori =                   where T otalT h =         T hi                   (3)
                                      T otalT h                          i=1
                                                                               n
                                  (M emi ∗ 100)
                M emF actori =                       where T otalM em =            M emi              (4)
                                   T otalM em                                i=1

   Thus, in essence, the higher the response time of a node i (i.e., the RespT imei in formula (2)),
the lower that node percentage of the total response time (i.e., the above RespT imeF actori ; note
that this is captured by the use of the inverse of the node response time).
   In contrast, the larger the throughput (or the available memory) of a node i, the larger the
percentage of the total cluster throughput (i.e., the above T hF actori ) that node delivers (or the
percentage of the total memory made available by that node in the cluster, i.e., the M emF actori ).


UBLCS-2006-06                                                                                         23
                                                                           6   Application Deployment


   The load balancing factor (i.e., lbf ) then can be obtained as the median of the three factors
above, assuming that each QoS parameter has the same weight in the construction of this factor.

                    lbfi = (RespT imeF actori + T hF actori + M emF actori )/3                     (5)
    In conclusion, it is worth observing that the load balancing factor is initialized to a default
value in each QaAS, and dynamically adjusted at cluster configuration and reconfiguration time.
The dynamic variations of the load balancing factor reflects the changes of the operational condi-
tions of the QoS-aware clustering. Hence, the WorkLoad policy enables adaptation of the hosting
environment as variations of the QoS delivered by this environment occur, causing cluster recon-
figuration.
    In essence, the usage of the load balancing factor, in the load balancing process, can be useful
in order to both optimize the user perceived response time and augment the number of requests
served by the clustered per second (i.e., the throughput of the QoS-aware clustering environ-
ment).


6     Application Deployment
Issues of application deployment and application component replication in a clustered hosting
environment can play a relevant role in the design of the earlier described CS. This Subsection
introduces these two issues, and examine three alternative approaches in the design of the CS.
    As to the application deployment, there exists a Homogeneous Application Deployment (HOAD)
that entails that each clustered application servers runs identical services, and application com-
ponents. In contrast, there exists a Heterogeneous Application Deployment (HEAD) that entails that
each application server in the clusters may run a different set of services and application compo-
nents.
    It is worth observing that, in practice, the use of this latter form of deployment results partic-
ularly expensive as this requires a number of issues to be further investigated and carefully dealt
with, such as state propagation, cluster-wide configuration management and so on. However, I
believe that heterogeneous deployment, in contrast with the homogeneous one, can be particu-
larly attractive when applied to component-based applications, as it enables the distribution of
the application components across clusters of machines, in a controlled manner.
    Typically, enterprise applications adopt a distributed multi tier paradigm, in which the appli-
cation consists of separate components; namely, client, web and business components. Generally,
the business components are those components used to interact with database systems and that
contain the logic of the application itself (this can vary depending on the specific business do-
main whereby they are utilized). In general, the Web components of an application are directly
exposed to the clients, so as to mask the business tier in which the business components are lo-
cated. In this context, the clustered application servers can be specialized; thus, for example, one
application server instance can be configured for optimum performance for the support of busi-
ness components, whereas another application server instance can be configured for optimum
performance in the support of web components.
    In addition, if one assumes a rather conventional scenario in which client-server communica-
tions are enabled via wide area networks, as clients can be located geographically far away from
the servers, it may well be convenient to distribute the application so that its Web components
are as close as possible to the clients. Moreover, in order to reduce the application response times,
it can be desirable to distribute the application business components so that those directly con-
nected to the database are located as close as possible to the clustered database servers. This can
be done both at deployment time, when the CS acquires the application QoS requirements (i.e.,
the SLA), and at run time, when the application SLA is close to being violated (as reported by the
MS).
    Finally, the two approaches above can be combined together (Homogeneous and Heterogeneous
Application Deployment (HEAD)) as follows. The homogeneous deployment can firstly used by
replicating application components among the application servers of the clusters. At application

UBLCS-2006-06                                                                                      24
                                                                           6   Application Deployment


run time, if the SLA is close to being violated, the re-configuration activity of the CS can decide
whether to (i) add more replicas of the entire application, (ii) add a certain number of machines
to the initial cluster, i.e. to augment the number of available resources, (iii) migrate application
components from overloaded machines to underutilized ones.
    The third option may lead to having different components of the same application deployed
and running onto different machines (i.e., it may lead to heterogeneous application deployment).
However, this option may have a notable impact on the overall performance of the clustered
application servers, as migrating application components within the cluster can be very costly.
Hence, this option can be used only when some particular hosting conditions occur (e.g., critical
thresholds are reached within the hosting environment, and detected during the SLA Monitoring
phase).
    However, the approach adopted in this thesis is to use a HOAD deployment through which
every QaAS clustered nodes deploys the same application components and the same middleware
services. In fact, after a detail analysis of the above different techniques, this kind of deployment
turned out to be the simplest and less expensive one amongst those available.
    Nevertheless, issues of HHAD (i.e., homogeneous and heterogeneous) deployment can be
very attractive in the light of providing the QoS-aware middleware services proposed in this
thesis in a wide area network clustered environment; hence, one such a deployment can be con-
sidered as a possible extension of the work here presented.




UBLCS-2006-06                                                                                     25
Chapter 3

Prototype architecture

A first prototype of the designed middleware architecture has been implemented by using the
Java programming language [Mic02b]. Therefore, every QaAS of the QoS-aware clustering pro-
vides distributed applications with Java-based QoS-aware middleware services.
    Specifically, these services are intended to be part of Java-based application server technolo-
gies. Among these technologies, a specific application server has been chosen, in order to assess
the effectiveness of the designed QoS-aware clustering. That application server is named JBoss;
JBoss has been properly extended with the collection of middleware services earlier described,
by producing a first prototype architecture.
    In that prototype, two distinct levels of abstractions have been identified, namely the macro
and the micro levels. In particular, the proposed middleware services carry out their protocols
by exercising their control over such resources as the nodes that form the cluster; hence, these
services can be conveniently thought of as operating at the macro level (or at the cluster level).
However, it is worth observing that some of the above macro level services (e.g., the Configu-
ration Service) need information concerning each single QaAS instance of the cluster (i.e., micro
level information), in order to compute the load balancing factor, for example. To this end, a mi-
cro level monitoring service capable of probing such instances and retrieving node-related data
(e.g., node response time, node throughput) can be required by the services that operate at the
macro level.
    For this purpose, in the implementation of the first prototype architecture, an additional ser-
vice, termed MicroResourceManager, has been used and included into JBoss. This service embod-
ies a micro level monitoring, is open-source and available for the download at the sourceforge
web site [TAP], and its specifications are described in [FSE04].
    By means of the MicroResourceManager, the designed CS and LBS are capable of obtaining
information concerning each clustered QaAS node; this information is a set of parameters that
include the response time, the throughput and the available JVM memory of each node, necessary
in the computation of the load balancing factor for use from the LBS.
    However, as the principal scope of this thesis is to extend application server technologies with
middleware services that operate at the cluster level in order to honor SLAs, the aforementioned
micro level monitoring has not been implemented; rather, for this purpose, it has been used the
existing open source MicroResourceManager earlier described. Therefore, this Chapter focuses
on the implementation of the services that operate at the cluster level (i.e., the QoS-aware mid-
dleware services discussed in the previous Chapter), only. In addition, this Chapter concludes
discussing an experimental evaluation of the macro level services carried out in order to assess
their effectiveness.




26
                                                                1   Open Source Application Servers


1     Open Source Application Servers
The application server technologies mentioned in this thesis are such middleware platforms that
implement the Java 2 Enterprise Edition (J2EE) [Sha03] specification; they adopt a component-
based approach in the development of distributed applications, and provide application devel-
opers with services for clustering and replication components.
   The J2EE specifications define a set of application components that make up a general J2EE
application, namely the Web components (Servlets and Java Server Pages (JSPs)) and the business
components (i.e., the so-called Enterprise Java Bean (EJB) components).
   As one of the thesis objective is to develop an open source platform, in the following two open
source application servers have been compared, namely JOnAS and JBoss, respectively, so as to
decide which one can better fulfill the requirements of the thesis.

1.1 The JOnAS Application Server
JOnAS is a Java open source application server, which adds the following features to the imple-
mentation of the J2EE 1.4 standard [jon]:
    • Management JOnAS server management uses JMX and provides a servlet based manage-
      ment console named Jadmin
    • Service It allows to apply a component model approach at the middleware level and makes
      easy the integration of new modules. It also allows starting only the services needed by a
      particular application, thus saving system resources
    • Scalability JOnAS integrates several optimization mechanisms for increasing the server
      scalability. These mechanisms include a pool of stateless session beans, a pool of message-
      driven beans, a pool of threads, cache of entity beans, the activation/passivation of entity
      beans, a pool of connections (for JDBC and JMS), and storage access optimizations.
    • Distribution JOnAS can work with two distributed processing environments; namely, RMI
      (Remote Method Invocation), using the Sun proprietary protocol JRMP, and Jeremie. This
      latter environment is the RMI implementation an Object Request Broker named Jonathan.
      When used with Jeremie, JOnAS benefits from transparent optimization of local RMI calls.
    JOnAS consists of a collection of services. A service typically provides system resources to
containers. Most of the components of the JOnAS application server are pre-defined JOnAS ser-
vices. However, it is possible for a JOnAS user to define its own services, and integrate them
in JOnAS. The main principle for defining a new JOnAS service is to encapsulate it in a class
whose interface is known to JOnAS. More precisely, such a class must enable the initialization,
the starting and stopping of the service. In order to make JOnAS aware of a new service, some
configuration files are be changed accordingly.
    With this structure, JOnAS allows high modularity and configurability of the application
server, starting only the services that the user needs, and parameterizing each module.
    The JOnAS architecture is depicted in Figure 1. In particular, this Figure shows two types of
containers: the EJB Container, hosting EJB components (i.e., the business components of an appli-
cation described in the Chapter 2 of this work), and the Web Container, hosting Web components
such as servlets, JSPs, and the Jadmin administrative console, used to administrate the JOnAS
server.
    Both the EJB Container and the Web Container can make use of the set of services depicted in
Figure 1, and summarized below:
    • Registry service This service is used for launching either the RMI registry, or Jeremie reg-
      istry, depending on the JNDI configuration. There are several Registry launching modes;
      e.g., in the same JVM or not, automatically (if not already running).
    • EJB Container service This service is responsible for loading the EJB components and their
      containers.

UBLCS-2006-06                                                                                   27
                                                               1   Open Source Application Servers




                                Figure 1. The JOnAS architecture



   • Web Container service It is in charge of running a Servlet/JSP Engine in the JVM of the
     JOnAS server, and loading web applications within this engine. Currently this service is
     configured to use Tomcat and Jetty
   • EAR service It is used for deploying complete J2EE applications, i.e. applications packaged
     in EAR files. It takes care of creating the appropriate class loaders in order for the J2EE
     application to execute properly.
   • Transaction service This is a Java Transaction Monitor, called JOnAS JTM. It is a mandatory
     service which handles distributed transactions. It provides transaction management for
     EJB components, as defined in their deployment descriptors. The JTM may be distributed
     across one or more JOnAS servers.

   • Database service This service is responsible for handling Datasource objects. A Datasource
     is a standard JDBC administrative object that handles the connections to a database.
   • Security service This service implements the authorization mechanisms for accessing EJB
     components, as specified in the EJB specification. Currently, to the best of my knowledge,
     this service does not fully implements JAAS standard.
   • JMS service This service supports message-driven beans and JMS operations in order to
     allow application components to send asynchronous messages to other application compo-
     nents. The JMS service is responsible for launching the integrated JMS server, which may
     or may not run in the same JVM as JOnAS.
   • JCA Resources service This service is responsible for deploying JCA compliant Resource
     Adapters (connectors) on the JOnAS server.
   • JMX service The JMX service is needed in order to administrate the JOnAS server from the
     JOnAS Jadmin console.
   • Mail service This service provides the necessary resources to application components in
     order to read or send emails.




UBLCS-2006-06                                                                                  28
                                                                     1   Open Source Application Servers




                              Security                      Remote             Mail
                  JTS/JTA                      JCA
                                                          Management




                                JMX Implementation


                   EJB                     Java Server                        JNDI
                              Databases                     JMS
                 Container                    Pages




                                  Figure 2. The JBoss architecture



1.2 The JBoss application server
JBoss consists of a collection of middleware services for communication, persistence, transactions
and security. These services interoperate by means of a microkernel termed Java Management eX-
tension (JMX) [Mic02a] as depicted in Figure 2. Specifically, JMX provides Java developers with
a common software bus that allows them to integrate components such as modules, containers
and plug-ins. These components are declared as Managed Beans (MBeans) services, which can
be loaded into JBoss, and can be administered by the JMX software bus. The MBeans are the
implementation of all the manageable resources in the JBoss server; they are represented by Java
objects that expose interfaces consisting of methods to be used for invoking the MBeans.
   The JMX architecture is structured in the following three hierarchical levels [jbo]:
   • Instrumentation Level This is the lower level of the JMX architecture; it provides the spec-
     ification for implementing JMX manageable resources. A JMX manageable resource can
     be virtually anything, including applications, service components, devices, and so on. The
     JMX user provides the instrumentation of a given resource through the implementation of
     one or more Managed Beans (MBeans). MBeans are Java objects implementing one of the
     standard MBean management interfaces; a standard MBean management interface consists
     of metadata classes covering attributes, operations, notifications, and constructors.
   • Agent Level This level is constructed on top of the Instrumentation level, introduced above.
     The Agent level incorporates agents; i.e., MBeans that control the resources defined at the
     Instrumentation level. In addition, it defines a standard MBean Server agent. The MBean
     Server is a registry for MBeans that makes the registered MBean management interfaces
     available for use from remote management applications. Note that an MBean never exposes
     directly the MBean object it implements; rather, its management interface is made available
     through the metadata and operations defined in the MBean Server. Thus, the Agent Level
     implements a loose coupling between the management applications and the MBeans they
     manage. Moreover, the Agent level embodies support services (e.g., dynamic class loading,

UBLCS-2006-06                                                                                        29
                                                                  1   Open Source Application Servers


      monitors, timers), and communication connectors and adaptors. Connectors and adaptors
      enable access to the MBean Server agent from outside the agent JVM. Specifically, the adap-
      tors make visible all the MBeans registered in the MBean Server to the remote management
      applications (e.g., an HTML adaptor can allow one such an application to display MBeans
      using a Web browser). In contrast, a connector is an interface, for use from management
      applications, that supports a common API for accessing the MBeanServer, independently
      of the underlying communication protocol.
   • Distributed Services Level This level is implemented on top of the Agent level. It can be pop-
     ulated by (either JMX compliant or proprietary) remote management applications and Web
     browsers. Remote management applications interact with the agents and their managed
     objects via connectors; Web browsers via adaptors, instead.

   The following basic services are integrated in JBoss, as illustrated in Figure 2:
   • The Java Naming and DIrectory service (JNDI) This service supports a common interface to
     a variety of existing naming services such as DNS, LDAP, Active Directory, RMI registry,
     COS registry, NIS, and file systems.
   • The JTA transaction service (JBossTX) This service allows one to integrate in JBoss the im-
     plementation of any Java Transaction API (JTA) transaction manager. Thus, JBoss is fully
     independent of the actual transaction manager implemented and used, as this manager can
     be implemented as an MBean.
   • The EJB Container Configuration and Architecture This service defines and provides all the
     elements and operations needed to manage Enterprise Java Beans (EJBs).
   • The Java Message Service (JMS) Configuration and Architecture This services can be used by
     the application components to send asynchronous messages to other application compo-
     nents.
   • The JCA Configuration and Architecture (JBossCX) It is a resource manager integration API
     whose goal is to standardize access to non-relational resources, in the same way as the JDBC
     API standardizes access to relational data.
   • The JBoss Security Configuration and Architecture (JBossSX) It is a service which provides
     both support for the role-based declarative J2EE security model, and integration with cus-
     tom security via a security proxy layer.

   • Integration Servlet and JSP Containers It is used to integrate a third party Web container
     into the JBoss application server framework. A Web container is a J2EE server component
     that enables access to servlets and JSP pages. Examples of servlet containers used by JBoss
     include Tomcat and Jetty.

1.3 Load Balancing and Failover in JBoss
A number of JBoss applications servers can be clustered in a network. A JBoss cluster (or parti-
tion) consists of a set of nodes. A node, in a JBoss cluster, is a JBoss application server instance.
There can be different partitions on the same network; each cluster is identified by an individual
name. A node may belong to one or more clusters (i.e., clusters may overlap); moreover, clusters
may be further divided into sub-clusters. Clusters are generally used for scopes of load distri-
bution purposes, whereas sub-partitions may be used for fault-tolerance purposes (it is worth
mentioning that, to the best of my knowledge, there are no implementations of the sub-partition
abstraction, as of today).
    A JBoss cluster can be used for either homogeneous or heterogeneous application deploy-
ment. (However, heterogeneous deployment is not encouraged for the reasons described in the
Chapter 2).


UBLCS-2006-06                                                                                     30
                                                                  1    Open Source Application Servers




                            HA                   HA                    HA
                           JNDI                  RMI                  EJB



                            Distributed                   Distributed
                             Replicant                       State
                             Manager



                                       HA Partition Framework


                                              JGroups




                                Figure 3. JBoss Clustering Framework



    The JBoss Clustering service [LBtJG04] is based on the framework depicted in Figure 3. This
framework consists of a number of hierarchically structured services, and incorporates a reliable
group communication mechanism, at its lowest level. The current implementation of this mecha-
nism uses JGroups [jgr], a toolkit for reliable multicast communications. This toolkit consists of a
flexible protocol stack that can be adapted to meet specific application requirements. The reliabil-
ity properties of the JGroups protocols include lossless message transmission, message ordering,
and atomicity.
    Specifically, JGroups provides its users with reliable unicast and multicast communication
protocols, and allows them to integrate additional protocols (or to modify already available pro-
tocols) in order to tune the communication performance and reliability to their application re-
quirements. JGroups guarantees both message ordering (e.g., FIFO, causal, total ordering), and
lossless message transmission; moreover, a message transmitted in a cluster is either delivered to
each and every node in that cluster, or none of those nodes receives that message (i.e., it supports
atomic message delivery). In addition, JGroups enables the management of the cluster member-
ship, as it allows one to detect the starting up, leaving and crashing of clustered nodes. Finally, as
state transfer among nodes is required when nodes are started up in a cluster, this state transfer
is carried out maintaining the cluster-wide message ordering.
    The HighAvailable Partition (HAPartition) service is implemented on top of the JGroups re-
liable communications; this service provides one with access to basic communication primitives
which enable unicast and multicast communications with the clustered services. In addition, the
HAPartition service provides access to such data as the cluster name, the node name, and infor-
mation about the cluster membership, in general. Two categories of primitives can be executed
within a HAPartition, namely state transfer primitives, and Remote Procedure Calls (RPCs).
    The HAPartition service supports the Distributed Replicant Manager (DRM), and the Dis-
tributed State (DS) services. The DRM service is responsible for managing data which may differ
within a cluster. Examples of this data include the list of stubs for a given RMI server. Each node
has a stub to share with other nodes. The DRM enables the sharing of these stubs in the cluster,


UBLCS-2006-06                                                                                      31
                                                                   1   Open Source Application Servers


and allows one to know which node each stub belongs to.
     The DS service, instead, manages data, such as the replicated state of a Stateful Session Bean,
which is uniform across the cluster, and supports the sharing of a set of dictionaries in the cluster.
For example, it can be used to store information (e.g., settings and parameters) useful to all con-
tainers in the cluster. The highest JBoss Clustering Framework level incorporates the HA-JNDI,
HA-RMI, and HA-EJB services. The HA-JNDI service is a global, shared, cluster-wide JNDI Con-
text used by clients for object look up and binding. It provides clients with a fully replicated
naming service, and local name resolution. When a client executes an object lookup by means
of the HA-JNDI service, this service firstly looks for the object reference in the global context it
implements; if the reference is not found, it requires the local JNDI to return that object refer-
ence. The HA-RMI service is responsible for the implementation of the smart proxies of the JBoss
clustering.
     Finally, the HA-EJB service provides mechanisms for clustering different types of EJBs; namely,
the Stateless Session Beans, the Stateful Session Beans, and the Entity Beans (no clustered imple-
mentation of the Message Driven Beans is currently available in JBoss 3.x). The cluster version of
the Stateless Session Beans appear to be easy to manage, as no state is associated to those beans;
the state management of the Stateful Session Beans is implemented. In contrast, the state man-
agement of the Entity Beans in a cluster is a rather complex issue which is currently addressed at
the level of the database the Entity Beans interface, only.
     The JBoss Clustering service implements load balancing of RMIs, and failover of crashed
nodes (i.e., when a clustered JBoss node crashes, all the affected client calls are automatically
redirected to another node in the cluster); these mechanisms are included into the client stub.
Specifically, a client gets references to a remote EJB component using the RMI mechanism; con-
sequently, a stub to that component is downloaded to the client. The clustering logic, including
the load balancing and failover mechanisms, is contained in that stub. In particular, the stub em-
bodies both the list of target nodes that the client can access, and the load balancing policy it can
use.
     Moreover, if the cluster topology changes, the next time the client invokes a remote compo-
nent, the JBoss server hosting that component piggybacks a new list of target nodes as part of the
reply to that invocation. The list of target nodes is maintained by the JBoss Server automatically,
using JGroups. Thus, in general, following a client RMI, the client stub receives a reply from the
invoked server, unpacks the list of target nodes from that reply, updates the current list of target
nodes with the received one, and terminates the client RMI.
     This approach has the advantage of being completely transparent to the client. The client just
invokes a method on a remote EJB component, and the stub implements all the above mecha-
nisms. From outside, the stub looks like the remote object itself; it implements the same interface
(i.e., business interface), and forwards the invocations it receives to its server-side counterpart.
When the stub interface is invoked, the invocation is translated from a typed call to a de-typed
call.
     Currently, JBoss 3.2.x implements the following four load balancing policies, which can be
specified into the EJB deployment descriptors:

   • Random Robin each call is dispatched to a randomly selected node.
   • Round Robin each call is dispatched to a new node. The first target node is randomly
     selected from the target node list;
   • First Available each stub elects one of the available target nodes is as its own target node
     for every call (this node is chosen randomly). When the list of the target nodes changes, a
     new target node is elected only if the earlier elected one is no longer available.
   • First Available Identical All Proxies this policy is the same as the First Available policy above.
     However, the elected target node is shared by a proxy family; i.e., a set of stubs that direct
     invocations to the same target node.


UBLCS-2006-06                                                                                       32
                                                                   1      Open Source Application Servers


   The above load balancing policies are defined at deployment time, inside the EJB deployment
descriptors. In particular, a load balancing policy can be specified for each bean, for the home
and remote proxies.
   Note that the load balancing policies currently available in JBoss implement non-adaptive
load balancing within the cluster; hence, at run time, they may select a target node in the cluster
which may well be overloaded or close to being overloaded. This limitation cannot be over-
come by those policies as they operate with no knowledge of the effective load of the clustered
machines, at run time.




                               Figure 4. JBoss client-side interceptors


    However, additional load balancing policies can be incorporated in a JBoss application server
by plugging them into the last interceptor of the chain, as illustrated in Figure 4. In particular,
the load balancing in JBoss can be augmented with user-defined, adaptive strategies that select
the target machines at run time, based on the actual computational load of those machines.
    In addition, the latest JBoss versions incorporate a so-called HTTPLoadBalancer Service which
implements a response time based adaptive load balancing strategy, for HTTP sessions, only. This
strategy is implemented at a higher level of abstraction than the RMI level mentioned above, and
operates regardless of any hosting SLA.

1.4 The chosen application server
From the assessment of these two application servers, it has emerged that both can be good candi-
dates to be used in the implementation of the earlier described QoS-aware middleware services.
However, due to its popularity, its wide range of services for clustering and load balancing and
its microkernel structure that makes use of the JMX tool as software bus, the JBoss application
server seems to be the best one that can meet the objectives of this thesis. Therefore, the case
study here presented uses the JBoss application server, version 3.2.5.
    Hence, a new JBoss server configuration, termed QaAS, has been created that provides dis-
tributed applications with the QoS Management subsystem earlier discussed (the Macro Level
and Load Balancing services of Figure 5). In particular, the Macro Level includes the CS and MS
instances described in Chapter 2. In contrast, the Micro Level implements the monitoring part of
the above introduced MicroResourceManager.
    Then, the QoS-aware clustering environment consists of a cluster of Linux machines, each
running an instance of new created JBoss QaAS configuration.
    As shown in Figure 6, within such a QoS-aware clustering, clients, typically browsers, use
a distributed application, homogeneously deployed in the cluster (the earlier described HOAD
deployment approach), by issuing HTTP requests. These requests are effectively intercepted by
the node Leader, and dispatched to the node with the highest load balancing factor, via the LBS.
    Each node runs both the Web Container, which deploys web components, and the EJB Con-
tainer, which deploys EJB components (this configuration is termed colocation); in addition, each

UBLCS-2006-06                                                                                         33
                                                                     2   The Macro Level Implementation




                                             Data         Remote     Macro
                   JTS/JTA    Security      Sources                  Level
                                                        Management




                                                                               Load
                              JMX Implementation                               Balancing




                    EJB                   Java Server                Micro
                              Databases                   JMS
                  Container                  Pages                   Level




                                     Figure 5. The JBoss QaAS



node incorporates the standard JBoss middleware services (e.g., naming, transaction, and so on)
and the aforementioned added services (in Figure 6, MRM represents the Macro Level, the LBS
is the Load Balancing and mRM represents the used Micro Level).


2    The Macro Level Implementation
The Macro Level is implemented by an MBean termed MacroResourceManager, which incorpo-
rates the CS and MS; Figure 7 shows the UML diagram of MBeans used at this level.
    The MacroResourceManager uses the following two MBeans: the MeasurementService MBean,
which saves periodically the cluster state, and the SLADeployer MBean, which transforms the
input hosting SLA, specified in a XML form, into a Java object.
    The MacroResourceManager implements the MS based on the monitoring architecture de-
scribed in Chapter 2. This implementation uses the above mentioned MeasurementService, and
two specific classes termed Macro Resource Monitoring and Evaluation and Violation Detection
Service, as illustrated in the UML diagram of Figure 8.
    The Evaluation and Violation Detection Service is responsible for monitoring, at run time, the
adherence of the QoS-aware clustering to the SLA, i.e., it detects whether the run time environ-
ment conditions (obtained from the Measurement Service) are close to violating the SLA, and
decides the cluster reconfiguration strategy to be performed, if necessary.
    The Macro Resource Monitoring is enabled by the MacroResourceManager, which starts the
monitoring thread. This thread detects (i) the current view of the cluster membership, (ii) new
members that join the cluster, (iii) dead members that leave the cluster, and iv) the performance
status of the cluster, in terms of throughput, response time (note that these parameters allow one
to detect whether or not the nodes of the cluster are overloaded).
    In order to carry out its task, the Macro Resource Monitoring uses the JGroups communication
interface available in JBoss, and the JBoss clustering framework.
    Finally, the current implementation of the Macro Resource Monitoring sends periodically the
data about the cluster membership, obtained from the JGroups framework, to the Measurement
Service. This latter Service maintains these data in stable storage for logging purposes.

UBLCS-2006-06                                                                                       34
                                                                                 2   The Macro Level Implementation




                                            J2EE Tier
              Client Tier                                        QoS-aware
                                                                 Clustering

                                                       EJB                    JMS     JTA       MRM
                                           Web
                              HTTP       Container   Container
                              request

                                          Extended JBoss                            JMX               LBS
               browser
                                                                               implementation

                               HTTP
                               request     Web         EJB                    JDBC   Security   mRM
                                         Container   Container
               browser

                                          Extended JBoss

                  .
                  .                        Web         EJB
                                         Container   Container

                  .
                                          Extended JBoss




                         Figure 6. The QoS-aware clustering deployment platform



    The CS in the MacroResourceManager MBean implements a distributed cluster configuration
protocol, which can be summarized as follows.
    Assume that a JBoss cluster is set up, and that homogeneous application deployment (HOAD)
is to be carried out within that cluster. Each JBoss node in that cluster embodies a CS instance
in its own MacroResourceManager MBean; this MBean is identified by a cluster-wide unique
identifier (ID), assigned by the JGroups view management protocol.
    In order to configure the application hosting environment (i.e., the QaAS), the actual applica-
tion deployment is preceded by what is termed an SLA deployment phase [Lod04]. In this phase,
an SLADeployer is provided with an application hosting SLA.
    The above mentioned SLADeployer then enables its local MacroResourceManager, which be-
comes the MacroResouceManager Leader of the cluster configuration. This Leader examines the
input hosting SLA, and contacts its peer MacroResourceManagers in the cluster in order to (i)
discover the resource availability at these MacroResourceManagers’ nodes, and (ii) construct a
suitable cluster of nodes that can meet the input SLA.
    Note that the nodes in a cluster will host identical instances of the application, as homoge-
neous deployment is being carried out; hence, each node in that cluster should be capable of
honoring the application SLA. As the cluster is started up, the actual application can be deployed
and run. Clients can issue RMIs to any node in the cluster, transparently. If a failure occurs, (e.g.,
the crash of a JBoss node in the cluster), the standard failover mechanism in JBoss redirects the
client RMIs, addressed to the crashed node, to another active node in the cluster. In the standard
JBoss clustering service, this node will be selected according to one of the four load balancing
policies introduced earlier, and specified at deployment time. As mentioned before, these poli-
cies select a target node with no knowledge of the run time computational load of that node, it
is possible that the RMI redirection following a node failure in a cluster lead to overloading an-
other node in that cluster. In principle, this process may continue until all nodes in that cluster
are brought to an overloaded state, as a sort of domino effect. (Note that this process may defeat
the adaptive HTTPLoadBalancer as well).


UBLCS-2006-06                                                                                                   35
                                                          3   Load Balancing Service Implementation




                                 Figure 7. UML MBeans Diagram



    In order to overcome this problem, in the implementation the CS aims to maintaining a fair
distribution of the computational load among the clustered nodes. To this end, in case a node
failure or an overload exception is raised by the MS within a cluster, the CS firstly attempts to
reconfigure that cluster by integrating in it a spare node that replaces the faulty one; that spare
node can be obtained possibly from another cluster (or from a pool of resources reserved for this
purpose, for example). Secondly, if no spare node is available and the above reconfiguration
cannot be carried out, the CS raises an exception to be dealt with at a higher level of abstraction
(e.g., at the application level by adapting the application rather than the environment).
    The Micro Level is implemented by an MBean termed MicroResourceManager. Its monitoring
system periodically checks the response time, the throughput and the free memory of the QaAS
node in which the MicroResourceManager is deployed. To this end, a set of Interceptors have
been used that operate at container level (specifically, at EJB container level), in order to com-
pute, before calling methods on the EJBs, the response time and the throughput. This monitored
data is periodically saved, for logging purpose, in stable storage by using the aforementioned
MeasurementService MBean, exactly in the same way of the MacroResourceManager.


3    Load Balancing Service Implementation
The Load Balancing Service implemented is a HTTP Load balancing, that is, the service is in
charge of dealing with incoming HTTP requests. In order to implement one such load balancing
mechanism, firstly the HTTP load balancing architecture proposed in the JBoss clustering docu-

UBLCS-2006-06                                                                                   36
                                                           3   Load Balancing Service Implementation




                        Figure 8. UML Macro Level Implementation Diagram



mentation has been evaluated. This architecture makes use of the Apache web server with the
mod jk module, in order to balance the http client requests at the web tier. The mod jk module is
a module based on the notion of ‘’workers”. These are the nodes of the cluster toward which the
mod jk module forwards the http requests. The module must be statically configured, so as to
specify the number of workers available in the cluster, and its configuration cannot be changed
dynamically (i.e., at run time).
    This is a severe shortcoming of this architecture that prevents its usage, for the purposes of
this thesis, as the main objective is to enable dynamic cluster reconfiguration in order to recover
from a node failure, for example, or to replace overloaded nodes at run time.
    Moreover, if one of the Web servers of the Web tier crashes, the mod jk will not fail over to
the remaining active nodes; thus, the active client sessions with the crashed node will be lost.
(JBoss will be looking after the fail over of the HTTP client sessions.) Finally, in case of replace-
ment of a crashed node with a new, operational one, a manual reconfiguration of the mod jk
module is required in order to allow this module to include the new node in the current cluster
configuration.
    Owing to the above limitations of the recommended JBoss load balancing, it has been imple-
mented the Load Balancing Service architecture described in the previous Chapter 2. Figure 9
illustrates the UML diagram of its implementation.
    This architecture is constructed out of the following two principal JBoss Services (MBeans)
that are deployed in each node of the cluster (for survivability purposes):
   • Load Balancing Scheduler this service is responsible for choosing the clustered target node
     to which to forward a client request. The service has been designed and implemented to
     be as general as possible, so as to be used by any kind of load balancer (e.g., HTTP load
     balancer, RMI load balancer). Moreover, in order to choose the target node, it uses different
     scheduler managers that can be plugged into the architecture dynamically. The implemen-

UBLCS-2006-06                                                                                     37
                                                         3   Load Balancing Service Implementation




                      Figure 9. UML Diagram of the Load Balancing Service



     tation of the first prototype of the LBS includes (i) the Adaptive Scheduler Manager, which
     selects the target node based on the load balancing factor earlier discussed, (ii) the Ran-
     dom Scheduler Manager, (iii) the First Available Node Scheduler Manager, and (iv) the
     RoundRobin Scheduler Manager. In addition, if a session based load balancing is enabled
     (this can be done at configuration time by properly changing a XML configuration file we
     have termed loadBalancing-service.xml), the Sticky Session Manager is invoked by the Load
     Balancing Scheduler in order to obtain the node to which forward the client requests related
     to a specific session. To this end, the Sticky Session Manager checks the unique StickyHost
     cookie included into the HTTP requests in order to identify the selected host.
   • HTTP Load Balancer this service (i.e., the implementation of the TTP Request Managerin the
     earlier Figure 9) is used to balance the HTTP requests load, only. It uses the Load Balancing
     Scheduler MBean to obtain the address of the target node that is to serve those requests,
     and implements the request manipulation protocol (i.e., the construction and forwarding
     of a new HTTP request) introduced earlier. To this end, the HTTP Load Balancer Service
     makes use of the AdaptiveLoadBalancer MBean. This MBean actually manages each HTTP
     request by directly invoking the Load Balancer Scheduler and building the new HTTP re-
     quests. (In practice, the AdaptiveLoadBalancer is a reverse proxy deployed in each JBoss
     node).
   The client requests are intercepted by the Request Interceptor of the LBS architecture. This
component is currently implemented by using the Servlet Filter technology [Fil], provided by
the JBoss embedded Tomcat web container [Tom]. To this end, it has been created a class named

UBLCS-2006-06                                                                                  38
                                                                        4   Experimental Evaluation


LoadBalancingFilter that intercepts the client requests and enables the other three MS components
responsible for monitoring the cluster performances. These three MS components are imple-
mented by making use of simple java classes, whose names are the names of the related MS
components introduced earlier.
   All the new added MBeans implement their own interfaces and are plugged to the afore-
mentioned JMX software bus so as to expose them for the use from the other JBoss middleware
services that are registered with JMX.


4     Experimental Evaluation
In order to assess the effectiveness of the implemented QoS-aware middleware services, an ex-
perimental evaluation of the prototype architecture introduced earlier has been carried out.
    The evaluation aims at showing some principal capabilities of the designed middleware ser-
vices; these capabilities include:
    • overhead introduced by QaAS. In the thesis, the overhead is measured in terms of two
      application-level parameters, namely the response time and the throughput exhibited by
      the cluster. These parameters may influence the ability of QaAS of meeting the SLA re-
      quirements; thus, in the following, it has been assessed the impact of the introduction of
      the QoS-aware middleware services on such parameters, only. Nevertheless, additional ex-
      periments may be carried out in order to also evaluate the overhead related to the traffic
      generated by the interactions among the above services; however, as yet, this latter evalua-
      tion has not been performed;
    • the ability of QaAS of honoring hosting SLAs;
    • the ability of QaAS of optimizing the usage of the clustered resources;
    • the benefits of the QaAS’s adaptive load balancing policy compared to static load balancing
      policies.
    To demonstrate the above capabilities of the implemented middleware, a cluster of dedicated
machines has been used; these machines were located at the Department of Computer Science of
the University of Bologna.
    This cluster consists of 5 application server instances running in 5 dedicated Linux machines,
interconnected by a 1GB dedicated Ethernet LAN. Each machine is based on a 2.66Ghz dual Intel
Xeon processor, and is equipped with a 2GB RAM. In the carried out experiments, one of these
machines was dedicated to running the LBS, only; the other four machines were used to serve
the client requests. The client program was the JMeter program [jme] running in a G4 processor
under Mac OS X; this client machine was connected to the clustered servers via a (non-dedicated)
100Mb Ethernet subnet.
    The clustered servers were homogeneously hosting a digital bookshop application developed
in the above Department. This application provides its clients with such operations as “choose
book “add book to cart “confirm order” and so on, which access and manipulate an application
database, termed Hypersonic, already included into the JBoss application server as a JMX ser-
vice. As the main scope of the preliminary evaluation described in this Chapter was to assess the
performance of the middleware services only, this application database was replicated locally to
each clustered node, in order to avoid that it become a bottleneck. (In practice, with most appli-
cations, advanced caching techniques can be deployed in order to minimize database contention
overheads).

4.1 Overhead evaluation
The scope of the first experiment was to assess the overhead, in terms of response time, caused
by the overall QaAS middleware, compared to that of the JBoss standard middleware. This
experiment consisted of 50 clients accessing the same bookshop application introduced earlier,

UBLCS-2006-06                                                                                   39
                                                                                                                                4     Experimental Evaluation



                                                                         Response Time JBoss and QaAS
                                                                             50 threads 10 loops


                                                4000

                                                              3577.67
                                                       3479
                                                3500



                                                3000
                 Response time (in millisecs)




                                                2500



                                                2000                                                                            JBoss with mod_jk
                                                                              1751.67
                                                                        1607.2                                                  QaAS

                                                1500
                                                                                                          1161
                                                                                                1099.67
                                                                                                                 905
                                                1000                                                                   821.33



                                                 500



                                                   0
                                                        1                  2                       3              4
                                                                               N. of instaces




                                                       Figure 10. Overhead in terms of Response Time



and performing 11 operations among those mentioned, each of which was repeated 10 times.
Specifically, the experiment consisted of the following two tests.
    The first test was performed on a standard JBoss cluster (including no QaAS middleware),
with Apache+mod jk as load balancing; the second test was performed using the QaAS mid-
dleware with the previous described LBS. Both tests measured the user perceived response time
(expressed in milliseconds) and the application throughput (expressed as requests per second).
The results of these tests are compared and contrasted in Figures 10 and 11.
    Figure 10 shows that the overhead, in terms of response time, caused by the overall QaAS
middleware, compared to the JBoss standard middleware is negligible. Specifically, Figure 10
shows that the QaAS middleware introduces a delay that ranges from 2% to 8%, approximately,
in case of one, two, three nodes serving the client requests. In contrast, in case four nodes are
being used, the QaAS middleware exhibits better performance than the standard JBoss (10%
approximately).
    Figure 11 shows the throughput the QaAS middleware and the standard JBoss can provide
the clients with. Figure 11 shows that in case of two, three, and four nodes serving the client
requests, the QaAS middleware provides higher throughput than the standard JBoss. Specifi-
cally, the throughput generated by the standard JBoss is up to 18% (approximately) lower than
that produced by the JBoss application server extended with the QaAS middleware. Instead, in
case of one node, which serves the client requests, the QaAS middleware throughput is lower
than the one generated by the standard JBoss. However, in this case the difference is negligible
(approximately 2%).
    Note that the aforementioned tests have been carried out using (in both the JBoss configu-
rations) a load balancing that balances client sessions (i.e., load balancing with sticky sessions).
Hence, the HTTP session replication provided by JBoss has been properly disabled.

4.2 Ability of honoring SLAs
The previous experiments have shown that the QaAS middleware does not introduce a signifi-
cant overhead with respect to JBoss standard configuration. The experiments in this Section show
that the designed middleware architecture can honor the response time specified in the SLA.
    Two tests have been performed; in both tests the SLA specifies that the HTTP request user
perceived response time is to be below 1s.
    In the first test, a load of 10, 20 and 30 clients has been imposed on a QaAS cluster consisting
of one node. As the load augments, the QaAS middleware reconfigures the cluster adding new
instances (up to 4), in order to honor the SLA. The same load has been also imposed on a JBoss

UBLCS-2006-06                                                                                                                                             40
                                                                                                                            4    Experimental Evaluation



                                                                           Throughput JBoss and QaAS
                                                                              50 threads 10 loops


                                                 60

                                                                                                                     53.3


                                                 50
                  Throughput (in reqs per sec)                                                               44.87

                                                                                                      40.9

                                                 40                                           37.67




                                                 30                  27.06 27.57                                            JBoss with mod_jk
                                                                                                                            QaAS



                                                 20

                                                      13.6    13.3


                                                 10




                                                  0
                                                       1               2                        3              4
                                                                             N. of instaces




                                                             Figure 11. Overhead in terms of Throughput




                 Figure 12. QaAS approach vs JBoss approach: One node in the cluster



cluster composed by a single node. In contrast to the QaAS middleware, the standard JBoss does
not reconfigure the cluster, and maintains only the initial instance during the experiment.
    As it can be seen from the Figure 12, if the load is low (10 clients) the response time is similar in
both the configurations. A higher load (20 and 30 clients) shows a significant difference between
the two configurations. With 20 and 30 clients, the JBoss response time is, respectively, 124% and
362% higher than that obtained by the QaAS middleware.
    Observing the Figure 12, it is worth noting that the response time of the QaAS middleware
with 10 and 20 clients is very similar; instead, with 30 clients it is lower (444 ms) than the previous
two. This is due to the fact that, during the test, new instances are automatically added to the
initial configuration, and the requests load is balanced amongst a number of nodes larger than
that of the initial configuration.
    The second test performed is similar to the one above, with the exception that the initial
configuration is a cluster of two nodes (both testing standard JBoss and QaAS middleware). A
load of 10, 20, 30, 40, 50 clients has been imposed. As in the previous test, the JBoss standard
cluster maintains the initial instances only; in contrast, the QaAS middleware adds instances

UBLCS-2006-06                                                                                                                                        41
                                                                                                                                    4    Experimental Evaluation


dynamically (up to 4) in order to meet the response time requirement.

                                                                                        JBoss and QaAS
                                                                    (Initial configuration: two nodes in both clusters)



                                                 1800

                                                                                                                       1607
                                                 1600


                                                 1400
                                                                                                          1249
                  Response Time (in millisecs)




                                                 1200

                                                                                            975
                                                 1000

                                                                                                                              804       JBoss
                                                  800                                                                                   QaAS
                                                                          643   627                              643
                                                                                                  597
                                                  600


                                                  400
                                                        299   292


                                                  200


                                                    0
                                                        10                20                30             40           50
                                                                                      Number of clients




                Figure 13. QaAS approach vs JBoss approach: Two nodes in the cluster


    As expected, the Figure 13 shows that for load of 10 and 20 clients the same response time has
been obtained in both JBoss and QaAS (this is because both configuration have the same number
of instances), while when the load augments (30, 40, 50 client) the response time of the QaAS
middleware is respectively 39%, 48%, 50% lower than that of the standard JBoss.

4.3 Resource utilization evaluation
For the purposes of this evaluation, it has been assumed that a standard JBoss application server
adopt a resource over-provision policy in order to ensure that the application hosting SLA be
honored, as a standard JBoss does not include any dynamic clustering mechanisms. The purpose
then of this evaluation is to show that, in contrast with this policy, the resource utilization can
vary dynamically as needed, without causing hosting SLA violations, if JBoss extended with the
QaAS middleware services is used (i.e., the use of the QoS Management subsystem’s services
optimizes the clustered resources utilization).
    In addition, it has been assumed that the hosting SLA prescribe a maximum response time of
1s for the bookshop operation termed “catalog” (i.e., the operation that lists the available books
that a user can buy), and a 20 request per second (rps) minimal throughput. Thus, the following
two principal warning points were set by the Configuration Service at cluster configuration time:
   • a High Load warning point HL = 90%, indicating that when either the response time or the
     throughput reach this percentage of their relative SLA values (i.e. their breaching points),
     dynamic cluster reconfiguration is required, and a new node is to be added to the current
     cluster;
   • a Low Load warning point LL = 30%, indicating that when either the response time or the
     throughput reach this percentage of their relative SLA values, dynamic cluster reconfigu-
     ration is required, and resources (i.e., nodes) can be released from the cluster as they are no
     longer necessary.
    Based on the above assumptions, two tests have been performed in order to carry out this
evaluation. In both these tests, initially 10 clients issued concurrently the entire sequence of
bookshop operations. After a fixed time interval (i.e., a period of 2 and 7 minutes, respectively,
in each test), the number of clients increased to 30, and then, after the same period in each test,
it increased to 50 (needless to say, the clients always issued the entire sequence of bookshop

UBLCS-2006-06                                                                                                                                                42
                                                                             4   Experimental Evaluation




                          Figure 14. Resource Utilization 10’: Round Robin



operations). Then, the client load decreases, firstly from 50 to 30 clients; then, from 30 to 10
clients.
    The results of these tests are shown in the Figures 14 and 15; in both these Figures, the client
load distribution is depicted by the bold line. Figure 14 shows a 10 minutes snapshot of a 90
minutes test in which the client load varied with a 2 minutes period. This Figure shows that the
standard JBoss allocates all the 4 available nodes, and maintains them allocated to the hosted
application for the entire duration of the test, regardless of the actual client load. In contrast,
the QaAS extended JBoss dynamically augments the cluster size, as the client load increases, and
dynamically releases clustered nodes, as the client load decreases.




                          Figure 15. Resource Utilization 35’: Round Robin


    Figure 15 shows a 35 minutes snapshot of a 140 minutes test in which the client load varied
with a 7 minutes period. It can be seen that the results shown in this Figure are coherent with
those discussed above, and illustrated in Figure 14. Namely, Figure 15 shows that, yet again,
the QaAS extended JBoss optimizes resource allocation by dynamically adding resources to the
cluster when necessary, and releasing those resources, as they become unnecessary. In contrast,
the standard JBoss cluster configuration needs resource over-provision, in order to guarantee that

UBLCS-2006-06                                                                                        43
                                                                           4    Experimental Evaluation




                         Figure 16. Percentage of used nodes: 10 minutes test



the SLA be honored.
    In summary, the last two tests above show that the QaAS middleware uses, on average, from
2 to 3 nodes, by saving approximately 30% of the available resources, when compared to the
standard JBoss; Figures 16 and 17 illustrate these results.
    The following experiment aims at evaluating the load balancing mechanisms implemented by
the QaAS extended JBoss. Specifically, this experiment assesses the number of used nodes when
Round Robin (RR) load balancing is implemented by the QaAS extended JBoss. The result is
compared and contrasted with that obtained when the WorkLoad load balancing policy is used,
instead.
    This experiment is based on the assumptions concerning the SLA and the warning points
introduced earlier, and has consisted of two tests. The first test compared and contrasted the
Round Robin load balancing policy with the WorkLoad policy, by running the test previously
described and illustrated in Figure 15 (i.e., from 10 to 50 clients accessing the electronic bookshop,




                         Figure 17. Percentage of used nodes: 35 minutes test



UBLCS-2006-06                                                                                       44
                                                                           4   Experimental Evaluation


with a 7 minutes period of load variation).




                      Figure 18. Resource utilization: Adaptive load balancing


    Figure 18 shows the results obtained by using the WorkLoad adaptive load balancing policy.
    Figures 15 and 18 show that the average number of used nodes is approximately the same by
applying either strategies. Hence, in order to assess the effectiveness of the WorkLoad adaptive
strategy, additional load in one of the clustered nodes was injected (as the adaptive load balancing
operates based on the actual load of each node). Thus, a second test has been carried out. The
test case was the same as that introduced above, with the exception that, on one of the clustered
nodes, additional load was injected by 5 clients using a computation al intensive application
running in that particular node. The resulting average resource utilization is shown in the Figures
19 and 20.




               Figure 19. Resource utilization: RR load balancing with additional load


    Figure 19 shows that the Round Robin policy in the QaAS extended JBoss may cause poor
clustered resource allocation. After allocating all the available resources, the number of allocated
nodes is maintained the same as that used in case of maximum, even though the load decreases.
It has been observed that this occurs because the response time (not shown in Figure 19) ex-
perienced by the clients never falls below the LL warning point (the 30% of the SLA specified
response time). In fact, the Round Robin policy selects each node in turn with no knowledge
concerning the node load conditions. Hence, if a node results overloaded, as in this specific case,
the overall cluster response time is influenced by such conditions and never falls below the LL

UBLCS-2006-06                                                                                      45
                                                                            4   Experimental Evaluation


threshold.




             Figure 20. Resource utilization: Adaptive load balancing with additional load


    In contrast, Figure 20 shows that the use of the WorkLoad policy, in order to balance the
client load in a cluster of QaAS extended JBoss nodes, optimizes the allocation of those nodes. In
fact, using the WorkLoad policy, which operates based on the load conditions of each node, the
overload state of one of the clustered nodes is detected and that node not used by the LBS. This
LBS’s behavior influenced the overall cluster response time that, in this test, decreased (below
the LL threshold) as the load decreased; hence, resources were released, as no longer necessary.

4.4 Static vs Adaptive load balancing policies
The last experiment was carried out in order to assess the effect of the RR and the adaptive
WorkLoad load balancing strategies on the cluster response time and throughput. The cluster
configuration for this experiment consisted of 3 nodes, only, implementing the QaAS extended
JBoss; specifically, 2 nodes were serving the client requests, and a node was used as dedicated
LBS. In addition, we injected, during the test, the additional load generated by 10 clients accessing
the computationally intensive application mentioned earlier in one of the two nodes serving the
client requests.
    The results of this experiment are depicted in the Figures 21 and 22. These Figures show
that both the response time and the throughput of the QaAS clustered nodes can improve (of
20%, approximately) if an adaptive, rather than round robin, load balancing is enabled within
the cluster.

4.5 Conclusions
To conclude, it has emerged that the QoS-aware clustering introduces negligible overheads, in
terms of both the user perceived response time and the application throughput, compared to the
standard JBoss application server. Moreover, the designed middleware architecture provides the
hosted applications with a dynamic mechanism that allows the services described in this thesis to
honor hosting SLAs and to support effective resource management, by avoiding resource over-
provision; in fact, it has been demonstrated that QaAS uses the minimum number of resources
necessary to honor the hosting SLAs.
    Finally, the LBS’s adaptive load balancing policy can effectively support the QoS-aware clus-
tering in providing distributed applications with an optimal resource utilization, compared to
standard static load balancing policies such as Round Robin or Random policies.
    However, the prototype architecture discussed in this Chapter presents some limitations that
lead to re-design and implement the architecture. The next Chapter describes these limitations
and the evolution of the prototype.

UBLCS-2006-06                                                                                       46
                                                                                                 4   Experimental Evaluation




                                                         QaAS RR vs QaAS Adaptive
                                                 (additional load on one node of the cluster)




                                                           2752
                                      3000
                                                                        2287

                                      2500
                Response time in ms




                                      2000                                             QaAS RR

                                      1500                                             QaAS Adaptive

                                      1000


                                       500


                                         0
                                                        response time



                                             Figure 21. QaAS RR vs QaAS WorkLoad: Response time




                                              Figure 22. QaAS RR vs QaAS WorkLoad: Throughput




UBLCS-2006-06                                                                                                            47
Chapter 4

Evolution of the prototype
architecture

From the evaluation of the QaAS, discussed in Section 4, it has emerged that the QoS-aware
clustering alternates the use of one/two clustered nodes in the resource utilization experiments,
when a Round Robin load balancing policy is enabled and the imposed load is an order of 10
clients. This means that, in this case, the middleware continuously tempts to add and release
nodes in the cluster so as to avoid SLA violations. This behavior is influenced by many factors
among which the values of the HL and LL warning thresholds being used in the experiments.
    Although the run time CS reconfiguration that adds new clustered nodes can be an attractive
solution when the hosting SLA is close to being breached, it is also true that this operation exhibits
costs in terms of performance; this is due to the time spent by the new added nodes to reach a
steady state in serving the client requests.
    To this end, this thesis evaluates a new design of the QaAS, in which the above mentioned
thresholds are not fixed at priori, but determined at run time, based on the past and current op-
erational conditions of the QoS-aware clustering. The purpose here is to improve the obtained
results, concerning the number of used nodes saved by QaAS compared to standard application
servers, still maintaining the advantages of the previous experimental evaluation; namely the
negligible overhead introduced by the proposed middleware architecture, and the better perfor-
mances obtained by enabling an adaptive load balancing policy in the cluster.
    Moreover, it is worth observing that in common industry SLAs there exists a specific param-
eter that allows the specified QoS requirements to be violated over a certain timeframe.
    Owing to the above observations, both the SLA and the SLA Monitoring have been revisited.
This Chapter firstly describes the changes applied to the example SLA discussed in Chapter 2,
and secondly the re-design and re-implementation of the MS.
    Note that the basic functionalities of the CS are not changed. Rather, in the new design the
load balancing factor has not been considered anymore. In fact, from the experimental evalua-
tion, carried out in order to test the effectiveness of the new prototype, emerged that an adaptive
load balancing policy, which simply considers the number of pending requests for each clustered
nodes, can be as attractive as the previously described WorkLoad policy. In addition, this addi-
tional policy, which can be easily plugged into the LBS architecture, has the advantage of requir-
ing information, to dispatch the client requests, that can be obtained by the LBS itself, without
involving other middleware services (i.e., CS), as in case of the WorkLoad policy. Needless to say,
the interaction between the CS and LBS is still maintained, in order for the LBS to get dynamically
at run time the cluster membership configuration.
    This Chapter concludes by describing the different evaluated techniques for the design of
the LBS’s adaptive load balancing policy, and discussing an experimental evaluation of the new
implemented prototype of the middleware architecture.



48
                                                                                    1   Revisited SLA


1     Revisited SLA
As previously mentioned, the hosting SLA has been revisited in order to take into account com-
monly used characteristics of commercial SLAs. Figure 1 depicts the new hosting SLA.
    As shown in this Figure, additional parameters have been introduced in the hosting SLA
xml file. Specifically, the server has to guarantee the SLA QoS requirements to be met a certain
number of times, during a specified period (the efficiency and efficiencyValidity in
Figure 1).
    A few observations concerning these parameters are in order. Firstly, the term efficiency is
being used in some commercial SLA contracts, as stated in [BCL+ 04], in order to indicate the
effectiveness of the provided service (i.e., efficiency) over a specific timeframe (i.e., efficiency
validity).
    Finally, the above two parameters can be also read as the number of times the application
server is allowed to violate the obligations stated in the SLA, over a predefined time interval
(e.g., SLA guarantee will be no less than 95 percent over 2 minutes).
    To conclude, it is worth noticing that the application operations of the Server Responsibil-
ities part of the hosting SLA are classified according to the server response time (in Figure 1
the maxResponseTime attribute), only. In fact, the response time parameter, rather than the
throughput, is being currently widely used in a large number of commercial SLAs, as an ef-
fective metric to determine the responsiveness of the service a provider delivers to its clients
[Agr, Wus02].


2     Revisited Monitoring Service
The part of the Monitoring Service that has been re-designed and re-implemented is that respon-
sible for monitoring the SLA, in order to detect whether or not the SLA contractual clauses are
close to being violated. As stated in the introduction of this Chapter, the new Monitoring Ser-
vice carries out its activities by using warning thresholds that are dynamically determined at run
time, based on the past and current operational conditions and SLA violations of the QoS-aware
clustering (see below).
    Figure 2 shows the revisited Monitoring Service architecture. As illustrated in this Figure, the
basic MS architecture is not changed; however, an additional component (i.e., SLA Violations, the
bold component in Figure 2) has been added to the SLA Monitoring.


3     SLA Monitoring
The cluster performances are monitored by making use of four additional MS components named,
Request Interceptor, SLA Violations, Evaluation and Violation Detection Service, and Cluster Perfor-
mance Monitor. Each component is described in the following separate paragraphs.

The Request Interceptor is used to intercept the client requests in order to collect the cluster per-
formance for specific application requests. The cluster performance consists of the set of client
request rate and response time values that allow one to detect whether or not the cluster is over-
loaded. Yet again, this interceptor is part of the LBS, as shown in Figure 3 (see next Subsection).

The SLA Violations is responsible for monitoring the adherence to the SLA efficiency at-
tribute, whose characteristics are being described in Section 1.
    Specifically, this MS component computes (i) the violation rate during the SLA efficiency va-
lidity period, and (ii) the average violation rate during a specified timeframe (the timeframe is
the fixed monitoring interval, termed monitoring frequency). The former is obtained by dividing
the number of requests per operation that has violated the SLA, by the total number of served
requests. In contrast, the latter is computed as moving average, over the monitoring frequency,
in order to smooth the violation rate trend of the QoS-aware clustering.

UBLCS-2006-06                                                                                     49
                                                                                                    3   SLA Monitoring




          <?xml version="1.0" encoding="UTF-8"?>
          <SLAng xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
                   <Parties>
                           <Client>
                                    <Name>Facilitare Inc.</Name>
                                    <Address>Frankfurt</Address>
                           </Client>
                           <Server>
                                    <Name>Subito Inc.</Name>
                                    <Address>Stockholm</Address>
                           </Server>
                   </Parties>
                   <SLS>
                           <Hosting>
                                    <ClientResponsibilities>
                                            <ContainerServiceUsage name="HighPriority" requestRate="100/s">
                                                    <Operations>
                                                        <Operation path="catalog.jsp"/>
                                                        <Operation path="AddToCart"/>
                                                        <Operation path="checkout.jsp"/>
                                                        <Operation path="CheckoutCtl"/>
                                                    </Operations>
                                            </ContainerServiceUsage>
                                            <ContainerServiceUsage name="MediumPriority" requestRate="50/s">
                                                    <Operations>
                                                        <Operation path="bookDetails.jsp"/>
                                                        <Operation path="RemoveFromCart" />
                                                    </Operations>
                                            </ContainerServiceUsage>
                                            <ContainerServiceUsage name="LowPriority" requestRate="20/s">
                                                    <Operations>
                                                        <Operation path="index.jsp"/>
                                                        <Operation path="login.jsp"/>
                                                        <Operation path="LoginCtl"/>
                                                    </Operations>
                                            </ContainerServiceUsage>
                                    </ClientResponsibilities>
                                    <ServerResponsibilities serviceAvailability = "0.30" efficiency = "0.95"
          efficiencyValidity = "2">
                                            <OperationPerformance name="HighPriority" maxResponseTime="1.0s">
                                                    <Operations>
                                                        <Operation path="catalog.jsp"/>
                                                        <Operation path="AddToCart"/>
                                                        <Operation path="checkout.jsp"/>
                                                        <Operation path="CheckoutCtl"/>
                                                    </Operations>
                                            </OperationPerformance>
                                            <OperationPerformance name="MediumPriority"
          maxResponseTime="4.0s">
                                                    <Operations>
                                                        <Operation path="bookDetails.jsp"/>
                                                        <Operation path="RemoveFromCart" />
                                                    </Operations>
                                            </OperationPerformance>
                                            <OperationPerformance name="LowPriority" maxResponseTime="5.0s">
                                                    <Operations>
                                                        <Operation path="index.jsp"/>
                                                        <Operation path="login.jsp"/>
                                                        <Operation path="LoginCtl"/>
                                                    </Operations>
                                            </OperationPerformance>
                                    </ServerResponsibilities>
                           </Hosting>
                   </SLS>
          </SLAng>




                                         Figure 1. The revisited SLA


UBLCS-2006-06                                                                                                      50
                                                                                                                         3    SLA Monitoring




                                                 Evaluation
                                                  Violation
                                                                             <<get SLA violation rate>>
                                              Detection Service

           Client
           Requests                                    <<evaluate SLA>>             SLA
                       Request                                                   Violations                 Membership
                      Interceptor                                                                           Interceptor
                                                            <<set violations>>

                                    <<send request>>
                                                           Cluster
                                                         Performance
                                                           Monitor
                                                                                        SLA Monitoring

                                                        <<save SLA violations data>>
                        <<save SLA monitoring data>>                                                      <<save membership data>>


                                                                                                           access
                                                       Measurement Service


                                                                 Monitoring Service



                                 Figure 2. The revisited MS architecture



    In particular, the moving average of the violation rate at the time t is a function of (i) the
moving average of the violation rate obtained within the previous monitoring interval (i.e., the
moving average of the violation rate calculated at the time t − 1), and (ii) the current violation
rate.
    Therefore, let ma vr be the moving average of the violation rate and vio rate the current
violation rate, the following formula is used to compute the ma vr:

                          ma vrt = α ∗ ma vr(t−1) + (1 − α) ∗ vio ratet                                                                  (1)
   The constant α indicates the weight that has been assigned to the moving average and current
violation rate. In fact, the higher the constant α, the lower the weight assumed by the current
violation rate, in the computation of the moving average. In contrast, the lower α, the higher the
weight of the current violation rate in the final average value.
   It is worth noticing that in this new prototype of the middleware architecture, the SLA might
be violated provided that the violation rate be maintained under the SLA efficiency limit, during
the SLA efficiency validity period.

The Evaluation and Violation Detection Service is in charge of monitoring whether the QoS deliv-
ered by the clustered nodes meets the hosting SLA. Specifically, this component detects, at run
time, variations in the operational conditions of the clustered nodes, which may affect the QoS
delivered by the cluster, and triggers the cluster reconfiguration, if necessary. This activity is
carried out by using a specific overload warning threshold described below.

The Cluster Performance Monitor cooperates with the Evaluation and Violation Detection Service
in order to assess the actual response time, the average response time and the client request
rate, delivered by the cluster. In particular, the Cluster Performance Monitor obtains the data
required to assess the response time and the client request rate from the Request Interceptor; it
then derives and maintains the client request rate and a set of indexes, each of which described
in the following.



UBLCS-2006-06                                                                                                                            51
                                                                                   3   SLA Monitoring


    The client request rate is calculated over the monitoring frequency by taking into account the
number of requests sent by the clients.
    In addition, the Cluster Performance Monitor maintains a set of indexes that are used to keep
track of the SLA violations trend at run time of the cluster. Specifically, there exist two indexes
in the new design, i.e., the so-called efficiency index and the node efficiency. The former is used to
check whether new nodes are to be added to the current cluster configuration, as it records the
current operational conditions of the QoS-aware clustering; in contrast, the latter is used so as to
verify whether or not a specific node might be excluded from the cluster.
    In particular, the efficiency index indicates whether the current cluster operational conditions
are respecting the hosting SLA. In other words, the efficiency index is used to verify whether
or not the SLA violation rate, exhibited by the cluster and related to the current cluster load
conditions, is critical compared to the SLA efficiency. Therefore, the index is initialized to
0 and, during the monitoring frequency, can be incremented or decremented depending on the
SLA response time violations.
    Specifically, let E be the SLA efficiency, the efficiency index is:

                        ef f iciency index = ef f iciency index − (1/1 − E)                        (2)
    if the actual request response time violates the related maxResponseTime of the SLA; other-
wise it is incremented by 1 as long as its value is not equal to 0. Note that the efficiency index can
always assume a value that ranges between [0, -Ω]. Ω is a constant value set at deployment time.
    Hence, if the efficiency index ranges between 0 and (1/1 − E), the actual violation rate is
lower than (1 − E) (the number of times the hosting SLA can be violated); in this case, the cluster
is positively responding to the injected client load. In contrast, if the index falls below (1/1 − E),
the cluster exhibits a violation rate that is close to breaching the (1 − E) bound of the SLA.
    For example, the SLA of Figure 1 states that the contractual clauses can be breached at most 5%
of the times, over a predefined timeframe. Thus, the efficiency index, in case of SLA violations,
is decremented by a value equals to 20, that is, every 20 client requests one of these requests is
allowed to violate the SLA. If the efficiency index ranges between [0,-20] then the violation rate is
lower than 5%; in this case the cluster is responding to the client load without breaching the SLA
efficiency. If the index falls below -20, the current violation rate of the cluster is close to the
SLA limit.
    In this latter case some corrective actions are to be taken in order to prevent the SLA efficiency
breaching. To this end, an overload warning threshold is used and maintained by the earlier
mentioned Evaluation and Violation Detection Service. This threshold allows one to derive the
responsiveness of the cluster proportionate to the ability of the cluster itself of meeting the SLA
efficiency obligation. In fact, its principal scope is to record the “history” of the SLA violations
exhibited by the cluster.
    Thus, the overload warning threshold, which ranges between [∆, Ω], is dynamically com-
puted based on the violations of the SLA. (Yet again, ∆ and Ω are constant values set at deploy-
ment time).
    Specifically, the overload warning threshold is firstly initialized to a default value, and then
computed by considering both the moving average of the violation rate, within the monitoring
frequency, and the general violation rate of the SLA over the efficiency validity period, all pro-
vided by the SLA Violations component.
    Hence, let ma vr be the moving average of the violation rate, E the SLA efficiency, and vr the
general violation rate; the overload warning threshold (OW T in the formula) can be calculated
as follows.

                                         OW T = OW T ∗ 2                                           (3)
                      if max(ma vr, vr) < (1 − E) ∗ δ         with 0 < δ < 1

                                         OW T = OW T /2                                            (4)

UBLCS-2006-06                                                                                      52
                                                                                    3   SLA Monitoring


                                   if max(ma vr, vr) > (1 − E)
    The formula (3) doubles the threshold if the worse between the vr and ma vr is lower than
the number of times the SLA can be violated, multiplied for a constant value δ (see below). This
means that the current operational conditions of the cluster are such that the QoS-aware cluster-
ing is not close to breaching the (1 − E) limit imposed by the hosting SLA.
    In the new design, δ represents how much one wishes to be close to breaching the SLA, de-
pending on the risk one wishes to assume as to the violation of the contract. Hence, the higher δ,
the higher the violation rate and consequently that risk; the lower δ, the lower the violation rate
and the risk one wishes to assume.
    In contrast, the formula (4) halves the overload warning threshold since the worse between
the above two values is higher than the number of times the hosting SLA can be violated; in this
case, in fact, the cluster of nodes is not positively responding to the client load and the threshold
is to be more stringent.
    The second index maintained by the Cluster Performance Monitor is the node efficiency. This
index is related to each node of the cluster (it is actually included into the resource plan object
maintained by the CS), and used to exclude a specific node from the cluster in case of low load
conditions. Its main objective is to keep track of the operational conditions of a specific clustered
node.
    Hence, let i be the i-th node of the current cluster configuration, N the number of nodes used
in the cluster configuration, and E the SLA efficiency; the node efficiency nei , initially equals to
0, is computed as follows, as soon as an incoming client request is intercepted by the MS.

                                   ∀i ∈ N   nei = nei − (1/1 − E)                                   (5)
    Thus, nei is computed by using (5) if the estimated response time, delivered by the cluster
configuration that one would obtain by excluding the i-th node, is higher than that of the related
SLA maxResponseTime. In contrast, if the estimated response time is lower than that of the
SLA, the node efficiency is incremented by 1 as long as its value is not equal to 0, as in case of the
efficiency index. Yet again, the node efficiency can range between [0, -Ω].
    The estimated response time, obtained by excluding a specific node from the cluster, considers
the number of served requests of each node during the monitoring frequency, if the new adaptive
load balancing policy (see below) is enabled; its value is given by applying the formula (6) below.
    Therefore, let N be the number of used nodes in the current cluster configuration, RT the
actual response time of the incoming request, sreqs the served requests over the monitoring
frequency of a node; the estimated response time RTCN −i of the cluster configuration that one
would obtain by excluding i is:
                                                       N
                                              RT ∗     k=1 (sreqsk )
                                  RTCN −i =          N
                                                                                                    (6)
                                                     j=1 (sreqsj )
                                                     j=i
   In contrast, if a static load balancing policy is used (e.g., Round Robin), the estimated response
time is calculated in a different way. In fact, in this case, it is necessary to evaluate the estimated
response time of each clustered node, since in a static load balancing policy, such as the Round
Robin policy for example, each node is in turn chosen for serving the client requests. Hence, for
each clustered node k, it is computed the k’s estimated response time that one would obtain by
excluding, from the cluster configuration, the aforementioned i-th node. Thus, let average RT
the average response time of a node; in this case, the following formula is used, instead, in the
computation of the k’s estimated response time:
                                                                 RT ∗average RTk
                                                                 average RTi ∗N
                      ∀k ∈ N     with k = i (RTK )CN −i =                                         (7)
                                                                   N −1
   In this case, the node efficiency nei is updated with the formula (5) if there exists at least one
node k ∈ N that exhibits an estimated response time higher than the SLA maxResponseTime;
otherwise, as before, nei is incremented by 1 as long as its value is not equal to 0.

UBLCS-2006-06                                                                                       53
                                                                          4   Load Balancing Policies


    Moreover, in case the WorkLoad policy, which uses the load balancing factor to dispatch the
requests, is enabled in the cluster, the estimated response time is calculated by taking into account
the load balancing factor, as this factor determines the load supported by the clustered nodes.
Hence, in case of WorkLoad policy, the estimated response time of the formula (6) is given by
substituting the sreqs of a node with the load balancing factor lbf of that node.
    To conclude, it can be noticed that the node efficiency mimics the behavior of the efficiency
index, when a CS reconfiguration that releases clustered nodes is to be applied.
    Once the Cluster Performance Monitor has collected and computed the aforementioned set
of information, it makes available these results for the use from the Evaluation and Violation
Detection service.
    The skeleton code of Figure 3 shows the protocol performed by these MS components.
    As shown in Figure 3, the Cluster Performance Monitor starts its thread at SLA deployment
time; the thread invokes the Evaluation and Violation Detection Service in order to check the
adherence to the hosting SLA of the values the Cluster Performance Monitor has collected and
computed (the saveData function of Figure 3). Note that this activity is repeated within the
aforementioned monitoring frequency.
    The Evaluation and Violation Detection Service evaluates the SLA. The evaluation consists of
checking the SLA Client Responsibilities (i.e., in Figure 3 the SLAClientRequestRateEvaluation
function) and the SLA Server Responsibilities (i.e., the SLARTEvaluation of Figure 3), in order.
As to the former, if the client sends a higher number of requests than that allowed, the client is
indeed violating the SLA and no corrective actions are performed to reconfigure the cluster; in
this case, an exception is raised at the application level (typically, this situation requires the client
to pay the penalties stated into the SLA).
    In contrast, if the client respects the SLA, the Evaluation and Violation Detection Service mon-
itors the adherence to the SLA Server Responsibilities. This activity consists of comparing the
aforementioned efficiency index, provided by the Cluster Performance Monitor, with the ear-
lier described overload warning threshold maintained by the Evaluation and Violation Detection
Service itself.
    Specifically, if the efficiency index is lower than the threshold, the CS is invoked so as to
reconfigure the cluster by adding new nodes. In fact, in this case the violation rate is close to
violating the SLA limit and a reconfiguration is necessary in order to adapt the cluster to the new
load conditions.
    Note that the CS can augment the cluster with one node at a time or more than one. When
adding a node at a time, there is a waiting time spent between these consecutive CS reconfigu-
rations; this time may be useful in order to deal with the transient phase (or induction phase) of
a new added node, that is, the elapsed time before that node reach a steady state in serving the
client requests.
    In contrast, adding more than one node at a time may be useful in order to cope with flash
crowds. In fact, this case may not be completely resolved by adding just one node at the time to
the cluster due to the aforementioned induction phase of the new node.
    If the efficiency index is higher than the overload warning threshold, the Evaluation and
Violation Detection Service verifies whether or not some clustered nodes can be released. In fact,
in this case, the cluster is positively responding to the injected client load and some nodes might
be excluded from the cluster as no longer necessary. To this end, both the efficiency index and
the node efficiency are used; specifically, if the efficiency index is equal to 0 and there exists at
least one node with a node efficiency equals to 0, that node is actually excluded from the cluster
by the CS, and included into the pool of spare resources.


4     Load Balancing Policies
As to the LBS, the principal functionalities of its architecture have been maintained; however,
an additional adaptive policy, termed Pending Requests Queue, has been introduced. This policy
takes into account the number of pending requests (i.e., requests that are waiting for processing)

UBLCS-2006-06                                                                                     54
                                                                      4   Load Balancing Policies




           // SLA DEPLOYMENT TIME
                start-CPMThread();
           // RUN TIME
                class ClusterPerformanceMonitor {
                  saveData() {
                    /* Record number of arrived and served requests when an
                       incoming client request is intercepted */
                    // Compute response time of incoming request
                    if (rt <= SLArt) {
                       efficiencyIndex++;
                    }else {
                       // Record violation with SLA violations component
                       SLAViolations.recordViolation();
                       // Decrement efficiency index
                    }
                    // for each node, update node efficiency
                  }

                     run() {
                       while(true) {
                         // compute client request rate and evaluate SLA
                         EVDS.SLAEvaluation();
                         sleep(monitoringFrequency);
                       }
                     }
                 }

                 class EvaluationViolationDetectionService {
                      SLAEvaluation() {
                        // Update warning load threshold
                        if (SLAClientRequestRateEvaluation())
                           SLARTEvaluation();
                        reconfigure();
                      }

                        reconfigure() {
                          if (excludeInstances()) {
                              CS.excludeNodesReconfiguration();
                          } else {
                              if (addNewInstances) {
                                 // Add new nodes
                                 CS.addNewInstancesReconfiguration();
                              }else { //No need to reconfigure }
                          }
                        }
                }




                         Figure 3. New SLA Monitoring Skeleton code




UBLCS-2006-06                                                                                 55
                                            4   Load Balancing Policies




                Figure 4. CPU utilization




UBLCS-2006-06                                                       56
                                                                           4   Load Balancing Policies


in the LBS queues, related to each clustered node. Hence, when an incoming client request is
intercepted by the LBS, this policy forwards that request toward the clustered node with the
shortest pending requests queue. The motivations for using one such a load balancing policy are
as follows.
    Chapter 2 describes the WorkLoad policy as a load balancing strategy that uses a load balanc-
ing factor to dispatch the incoming client requests in the cluster. That factor is computed as linear
combination of three node-related parameters, namely the node response time, the throughput
and the JVM available memory. These parameters, as seen before, are obtained by interrogating
the MicroResourceManager service, added to QaAS for this purpose. Although the experimental
evaluation has demonstrated that the WorkLoad policy such computed is beneficial compared
to a Round Robin policy, the calculation of the load balancing factor in formula (5) exhibits the
following principal shortcoming.
    The MicroResourceManager computes the node response time and throughput at the EJB con-
tainer level only, by using a collection of interceptors attached to that container; thus, the load
generated by the Web container is not taken into account. Hence, in a scenario of colocation, as
that used in the deployment platform shown in Figure 6, the above shortcoming might be lim-
ited, as to local calls enabled between the two containers. However, in a more realistic scenario,
in which the Web level of the J2EE architecture is distributed with regard to the EJB level, the
WorkLoad policy might not be so beneficial as shown in the previous experimental evaluation.
In fact, in this case, additional overhead is introduced by the consequent remote communication
between the above two levels. Needless to say, this overhead may notably influence the client
perceived response time and the throughput parameters. Hence, as the WorkLoad policy uses
these two parameters that are not computed by considering the load generated by the Web level,
in this particular case of non-colocation of the two containers, the final value of the load balancing
factor of each node may not be corresponding to the real load of the clustered nodes. Thus, this
leads to not exploit the advantages offered by the adaptive load balancing policy that forwards
the requests toward the less loaded nodes.
    In addition, the available JVM memory turned out to be an unstable metric for the estimation
of the actual load of the nodes, due to its uncontrolled behavior.
    To this end, other approaches have been evaluated that use (i) low-level monitoring data (i.e.,
CPU utilization), in order to determine the actual load of each clustered node, and (ii) high-level
monitoring data (i.e., client perceived response time) computed by taking into account the load
generated by both the Web and the EJB containers. However, these approaches exhibited worse
performance results compared to the developed Pending Requests Queue policy (see below). The
reasons for which the earlier mentioned techniques were not successful can be summarized as
follows.
    As to low-level monitoring data, the CPU utilization metric has been evaluated. Hence, it has
been noticed that, if the CPU utilization of a machine increases, the response time of that machine
increases too. However, as shown in Figure 4, the response time augments linearly, compared
to the CPU utilization that reaches its maximum value before a significant degradation of the
response time occur.
    Owing to this limitation, a high-level monitoring data, such as the client perceived response
time, has been considered as parameter in the computation of the load balancing factor. To this
end, two different approaches have been compared, namely an approach that uses the moving
average of the response time, and a second one that simply takes into account the average of the
response time computed in every monitoring interval. In either cases, the principal scope was to
keep balanced the response time exhibited by each clustered node.
    Hence, as to the former approach, the load balancing factor has been calculated by taking into
account the moving average of the response time. Specifically, at the cluster configuration time it
has been assumed that every node i deliver the same response time to the clients; hence, the i’s
load balancing factor in percentage was given by dividing 100 by the total number of clustered
nodes that composed the initial cluster configuration. At run time, the i’s moving average of the
response time has been measured by using the formula (1); the load balancing factor was then
computed in percentage by considering both the i’s moving average of the response time and the

UBLCS-2006-06                                                                                      57
                                                                                4   Load Balancing Policies




         Figure 5. Response Time: login.jsp                     Figure 6. Response Time: catalog.jsp



i’s load balancing factor obtained in the previous monitoring interval (it has been imposed that
the load balancing factor change within every monitoring interval, for this purpose).
     As to the second technique, the load balancing factor has been computed in percentage by
considering the average of the response time within the monitoring interval.
     Hence, the load balancing factor lbf of each clustered node has been computed by using the
formula (8) below:
                                              PN
                                               i=1 (resp   timei )
                                                    N          ∗ (lbfi )(t−1)
                           ∀i ∈ N    lbfi =                                                             (8)
                                                        resp time
    where N is the total number of clustered nodes of the current cluster configuration, and
resp time is either the moving average of the response time or the average of the response time
computed within the monitoring interval, depending on which of the above two approaches is
being enabled.
    Note that the load balancing factor at the instant t has been computed by taking into account
the load balancing factor of the previous monitoring interval t − 1 (i.e., the load sent to each
clustered node till that time), in order to reduce the so-called “bouncing effect” between the
values of the load balancing factors.
    For example, suppose to have two clustered nodes and, at the time instant t − 1, the load
balancing factor of a node to be significantly high compared to the factor of the other clustered
node. In this case, the LBS that enables the WorkLoad policy operates by sending more requests
to the node with the highest load balancing factor, leading to overload that node. At the time t,
when the load balancing factor is computed again, the CS will assign the lower load balancing
factor to the node that at time t is overloaded, and the highest load balancing factor to the other
clustered node. Needless to say that, likely, at the time t + 1 the values of the load balancing
factors may be inverted again, by producing what I termed the “bouncing effect”.
    Hence, owing to these observations, in the earlier formula (8) the load sent to each clustered
node (i.e., the load balancing factor) has been also considered at every monitoring intervals, as the
load balancing factor indeed turned out to be an influent parameter to reduce the effect described
above.
    However, during the experimental evaluation of the new prototype of the middleware archi-
tecture, it has emerged that both the aforementioned approaches perform better than a static load
balancing policy (e.g., Round Robin). Nevertheless, as depicted in Figures 5 and 6, the response
time parameter turned out to be an unstable metric, in the calculation of the actual load of each
clustered node. Specifically, Figure 5 illustrates the response time distribution of the “login” oper-
ation of the bookshop application obtained by imposing a load in the cluster that consisted of 10
clients accessing the application for 30 seconds. As it can be seen from that Figure, the response
time has a distribution characterized by many peaks.
    Figure 6 shows the response time distribution of the “catalog” operation obtained by perform-

UBLCS-2006-06                                                                                           58
                                                                                      5    Revisited Implementation




                                           J2EE Tier
        Client Tier                                             QoS-aware
                                                                Clustering

                                                      EJB
                                                                             JMS     JTA       MRM
                                          Web
                       HTTP             Container   Container
                       request

                                         Extended JBoss                            JMX
          browser
                                                                              implementation

                        HTTP
                        request           Web         EJB                    JDBC   Security   LBS
                                        Container   Container
          browser

                                         Extended JBoss

             .
             .                            Web         EJB
                                        Container   Container

             .
                                         Extended JBoss




                                  Figure 7. The new Deployment Platform



ing the earlier mentioned test. In this case, the distribution is even more irregular than the first
case; this is caused by the “catalog” operation itself that accesses to a non in-memory database
(see below) in order to retrieve information concerning the catalog of the available books that
clients can buy.
    Therefore, in essence, the load balancing factor computed by using the response time metric
resulted unstable during the experimental evaluation.
    Owing to the above observations, the Pending Requests Queue policy earlier introduced has
been developed. Note that, by enabling this policy, the CS does not compute the load balancing
factor anymore, as no longer necessary. In this way, it has been possible to balance the incoming
client requests based on information owned by the LBS itself, without involving an additional
interaction between the LBS and other QoS-aware middleware services (i.e., CS), as in case of
the WorkLoad policy described in Chapter 2. This policy turned out to be the better one in the
further experimental evaluation that has been carried out.


5    Revisited Implementation
Following the earlier discussion, the QaAS implementation has been revisited; in particular, Fig-
ure 7 depicts the new deployment platform in which each clustered QaAS has been extended
with two principal services only, avoiding the use of the MicroResourceManager mentioned be-
fore.
    Within such an environment, the implementations of the LBS and MacroResourceManager
(MRM in Figure 7) have been partially changed. In particular, as to the MacroResourceManager,
the principal changes have been applied to the SLA component of the Monitoring Service, for


UBLCS-2006-06                                                                                                   59
                                                                            6   Experimental Evaluation


which a new class (i.e., the SLAViolations) has been added to its previous implementation. More-
over new methods have been implemented for both the CS and MS instances included in the
MRM service.
   As to the Load Balancing Service, a new Scheduler Manager termed QueueSchedulerManager
has been embodied in the architecture, in order to implement the aforementioned adaptive Pend-
ing Requests Queue policy. As all the other available LBS’s scheduler managers, the QueueSched-
ulerManager implements an interface through which its principal method getNextNode() is in-
voked.


6     Experimental Evaluation
The new experimental evaluation used a cluster configuration that consisted of nine Linux ma-
chines interconnected by a dedicated 1Gb Ethernet LAN. Each machine is based on a 2.66Ghz
dual Intel Xeon processor, and is equipped with a 2GB RAM. Yet again, the above cluster was
located at the Department of Computer Science of the University of Bologna. In the experiments
described below, one of these machines was dedicated to host the implementation of the LBS;
the other machines were used to host both the application server instances that served the client
requests, and the client program used to generate artificial load in the cluster. To this end, the
client program simulated a collection of clients accessing an application hosted in the cluster; for
the purpose of this new evaluation it has been used a client program implemented in the above
Department, rather than JMeter of the previous experimental evaluation. In fact, JMeter pre-
sented some limitations in its usage that have been overcome by using one such a program; this
latter program allows one to (i) specify a variety of client load distributions, (ii) specify different
request rates of the artificial clients, and (iii) simulate a typical behavior of common browsers, by
enabling some kind of caching of the static contents of the HTTP client requests.
    The clustered servers were hosting (homogeneously) the aforementioned digital bookshop
application. However, the database used in these experiments was the postgresql database [Pos].
As the principal scope of this evaluation was to assess the performance of the QoS-aware mid-
dleware services, only, this database was replicated and instantiated locally to each clustered
machine, in order to avoid database bottlenecks; finally, issues of database consistency were still
ignored, for the scope of this evaluation.
    Hence, four principal experiments have been carried out, each consisted of a variety of tests.
In all the experiments, the earlier described constants, required for SLA monitoring purposes,
were set at deployment time to the following values:
    • Ω = 8192
    • ∆ = 64
    • δ = 0.3
    In addition, in the experiments with homogeneous clustered machines, we have used a con-
stant number of clients. Each client sent HTTP requests with no client waiting time between each
request sending; in this case, the clients simply waited for the server responses before issuing
further requests. In contrast, in those experiments with heterogeneous machines, each client sent
HTTP requests at a fixed rate (see below). Finally, as to the LBS’s enabled approach, yet again, in
all the experiments described in this Section, a “per-session” load balancing was used, with the
Pending Requests Queue policy enabled; hence, the HTTP session replication on the clustered
nodes that served the clients requests was properly disabled.

6.1 Overhead Evaluation
The first concern has been to assess whether the new prototype of the QoS-aware middleware
services was adding unnecessary overheads to the cluster response time (measured in ms) and
throughput (measured in pages per second (pag/s)), in the absence of failures. To this end, the

UBLCS-2006-06                                                                                       60
                                                                         6   Experimental Evaluation




     Figure 8. QaAS’s overhead: Response Time           Figure 9. QaAS’s overhead: Throughput




      Figure 10. LBS Scalability: Response Time         Figure 11. LBS Scalability: Throughput



new QaAS was instantiated in the cluster above. The test, performed over a 2 minutes timeframe,
consisted of 20 clients accessing the application operations (this test was run numerous times; the
results shown are the average values, obtained from the various runs).
    In order to evaluate the overhead introduced two different configurations were compared
and contrasted; namely, the first configuration consisted of one standard application server in-
stance with no LBS. In this case, all the client requests were dispatched directly to that server
instance. The second configuration consisted of one QaAS instance and the LBS used to dispatch
the incoming client requests to that QaAS instance.
    The results of these tests are depicted in Figures 8 and 9. As shown in these Figures, QaAS
introduces a negligible overhead to both the cluster response time and the throughput (approxi-
mately 5%/6%) compared to the configuration with no QaAS being used.
    It is worth observing that this overhead is obtained by comparing a one node cluster, with no
load balancing service enabled, with a cluster configuration that deploys QoS-aware middleware
services and the LBS component. Thus, the overhead is principally caused by the introduction of
this load balancing component, necessary in a clustered environment, rather than by the deploy-
ment of the CS and MS.

6.2 LBS scalability
The second experiment has been carried out in order to evaluate the scalability of the LBS in
terms of number of both clustered nodes and supported client requests. To this end, some tests

UBLCS-2006-06                                                                                    61
                                                                          6   Experimental Evaluation


were performed.
    In these tests, firstly one node and 20 clients issuing HTTP requests were used. These requests
were intercepted by the LBS and dispatched to that node. Then, the number of servers augmented
up to 2 with 40 clients, 3 with 60 clients and finally 4 nodes with 80 clients, respectively.
    The obtained results are illustrated in Figures 10 and 11. Figure 10 shows the response time
of the cluster. As it can be noticed, the response time in the tests remains approximately constant
as both the number of nodes in the cluster and the load injected augment.
    Figure 11 depicts the throughput of the cluster. As shown in this Figure, by augmenting the
number of clustered nodes and imposed clients, the throughput of the cluster increases linearly.
    In essence, Figures 10 and 11 demonstrate that the LBS scales well when both the number of
nodes and clients in the cluster augment.

6.3 Static vs Adaptive Load Balancing policies
The scope of the third experiment, described in this Section, was to assess the effectiveness of the
new adaptive load balancing policy (i.e., the Pending Requests Queue) compared to a static one,
i.e., the Round Robin (RR) policy. To this end, the performance of the QoS-aware clustering was
evaluated, in terms of response time and throughput, when the adaptive load balancing policy
and the static load balancing policy is in turn enabled in the cluster.
     It is worth observing that, for this experiment, firstly two clustered QaAS nodes serving the
client requests were used, each of which deployed into two identical dual processor machines. A
dedicated node were deploying the LBS and other two machines hosting the client program (note
that, in this experimental evaluation, even the client program has been replicated in the cluster
in order to avoid it to become a bottleneck).
     Each client program simulated 20 clients so that to impose a total of 40 clients in the cluster,
sending 90 HTTP requests toward per second the cluster.
     Within such an environment, some tests have been carried out. The first test enabled the RR
policy in the LBS, in order to dispatch the incoming client requests. In contrast, the second test
enabled the Pending Request Queue policy. The obtained results are shown in Figures 12 and 13.
     As shown in these two Figures, the performances exhibit by either policies are very similar
for both the response time (Figure 12) and the throughput (Figure 13) of the cluster. This is due to
the fact that, on equal terms of power machines deployed in the cluster, the principal advantages
of an adaptive policy, which operates based on the actual load of each clustered node, cannot be
completely exploited.




    Figure 12. RR vs Adaptive: Response Tim           Figure 13. RR vs Adaptive: Throughput (iden-
    (identical clustered nodes)                       tical clustered nodes)


   Thus, a second test has been carried out. This latter test case was the same as that introduced
above, with the exception that, one of the QaAS node, serving the client requests, was deployed


UBLCS-2006-06                                                                                     62
                                                                            6   Experimental Evaluation


in a single processor machine. In this way, the test was able to simulate different operational con-
ditions of the machines of the cluster so as to differentiate the ability of the machines themselves
to respond to the injected client load.




    Figure 14. RR vs Adaptive: Response Time           Figure 15. RR vs Adaptive: Throughput (single
    (single and dual processor machines)               and dual processor machines)


    The results of this second test are illustrated in Figures 14 and 15.
    Specifically, Figure 14 shows that the response time of the cluster in case the adaptive policy
is used is notably shorter (approximately 47%) than that if the RR policy is enabled. The same
results can be obtained for the throughput of the cluster. In fact, as shown in Figure 15 the
throughput, when the Pending Requests Queue is enabled, is much better (approximately 33%)
than that obtained with the RR policy.
    Thus, this experiment demonstrates the effectiveness of the new adaptive load balancing pol-
icy when the machines of the cluster exhibit different operational conditions.
    In essence, the new results confirm the conclusions of the preliminary evaluation of this thesis,
for which an adaptive load balancing policy can be in general more beneficial than a static one.

6.4 Resource Utilization Evaluation
The last experiment aims at showing the ability of the new prototype of improving the opti-
mization of the usage of the resources (i.e., nodes) in the cluster, without causing hosting SLA
violations. For the purposes of this evaluation, it has been assumed that if no dynamic clustering
techniques, as those enabled by QaAS, are provided, a resource over-provision policy is used
instead, which allocates as many nodes as possible to ensure the hosting SLA to be honored. In
addition, it has been assumed that one can allocate a maximum number of 4 nodes in the cluster.
    Therefore, in an over-provision policy, all the 4 available nodes are used; in contrast, QaAS
can dynamically allocate from a minimum of 1 node up to 4 nodes in the cluster, depending on
the imposed load in different time intervals, in order to honor a specific hosting SLA.
    In all the tests below, it has been used an adaptive load balancing policy (i.e., the Pending
Requests Queue) as it has been demonstrated its effectiveness compared to the static ones.
    Based on these assumptions, it has been carried out this last experiment that consisted of two
principal tests.
    In the first test, it has been assumed that the SLA efficiency requirement (the efficiency
attribute illustrated in Figure 1) was set to 95%, that is, the SLA requirements can be violated 5%
of the times over a predefined timeframe.
    Hence, this test has been run firstly for 35 minutes, by using a client distribution similar to that
adopted in the experimental evaluation of Section 4. Note that each resource utilization graphs
illustrated in this Section are to be read paired with their own underlying graphs. Specifically,
the graphs labeled with (a) show how the clustered nodes are allocated by QaAS at different

UBLCS-2006-06                                                                                       63
                                                                     6   Experimental Evaluation




Figure 16. Resource Utilization: SLA efficiency   Figure 17. Resource Utilization: SLA efficiency
95%, 35 minutes test                             95%, 13 minutes test




Figure 18. Resource Utilization: SLA efficiency   Figure 19. Resource Utilization: SLA efficiency
95%, 39 minutes test                             95%, 65 minutes test




UBLCS-2006-06                                                                                64
                                                                           6   Experimental Evaluation


time intervals, depending on the load injected in the cluster. The imposed load distribution is
illustrated in the graphs labeled with (b).
    Thus, initially 16 clients have been imposed, then 32, 56 and then 32, and finally 16 clients
again, each repeating the bookshop operations for 7 minutes. The results of this test are shown
in Figure 16. As depicted in that Figure, the distribution of the resource utilization follows the
imposed client load distribution.
    The same test has been then run by subjecting the cluster to a smoother client distribution.
Figures 17, 18, and 19 show the results obtained by running the test for 13, 39 and 65 minutes,
respectively. In this case, initially 8 clients have been injected that issued concurrently the entire
sequence of bookshop operations. After a fixed time interval (i.e., a period of 1’, 3’, and 5’, in
Figures 17, 18, and 19, respectively), the number of clients increased to 16, and then, after the
same period in each test, it increased to 24, and then to 32, 40, 48 and finally 56 clients (needless
to say, the clients always issued the entire sequence of bookshop operations). Then, the client
load decreased, firstly from 56 to 48, and then to 40, 32, 24, 16 and finally 8 clients again.
    As shown in the above three Figures, QaAS optimizes resource allocation by dynamically
adding resources to the cluster when necessary, and releasing those resources, as they become
unnecessary. In addition, it can be seen that, when one considers a longer snapshot test (for
example, 39 minutes or 65 minutes tests), QaAS releases the resources in a way that is very similar
to the load distribution, shown in graphs (b). This behavior of the middleware architecture is less
evident in case of shorter snapshot tests (13 minutes, for instance), as the time spent by QaAS to
release the clustered nodes, in percentage, graves significantly on the overall snapshot of the test.
    In the second test, a more stringent SLA, in which the efficiency attribute was set to 98%,
has been considered. This means that the SLA requirements could be violated a 2% of the times
over a predefined timeframe, only.
    As in the previous test, this second test has been run firstly for 35 minutes, by imposing 16, 32,
56, 32 and finally 16 clients, and then for 13, 39, 65 minutes with the smoother client distribution
previously described. The results obtained are shown in Figures 20, 21, 22 and 23, respectively.
As it can be noticed, with the first imposed client distribution QaAS uses more clustered nodes,
compared to the previous test with the SLA efficiency equals to 95%. This is caused by a lower
margin of error allowed by the SLA that forces QaAS to use more clustered nodes in order to
cope with stringent SLA requirements that are to be met. In contrast, with the smoother client
distribution, the considerations of the previous test discussed earlier are still valid, even if a
stringent SLA efficiency is to be fulfilled.
    However, it is worth observing that, in case of stringent SLA efficiency, the QoS-aware clus-
tering alternates the use of one and two nodes in the cluster, when the imposed load is about
16/24 clients. This is caused by the values of the earlier described warning threshold, efficiency
and node efficiency indexes. Specifically, in case of stringent SLA efficiency, the warning thresh-
old, described in Section 3, is smaller and the efficiency index lower of the threshold, causing CS
reconfiguration that adds new clustered nodes to be invoked.
    Figures 24, 25, 26, and 27 compare the number of used nodes in case 95% of SLA efficiency is
specified, with those used in case of 98% of SLA efficiency.
    As expected, Figures 24, 25, 26, and 27 show that, when the SLA requirements are more
stringent (98%), the number of used clustered nodes is higher, in order to cope with a smaller
percentage of allowed SLA violations.
    In summary, from the results obtained by the above experiments, as shown in Figures 28, 29,
30 and 31, QaAS, on average, uses about 2 nodes, by saving approximately from a 33% to 50%
of the available resources, when compared to a standard application server configuration where
an over-provision policy is to be used in order to honor hosting SLAs. Hence, the new prototype
allows one to save more used nodes, compared to the previous described approach.
    To conclude, as the principal objective of this work was to honor hosting SLAs, Figures 32 and
33 show the percentage of SLA violations in a snapshot test of 65 minutes. In order to obtain these
results, specific SLA values have been fixed during the experiment, for this purpose. Specifically,
it has been specified a maximum response time of 1s for the “catalog” operation of the bookshop
application.

UBLCS-2006-06                                                                                      65
                                                                     6   Experimental Evaluation




Figure 20. Resource Utilization: SLA efficiency   Figure 21. Resource Utilization: SLA efficiency
98%, 35 minutes test                             98%, 13 minutes test




Figure 22. Resource Utilization: SLA efficiency   Figure 23. Resource Utilization: SLA efficiency
98%, 39 minutes test                             98% , 65 minutes test




UBLCS-2006-06                                                                                66
                                                           7   Wide Area Network (WAN) case study




Figure 24. Resource Utilization: SLA efficiency       Figure 25. Resource Utilization: SLA efficiency
98% vs 95%, 35 minutes test                          98% vs 95%, 13 minutes test




Figure 26. Resource Utilization: SLA efficiency      Figure 27. Resource Utilization: SLA efficiency
98% vs 95%, 39 minutes test                         98% vs 95%, 65 minutes test



    Thus, Figure 32 illustrates the number of violations of the above SLA requirement, in case the
SLA efficiency is set to 95%; Figure 33 depicts the violations obtained in case the imposed SLA
efficiency is equal to 98%. Hence, in Figures 32 and 33, the maximum number of SLA violations
cannot be over the 5% and 2% limits allowed, respectively. In fact, in either cases, it can be no-
ticed that, even if QaAS violates the SLA maximum response time for the “catalog” operation,
the actual total percentage of SLA violations computed within the monitoring frequency is main-
tained significantly under the maximum percentage allowed by the hosting SLA, preventing SLA
efficiency breaching.


7     Wide Area Network (WAN) case study
A further experimental evaluation of the new prototype of the middleware architecture has been
also carried out in a Wide Area Network environment. The purpose here was to verify whether
or not the approach provided by this thesis might be applied in a geographical context as well.
    Within such an environment, one would expect that a flat approach, as that discussed so
far, exhibit performance degradations, and therefore SLA efficiency violations, due to the higher
network delays introduced between different geographical regions.
    Owing to the above observation, further experiments have been performed in order to verify
whether the presented approach is indeed effective in a WAN environment. Specifically, for
this purpose, two different clusters of machines have been constructed; within such clusters a
collection of experiments have been carried out and the obtained results are discussed in the
following separate Subsections.



UBLCS-2006-06                                                                                   67
                                                            7   Wide Area Network (WAN) case study




Figure 28. QaAS vs standard JBoss: Used              Figure 29. QaAS vs standard JBoss: Used
nodes, SLA efficiency 95%, 35 minutes test            nodes, SLA efficiency 98%, 35 minutes test




Figure 30. QaAS vs standard JBoss: Used              Figure 31. QaAS vs standard JBoss: Used
nodes, SLA efficiency 95%, 65 minutes test            nodes SLA efficiency 98%, 65 minutes test



7.1 Bologna-Cesena cluster
The former cluster consisted of two LAN clusters, namely the LAN cluster of the previous exper-
iments, located in Bologna (BO), and another LAN cluster located in a different Italian city, i.e.,
Cesena (FC), at the Computer Science Laboratory of the University of Bologna. This Laboratory
is four network hops far from the Department in Bologna. The connection between the Depart-
ment of Bologna and this Laboratory has a limited bandwidth of 2 Mbps, and is characterized by
a rather high packet loss rate (between 2% and 9%).
    The cluster in Cesena consisted in turn of two non-dedicated Linux machines interconnected
by a non-dedicated 100Mbps Ethernet LAN. One machine of this cluster is based on a 400Mhz In-
tel Pentium II processor, equipped with a 512MB RAM. The other machine embodies a 699Mhz
Intel Pentium III processor, and is equipped with a 512MB RAM. The database used in these
machines was the Hypersonic database enabled in the discussed preliminary experimental eval-
uation.
    In order to enable the communications among the nodes in these two clusters, the afore-
mentioned reliable group communication protocol, provided by JBoss (i.e., JGroups), has been
properly configured to operate in such geographical context. To this end, the default UDP with
IP multicast configuration, enabled in the experiments described so far, has been changed and
replaced by a configuration that uses the TCP transport protocol. This allows one to configure
clusters of JBoss application server instances in a WAN environment with different networks that
communicate one another.
    The above Figure 34 depicts the scenario used in this case study. As shown in that Figure,
the client program and the QaAS Leader instance that deploys the LBS are hosted in the cluster
of Bologna. The other remaining machines that serve the client requests are located in both the
LAN clusters.

UBLCS-2006-06                                                                                   68
                                                             7   Wide Area Network (WAN) case study




Figure 32. SLA violations: SLA efficiency 95%,         Figure 33. SLA violations: SLA efficiency 98%,
65 minutes test                                       65 minutes test




                              Figure 34. Bologna-Cesena WAN cluster



    In this case, only the experiments concerning the resource utilization and the SLA efficiency
violations have been performed.
    In particular, as to the resource utilization the results obtained are depicted in Figures 35 and
36. These Figures show the distribution of the resource utilization exhibited by QaAS in case the
SLA efficiency is set to 95%, in two snapshot tests, namely 39 and 65 minutes tests, respectively.
As illustrated in these Figures, QaAS uses more nodes (approximately 8% more) than in case
of the related LAN experiments shown in Figures 18 and 19. In fact, in this case, the delays
of the network link between the two clusters, experienced during the working hours, and the
usage of non-dedicated machines located in Cesena influenced the results, as confirmed by the
SLA violations tests (see Figures 37 and 38). In fact, as illustrated in Figures 37 and 38, the SLA
violations are approximately doubled than in case of the experiments with the LAN cluster in
Bologna (note that, yet again, the value of 1s for the maximum response time of the “catalog”
operation has been specified into the SLA, in order to detect the SLA violations).
    However, it is worth noticing that although the SLA violations are augmented, compared to
the LAN case study of the previous experimental evaluation, these violations are still under the
maximum allowed by the SLA (i.e., under the 5% imposed limit).

7.2 Newcastle-Bologna-Cesena cluster
The scenario used in this case study is illustrated in Figure 39. Hence, within the shown WAN
cluster (dashed circle in Figure 39), the same experiments as described in the previous Subsection
have been carried out.
   The new WAN cluster consisted of four Linux machines serving the client requests. One of

UBLCS-2006-06                                                                                     69
                                                          7   Wide Area Network (WAN) case study




Figure 35. Resource Utilization: BO-FC, SLA        Figure 36. Resource Utilization: BO-FC, SLA
efficiency 95%, 39 minutes snapshot test            efficiency 95%, 65 minutes snapshot test



these machines was located in Bologna at the Department of Computer Science of the University
of Bologna (BO) and belonged to the aforementioned LAN cluster; another machine was located
in England at the Department of Computing Science of the University of Newcastle upon Tyne
(NCL) and the other remaining two machines were located in Cesena (FC) at the Laboratory
of Computer Science of the University of Bologna. Hence, in this case, another geographical
network link has been added between the machines that serve the client requests; this link has
been connected with the Bologna LAN cluster, as shown in Figure 39 and it was 19 hops far from
Bologna. As before, the QaAS Leader that deployed the LBS has been located in Bologna as well
as the client program that generated the load in the cluster.
    The machine in Newcastle is based on a 1Ghz Intel Pentium III processor, equipped with
a 512MB RAM and running the Fedora version 3.0 Linux operating system. The machines in
Cesena were the same as the experiments earlier described.
    The obtained results concerning the resource utilization are depicted in Figures 40 and 41.
As it can be noticed by these Figures, QaAS uses more clustered nodes compared to both the
experiments carried out by using the LAN cluster in Bologna only, and the experiments described
in the previous Subsection.
    Hence, in this case, in order to cope with a higher number of SLA violations (due to higher
WAN network delays), QaAS uses more clustered nodes by saving approximately a 17% of the
used nodes, only.
    In addition, as depicted in Figures 42 and 43 the SLA violations are higher than those of
the previous experiments and this leads to an SLA breaching, since QaAS cannot fulfill the SLA
efficiency requirement. In fact, as it can be seen in Figures 42 and 43, the number of violations
exhibited by QaAS are maintained under the maximum allowed in the first part of both tests (i.e.,
the previous 39 and 65 minutes tests); however, when the number of imposed client requests
augments, the number of violations augments too by exceeding the maximum allowed by the
SLA.
    To conclude, even in a WAN environment QaAS can be beneficial in terms of used clustered
nodes, compared to an over-provision based approach. In fact, the results obtained demonstrate
that in both the enabled WAN cluster configurations, QaAS allows one to save resources in the

UBLCS-2006-06                                                                                70
                                                           7   Wide Area Network (WAN) case study




Figure 37. SLA violations: BO-FC, SLA effi-          Figure 38. SLA violations: BO-FC, SLA effi-
ciency 95%, 39 minutes snapshot test                ciency 95%, 65 minutes snapshot test




                                              Newcastle




                                                                 Cesena
                                                     Bologna




                        Figure 39. Newcastle-Bologna-Cesena WAN cluster



cluster.
    However, although the Bologna-Cesena cluster exhibit a degradation of the performances,
compared to those obtained by using the Bologna LAN cluster only, that degradation is accept-
able: QaAS does not violate the SLA efficiency requirement, even if the percentage of SLA re-
sponse time violations augments, compared to the related experiments carried out within the
Bologna LAN cluster.
    Nevertheless, in case the geographical distances between the nodes in the cluster augment, as
in the discussed Newcastle-Bologna-Cesena case, the degradation of the performances is signif-
icant; in fact, QaAS violates the SLA efficiency requirement when the number of imposed client
requests in the cluster augments. Hence, future work is needed in this latter case (see Section 1)
so as to adopt, in a geographical context, a possible hierarchal and recursive approach capable
of exploiting the benefits obtained by using the QaAS’s dynamic clustering mechanisms at LAN
cluster level.




UBLCS-2006-06                                                                                  71
                                                    7   Wide Area Network (WAN) case study




Figure 40. Resource Utilization: NCL-BO-FC,   Figure 41. Resource Utilization: NCL-BO-FC,
SLA efficiency 95%, 39 minutes snapshot test   SLA efficiency 95%, 65 minutes snapshot test




Figure 42. SLA violations: NCL-BO-FC, SLA     Figure 43. SLA violations: NCL-BO-FC, SLA
efficiency 95%, 39 minutes snapshot test       efficiency 95%, 65 minutes snapshot test




UBLCS-2006-06                                                                          72
Chapter 5

Related Work

This thesis has discussed a number of issues that can be addressed in the field of the applica-
tion server technologies. Specifically, the principal challenges investigated by the designed and
implemented prototype architecture can be summarized as follows. The prototype is capable of:
    1. providing distributed enterprise applications with guarantees concerning the end-to-end
       QoS requirements those applications may require. In the context of this work, as these ap-
       plication QoS requirements are specified within SLAs, this characteristic entails monitoring
       and enforcing SLAs in order to fulfill its contractual clauses;
    2. providing distributed enterprise applications with a dynamic clustering service that can
       optimize the usage of the available resources; this entails using the minimum number of
       those resources that can allow the prototype itself to honor SLAs;
    3. enabling an adaptive load balancing service to be used within the aforementioned dynamic
       clustering service;
    4. extending open source application server technologies in order to guarantee the above three
       points.
    Owing to these observations, for each point listed earlier, this Chapter goes over a number of
relevant researches, which have influenced notably the approach to the design of the described
QoS-aware application server, by comparing and contrasting those researches with the solution
discussed in this thesis. Therefore, this Chapter firstly provides a survey on the end-to end QoS
architectures; this study has contributed to derive some design principles for the development
of the QoS-aware middleware services. Secondly, it describes some research proposals about
resource clustering, and SLA enforcement and monitoring in the context of both Web Services
and application servers. Finally, this Chapter provides some concluding remarks concerning the
assessment carried out.


1     End-to-End QoS Architectures
A large body of research has investigated the design and development of so-called end-to-end
QoS architectures, aimed at providing platforms that support effectively distributed applications
characterized by end-to-end QoS requirements.
   Relevant examples of these architectures are described in the following.

RT CORBA [FDG+ 97]: It is a standard architecture for real-time management of distributed
objects; this architecture has been developed in order to support fixed-priority CORBA applica-
tions. RT CORBA prescribes a number of functionalities that must be integrated and managed
by ORB end-systems in order to ensure the predictable behavior of the activities carried out by

                                                                                               73
                                                                    1   End-to-End QoS Architectures


CORBA clients and servers [SK00]. The above mentioned functionalities include: (i) A Commu-
nication infrastructure resource management: A RT CORBA end-system must exploit policies and
mechanisms in the underlying communication infrastructure that support resource guarantees;
(ii) OS scheduling mechanisms: The RT CORBA specification is targeted to operating systems that
allow applications to specify scheduling priorities and policies; (iii) Real-Time ORB end-system that
must provide applications with standard interfaces that allow these applications to specify their
resource requirements; (iv) Real-Time services and applications: RT CORBA ORBs must guarantee
efficient, scalable and predictable behavior of higher-level services and application components.
     In order to manage these functionalities, RT CORBA defines standard interfaces and QoS
policies that improve an application’s ability to configure and control (i) the processor resources
through pools of threads, priority mechanisms, intra-process mutual exclusion mechanisms and
a global scheduling service, (ii) the communication resources via protocol properties and explicit
bindings of clients to servers and (iii) the memory resources via request queues and bounded
thread pools.
     In essence, RT CORBA specifies a set of relevant features that allow the application to control
thread priorities and scheduling; however, it does not provide high level primitives for construct-
ing adaptive QoS management mechanisms, as the approach proposed in this thesis.

TAO [SLM97]: It is an open-source high performance RT CORBA compliant ORB. TAO imple-
ments a rich set of middleware mechanisms that can support applications with both deterministic
and statistical QoS requirements, and applications with best-effort requirements.
   This ORB consists of the following four major components that provide applications with
end-to-end QoS guarantees:

   • ORB: this component supports real-time by optimizing (i) code generation, and (ii) the use
     of system components such as the memory management system, and the network proto-
     cols.
   • Scheduling Service: this service provides real time scheduling of client requests, and sup-
     ports both static scheduling, based on off-line schedulability analysis, and dynamic schedul-
     ing, via admission control policies.
   • Event Service: this service implements real-time scheduling of CORBA events, and provides
     filtering and correlation mechanisms that allow consumers to select the events they receive.
   • Real-Time I/O (RIO) subsystem: this subsystem runs in the OS kernel and is designed to take
     advantage of ATM network features.
    It is worth observing that TAO is primarily targeted for static hard real-time applications.
Thus, it assumes that, once the ORB is initially configured, its strategies will remain in place until
it completes its execution. Therefore, there is very little support for on-the-fly reconfiguration.
    In contrast, the approach presented in this thesis proposes CS reconfiguration strategies to be
applied at run time, whenever the operational conditions of the clustered resources may lead to
violate the hosting SLA.
    Moreover, programming TAO’s low level real-time mechanisms, in order to meet specific
end-to-end QoS requirements, can be complex and error-prone, particularly for large-scale, QoS-
enabled distributed applications. Therefore, a set of capabilities are required: higher-level mid-
dleware capabilities for end-to-end QoS specification and control, and on-the-fly reconfiguration
functionalities.
    In order to meet these requirements, other architectural frameworks such as dynamicTAO
and Quality Objects (QuO) have been developed. These frameworks are introduced below.

DynamicTAO [KRL+ 00]: It is a reflective CORBA ORB, built as an extension of TAO to support
on-the-fly reconfiguration while assuring that the ORB engine remains consistent [RKC01]. Thus,
it allows inspection and reconfiguration of its internal engine by exporting an interface for (i)

UBLCS-2006-06                                                                                     74
                                                                    1   End-to-End QoS Architectures




                                Figure 1. Dynamic TAO Components



transferring components across the distributed system, (ii) loading and unloading modules into
the ORB runtime, and (iii) inspecting and modifying the ORB configuration state.
    Reification in dynamicTAO is achieved by means of a collection of entities, termed compo-
nent configurators. A component configurator holds the dependencies between a certain compo-
nent and other system components. The dynamicTAO framework consists of two Configurators,
namely the Domain Configurator and the TAO Configurator. The former is responsible for main-
taining references to instances of the ORB and to servants running in each process. The latter,
instead, is included in each ORB instance and contains hooks to which implementations of dy-
namicTAO strategies are attached.
    The principal components of the dynamicTAO framework are the following (Figure 1): (i) A
Persistent Repository: It stores category implementations in the local file system. Furthermore, it
offers methods for manipulating categories and the implementations of each category; (ii) Net-
work Broker: It receives reconfiguration requests from the network and forwards them to the
Dynamic Service Configuration. The latter includes the above mentioned DomainConfigurator and
supplies common operations for dynamic configuration of components at runtime. Moreover, it
delegates some of its functions to specific component configurators (e.g., a TAOConfigurator or a
certain ServantConfigurator).
    In conclusion, the dynamicTAO provides a base for supporting safe dynamic reconfiguration
of scalable, high performance distributed systems. The approach’s main focus is to apply adap-
tation mechanisms at the application level (e.g., changing a video application’s frame rate) rather
than at the middleware level as in case of the adaptation used in this thesis. In addition, dynamic-
TAO is no longer developed and maintained. Instead, other systems have replaced dynamicTAO
such as the Universally Interoperable Core (UIC)-CORBA [ZBS97], a reconfigurable component-
based reflective middleware targeting environments with limited resources (e.g., handheld de-
vices).

QuO [LSZB98]: It is a framework designed to support the development of distributed applica-


UBLCS-2006-06                                                                                    75
                                                                     1   End-to-End QoS Architectures




                                     Figure 2. QuO Architecture



tions characterized by QoS requirements [VZL+ 98]. It provides the application designer with the
ability to (i) specify, monitor, and control QoS aspects of a distributed object application, and (ii)
define and implement application adaptation mechanisms that must be enabled in response to
changing system conditions (e.g., an intrusion aware application detecting an attack to a server
object may decide to break the connection to that server, locate a server that has not been attacked,
and reconfigure itself to use that latter server).
    In the QuO framework, a method call made by a client on a remote object, through its func-
tional interface, is a superset of a traditional CORBA call. This superset includes the additional
components itemized below and depicted in Figure 2:
   • Contracts between clients and objects: the Contract component specifies the level of service
     required by a client, the level of service a remote object is expected to provide its clients
     with, the operating regions indicating the possible measured QoS, and actions to take when
     the QoS level changes.
   • Delegates of remote objects: a Delegate component is a wrapper of a remote object, and pro-
     vides a functional interface identical to that of the remote object it wraps.

   • System condition objects: these components implement interfaces to resources, mechanisms,
     objects and ORBs in the system that require to be measured and controlled by QuO con-
     tracts.
    Moreover, the QuO toolkit includes: (i) a Quality Description Language (QDL), used for de-
scribing contracts, system condition objects, and the adaptive behavior of the objects and the
delegates; (ii) the QuO kernel, which coordinates the evaluation of contracts and the monitoring
of the system condition objects; (iii) Code Generators, which weave together the QDL descrip-
tions, the QuO kernel code, and the client code to produce a single program.
    When the client invokes a method on the remote object, it is actually invoking that method
on the local delegate, which triggers contract evaluation. The contract component gets the actual
values of the system conditions, to determine the current operating region. The delegate chooses
the behavior based on the current regions; for instance, the delegate might (i) choose between
alternative methods, or (ii) block when QoS has degraded, or (iii) pass the method invocation
through to the remote object. Then the remote object is invoked, performs its method and returns
a value. By this time, the delegate performs similar processing upon the method return, i.e. it


UBLCS-2006-06                                                                                      76
                                                                     1   End-to-End QoS Architectures




                                    Figure 3. AQuA Architecture



evaluates the contract to obtain the QoS regions and selects a behavior, passing then the return
value back to the client [VZL+ 98].
    In summary, QuO provides a middleware platform capable of supporting end-to-end QoS. Its
important features are the QuO Contracts, and the QDL implemented to describe these Contracts.
However, QuO monitoring and adaptation mechanisms do not appear to be sufficiently flexible,
as they can be activated only when an invocation on a remote method is made by a client, or the
result of that invocation is returned to that client.
    To conclude, it is worth noticing that the QuO project shares some similarities with the ap-
proach of this thesis. For example, the QuO contracts are similar to the concept of the resource
plan maintained by the CS. However, whereas the QuO contracts specifies which corrective ac-
tions are to be taken in case of QoS changes, the resource plan of this thesis, maintained by the
CS, does not include any information about the reconfiguration strategies that are applied at run
time in case the hosting SLA is close to being violated.
    However, the main difference between QaAS and QuO concerns the monitoring and adapta-
tion activities; specifically these activities in the designed QaAS are not enabled on single client-
server interactions, only, as in case of QuO. In fact, in this thesis, the monitoring activity is con-
tinuously performed at run time in order to detect changes in the hosting environment that may
lead to violate the SLAs; in case the SLA is close to being breached, the adaptation (i.e., recon-
figuration of the hosting environment) is triggered by the Monitoring Service and carried out by
the Configuration Service.

AQuA [CRS+ 98]: it is a framework for building dependable, distributed, object-oriented sys-
tems that support the adaptation to both faults and changes in an application’s dependability
requirements.
     Figure 3 illustrates the components of the AQuA architecture. In AQuA, fault tolerance is
achieved through the replication of objects. All replicas of an object form a group. Messages
communicated among different objects are sent through groups. To provide fault tolerance at the
most basic level, the AQuA system uses the Ensemble group communication system to ensure
reliable communication between groups of processes, to ensure that totally ordered messages
are delivered to the members in a group, to maintain group membership based on the virtual
synchrony model, and to detect and exclude from the group members that fail by crashing.
     In order to provide a way for an application to specify its dependability requirements, the
aforementioned QuO middleware platform can be used. Specifically, in the AQuA approach,
QuO is used to transmit applications dependability requirements to Proteus, which attempts to
configure the system to achieve the desired level of dependability. QuO also provides an adapta-
tion mechanism that is used when Proteus is unable to provide the specified level of dependabil-
ity.
     In AQuA, Proteus provides adaptive fault tolerance. It consists of a replicated dependability

UBLCS-2006-06                                                                                      77
                                                                     1   End-to-End QoS Architectures


manager, a set of object factories, and gateway handlers. The dependability manager determines
a system configuration based on reports of faults and desires of application objects. An object
factory that resides on each host is used to create and kill objects, as well as to provide load and
other information about the host to the dependability manager.
    Communication between all architecture components (i.e., applications, the QuO runtime,
object factories, and dependability managers) is done using gateways, which translate CORBA
object invocations into messages that are transmitted via Maestro/Ensemble. Furthermore, the
handlers in a gateway implement multiple replication schemes and communication mechanisms.
The handlers are also used to detect application value faults, and to report value faults and group
membership changes to the dependability managers.
    Hence, AQuA is a CORBA-based framework that adds replication mechanisms to standard
ORBs in order to cope with objects failures and site crashes. In particular, the replication is
enhanced by adding specific QoS attributed to the object replication. The QoS requirements
are managed by using the aforementioned QuO framework; therefore the above considerations
about QuO are also valid for the AQuA architecture.

Agilos (Agile QoS) : it is a middleware architecture designed to provide services that can support
so-called application-aware QoS adaptation mechanisms for use from distributed applications
structured as a set of CORBA objects. The Agilos adaptation mechanisms are fully configurable
to the applications needs, and aware of the application-specific semantics. These mechanisms
monitor the system resources, and maintain system-wide adaptation properties of the applica-
tions. As they are implemented at the middleware level, they do not require tight integration or
modifications to the services implemented in the OS kernel and network protocol stack. Agilos is
designed as a three-tier architecture, as illustrated in Figure 4. The first tier embodies two compo-
nents, namely the adaptors and the observers, which maintain tight relationships with individual
resources, and support low level resource adaptation by reacting to changes in the availability of
those resources. The second tier consists of application-specific configurators, responsible for tak-
ing decisions as to when and what application functions are to be invoked in a client-server ap-
plication (based upon on-the-fly user preferences and application-specific rules). Moreover, this
tier includes so-called QualProbes components that provide QoS probing and profiling services,
so that application-specific adaptation rules can be either derived by measurements, or specified
explicitly by the user. The third tier, on both clients and servers, consists of a centralized gateway
and multiple negotiators that control the adaptation behavior of an application constructed out
of multiple clients and servers, so that dynamic reconfigurations of the client-server mapping are
possible, and can be tuned to the application.
    It is worth observing that both the middleware components and the actual QoS-aware ap-
plications may be reconfigured to adapt to the changing environment. Thus, as pointed out
in [Bao00], the following two distinct approaches exist to the design of application-aware QoS
adaptation mechanisms. One approach, adopted by the QuO middleware described earlier, is
to dynamically reconfigure the middleware itself so that it can provide a stable and predictable
operating environment to the application, transparently to the application itself. This approach
is attractive as it does not require any modifications to the application; hence, any legacy appli-
cation can be deployed with little efforts and with a certain level of QoS assurance. However,
since it can only provide a generic solution to all applications, a set of highly application-specific
requirements cannot be addressed. In contrast, the middleware can be active, and exert strict con-
trol of the adaptation behavior of the QoS-aware applications, so that these applications adapt
and reconfigure themselves under such control. This approach has the advantage of knowing
exactly what are the application specific adaptation priorities and requirements, so that appro-
priate adaptation choices can be made to address these requirements. The Agilos middleware
architecture implements this approach.
    In contrast, this thesis proposes that the adaptation be carried out at middleware level rather
than at the application level. This allows the distributed applications to be maintained un-
changed and the application developers to concentrate on the functional application require-


UBLCS-2006-06                                                                                      78
                                                                     1   End-to-End QoS Architectures




                                 Figure 4. The Agilos Architecture



ments, only.

The Real-Time Specification for Java (RTSJ) [BG00]: The RTSJ extends the Java programming lan-
guage and the Java Virtual Machine specifications in order to provide an Application Program-
ming Interface that enables the creation, verification, analysis, execution and management of
real-time Java threads (i.e., Java threads whose correctness criteria include timeliness) [BFH+ 00].
    The Real-Time for Java Experts Group identified the following seven areas of interest that re-
quired Java language extensions; namely, thread scheduling, memory management, thread syn-
chronization, asynchronous event handling, asynchronous control transfer, asynchronous thread
termination, and physical memory management.
    Thus, in essence, RTSJ consists of a set of primitives that extend the Java programming lan-
guage so as allow the programmer of real-time Java applications to deal with the seven areas of
interest mentioned above.
    To conclude, RTSJ can be useful in case stringent QoS requirements are to be met, as it pro-
vides guarantees concerning the memory management, the thread synchronization and schedul-
ing, and so on. However, it is worth noticing that RTSJ is an extension of a programming lan-
guage rather than a collection of middleware services for the QoS management.

Real-Time Adaptive Resource Management (RTARM) [CJCP00]: It is a system that implements a
general middleware architecture for adaptive management for integrated services, aimed to real-
time mission-critical distributed applications. The RTARM uses an innovative approach, which
unifies heterogeneous resources and their management functions into a hierarchical uniform ab-
stract service model. It provides the following basic features: (i) scalable end-to-end criticality-
based QoS contract negotiation that permits distributed applications to share common resources
while maximizing their utilization and execution quality; (ii) end-to-end QoS adaptation that
dynamically adjusts application resource utilization, according to their availability while opti-
mizing application QoS; (iii) integrated services for CPU and network resources with end-to-end
QoS guarantees; (iv) real-time application QoS monitoring for integrated services and, finally, (v)
plug-and-play architecture components for easy extensibility for new services.
    A prototype of the system has been implemented as a middleware layer above the operating
system and network resources. The prototype is composed by RTARM servers, developed in


UBLCS-2006-06                                                                                     79
                                                                  2   Component-based architectures




                                 Figure 5. ControlWare Architecture



C++, running as user-level processes on three Windows NT workstations and export a CORBA
interface to clients and applications. In the RTARM system, clients and applications are two
different components: A client is any entity that issues a request for services and negotiates a QoS
contract; an application consumes services reserved by a client on its behalf and continuously
cooperates with the resource management system to achieve the best available QoS.
    In conclusion, the RTARM system seems to fit with the objectives of this thesis even if it does
not work in particular with the concepts of Service Level Agreements and application servers
as in the case of this work. Moreover, RTARM is implemented in C++ rather than Java and it
manipulates such resources as CPU and networks.

ControlWare [ZLAS02]: It is a middleware platform developed to provide Internet services with
performance guarantees. This platform applies methods, derived from the Control Theory, for
system configuration and control purposes. Specifically, using ControlWare, the controlled sys-
tem can be an Internet service, and the control goal is to provide that service with QoS guarantees.
    ControlWare provides the Internet service developer with software tools and library rou-
tines for converting QoS specifications (i.e., the required QoS guarantees) into so-called feedback
control loops; these loops are stored in service configuration files, and implement the service
performance control mechanisms. In addition, ControlWare implements a convenient interface
between the service software and the feedback control loops mentioned above. This interface
is termed SoftBus, and is responsible for the management of the monitoring of the controlled
system, and its adaptation to possible variations of the execution environment. The SoftBus, as
shown in Figure 5 is a distributed protocol running across multiple machines and address spaces,
forming a virtual application backbone into which applications, performance sensors, and actu-
ators can plug-in. Sensors and actuators measure the system performance, and implement the
required adaptation strategies.
    In essence, the feedback control theory used by ControlWare has influenced the approach of
the monitoring presented in this thesis.


2     Component-based architectures
In addition to these architectures, it is worth mentioning that component-based technologies,
such as Java 2 Enterprise Edition (J2EE) [Sha03], Microsoft .NET [Pro02] or CORBA Component
Model (CCM) [WSO00], are currently assuming a preeminent role in developing multi-tier dis-
tributed applications constructed out of software components.
    These technologies support the specification of the functional component interfaces; how-
ever, they support only partially the definition of non-functional properties (i.e., the QoS) of the

UBLCS-2006-06                                                                                    80
                                                              3   Architectures for resource clustering


component execution.
   Thus, recently some works have begun to emerge in order to integrate those non-functional
properties into containers, i.e. run-time environments, promoted by these technologies, that
host application components. In particular, Conan et al [CPFD01] use containers together with
aspect-oriented software development techniques to plug in different non-functional behaviors.
DeMiguel [DeM01] extends specific containers, i.e. Enterprise JavaBeans (EJB) containers, to
support a new interface, named QoSContext that allows the exchange of QoS related information
with component instances. Therefore, to take advantage of the introduced QoS management, a
component must implement two specific interfaces, the QoSBean and the QoSNegotiation inter-
faces.
   The above two approaches differ from the proposed approach as they provide new types of
application components, and a new run-time environment for those components, in order to meet
non-functional application requirements. In contrast, this thesis has described a designed and
implemented middleware architecture that is an extension of an open source application server
used to augments that server with QoS capabilities, transparently to the application components.
   IBM Lotus [lot] provides clustering features that can be applied to wide area network based
clusters for some such Lotus services as Lotus Instant Messaging and Web Conferencing. How-
ever, the Lotus’ proposed solution does not include the notion of Service Level Agreement as
means for specifying QoS requirements.


3     Architectures for resource clustering
Issues of resource clustering have been widely investigated in the literature, since a number of
years. Recent relevant work includes the following.
    [SYC03] presents the design and the implementation of the Neptune middleware system that
provides clustering support and replication management for scalable network services. To this
end, Neptune targets partitionable network services, i.e., persistent data manipulated by such a
service can be divided into a large number of independent partitions and each service access can
be delivered independently on a single partition or each access is an aggregate of a set of sub-
accesses. The Neptune architecture encapsulates an application-level network service through a
service access interface which contains several RPC-like access methods. In addition, Neptune,
employs a functionally symmetric approach in constructing the server cluster. This implies that
every node in the cluster contains the same Neptune components and is functionally equal to
each other. Each cluster node can be elected to host service components with certain data parti-
tions and it can also access service components hosted at other nodes in the cluster. The nodes in
the cluster are loosely connected by a publish/subscribe communication system.
    Figure 6 illustrates the architecture of a Neptune node. As depicted in Figure 6 there is a
request dispatcher that directs an incoming request to the hosted service instance. If the chosen
node is busy, the subsequent requests will be queued. The availability publisher periodically an-
nounces the locally hosted service component, the data partitions and the access interface to other
nodes in the cluster. The availability listener monitors those announcements from other nodes and
maintains a local yellow service page. This latter contains available service components and their
locations in the cluster.
    Moreover, the Neptune architecture uses load balancing in order to load among the nodes
of the cluster. To this end, each node can poll the load indexes of other nodes through a local
load polling agent. The polls are responded to by load index servers at the polled servers. The load
balancing strategy used is a random polling-based policy.
    From the evaluation performance results obtained for Neptune it emerges that the clustering
support and replication management is carried out only within local area server clusters. Thus,
no wide area network experiments, to the best of my knowledge, have been carried out of the
proposed architecture. However, [SYC03] demonstrates that their multilevel replication consis-
tency model ensures that client accesses are serviced progressively within a specific soft staleness
bound, which is sufficient for many network services.

UBLCS-2006-06                                                                                       81
                                                              3   Architectures for resource clustering




                        Figure 6. Component Architecture of a Neptune node



    Once again, this framework is based on the CORBA middleware rather than using specific
application server technologies as the approach of this thesis.
    [KSC03] proposes an adaptive framework that dynamically selects replicas to service a client’s
request, based on the prediction made by probabilistic models. These models use the feedback
from online performance monitoring of the replicas to provide probabilistic guarantees for meet-
ing client’s QoS specifications. The designed framework has been implemented in the earlier de-
scribed AQuA CORBA-based middleware that supports transparent replication of objects across
LAN. In order to design the framework, the authors of [KSC03] addressed three main issues: (i)
the organization of the replica, (ii) the development of protocols that implement different con-
sistency semantics and (iii) the development of a mechanism to select replicas to service a client
dynamically based on the client’s QoS requirements.
    In [USR02] techniques are described for the provision of CPU and network resources in
shared hosting platforms (i.e. platforms constructed out of clusters of servers), running poten-
tially antagonistic third-party applications. In particular, the architecture proposed in this paper
provides applications with performance guarantees by overbooking clustered resources, and by
combining this technique with commonly used resource allocation mechanisms.
    In addition, [USR02] presents techniques for empirically deriving an application’s resource
needs. To this end, the QoS requirements of an application are defined on a per-capsule basis.
A capsule is a specific application component running on a single node of the above mentioned
shared hosting platform. For each capsule, the QoS requirements specify the intrinsic rate of
resource usage, the variability in the resource usage, the time period over which the capsule
desires resource guarantees and the level of overbooking that application is willing to tolerate.
    Then, a platform node can accept a new application capsule so long as the resource require-
ments of existing capsules are not violated, and sufficient unused resources exist to meet the
requirements of the new capsules. However, if the node resources are overbooked, another re-
quirement is added: the overbooking tolerances of individual capsules already places on the
node should not be exceeded as a result of accepting the new capsules.
    This [USR02] work is similar to that discussed [ADZ00], which describes a comprehensive
framework for resource management in web servers, with the aim of delivering predictable QoS
and differentiated services. However, in this paper application overload is detected when re-
source usage exceeds some predetermined thresholds; in the designed middleware architecture
it has been also used thresholds to determine when a resource is overloaded, but without consid-
ering only web servers as environment to apply resource management mechanisms. In [USR02],


UBLCS-2006-06                                                                                       82
                                                               3   Architectures for resource clustering


instead, the application overload is detected by observing the tail of recent resource usage distri-
butions.
    In [STYC02] the design and implementation of an integrated resource management frame-
work for cluster-based network services is investigated. Specifically, an adaptive multi-queue
scheduling scheme is employed inside each node of a clustered environment, in order both to
achieve efficient resource utilization under quality constraints, and to provide service differenti-
ation.
    This researches differ from the presented approach as this thesis mainly investigates mecha-
nisms of resource management at middleware level and specifically, at application server tech-
nologies level. Moreover, the QoS requirements considered by this work are specified within
so-called SLAs, not per-capsule basis as in [USR02]. In contrast, the above mentioned works
enables clustering mechanisms located at a lower-level of abstraction.

3.1 Resource clustering in Web services
The research activity on QoS enforcement, monitoring, and resource clustering, in the context
of both Web services and application server technologies, has been very active in recent years.
This activity has produced a number of relevant proposals which have influenced notably the
approach to the design of a QoS-aware application server.
     In [TIRL03] the authors discuss the design of a framework that enables the development of
composite Web services. Their proposal is based on the use of forward error recovery techniques,
at the compositional level, in the specification of the behavior of composite Web services in the
presence of failures. These services can be structured as coordinated atomic actions which have
a well-defined behavior, both in the presence and absence of failures.
     In [ISP00] the principal issues involved in the design of high-volume, highly dependable Web
services are discussed. In particular, this paper describes a number of both hardware (e.g., Cisco’s
DistributedDirector), and network-based (e.g., NAT, HTTP redirect, DNS round-robin) solutions
that can be deployed in order to meet Web service scalability requirements, and summarizes a
variety of software-implemented techniques (e.g., transactions on objects and process groups)
that can be used in order to meet Web service fault-tolerance requirements.
     Pacifici et al. [PSTY02] present an architecture and prototype implementation of a perfor-
mance management system for cluster-based Web services capable of allocate server resources
dynamically so as to maximize a given cluster utility function. This function is used to encapsu-
late business values, in the face of SLAs and fluctuating offered load. The architecture consists
of (i) a L4 switch used for load balancing purposes (a round robin policy is applied at this level),
(ii) a set of gateways responsible for controlling the amount of server resources allocated for each
traffic class of Web Services, (iii) a collection of server nodes wherein the Web services are exe-
cuted and (iv) a management console and a global resource manager. This latter global resource
manager is in charge of solving an optimization problem and tuning the parameters of the gate-
way’s mechanisms.
     The main difference between the above works and the application server design and imple-
mentation discussed in this thesis is that the proposed QoS-aware middleware services can sup-
port a generic, dependable application hosting in the context of component-based technologies,
rather than just Web services.
     [AS99] describes the Walrus system architecture that allows a single logical Web server to be
replicated to several clusters of identical servers where each cluster resides in a different part of
the Internet. An important aspect of Walrus is its ability to transparently direct the web browser
to the best replica without any changes to the web server, web client, and network infrastructure.
Best is a relative term, dependent on where the client is located on the network, and the load on
each replica.
     The principal Walrus’ architectural components are illustrated in Figure 7. As shown in Figure
7, the architecture contains four components: The Replicator, the Reporter, the Director, and the
Controller.
     Each geographical area that is to be controlled by Walrus has one name server. In the Walrus
system, this is known as an area. Each area has one or more Web servers assigned to it, collec-

UBLCS-2006-06                                                                                        83
                                                              3   Architectures for resource clustering




                             Figure 7. The Walrus System Architecture



tively known as a cluster. Thus, every area is assigned to a cluster, and every cluster is located
near an area. If an area is assigned to its nearby cluster, then that cluster is the home cluster. A
threshold is a value, above which a server is deemed to be overloaded and not to be used for new
clients. In a choice between no service at all, or overloaded service, Walrus will elect to use the
slower, yet still functional, overloaded server.
    A cluster is assigned to an area if the name server for that area points to one or more of the
Web servers within this cluster. To ensure that all of the different Web servers return equivalent
responses, they must be kept in a consistent state. To this end, the Replicator component of the
Figure 7 manages this task by reconciling changes among all of the replicated servers.
    In addition, every Web server has a background process (or daemon) running on it known as
a Reporter. The Reporter monitors the status of the local Web server and reports it to the rest of
the system so decisions on how to balance the system can be made.
    To direct the actions of the system, there are one or more programs known collectively as
Controllers. These programs use the information the Reporters send to control which parts of
which cluster are in use, and which parts should (for reason of load or otherwise) be taken out of
use. The location of the Controllers does not have to be related to the different areas. To prevent
multiple incarnations of the Controller from interfering with each other, only one Controller may
be active at any one time and there may be any number of backup Controllers. As a practical
matter, optimum service occurs when there is at least one Controller in each area. This would
ensure that if a portion of the network becomes isolated, every area is still guaranteed to have a
Controller.
    The Director defines the method used by the Controller to cause the client browser to go to
the appropriate server.
    The main difference between the above work and that discussed in this thesis is that the
developed services, at the middleware level, can effectively extending application server tech-
nologies so as to provide applications with dynamic clustering techniques through which the
cluster configuration can be varied at run time, depending on the load of the cluster and the SLA
requirements that are to be met.
    In contrast, Walrus appears to be an effective architecture used to manage different wide are
network deployed Web servers, by directing the client requests toward the “best” replica of the
cluster.




UBLCS-2006-06                                                                                       84
                                                  4   Architectures for SLA enforcement and monitoring


4     Architectures for SLA enforcement and monitoring
Issues of SLA enforcement and monitoring are discussed in [KL03] and [DK03]. These papers
propose specific SLAs for Web services, and focus on the design and implementation of an SLA
compliance monitor, possibly owned by Trusted Third Parties (TTPs).
    Specifically [KL03] presents a framework for defining and monitoring SLAs for Web Services
that consists of a flexible and extensible language based on XML Schema; it also includes a run-
time architecture comprising several SLA monitoring services, which may be outsourced to third
parties. The framework allows service providers and their customers to define the quality of
service aspects of a service, and Web Services in particular. In order to avoid the potential ambi-
guity of higher-level SLA parameters, the framework also allows the parties to define precisely
how resource metrics are measured and how composite metrics are computed. Moreover, [KL03]
describes in details the prototype of a SLA compliance monitoring system. This prototype has
influenced the designed MS proposed in this work.
    [DK03] extends the work presented in [KL03] and it presents the novel approach already de-
scribed in [KL03] using the Common Information Model (CIM). CIM is a language-independent
programming model that uses object-oriented techniques to describe an enterprise; it is designed
to present a consistent view of logical and physical objects in a management environment [For99].
    In addition, [ZK02] presents an architecture for the coordinated enforcement of resource shar-
ing agreements (i.e., SLAs) among applications using clustered resources. The [ZK02] architec-
ture exploits a uniform application-independent representation of agreements, and combines it
with efficient time-window based coordinated queuing algorithms running on multiple nodes.
The architecture is then implemented in two different network layers in the context of web ser-
vices: an application layer HTTP redirector and a transport layer packet redirector, both located
between clients and a clustered server system accessed using HTTP.
    The approach to the design of this architecture shares a number of similarities with the ap-
proach of this thesis (e.g., the resource sharing among the applications is governed by the SLAs
these applications have with their hosting environment). However, prototype implementations
of this architecture have been developed both at the HTTP level, and at the transport level, rather
than at the middleware level, as in the approach of this work.
    Another work that has notably influenced the approach proposed in this thesis is the Oceano
[AFF+ 01] IBM project. Oceano is a prototype of a highly available, scaleable, and manageable in-
frastructure for an e-business computing utility. Oceano manages the resources of the computing
utility in an automatic and dynamic manner, by reassigning resources to distributed applications
in order to meet specific SLAs. To this end, Oceano includes SLA driven monitoring, event cor-
relation, network topology discovery and automatic network reconfiguration.
    The approach used by Oceano is very close to the proposed solution; however, the Oceano’s
load balancing is performed at the network level rather than at middleware level as in our case.
    Finally, in [PSS+ 05], the authors present a platform, implemented as an extension of the Web-
Sphere application server, that can manage complex multi-tiered applications where each request
uses multiple resources distributed over multiple tiers. The objective of the above platform is to
manage the response time of different web applications. To this end, the platform allows a ser-
vice provider to specify a policy that divides all possible requests of different web applications
into service classes and assigns a performance goal to each class. The performance goal describes
the target response time and the importance level that, in turn, specifies the relative importance
of meeting, exceeding, or falling to meet the target specified within SLAs. The usage of the per-
formance goals and the importance levels for each hosted web applications determines how to
allocate resources to those applications.
    Figure 8 illustrates the overall system architecture proposed in [PSS+ 05]. As shown in this Fig-
ure, different replicas (i.e., application instances) of different web applications can be deployed
in several nodes (i.e., machines) of a WebSphere cluster. Within such a hosting environment,
the system principally reacts to the application replica placement and computes signals termed
resource requests, that can guide the dynamic adjustment of the application placement (i.e, how
many replica to use for each web applications and which server machines are suitable for running

UBLCS-2006-06                                                                                      85
                                                  4   Architectures for SLA enforcement and monitoring




                        Figure 8. The extended WebSphere application server



those replicas).
    In order to achieve the above objectives, the system makes use of a Proxy tier that divides
and controls the flow of requests. In addition, a Resource controller is used that monitors the
request response time and other performance metrics and periodically recomputed the resource
allocation parameter for the use from the Proxy tier.
    Thus, when a request arrives at the proxy, the proxy examines the request attributes and clas-
sifies the request according to the user SLA. Depending on the service class the request belongs to,
the proxy inserts the request into the queue associated with the specific service class. A scheduler
mechanism is then used that is responsible for removing a request from a queue and forwarding
that request to a load balancer mechanism that selects the suitable node that can processes it. The
load balancer mechanism uses a weighted round robin in order to dispatch the client requests.
    The resource controller is responsible for monitoring the response time experienced by the
requests and computing the amount of resources to allocate to each service class. In particular,
the monitored performance is mapped to a continuous utility function and the adaptation is
constantly performed in order to optimize the overall system utility.
    In essence, the work described in [PSS+ 05] is very similar to the work presented in this thesis.
In either cases, an extension of an application server technology is presented. However, some
differences can be identified and summarized as follows.
    In this thesis, QaAS uses warning thresholds in order to dynamically adapt the hosting en-
vironment to the changes that occur at run time. In the first prototype architecture presented in
this thesis, these thresholds were statically determinate, whereas in the discussed evolution of
the prototype, they were computed by taking into account the actual operational conditions of
the cluster. Based on these thresholds, QaAS starts adapting the hosting environment in order to
prevent the cluster to reach the breaching points specified in the SLAs.
    In contrast, in [PSS+ 05] the observed performance is mapped to a continuous utility function
and the adaptation is constantly enabled in order to maximize the overall utility function.
    QaAS uses a load balancing mechanism that can be considered a J2EE service of the appli-
cation server in which is integrated; it can be configured so as to use a variety of static but also
dynamic load balancing strategies. In contrast, to the best of my knowledge, the load balancing
mechanism incorporated in the above described Proxy tier is used with a weight round robin

UBLCS-2006-06                                                                                      86
                                                                 5   Architectures for Load Balancing


policy only.
    To conclude, it is worth noticing that the work presented in [PSS+ 05] introduced a further
complexity as a multiple deployment of web applications with different SLAs can be performed.
In contrast, in the presented QaAS’s experimental evaluation, it has been assumed that one ap-
plication is deployed in the cluster. However, work is in progress for enabling a multiple ap-
plication deployment in the QoS-aware clustering environment (further details concerning this
deployment can be found in Section 1).


5    Architectures for Load Balancing
As to the load balancing activity, in the following some works about this research field are com-
pared and contrasted with solution proposed by this thesis.
     In [OOS01] and [BSDO04] the authors present a novel adaptive load balancing service that
can be implemented using standard CORBA features. The key features of this load balancing ser-
vice named Cygnus are its ability to make load balancing decisions based on application-defined
load metrics, dynamically (re)configure load balancing strategies at run-time and transparently
add load balancing support to client and server applications.
     The approach discussed in these two references is similar to that presented so far. However,
the service is principally integrated in CORBA and exploits such basic CORBA features and com-
ponents as the “location forward GIOP” message (used to forward client requests to other servers
transparently, portably and interoperably), and the Portable Object Adapter (POA). Moreover, in
their proposed adaptive load balancing policy, a load metric threshold is being used in order to
determine whether the particular location is overloaded. The load metrics supported are (i) a
requests-per-second metric, which calculates the average number of requests arriving per second
at each server, (ii) a CPU run queue length metric, which returns the load expressed in terms of
number of processes in the OS run queue over a configurable time period in seconds, and finally
(iii) a CPU utilization metric, which returns the load as the CPU usage percentage. These met-
ric are computed by using a specific monitoring activity included in the Cygnus load balancing
system.
     In contrast, the approach of this thesis is applied in a Java environment, specifically in a
component-based technology such as J2EE.
     In the J2EE context, to the best of my knowledge, widely used non open source application
servers such as WebSphere and WebLogic provide the users with HTTP load balancing mecha-
nisms similar to the solution of this work.
     Specifically, the WebLogic platform consists of a WebLogic Server equipped with a set of
Web server plug-ings. These latter plug-ins are modules that can be added to third party Web
Servers and configured appropriately to enable the interactions between the WebLogic Server
and the application components hosted inside proprietary Web Servers. In general, the plug-ins
are used to distribute the HTTP client requests among the WebLogic clustered servers; their static
load balancing policies are (i) the Round Robin, which cycles through the list of WebLogic Server
instances in order, and (ii) the Weight-based, which improves the previous round robin algorithm
by taking into account a pre-assigned weight for each server [bea03]. No adaptive policies are
provided.
     Furthermore, it is worth observing that the plug-ins use a first list of target WebLogic Server
instances as starting point for load balancing among the members of the cluster. After the first
request is routed, a dynamic list of servers is returned. This mechanism allows WebLogic to
provide the users with a dynamic load balancing technique in terms of variations in the clus-
ter membership configuration (this is also provided by the CS). However, the WebLogic’s load
distribution mechanism is not part of the J2EE architecture, as it is the LBS.
     In the WebSphere application server, load balancing can take place at two different levels
that are described as follows.
     The first solution proposed by WebSphere [ea04b] is to include the load balancing mechanism
inside a plug-in for the Web container. The plug-in distributes the HTTP client requests based on

UBLCS-2006-06                                                                                     87
                                                                                      6   Conclusions


two different algorithms, namely the Round Robin with Weighting, which cycles through the list
of WebSphere instances and decrements their weight by 1 when a server is first selected, and the
Random algorithm. Note that these two load balancing strategies are static and they are unable to
cope with variations of the computational load of the servers, as the WorkLoad policy does. In
addition, the list of the available WebSphere instances is not dynamically determined as in both
the LBS’s and WebLogic’s cases, but it is included in a specific configuration file.
    The second solution proposed by WebSphere [ea04a] is to use an IBM software packet named
WebSphere Edge Components, which includes a Load Balancer. This Load Balancer consists of dif-
ferent components. The most important one, for the purpose of the current discussion, is the
Dispatcher component.
    The Dispatcher distributes the load it receives to a set of servers contained in the cluster, and
decides which server will handle the HTTP request, based on the weight of each server in that
cluster. The weight can be set as a fixed value (it will not change no matter the conditions of the
balanced servers) or it can be dynamically computed by the Dispatcher. In this latter case, the list
of available clustered servers is consequently dynamically obtained.
    To conclude, it is worth observing that the designed approach is similar to the second solution
proposed by WebSphere. However, the main difference is that the LBS is a J2EE component fully
integrated in the application server and not included within specific architectures such as the
aforementioned IBM WebSphere Edge Components.


6     Conclusions
From this assessment some key issues, for the purposes of this thesis, have been identified and
summarized in the following.
    All the above architectures support object oriented paradigms in the form of remote services
or remote objects.
    In addition, the state-of-the-art on end-to-end QoS architectures consists of architectures con-
structed out of QoS-aware middleware services that provide applications with end-to-end QoS
guarantees. These services are typically responsible for meeting the non-functional application
requirements mentioned in this thesis, so as to allow application developers to concentrate on
the implementation of the functional aspects of their application.
    The distinguishing features of these architectures can be summarized as follows. Most of the
aforementioned works (RT-Corba, TAO, QuO, AQuA, RTSJ, RTARM, UIC) address issues of end-
to-end QoS in multimedia and mission critical distributed applications, incorporating reservation
and monitoring mechanisms.
    The QuO/AQuA/TAO approach to the monitoring and adaptation focuses on single client-
server interactions and does not directly support resource management for a large number of
interactions, as the case of this thesis.
    Moreover, following the middleware layer classification of the Introduction of this work, such
architectures as RT-Corba, TAO, UIC, dynamicTAO provides a collection of middleware services
can be located at the Distribution layer. In contrast, the QuO (and consequently AQuA that uses
QuO) provides services for monitoring and adaptation at the Common Middleware Services as
the case of the QoS-aware middleware services described in this thesis.
    Most of the examined frameworks are CORBA-based middleware platforms. They all use
CORBA as distribution middleware layer and, on the top of it, they develop and use QoS-aware
middleware services. These services exploit the underlying CORBA functionalities for develop-
ing and executing distributed applications.
    In contrast, only few architectures are available in Java for QoS management in distributed
systems. In addition, most of them concentrate their efforts on either the lowest levels of the dis-
tributed systems (i.e., at ORB level) or the programming language level. Hence, there are works
in the literature that (i) propose solutions to extend the Java Remote Method Invocation (Java-
RMI) to make it timely predictable [DeM01, Bor03] and (ii) propose solution to extend the Java


UBLCS-2006-06                                                                                     88
                                                                                      6   Conclusions


programming language with further specifications, in order to provide the application program-
mers with certain language features (e.g., the aforementioned RTSJ).
    In addition, it has been seen that current open source J2EE [Sha03] application server tech-
nologies are Java-based middleware platforms that include middleware services for transaction,
persistence and communication management. However, they are not fully instrumented to meet
such non-functional requirements as availability, reliability, timeliness and scalability and they
lack of middleware services for resource configuration, monitoring and adaptation.
    Nevertheless, in a non-open source commercial application server (i.e., WebSphere) work is in
progress in order to integrate the aforementioned middleware services for resource monitoring
and adaptation. In particular, it has been discussed a system architecture [PSS+ 05] very similar
to the work presented in this thesis; in fact, in either cases the developed architectures target the
same objectives. However, as it has been described before, the approach used by QaAS to achieve
the objectives of this thesis is different from that included in WebSphere.
    In addition, for the purpose of this thesis, the Oceano system seems to propose a solution
that is very close to the solution described in this work. However, as previously noticed, there
are some differences between the two approaches that principally concern the load balancing
policies adopted.
    To conclude, this assessment has confirmed that a number of recommendations and design
principles adopted in this thesis are necessary in order to achieve the discussed objectives. These
include both such recommendations as the need for incorporating a resource monitoring service,
in order to assess the resource state at run time, as well as design principles such as those that
can be derived from the control theory, in order to deploy adaptation facilities.
    Moreover, clustering provides one with an indeed valuable approach to the design and devel-
opment of adequate support for highly available distributed applications, as these applications
can be replicated across multiple machines, in a controlled manner. However, to the best of
my knowledge, none of the clustering mechanisms cited above is based on augmenting exist-
ing open source component-based technologies with middleware services for cluster-wide SLA
enforcement, configuration, run-time monitoring of the delivered QoS, and adaptation.




UBLCS-2006-06                                                                                     89
Chapter 6

Concluding remarks and future
directions

Advanced resource management techniques are becoming increasingly necessary in a variety of
different technological contexts. For example, in Grid [FK98], resources may have to be dynami-
cally discovered and made available to the application; in Utility Computing [IBM04], resources
(e.g., in a data-center) may have to be allocated to different applications on demand; in Applica-
tion Service Provision [Con], resources may have to be allocated to, and optimized for, different
hosted applications. These techniques are of crucial importance in order to optimize the resources
usage, avoid resource over-provision, and ensure that the QoS application requirements are ef-
fectively met.
    This thesis has discussed the design, implementation, and experimental evaluation of a col-
lection of QoS-aware middleware services that provide distributed applications with one such a
resource management technique in the context of application server technologies. By using these
services, which are part of a generic open source middleware architecture, one can implement
QoS-aware, dynamic clustering of application servers.
    In particular, two different stages in the design of such QoS-aware middleware services have
been discussed. In the first stage, a preliminary implementation version (QaAS v 1.0, available
for the download at the sourceforge web site [TAP]) has been released and evaluated. The re-
sults obtained from the preliminary experimental evaluation have shown the ability of QaAS of
honoring hosting SLAs, and optimizing the usage of the clustered resources, without incurring
in resource over-provision. Moreover, it has been shown that an adaptive load balancing ap-
proach can be beneficial compared to a static load balancing service; in fact, the results obtained
show improvements in both the performances perceived by the clients and the optimization of
the utilization of the clustered resources, when an adaptive load balancing service is enabled in
the cluster rather than a static one.
    The second stage in the design of QaAS introduced (i) a new model used to dynamically com-
pute the thresholds that trigger the run time re-configuration of the cluster and (ii) the possibility
of violating the hosting SLAs, provided that the hosting SLA violations be maintained under a
predefined limit specified in the SLA itself. Once again, the new QaAS has been implemented by
producing a new version of the software that has been evaluated.
    The experimental results obtained from a second validation exercise of the new implemented
prototype of the middleware services confirm the adequacy of the proposed approach; specifi-
cally, these results show that the QoS-aware middleware services enable the implementation of
an application hosting environment capable of improving the results obtained from the prelim-
inary evaluation and concerning the optimization of the clustered resources. In addition, it has
been shown that the middleware can still honor hosting SLAs as it maintains the SLA violations
significantly under the imposed SLA limit.
    However, it has been also demonstrated that in a WAN context, QaAS is not as beneficial as in


90
                                                                                   1   Future directions




                                                                     WAN cluster




              WAN cluster




             LAN cluster




                                  Single machine




                              Figure 1. Geographical clusters of nodes



case of a LAN environment. Within such a wide area context, the mechanisms described in this
thesis for advance resource management can exhibit scalability challenges that are be addressed.
In fact, with thousands of geographically deployed application server instances and issued client
requests a flat approach, as that presented in this work, can result inappropriate.


1     Future directions
Hence, future work is needed that includes extending the QoS-aware clustering mechanism to
effectively operate across wide area networks, and investigating issues of both service provision
and component based application deployment in geographically distributed, clustered environ-
ments.
    Thus, in order to enable an effectively scalable QoS-aware clustering a hierarchical approach
can be adopted.
    To this end, clusters of application servers can be combined together so as to create a recursive
and hierarchical clustering structure. Therefore, a possible extension of the work presented in this
thesis can identify two principal hierarchical levels in the QoS-aware clustering; namely a new
level that can be termed Wide Area Network (WAN) level and the Local Area Network (LAN) level,
already used in this thesis.
    Figure 1 depicts a preliminary idea of a geographical hierarchical model. As shown in Figure
1, a geographical cluster can be used in order to host distributed applications.
    That cluster is a WAN cluster, possibly constructed out of other WAN clusters (dashed circles
in Figure 1), and LAN clusters, which may consists recursively of other LAN clusters (dotted
circles in Figure 1). The leaves of this hierarchical structure are represented by single machines
(the smallest circles with diamonds of Figure 1) that deploy the discussed QaAS nodes.
    Within such a context, it is necessary to design specific components capable of analyzing the

UBLCS-2006-06                                                                                        91
                                                                                1   Future directions


clusters, in order to select the geographical cluster configuration that can better fulfill the QoS-
requirements specified in SLAs.
   Hence, one can think to have the discussed QoS-aware middleware services able of operat-
ing at the WAN cluster level and cooperating with those deployed at the LAN level in order to
provide applications with advance geographical resource management. Thus, a Configuration
Service at the WAN level may select the LAN clusters to compose a geographical QoS-aware
clustering, depending on specific information concerning each cluster, or it may require to install
Virtual Private Network (VPN) technologies [Dav04] amongst the selected clusters.

Multiple application deployment Another possible extension of the work presented in this thesis
concerns the ability of the QoS-aware middleware services of enabling dynamic resource man-
agement techniques to a large number of applications deployed in the cluster; these applications
can have their own hosting SLAs to be honored and may compete on the use of those clustered
resources. In this case, work is in progress that consists of assigning priorities to the vary host-
ing SLAs and monitoring those SLAs so as to configure the clustered nodes by preferring the
applications with the highest SLA priorities.




UBLCS-2006-06                                                                                     92
                                                                                       REFERENCES


References
[ade]       ”Adesso AG”. Available at http://www.adesso.de.
[ADZ00]     M. Aron, P. Druschel, and W. Zwaenepoel. ”Cluster Reserve: A Mechanism for Re-
            source Management in Cluster-based Network”. In Proceedings of ACM SIGMETRICS
            Conference, Santa Clara, CA, June 2000.
[AFF+ 01]   K. Appleby, S. Fakhouri, L. Fong, G. Goldszmidt, M. Kalantar, S. Krishnakumar, D. P.
            Pazel, J. Pershing, and B. Rockwerger. “Oceano-SLA Based Management of a Com-
            puting Utility”. In Proceedings of the International Symposium on Integrated Network
            Management, pages 14–18, Seattle WA, May 2001.
[Agr]       Service Level Agreement. Available at http://www.wilsonmar.com/1websvcs.htm.
[AS99]      Y. Amir and D. Shaw. ”Walrus: A Low Latency, High Throughput Web Service us-
            ing Internet-wide Replication”. In Proceedings of the 19th IEEE ICDCS Workshop on
            Electronic Commerce and Web-based Applications, pages 31–40, Austin, May 1999.
[Bala]      The Cisco LocalDirector Load Balancing. Available at http://www.cisco.com/.
[Balb]      ZXTM Load Balancing. Available at http://www.zeus.com/products/zlb/.
[Bao00]     L. Baochun. ”Agilos: A Middleware Control Architecture for Application- Aware Quality of
            Service Adaptations”. PhD thesis, University of Illinois at Urbana-Champaign, Illinois,
            2000.
[BCL+ 04]   M. J. Buco, R. N. Chang, L. Z. Luan, C. Ward, J. L. Wolf, and P. S. Yu. ”Utility
            computing SLA management based upon business objectives”. IBM Systems Jour-
            nal, 43(1):159–178, 2004.
[BCS94]     R. Braden, D. Clark, and S. Shenker. ”Integrated Services in the Internet Archi-
            tecture: An Overview”. Technical report, IETF RFC 1633, 1994. Available at
            http://www.ietf.org/rfc/rfc1633.txt.
[bea03]     ”BEA WebLogic Server - Using WebLogic Server Clusters (v. 8.1)”. Available at
            http://e-docs.bea.com/wls/docs81/cluster/, 20 October 2003.
[BFH+ 00]   G. Bollella, S. Furr, D. Hardin, P. Dibble, J. Gosling, M. Turnbull, and R. Belliardi. The
            Real-Time Specification for Java. Addison-Wesley, 2000.
[BG00]      G. Bollella and J. Gosling. ”The Real-Time Specification for Java”. IEEE Computer,
            33(6):47–54, June 2000.
[Bob03]     A. Bobbio. ”Dependability Theory and Methods”. Lecture notes for BISS-2003 Berti-
            noro International School for Gruaduate Studies in Computer Science Available at
            http://www.mfn.unipmn.it/∼bobbio, March 10-14 2003.
[Bor03]     A. Borg. ”A Real-Time RMI Framework for the RTSJ”. In Proceedings of the 15th
            Euromicro Conference on Real-Time Systems (ECRTS’03), Porto, Portugal, 2-4 July 2003.
[BSDO04]    J. Balasubramanian, D. C. Schmidt, L. Dowdy, and O. Othman. ”Evaluating the Per-
            formance of Middleware Load Balancing Strategies”. In Proceedings of the 8th Interna-
            tional IEEE Enterprise Distributed Object Computing Conference (EDOC’04), Monterey,
            California, September 20-24 2004.
[CJCP00]    I. Cardei, R. Jha, M. Cardei, and A. Pavan. ”Hierarchical Architecture for Real-Time
            Adaptive Resource Management”. In Proceedings of IFIP/ACM International Con-
            ference on Distributed Systems Platforms and Open Distributed Processing (Middleware
            2000), LNCS 1795, pages 415–434, New York, USA, April 2000.

UBLCS-2006-06                                                                                      93
                                                                                    REFERENCES


[Con]       ASP Industry Consortium. ”SLA for Application Service Provisioning”. White Pa-
            pers, Available at http://allaboutasp.org.
[CPFD01]    D. Conan, E. Putryez, N. Farcet, and M. A. DeMiguel. ”Integration of Non-
            Functional Properties in Containers”. In Proceedings of the 6th International Workshop
            on Component Oriented Programming, Budapest, Hungary, 2001.

[CRS+ 98]   M. Cukier, J. Ren, C. Sabnis, D. Henke, J. Pistole, W. H. Sanders, D. E. Bakken, M. E.
            Berman, D. A. Karr, and R. E. Schantz. ”AQuA: An Adaptive Architecture That
            Provides Dependable Distributed Objects”. In Proceedings of the 17th IEEE Symposium
            on Reliable Distributed Systems (SRDS’98), pages 245–253, West Lafayette, Indiana,
            October 20-23 1998.

[Dav04]     R. Davoli. ”VDE: Virtual Distributed Ethernet”. Technical report, University of
            Bologna, Dept. of Computer Science TR-UBLCS-2004-12, June 2004.
[DeM01]     M. A. DeMiguel. ”Solutions to Make Java-RMI Time Predicatable”. In Proceedings of
            the 4th IEEE International Symposium on Object-Oriented Real-Time Distributed Comput-
            ing, pages 379–386, 2001.

[DK03]      M. Debusmann and A. Keller. ”SLA-driven Management of Distributed Systems us-
            ing the Common Information Model”. In Proceedings of the 8th International IFIP/IEEE
            Symposium on Integrated Management (IM 2003), Colorado Springs, CO, USA, March
            2003.
[ea97]      R. Braden et al. ”Resource ReSerVation Protocol (RSVP) Version 1 Functional
            Specification”. Technical report, IETF RFC 2205, September 1997. Available at
            http://www.ietf.org/rfc/rfc2205.txt.
[ea98]      M. Carson et al. ”An Architecture for Differentiated Services”. Technical report, IETF
            RFC 2475, December 1998. Available at http://www.ietf.org/rfc/rfc2475.txt.

[ea04a]     B. Roehm et al. IBM WebSphere Application Server V6 Scalability and Performance Hand-
            book. Redbooks IBM Coorporation, 2004.
[ea04b]     B. Roehm et al. IBM WebSphere V5.1 Performance, Scalability, and High Availability
            WebSphere Handbook Series. Redbooks IBM Coorporation, 2004.
[FDG+ 97]   V. Fay, L. C. DiPippo, R. Ginis, M. Squadrito, S. Wohlever, I. Zykh, and R. Johnston.
            ”Real-Time CORBA”. In Proceedings of the 3rd IEEE Real-Time Technology and Applica-
            tions Symposium, Montreal, Canada, June 1997.
[Fil]       The Servlet Filters. Available at http://java.sun.com/products/servlet/Filters.html.
[FK98]      I. Foster and C. Kesselman. ”The Grid: Blueprint for a New Computing Infrastruc-
            ture”. Morgan Kaufmann Publishers, 1998.
[FLPS03]    G. Ferrari, G. Lodi, F. Panzieri, and S. K. Shrivastava. ”The TAPAS Architecture:
            QoS-enabled Application Servers”. In TAPAS Deliverable D7 Trusted and QoS-Aware
            Provision of Application Services IST Project No: IST-2001-34069, Brussels, April 2003.
[For99]     Distributed Management      Task    Force.                   ”Common   Information
            Model (CIM) Version 2.2. Specification”, June               1999.     Available at
            http://www.dmtf.org/standards/cim spec v22/.
[FR03]      M. Fleury and F. Reverbel. ”The JBoss Extensible Server”. In ACM/IFIP/USENIX
            International Middleware Conference, Rio de Janeiro, Brazil, 16-20 June 2003.



UBLCS-2006-06                                                                                   94
                                                                                    REFERENCES


[FSE04]     G. Ferrari, S. K. Shrivastava, and P. Ezhilchelvan. ”An Approach in Adaptive Per-
            formance Tuning of Application Servers”. In Proceedings of the 1th IEEE International
            Workshop on Quality of Service in Application Servers (QoSAS 2004), in conjunction with
            23rd Symposium on Reliable Distributed Systems (SRDS 2004), pages 7–12, Jurere Beach
            Village, Santa Catarina Island, Brazil, 17 October 2004.
[Ghi01]     V. Ghini. ”QoS-Adaptive Middleware Services”. PhD thesis, University of Bologna,
            Italy, 2001.
[hlb]       The    F5   BIG-IP   hardware    load           balancing.             Available     at
            http://www.f5.com/products/bigip/.
[IBM04]     IBM. Utility Computing, volume 43. IBM System Journal, 2004.

[ISP00]     D. B. Ingham, S. K. Shrivastava, and F. Panzieri. ”Constructing Dependable Web
            Services”. IEEE Internet Computing, 4(1):25–33, January-February 2000.
[jbo]       ”The JBoss Application Server”. Available at http://www.jboss.org.
[jgr]       ”The JGroups reliable group communication”. Available at http://javagroups.com.

[jme]       ”Apache JMeter”. Available at http://jakarta.apache.org/jmeter.
[jon]       ”The JOnAS Application Server”. Available at http://www.objectweb.org.
[KL03]      A. Keller and H. Ludwig. ”The WSLA Framework: Specifying and Monitoring Ser-
            vice Level Agreements for Web Services”. Journal of Network and Systems Management,
            Special Issues on E-Business Management, Plenum Publishing Corporation, 11(1), March
            2003. Presprint available as IBM Reseach Report RC22456.
[KRL+ 00]   F. Kon, M. Roman, P. Liu, J. Mao, T. Yamane, L. C. Magalhes, and R. H. Campbell.
            ”Monitoring, security, and Dynamic Configuration with the dynamicTAO Reflective
            ORB”. In IFIP/ACM International Conference on Distributed Systems Platforms and Open
            Distributed Processing (Middleware 2000), LNCS 1795, pages 121–143, New York, 3-7
            April 2000.
[KSC03]     S. Krishnamurthy, W. H. Sanders, and M. Cukier. ”An Adaptive Quality of Service
            Aware Middleware for Replicated Services”. IEEE Transactions on Parallel and Dis-
            tributed Systems, 14(11):1112–1125, November 2003.

[LBtJG04]   S. Labourey, B. Burke, and the JBoss Group. ”JBoss AS Clustering”. JBoss Inc, 3340
            Peachthree Road; Suite 1225 Atlanta, GA 30326 USA, May 2004.
[Lod04]     G. Lodi. ”Dependable Application Hosting”. In Proceedings of the 2004 International
            Conference on Dependable Systems and Networks (Student Forum Track), Florence, Italy,
            June 28 - July 1 2004.
[lot]       ”IBM Lotus”. Available at http://www-106.ibm.com/developerworks/lotus.
[LP04]      G. Lodi and F. Panzieri. ”QoS-aware Clustering of Application Servers”. In Pro-
            ceedings of the 1th IEEE International Workshop on Quality of Service in Application
            Servers (QoSAS 2004), in conjunction with 23rd Symposium on Reliable Distributed Sys-
            tems (SRDS 2004), pages 1–6, Jurere Beach Village, Santa Catarina Island, Brazil, 17
            October 2004.
[LPRT05]    G. Lodi, F. Panzieri, D. Rossi, and E. Turrini. ”Experimental Evaluation of a QoS-
            aware Application Server”. In Proceedings of the 4th International IEEE Symposium on
            Network Computing and Applications (IEEE NCA05), Marriott Hotel, Kendal Square,
            Cambridge, MA USA, July 27-29 2005.

UBLCS-2006-06                                                                                   95
                                                                                      REFERENCES


[LSZB98]    J. P. Loyall, R. E. Schantz, J. A. Zinky, and D. E. Bakken. ”Specifying and Measuring
            Quality of Service in Distributed Object Systems”. In Proceedings of ISORC’98, Kyoto,
            Japan, 20-22 April 1998.
[Mic02a]    Sun Microsystems. ”Java Management eXtension: Instrumentation and Agent Specifica-
            tion v.1.1”, 2002. Available at http://java.sun.com/jmx/.

[Mic02b]    Sun Microsystems. ”Java Tutorial”, 2002. Available at http://java.sun.com.
[MJSCG04] C. Molina-Jimenez, S. K. Shrivastava, J. Crowcroft, and P. Gevros. ”On the Monitor-
          ing of Contractual Service Level Agreements”. In Proceedings of the 1th International
          IEEE Workshop on Electronic Contracting (WEC), San Diego, July 2004.
[Mod]       Mod jk. Available at http://jakarta.apache.org/tomcat/connectors-doc/.
[OOS01]     O. Othman, C. O’Ryan, and D. C. Schmidt. ”The Design and Performance of an
            Adaptive CORBA Load Balancing Service”. IEEE Distributed Systems Online, 2(3),
            March 2001.
[Pos]       PostgreSQL. Available at http://www.postgresql.org/.

[Pro02]     J. Prosise. Programming Microsoft .NET, Core Reference. ISBN: 0-7356-1376-1. Microsoft
            Press, 2002.
[PSS+ 05]   G. Pacifici, W. Segmuller, M. Spreitzer, M. Steinder, A. Tantawi, and A. Youssef.
            ”Managing the Response Time for Multi-tiered Web Applications”. Technical report,
            IBM RC 23651, 2005.

[PSTY02]    G. Pacifici, M. Spreitzer, A. Tantawi, and A. Youssef. “Performance management for
            web services”. Technical report, RC22676 IBM Research, August 2002.
[RFC97]     RFC-2109. HTTP State Management Mechanism, February 1997.
[RKC01]     M. Roman, F. Kon, and R. H. Campbell. ”Reflective Middleware: From Your Desk to
            Your Hand”. IEEE Distributed Systems Online Journal Special Issue on Reflective Middle-
            ware, July 2001.
[RT04]      D. Rossi and E. Turrini. “Testing J2EE clustering performance and what we found
            there”. In Proceedings of the 1st IEEE International Workshop on Quality of Service in
            Application Servers, in conjunction with 23rd Symposium on Reliable Distributed Systems
            (SRDS ’04), pages 13–18, Florianopolis, Brazil, 2004.
[Sch02]     D. C. Schmidt. ”Middleware for Real-Time and Embedded Systems”. Communica-
            tions of the ACM, 45(6):43–48, June 2002.
[Ser]       Linux Virtual Server. Available at http://www.linuxvirtualserver.org/.

[Sha03]     B. Shannon. ”Java 2 Platform Enterprise Edition v. 1.4”, November 24 2003. Sun Mi-
            crosystem, Available at http://java.sun.com.
[SK00]      D. C. Schmidt and F. Kuhns. ”An Overview of the Real-time CORBA Specification”.
            IEEE Computer Special Issue on Object-Oriented Real-Time Distributed Computing, June
            2000.

[SLE04]     J. Skene, D. Lamanna, and W. Emmerich. ”Precise Service Level Agreements”. In
            Proceedings of the 26th International Conference on Software Engineering (ICSE’04), Edin-
            burgh, Scotland, UK, 25 May 2004.
[SLM97]     D. C. Schmidt, D. L. Levine, and S. Mungee. ”The Design of the TAO Real-Time
            Object Request Broker”. Computer Communication Journal, Summer 1997.

UBLCS-2006-06                                                                                     96
                                                                                     REFERENCES


[STYC02]    K. Shen, H. Tang, T. Yang, and L. Chu. ”Integrated Resource Management for
            Cluster-based Internet Services”. In Proceedings of the 5th Symposium on Operating
            Systems and Design and Implementation USENIX Association, Boston Massachussets,
            USA, 9-11 December 2002.
[SYC03]     K. Shen, T. Yang, and L. Chu. ”Clustering Support and Replication Management
            for Scalable Network Services”. IEEE Transactions on Parallel and Distributed Systems,
            14(11):1168–1179, November 2003.
[TAP]       TAPAS. Available at http://tapas.sourceforge.net.
[TIRL03]    F. Tartanoglu, V. Issarny, A. Romanovsky, and N. Levy. ”Coordinated Forward Error
            Recovery for Web Services”. In Proceedings of the 22nd Symposium on Reliable Dis-
            tributed Systems (SRDS’2003), Florence, Italy, October 2003.
[Tom]       Tomcat. Available at http://jakarta.apache.org/tomcat/.
[Tri02]     K. Trivedi. Probability and Statistics with Reliability Queueing and Computer Science
            Applications- Second Edition. Wiley, 2002.

[USR02]     B. Urgaonkar, P. Shenoy, and T. Roscoe. ”Resource Overbooking and Application
            Profiling in Shared Hosting Platforms”. In Proceedings of the 5th Symposium on Oper-
            ating Systems and Design and Implementation USENIX Association, Boston Massachus-
            sets, USA, 9-11 December 2002.
[Vet95]     R. J. Vetter. ATM concepts, architectures, and protocols, volume 38. ACM Press, New
            York, NY, USA, February 1995.
[VZL+ 98]   R. Vanegas, J. A. Zinky, J. P. Loyall, D. A. Karr, R. E. Schantz, and D. E. Bakken.
            ”QuO’s Runtime Support for Quality of Service in Distributed Objects”. In Pro-
            ceedings of IFIP International Conference on Distributed Systems Platforms and Open Dis-
            tributed Processing (Middleware ’98), September 1998.
[web]       ”The IBM WebSphere Application Server”.                  Available at http://www-
            306.ibm.com/software/webserver/appserv.
[WSO00]     N. Wang, D. C. Schmidt, and C. O’Ryan. ”An Overview of the CORBA Component
            Model”, chapter in Computer-Based Software Engineering. Addison-Wesley, Mas-
            sachusetts, 2000.
[Wus02]     E. Wustenhoff. ”Service Level Agreement in the Data Center”. Sun Blueprints, April
            2002.
[ZBS97]     J. A. Zinky, D. E. Bakken, and R. D. Schantz. ”Architectural Support for Quality of
            Service for CORBA Objects”. Theory and Practice of Object Systems, 3(1), 1997.
[ZK02]      T. Zhao and V. Karamcheti. ”Enforcing Resource Sharing Agreements among Dis-
            tributed Server Clusters”. In Proceedings of the 16th International Parallel and Dis-
            tributed Processing Symposium (IPDPS), April 2002.
[ZLAS02]    R. Zhang, C. Lu, T. F. Abdelzaher, and J. A. Stankovic. ”ControlWare: A Middleware
            Architecture for Feedback Control of Software Performance”. In Proceedings of the
            International Conference on Distributed Computing Systems, July 2002.




UBLCS-2006-06                                                                                    97

				
DOCUMENT INFO