Document Sample
ivory Powered By Docstoc
					Toward Automatic State Management for Dynamic Web Services 
    Geo C. Berry, Je rey S. Chase, Geo A. Cohen, Landon P. Cox, and Amin Vahdat
                            Department of Computer Science
                                   Duke University

Abstract                                                      through a cookie, and the service may update its
A key challenge in the development of the Inter-              internal state as a side e ect of the request. The
net is to simplify construction of scalable wide-             Common Gateway Interface CGI, Microsoft's Ac-
area services. One approach to scaling wide-area              tive Server Pages ASP, Java servlets, JavaServer
services is to deploy generic computing power and             Pages JSP, and other technologies for dynamic
storage in the network, and use it to absorb service          Web content are central to the continuing evolution
load through dynamic resource recruitment, active             of Web servers" into Web application servers"
caching, or dynamic service replication. Each of              supporting personalized content e.g., my.*.com,
these approaches introduces distributed state and             electronic commerce and auctions, online nancial
an accompanying burden on the programmer to                   services, and communication services such as Web-
manage that state.                                            based mail. The technology is also gaining popular-
   This paper develops an approach to automatic               ity as a vehicle for delivering outsourced application
state management for replicated services, a key step          services, e.g., billing or personnel management for
toward the goal of automatically converting un-               small businesses, and to associate code with static
scalable service implementations into scalable ones.          content, e.g., to track user access patterns.
We demonstrate a prototype implementation of au-                 Scaling these dynamic Web services is a key chal-
tomatic state management, called Ivory. Ivory                 lenge for the continuing development of the Inter-
transforms the bytecodes of a Java-based service              net. Increasing bandwidths enable more advanced
to trap updates to its data structures and propa-             dynamic applications, but these applications are
gate modi ed objects among to peer replicas. We               interactive and must respond quickly. Faster net-
demonstrate our approach in the context of a ser-             works alone cannot overcome latencies imposed by
vice caching framework that replicates service code           server load or the speed of light, or outages caused
and data on demand, and present measurements of               by server failures or network glitches. The solution
an example Web portal application that shows the              is to use caching and replication to push applica-
overhead and scalability bene ts of service replica-          tion data and processing out into the network and
tion using Ivory.                                             closer to the end users. Unfortunately, dynamic
                                                              content defeats current Web proxy caches. Dy-
                                                              namically generated documents are not cacheable
1 Introduction                                                because they may change each time they are ac-
                                                              cessed. This presents a fundamental limit to the
Dynamic Web content is becoming increasingly im-              e ectiveness of Web document caching. For dy-
portant as Internet services continue to evolve. A            namic services to bene t, caching and replication
dynamic Web service generates custom documents                strategies must be extended to replicate some or all
on-the- y by executing code over internal service             of the service itself | its code and internal state |
state often extracted from a database. The gener-             rather than merely caching or replicating the doc-
ated response may depend on arguments in the re-              uments that it generates.
quest, session history, or user information accessed
    This work is supported by the National Science Founda-
                                                                 Several frameworks exist for managing service
tion under grants CCR-96-24857 and CDA-95-12356. Geo          replication in the Web, including research sys-
Cohen was supported in part by an IBM Cooperative Grad-       tems VAD+ 98, RRRA99 and emerging Web host-
uate Fellowship.                                              ing providers e.g., Akamai and Sandpiper Net-
works. One di culty with replicating service                outside of the server.
state is that the service must maintain consistency
among its replicas; updates originating at any site          The transformed service code can make use of
must propagate to all replicas using the modi ed             generic processing power and storage in the
state. The speci c way to address the consistency            network. In particular, it is easy to expand
problem depends on the representation of the data,           capacity by adding more application proxies.
its internal consistency requirements, and the na-           Service caching with Ivory provides for regular,
ture of the updates. For example, some or all of the         incremental, transparent, and consistent prop-
server state may be stored in a collection of les, a         agation of updates in both directions between
relational database, or data structures generated by         each proxy and the primary server.
programs. Each requires a di erent level of system
support to maintain consistency.                             Web application proxies can exploit sharing of
   This paper presents techniques for automatically          content within a client population, in a manner
managing replica consistency, taking an impor-               analogous to static Web proxy caching.
tant step toward transparent replication of dynamic
Web services. We focus speci cally on services              We demonstrate service caching with a dynamic
based on server-side Java technology, as one ex-         portal application intended to be representative of
ample technology for producing dynamic Web con- and other commercial Web services
tent. Our approach is based on a toolkit, called         providing personalized views of news, stock quotes,
Ivory, that leverages Java binary rewriting tools to     sports scores, weather, etc. In this paper, ser-
instrument the compiled bytecodes for the service        vice caching and the portal example serve to illus-
implementation, adding new instructions to cap-          trate the Ivory approach to automatic state man-
ture and propagate object updates.                       agement. In particular, this paper does not ad-
   While the general problem of replica consistency      dress important security, access control, and re-
is extremely di cult, our solution is promising for      source management issues for service migration and
state represented as Java object structures with no      on-demand replication. For example, it would not
concurrent write sharing of any individual object        be useful for a large search engine to download its
among the replicas. Our system provides no means         entire index to a proxy. We view service caching as
to order updates from multiple replicas. This is ad-     a speci c instance of a general vision of migratable
equate for the large class of dynamic Web services       service code, which is addressed more comprehen-
in which content updates disseminate from the pri-       sively by recent research on WebOS VAD+ 98 and
mary server and user updates are limited to state        Active Names VDAA99 . Similarly, the Ivory ap-
associated with a particular user e.g., user pro les,   proach to automatic state management is applica-
shopping carts, mail boxes, account information         ble within these more general frameworks.
or with a group of clients bound to a single replica        This paper is organized as follows. Section 2
e.g., a business using an outsourced application.      presents the Ivory approach to state management
   We illustrate our approach as the core of a service   and its prototype implementation. Section 3 illus-
caching framework for dynamic Web services. This         trates the use of Ivory in the service caching ar-
framework extends the Web proxy caching infras-          chitecture. Section 4 sets our approach in context
tructure to allow on-demand partial replication of       with related work. Section 5 presents experimen-
services in Web application proxies. Service caching     tal results showing the overhead of automatic state
leverages the transportability of Java code and Java     management and the performance bene ts of ser-
objects, and the partitioning of Java-based services     vice caching. Section 6 concludes.
into discrete code units servlets. Service caching
is super cially similar to Java applets in that ap-
plication services are delivered by server-supplied
                                                         2 Ivory Architecture
code without the need for clients to install, admin-     This section describes the structure of the Ivory
ister, or maintain the application software. How-        system, the techniques and mechanisms used to
ever, service caching di ers in several fundamental      manage replicated service state, and the issues
respects.                                                raised by our approach. While the implementation
                                                         described here is speci c to Java, the underlying
    There is little or no burden on the program-         principles extend to any similar language. A Java-
    mer to use service caching; bytecode rewriting       based service may instantiate its data from some
    transparently adapts the service code to run         external storage, e.g., les or a database, but Ivory
deals only with the data's representation as Java          object values only at well-de ned commit points oc-
data structures.                                           curring at the completion of sequences of updates
  Ivory consists of a state manager, a serializer,         that transition the modi ed structures from one
and a transformer program built using JOIE, a              consistent state to another. When a service thread
programmable bytecode rewriting toolkit CCK98 .            reaches a commit point, all objects modi ed since
The transformer instruments the compiled service           its last commit point are committed to the state
code with write barriers that capture object up-           manager. The state managers observe the follow-
dates and notify the state manager, which propa-           ing constraints, which are both necessary and suf-
gates the modi ed object values. The serializer is           cient to preserve consistent views: 1 the state
an extended object serialization package used by           manager never propagates an object that is dirty
the state manager to propagate updates incremen-           but uncommitted, 2 if the state manager sends
tally. The following subsections explore the Ivory         to a peer any object modi ed in a given commit,
architecture in more detail.                               then it sends all objects in the peer's view that were
                                                           modi ed in that commit or in a preceding commit,
2.1 Replicas and Views                                     and 3 the receiving replica applies updates from
                                                           a given commit as an atomic unit, and never in-
To instantiate a new replica, a server serializes some     terleaves them with processing for a request that
set of objects into a TCP stream. The receiver             might access the modi ed objects.
unpacks the serialized objects to create a complete           Our approach leads to the following consistency
or partial replica of the service. The policies for        guarantee for each replica's view. For each peer P ,
selecting replica sites or objects to replicate are left   consider the set of locally replicated objects whose
unspeci ed; the service caching framework outlined         latest update originated at P . The objects in this
in Section 3 illustrates one useful set of policies.       set have the same state that they had at some
   To simplify the exposition, we describe how the         commit point recently occurring on P . Thus this
system manages state shared among a single pair            state is internally consistent, but it is permitted
of replicas. Generalization to multiple replicas is        to be stale, i.e., subsequent commits may have es-
straightforward. eralization to multiple replicas is       tablished a more recent state not yet re ected in
straightforward.                                           the replica. In general, a client bound to a single
   The set of objects shared by a pair of replicas is      replica of a dynamic Web service cannot detect that
called the pair's view. Each replica's state manager       its replica's data is stale, since any state presented
maintains a table of references to objects shared          to the client in a response could change before the
with its peer, called a view table. The serializer         client submits the next request. However, stricter
uses the view table to propagate object updates in-        session guarantees are needed if clients migrate be-
crementally, as explained in Section 2.4. If either        tween replicas. Also, if multiple sites may generate
replica loses state in a failure, it may be reestab-       con icting updates then the system must impose
lished from the survivor's view table.                     some safe ordering on these updates. Safely repli-
   Connected replicas must agree on the set of ob-         cating this class of service requires an external con-
jects contained within the shared view. Views are          currency control scheme FCNyML94 or restricted
not static; either peer may create new objects or          data representations that can tolerate multiple up-
add objects to the view. In our current prototype          date orderings PST+97 .
each view contains a closed set of objects; any ob-           Rather than attempt to determine appropriate
ject that becomes reachable from other objects in          commit points automatically, we currently require
the view is automatically added to the view and            the programmer to direct the placement of com-
propagated to the peer, as described in Section 2.4.       mits. Ivory de nes a null interface called Consis-
                                                           tent; methods of interfaces marked by the program-
2.2 View Consistency                                       mer as extending the Consistent interface are as-
                                                           sumed to transform the data from one consistent
Replicated data may include arbitrarily linked data        state to another. The bytecode transformer instru-
structures with strong internal consistency require-       ments these methods to commit updates before re-
ments. Therefore, Ivory must propagate updates             turning.
in such a way that each replica observes only inter-          Our Ivory prototype imposes coarse-grained con-
nally consistent states. Our solution is to borrow         currency control on the service to ensure that
the notion of an atomic commit from ACID trans-            updates and requests do not interleave within
action systems. The state manager records new              any replica. Our approach establishes a global
reader writer lock in the state manager. Consis-           for a dirty object is proportional to the number of
tent methods, which update replicated state, are           views containing the object.
instrumented to include a prologue that acquires
the global lock in write mode. Request handlers
acquire the lock in read mode, and may promote to          2.4 The Serializer
write mode if the handler encounters a consistent          The state manager uses a new serialization pack-
method. The serializer holds the lock in read mode         age to propagate modi ed object values. The Ivory
when it is propagating updates, and in write mode          serializer is similar to Java Serialization in that it
when it is applying received updates. The locking          packs and unpacks object values into and out of a
rules prevent any thread from observing a possibly         stream, in this case a network connection between
inconsistent state while it is processing a request.       a replica pair. However, the Ivory serializer di ers
                                                           in three key respects:
2.3 Propagating Updates                                        It is incremental: the serializer transmits only
The state manager is responsible for propagating               the objects that have been modi ed or added
the new object values to replicas as needed, as de-            to the view, rather than reserializing the entire
termined by the lists of dirty objects passed to its           data structure. This minimizes the overhead
commit method. Each site must track object mem-                to propagate updates, and it allows di erent
bership in the views of connected replicas, so that            sites to concurrently update di erent portions
it may propagate any modi ed object to all views               of a connected data structure.
that contain it. To meet this need, the state man-             It is iterative rather than recursive, so it is not
ager maintains a copyset table mapping each object             vulnerable to stack over ows for deeply nested
to a list of views that contain the object. The copy-          data structures.
set table is updated when the state manager adds
or removes objects from a view.                                It uses e cient class-speci c serialization
   Ivory accommodates both push and pull policies              methods generated and installed in each class
for propagating updates to peer replicas. To im-               by the bytecode transformer, rather than us-
plement a push, the state manager simply serial-               ing re ection to discover each object's internal
izes all committed objects into the streams bound            structure at runtime.
to each containing view at commit time. The pull
state manager retains records of dirty objects un-            The view tables are the basis for the Ivory se-
til the peer asks for them; it maintains with each         rialization scheme. Each view table maintains a
view a list of objects that are dirty in the view, i.e.,   mapping between object references to integer ob-
objects that have not been propagated since they           ject IDs OIDs that are unique within the view.
were last modi ed. Each object that is dirty in            The state manager initializes the mappings when it
the view is marked with a dirty bit in the object's        creates the view, and updates it by assigning new
view table record. On commit, modi ed objects              OIDs as objects are added to the view. View ta-
that are not already dirty in the view are added to        bles allow the peers on either side of a connection
the view's dirty list. When the replica requests up-       to agree on a common space of OIDs. OIDs enable
dates, the state manager serializes the view's dirty       incremental and iterative serialization because they
objects into the stream, clears the view's dirty bits      provide a means to represent objects by reference
for the objects in the list, and empties the list. The     in the serialized stream.
pull lazy model may impose a round-trip latency             One role of the serializer is to automatically add
on some requests, but it is less consumptive of net-       objects to the view when those objects become
work bandwidth, it matches the request response            reachable from other objects in the view. This can
structure of HTTP, and it supports the browser             occur only when an object A already in the view
  refresh" button.                                         is modi ed to include a reference to an object B
   The state managers require state proportional           that is outside of the view, causing B and its de-
to the sum of the number of objects in all the             scendents to become reachable from the view. The
views. There are three hash table entries for each         serializer discovers and handles all such cases while
object; view pair, one in the copyset table and two      serializing the updated A, when it queries the view
in the view table OID-to-reference and reference-         table for an OID to represent the referenced object
to-OID. Each entry contains at most an object ref-        B . The sending serializer allocates an OID for B ,
erence and a dirty bit. State management overhead          adds the mapping to the view table, and transmits
                         ALOAD 0                                    this
                         GETFIELD Ivory dirty
                         IFNE @END                                  if true goto end
                         ALOAD 0                                    this
                         ICONST 1                                   true
                         PUTFIELD Ivory dirty
                         ALOAD 0                                    this
                         INVOKESTATIC Hamper::setDirty

                Table 1: Bytecode transformer splice for write barrier on put eld operations.

B together with its new OID. The receiver instanti-      jects new instructions a splice into the bytecode
ates a new copy of B and installs the new mapping.       to set an object's dirty bit each time one of its
                                                           elds is modi ed with a putfield instruction. To
2.5 The Role of Bytecode Rewriting                       capture all needed updates, the transformer must
                                                         rewrite any class that updates any replicated ob-
Ivory uses bytecode rewriting to make state man-         jects, including classes that set a public eld rather
agement automatic. The replication system is a           than calling a method. The spliced code sequence,
general-purpose Java package; a service implemen-        shown in Table 1, collects modi ed objects on a
tor could design the service to use the replication      dirty list a globally shared hamper as the dirty
package by making explicit calls to the state man-       bit transitions from clean to dirty. The epilogue
ager. Our goal is to automate this process, provid-      for Consistent methods commits the dirty list to
ing a means to automatically convert unscalable          the state manager, which records the dirty objects,
service implementation into scalable ones. Byte-         resets their dirty bits, and empties the hamper.
code transformation is a promising approach to              For automatic state management to be practi-
injecting new system functionality into Java pro-        cal, it is critical to minimize state management
grams without modifying the application source           overheads in the transformed bytecodes. The sim-
code or the Java system itself the language, com-       plest write barrier transformer instruments every
piler, and JVM.                                         putfield instruction, which is wasteful when mul-
   The Ivory bytecode rewriter is built using            tiple writes occur within the same code path. The
JOIE CCK98 , a bytecode rewriting toolkit. JOIE          JOIE toolkit provides primitives to partition the
bytecode rewriters are transformer classes written       bytecode into basic blocks and to perform simple
in Java using JOIE primitives for deconstructing         control ow analysis. We have used these primi-
and instrumenting compiled Java classes. JOIE            tives to implement two re nements of the simple
transformers can be used to transform stored class-      write barrier transformer. The BasicBlock trans-
 les after compilation, or they may be applied on-       former updates the dirty bit at most once per ba-
the- y by a transformer-enabled JOIE ClassLoader         sic block. The Dominator transformers performs
as it loads service classes into the JVM.                a dominating path analysis to reduce dirty bit up-
   Ivory includes three JOIE transformers that il-       dates for basic blocks accessed through the same
lustrate simple tasks easily achieved with byte-         code path. The more sophisticated transformers
code transformation. First, a serialization trans-       trade o transformation speed for more e cient
former injects code used by the incremental seri-        runtime performance. Section 5 presents perfor-
alizer for e cient, transparent serialization, as de-    mance results from these transformers.
scribed in Section 2.4. Second, a consistency trans-
former identi es methods that implement the inter-
face Consistent Section 2.2, and injects a prologue    3 Service Caching
and epilogue to synchronize the consistent method
and to commit dirty objects on completion. Finally,      This section outlines the structure of the service
a write barrier transformer installs write barriers to   caching framework used in our experiments. The
track and record object updates.                         framework is designed to illustrate use of Ivory in
   The write barrier transformer modi es each class      conjunction with a simple scheme for replica cre-
to add a dirty bit to each instantiated object. It in-   ation and request routing. It allows on-demand
caching of service code and data in application         a proxy, but our prototype has no access control.
proxy servers, which extend static proxy caching           Service caching in application proxies can im-
to support local execution of cached Java classes       prove the scalability, availability, and response time
and data.                                               of a dynamic Web service. The bene ts accrue from
   Our prototype enables transparent service            several factors:
caching for services implemented using the
JavaServer Pages JSP standard PLC99 , a re-               It o oads the processing cost of generating dy-
cent extension of the earlier Java Servlet stan-            namic content. This allows the home server to
dard Dav99 . The JSP standard supports con-                 support a larger community of clients before it
struction of dynamic Web services from static tem-          saturates.
plates e.g., HTML containing embedded scripting           It exposes to the application proxies the as-
code invoked at page-fetch time to ll in dynamic            sembly" of dynamic documents from static and
content. The JSP script code is written in Java,            dynamic components, allowing the proxy to
although future releases may support other script-          cache the static content. Requests that hit in
ing languages. JSP scripts access the service data          the cache retrieve at most the objects needed
using a registry that allows them to retrieve some          to generate the dynamic components. In many
subset of the underlying service objects by sym-            but not all services this will reduce network
bolic name. JSPs are compiled to generate Java              bandwidth consumption to satisfy the request.
servlets, which are Java classes implementing stan-
dard methods for handling HTTP requests.                    For requests that hit in the cache, a pull to
   The service caching framework uses JSP servlets          refresh the cache state will transfer only the
and the retrieved objects as the granularity of             objects that were modi ed. This can sub-
caching. The proxies cache servlets by the URL              stantially reduce overhead and network band-
name pre x not including arguments. The proxy             width demand for services with su ciently low
can then service any URL operation with a match-            update rates. On lower bandwidth links the
ing pre x by executing the cached JSP servlet lo-           smaller transfer size can substantially reduce
cally. As the servlet retrieves objects by name, the        response time.
Ivory-enabled proxy server contacts the home site
to retrieve the objects and add them to its view.           Many services can tolerate delays in the dis-
The proxy maintains a local cache over the name             semination of content updates from the server.
registry, so that it can serve repeat references by         This can reduce the propagation frequency
symbolic name from its object cache. Of course,             and the overhead of round-trips to query for
the retrieved objects may reference other objects           updates, reducing latency and bandwidth de-
directly; the serializer in the home server automat-        mands. It also improves service availability,
ically adds these descendent objects to the view            since a client may continue interacting with a
as described in Section 2.4. Thus the local ob-             replica if the home server fails or is unreach-
ject cache may contain service objects that are not         able.
named symbolically.                                         Like static proxy caches, application proxies
   This replication scheme results in a simple hier-        can deliver caching bene ts from shared data
archy in which each proxy has a connection for ex-          brought into the cache by multiple users ac-
changing updates with a parent, e.g., the home site.        cessing the same Web sites.
Updates may ow in either direction. For example,
in the portal application described in Section 5, the
server noti es the proxies of updates on the con-
tent provider e.g., news and stock quotes. Our
                                                        4 Related Work
prototype application proxy pulls updates from the      Replication for improving server performance is not
server on each client request. A weaker but more        a new technique, having analogs in le servers,
e cient implementation could limit the pull rate        databases, and web servers. Many of these sys-
according to some policy, e.g., pull at most once       tems propagate updates to replicated state incre-
a minute unless the client hits the refresh" but-       mentally. For example, Delis and Roussopoulos
ton. In the other direction, the proxy may notify       explore a log-based approach for updating client
its parent of updates to user-speci c information       caches in a relational database system DR92 . Up-
e.g., a user pro le. For security reasons a server    dates are centralized at the server; before a client
may refuse to accept updates to service state from      accesses a data item in its cache, it rst contacts

             throughput (requests/second)


                                             40                                                            1-view
                                             20                                                            1-proxy

                                                   0              50              100              150               200
                                                                    demand (requests/second)

                                            Figure 1: Request throughput for the portal application on a single server.

the server to retrieve log records generated since                                  classes. Our approach using transformation allows
the cached copy of the item was last updated. Rel-                                  the key elements of this functionality on a standard
ative to this large body of work, our contribution                                  JVM.
lies in: 1 system support for incremental updates                                    The Ivory prototype is applicable to pure Java-
to replicated Java data structures, 2 our use of                                  based services. However, techniques similar to our
bytecode rewriting to make state management au-                                     own, for example, leveraging related work in the
tomatic for Java-based services, and 3 our use of                                 database community on materialized views GM95
automatic state management to extend the Active                                     are more generally applicable to other kinds of ser-
Cache idea CZB98 to handle Java-based dynamic                                       vices.
   Our pull-based incremental update propagation
is also similar to delta encoding of updates to web                                 5 Experience
pages MDFK97 . HPP DHR97 preprocesses web                                           This section presents experimental performance re-
pages to identify static versus dynamic portions.                                   sults to illustrate the bene ts and quantify the costs
HPP could be used for our simulated web portal                                      of automatic state management using Ivory. We
application to cache portions of the web page. Ivory                                experimented with a simple portal application to
extends these ideas to handle the Java state used                                   demonstrate the potential of the service caching
to generate dynamic content. In this sense, these                                   framework. The portal site is implemented as a
systems are complementary to Ivory and can be                                       JSP servlet that generates personalized" HTML
used together for e cient delivery of web content.                                  viewing a selection of content objects that are ran-
   Software write detection has been used previ-                                    domly generated and regularly updated. The con-
ously for distributed shared memory, fault isola-                                   tent mimics common information such as news cat-
tion, garbage collection, and dynamic data race de-                                 egories and headlines, stock quotes, weather, and
tection ZSB94, HM93, SG97 . It has been used                                        sports scores. The content is stored in a variety of
for Java in the PSE persistent storage engine from                                  Java data structures and referenced by randomly
Object Design Inc. Obj98 , which includes a util-                                   generated user customization pro les.
ity that transforms Java classes to be storable in                                     Personalized Web portals are illustrative of a
their persistence storage infrastructure, but does                                  growing class of Web services that generate docu-
not detect if the instances have become dirty.                                      ments dynamically based on user preferences and
PJama ADJ+ 96 , earlier called PJava, uses a mod-                                   changing underlying content. We show how to
i ed JVM to supply orthogonal persistence to user                                   improve the scalability of these services by using

              throughput (requests/second)

                                             100                                              3-proxy
                                                   0   100   200       300        400        500         600
                                                             demand (requests/second)

   Figure 2: Request throughput for the portal application using Ivory-enabled Web application proxies.

Ivory to cache subsets of the underlying content ob-                    each request with zero latency. The number of user
jects at application proxies; this allows the proxies                   pro les scales with demand; in these experiments
to generate the documents locally, contacting the                       500 pro les generate a demand of 100 requests per
server only to receive updates. In this way, prox-                      second. Each pro le references 60 items randomly
ies can service requests for dynamically generated                      selected from a universe of 2500 content objects.
data, acting as logical extensions of the service.                      While the total data size is less than a megabyte,
   The simulated web portal consists of a server the                   we stress the system by aggressively updating the
portal site and a varying number of proxies and                        data: 15 of the objects are updated each second.
clients. The server and proxies are based on Tiny-                      All proxies and servers are 167 MHz Sun Ultra 140
Server, a simple Java-based servlet engine devel-                       workstations running Solaris 2.6 and JDK 1.1.5,
oped for a distributed systems class. A collection                      interconnected with the clients by a switched 100
of servlets and associated classes implements the                       Mb s network.
application proxy cache engine, the server interface
to the state manager, the naming registries for the                     5.1 Service Overhead
JSP service caching framework, and the portal ap-                       Figure 1 quanti es the overhead of our prototype by
plication. URL requests to the portal servlet spec-                     showing the saturation points of a single server in
ify a user by name, demand loading the user pro le                      various con gurations. The top two lines show that
and any referenced content objects if they are not                      transforming the service code to track object up-
already resident in the cache.                                          dates degrades its saturation throughput by about
   Figures 1 and 2 show the overhead costs and scal-                    6; Section 5.3 explores this cost in more detail.
ing bene ts of Web application proxies using Ivory                      In addition to the xed cost of tracking object
for the portal application with representative pa-                      updates, servers incur an additional cost to track
rameter settings. In these experiments a commu-                         the objects that are dirty in the view of each peer
nity of client processes generates a stream of page                     replica. This cost scales with the number of peers
view requests, with each process using a di erent                       and with the number of updates recorded for the
user pro le. These are closed loop experiments in                       objects in each peer view. The next two lines in
which each client process issues a request, awaits                      Figure 1 show that this cost is signi cant in the
the response, then sleeps for ve seconds before is-                     prototype: maintaining copy sets and dirty sets for
suing the next request. The gures give delivered                        each complete replica degrades request throughput
throughput as a function of demand, the request                         by about 3 to 4 in this experiment, primarily
arrival rate for an ideal server that responds to                       due to the aggressive update rate.

                                                                                            All Writes   Basic Blocks   Dominators


                 Transformation Speed (KB/sec)




                                                         compress        jess          db                      mtrt                  jack

                                                       Figure 3: Transformation speed for Ivory bytecode rewriting.

   The lowest line in Figure 1 shows the steady-                                        the clients evenly distributed among varying num-
state request throughput through a single proxy                                         bers of proxies backed by a single primary server.
using the pull-based state manager with a two-                                          The proxies use a ve-second update window.
second update window. This gives a pessimistic                                             Figure 2 shows that serving the portal appli-
estimate of the e ect of update propagation on re-                                      cation from Web application proxies allows it to
quest throughput. Once the proxy's cache has been                                       scale to larger numbers of clients. Aggregate re-
loaded in the rst few seconds of the experiment,                                        quest throughputs at saturation scale almost lin-
the proxy satis es each request from the cache un-                                      early as proxies are added. In principle, proxies
less the update window expires. On the rst request                                      can be added and will deliver linear scalability up
after the update window expires, the proxy queries                                      to the point at which the primary server saturates
the primary server for updates to its view, which in                                    in delivering updates to the proxies.
this experiment returns new values for an average                                          This experiment is conservative in that it does
of 30 of the objects in the cache. The proxy can-                                      not re ect the cost to fetch server data over a wide-
not execute any more requests until it has applied                                      area link, which often carries higher latency and
these updates, leading to a signi cant drop in re-                                      lower bandwidth than the link to the proxy. Of
quest throughput. This is a pessimistic test for two                                    course, the performance delivered in practice will
reasons. First, a practical con guration would use                                      also depend on application parameters including
a larger update window. Second, the proxy in this                                       the size of the generated content, the size of the
experiment is serving requests for a single service,                                    objects used to generate the content, the update
and these requests cannot be overlapped with up-                                        rate for those objects, the processing cost to gener-
date propagation for the service. In practice, each                                     ate the content, client bandwidths to the proxy and
proxy would serve requests for multiple services,                                       the server, and the degree of sharing among mul-
and would overlap update propagation with request                                       tiple clients bound to the same proxy. We leave a
processing for other services.                                                          more complete exploration of the parameter space
                                                                                        to future work.
5.2 Scaling Bene ts
Figure 2 illustrates the scalability bene ts of Web                                     5.3 Write Barrier Overhead
application proxies using Ivory. Like Figure 1, Fig-                                    We ran a second set of experiments to better ap-
ure 2 gives request throughput as a function of re-                                     proximate the cost of bytecode transformation for
quest demand. In these experiments, each client                                         tracking updates. We transformed ve programs
process is bound to a Web application proxy, with                                       from the SpecJVM98 suite compress, jess, db,

                                                                          Untransformed   All Writes   Basic Blocks    Dominators


                 Normalized execution time





                                                   compress        jess                   db                          mtrt          jack

                                             Figure 4: Runtime overhead for code instrumented with write barriers.

mtrt, and jack using the three versions of the                                            Java technology. Ivory uses bytecode rewriting to
write barrier transformer outlined in Section 2.5.                                         instrument compiled service code with hooks into
We measured both the speed of the transforma-                                              a state management package, taking automatically
tion and the slowdown of the transformed byte-                                             converting centralized service implementations into
code. Figure 3 and Figure 4 show results from                                              scalable, replication-aware, wide-area applications.
a 300 MHz UltraSPARC-IIi processor running So-                                             We illustrate use of Ivory in a service caching frame-
laris 5.7 and JDK 1.2. The transformers process                                            work for dynamic services based on JavaServer
and rewrite bytecode at between 280 KB s and                                               Pages JSPs. The JSP standard is well-suited
500 KB s, with the more sophisticated transform-                                           to service caching because it imposes a partition-
ers running slightly slower but producing more e -                                         ing on the service code and data, and its naming
cient transformed code. The slowdown of the trans-                                         infrastructure is easily extended to demand-fault
formed code is under 10 using the Dominator                                              service objects referenced by symbolic name. Our
transformer for three of the ve benchmarks, with                                          approach takes a signi cant step toward general-
only compress showing a signi cant slowdown of                                             izing Web caching and replication infrastructures
22. The bene t of control ow analysis is typ-                                             to handle dynamic content. This can signi cantly
ically modest but is signi cant for some applica-                                          improve scalability, response times, and consumed
tions. For mtrt, Dominator reduces the slowdown                                            wide-area bandwidth for dynamic Web services.
from 12 with AllWrites to 6.

6 Conclusion                                                                               References
                                                                                             ADJ+ 96                   M.P. Atkinson, L. Daynes, M.J. Jor-
Caching and replication are key techniques for scal-                                                                   dan, T. Printezis, and S. Spence. An
ing Web services. Unfortunately, state replication                                                                     Orthogonally Persistent Java. ACM
introduces a di cult state management problem,                                                                         SIGMOD Record, December 1996.
since service state must be kept consistent across
replicas. This is a challenging and problem for ser-                                         CCK98                     Geo A. Cohen, Je rey S. Chase,
vices with dynamically generated content, which is                                                                     and David L. Kaminsky. Automatic
increasingly prominent.                                                                                                Program Transformation with JOIE.
   This paper describes the design and implementa-                                                                     In USENIX 1998 Annual Techni-
tion of Ivory, a system that automates state man-                                                                      cal Conference, pages 167 178, June
agement for dynamic services based on server-side                                                                      1998.
CZB98       Pei Cao, Jin Zhang, and Kevin           Obj98     Object Design Inc. ObjectStore
            Beach. Active Cache: Caching Dy-                  PSE Resource Center, 1998.
            namic Contents on the Web. In Pro-                http: content
            ceedings of Middleware, 1998.                     products PSEHome.html.

Dav99       James Duncan Davidson. Java             PLC99     Eduardo Pelegri-Llopart and Larry
            Servlet API: Version 2.2. Techni-                 Cable. JavaServer Pages Speci ca-
            cal report, Sun Microsystems, June                tion: Version 1.1. Technical report,
            1999.                                             Sun Microsystems, August 1999.
DHR97       Fred Douglis, Antonio Haro, and         PST+97    Karin Petersen, Mike J. Spre-
            Michael Rabinovich. HPP: HTML                     itzer, Douglas B. Terry, Marvin M.
            Macro-Preprocessing to Support Dy-                Theimer, and Alan J. Demers. Flex-
            namic Document Caching. In Pro-                   ible update propagation for weakly
            ceedings of the 1997 Usenix Sympo-                consistent replication. In Proceed-
            sium on Internet Technologies and                 ings of the Sixteenth ACM Sympo-
            Systems, Monterey, California, De-                sium on Operating System Princi-
            cember 1997.                                      ples SOSP, pages 288 299, Octo-
                                                              ber 1997.
DR92        A. Delis and N. Roussopoulos. Per-
            formance and Scalability of Client-     RRRA99    M. Rabinovich, I. Rabinovich, R. Ra-
            Server Database Architectures. In                 jaraman, and A. Aggarwal. A Dy-
            Proceedings of the 18th International             namic Object Replication and Mi-
            Conference on Very Large Databases,               gration Protocol for an Internet
            pages 610 623, August 1992.                       Hosting Service. In IEEE Int. Conf.
                                                              on Distributed Computing Systems,
FCNyML94 Michael J. Feeley, Je rey S. Chase,                  May 1999.
         Vivek R. Narazayya, and Henr
         y M. Levy. Integrating coherency           SG97      Daniel J. Scales and Kourosh Ghara-
         and recoverability in distributed sys-               chorloo.      Towards Transparent
         tems. In Proceedings of the First                    and E cient Software Distributed
         Symposium on Operating System De-                    Shared Memory. In Proceedings
         sign and I mplementation, pages                      of the Sixteenth ACM Symposium
         215 227, November 1994.                              on Operating Systems Principles
                                                              SOSP, pages 157 169, October
GM95        Ahish Gupta and Inderpal Singh                    1997.
            Mumick. Maintenance of Material-
            ized Views: Problems, Techniques,       VAD+ 98   Amin Vahdat, Thomas Anderson,
            and Applications. In Data Engineer-               Michael Dahlin, Eshwar Belani,
            ing Bulletin, June 1995.                          David Culler, Paul Eastham, and
                                                              Chad Yoshikawa. WebOS: Oper-
HM93        Antony L. Hosking and J. Eliot B.                 ating System Services for Wide-
            Moss. Protection traps and alter-                 Area Applications. In Proceedings
            natives for memory management of                  of the Seventh IEEE Symposium on
            an object-or iented language. In                  High Performance Distributed Sys-
            SOSP93, pages 106 119, December                   tems, Chicago, Illinois, July 1998.
                                                    VDAA99    Amin Vahdat, Michael Dahlin,
MDFK97      Je rey Mogul, Fred Douglis, Anja                  Thomas Anderson, and Amit Ag-
            Feldmann, and Balachander Krish-                  garwal. Active Names: Flexible
            namurthy. Potential Bene ts of                    Location and Transport of Wide-
            Delta Encoding and Data Compres-                  Area Resources. In Proceedings of
            sion for HTTP. In Proceedings of                  the USENIX Symposium on Internet
            ACM SIGCOMM, pages 181 194,                       Technologies and Systems USITS,
            August 1997.                                      October 1999.
ZSB94   Matthew J. Zekauskas, Wayne A.
        Sawdon, and Brian N. Bershad. Soft-
        ware Write Detection for Distributed
        Shared Memory. In Proceedings of
        the First USENIX Symposium on
        Operating Systems Design and Im-
        plementation OSDI, pages 87 100,
        November 1994.

Shared By: