Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out

thesis_final

VIEWS: 27 PAGES: 19

grid computing system - products - applications

More Info
									                        Grid Systems enter the Web:
                 The Open Grid Service Architecture
                                      Bachelors Thesis (2002/2003)
                                   Reinier Timmer (rjtimmer@cs.vu.nl)

                                      Vrije Universiteit, Amsterdam

                                        Supervisor: Thilo Kielmann




Abstract
   Grid computing systems are an emerging class of distributed systems. They are already widely
used in scientific research, enabling resource sharing of all kinds, rather than distributing only number-
crunching tasks. To make a move to the commercial world, however, a middleware system is required,
that allows seamless integration with existing technologies, allowing organisations to not only design
their own grid system, but also to cooperate with other grids.
   A new architecture that is designed with the commercial applications in mind is the Open Grid
Service Architecture (GOSA). It builds on existing technologies from both the grid world (the Globus
Toolkit) and the world of Web Services (WSDL, SOAP). Both the Globus Toolkit and the Web
Services framework have proven themselves, and the designers of OGSA believe the cooperation
between these technologies can lead to a breakthrough in the field of grid systems. So far, there have
not been any official releases, and as a result, the grids have not proved to be a worthy addition to
commercial settings.




                                                                                                        1
Table of Contents

 Table of Contents.................................................................................................................. 2
 1 Introduction ....................................................................................................................... 3
    1.1 Grid Systems ........................................................................................................................... 3
    1.2 Issues in designing grid systems ............................................................................................ 4
 2 OGSA: an Open Grid Services Architecture .................................................................... 5
    2.1 Basic OGSA characteristics................................................................................................... 5
    2.2 The Globus Toolkit................................................................................................................. 6
    2.3 Web Services ........................................................................................................................... 6
       2.3.1 Exchanging information: SOAP....................................................................................................... 7
       2.3.2 Describing services: WSDL ............................................................................................................. 8
       2.3.3 Service Discovery: UDDI & WSIL................................................................................................ 11
    2.4 The need for OGSA.............................................................................................................. 12
 3 OGSA functionality ......................................................................................................... 12
 4 Using OGSA for writing grid applications: The Globus Toolkit version 3.0 ................ 14
    4.1 Programming model............................................................................................................. 15
    4.2 Programming a server ......................................................................................................... 15
    4.3 Programming a client........................................................................................................... 17
    4.4 Larger grid applications using GT3 ................................................................................... 18
 5 Conclusions...................................................................................................................... 18
 6 References ........................................................................................................................ 19
    Literature .................................................................................................................................... 19
    Software References ................................................................................................................... 19




                                                                                                                                                    2
1 Introduction
    The time that single computer systems could be used for managing all the processes in an
organisation lies far behind us. Distributed systems are constructed from single systems, tied together
by some kind of hardware and software connection to distribute the work over various computers. For
organisations, this approach has many advantages. For example, a database can be managed on a
single system, allowing all connected systems to access the same information to be available on all
computers. This idea can also be applied to the file system, with a single (or cluster of) system(s)
managing the user files. This has the advantage that it does not matter which system a user uses, the
user always will have access to his/her own files. Another advantage is that upgrades or repair jobs can
be performed without having to shut down the entire distributed system, so that the largest fraction
will be on-line for most of the time. The last example could be an order processing machine, which is
distributed among several clerks, all using another computer to access the central system. For the users
it is important that they are to be confronted with this complex interconnection mechanism as little as
possible. In other words: the distribution must be as reasonably transparent as is possible. These are
essential properties of distributed systems for easy and reliable use.
    Earlier distributed systems (and other forms of parallel computers) were often very tightly-coupled.
They were often specially designed for an organisation, which is not only very expensive, but also
reduces the possibilities of performing system upgrades. One of the most challenging tasks is to devise
a distributed system’s middleware that would allow all kinds of systems to participate. These systems
would possibly provide us with relatively easy methods to build larger distributed systems, which
would be easily upgradeable at reasonable costs.
    In a distributed system, all single computers need to have some ways to communicate with other
systems and to access distributed resources. As all systems need to share a single system view to
execute distributed applications, an extra software layer is required. This middleware is placed on top
of the local operating system (OS). Instead of letting the application perform its system calls to the
operating system, these are given to the middleware. This software abstraction then performs all the
necessary communication to hide as much of that from the users. Middleware is essential for
(transparent) distributed systems. For example, have a look at Figure 1. This shows how the
middleware bridges the gap between the applications and the local operating system: the distributed
application only communicates with local and remote services through the middleware.




                  Figure 1: Middleware in a distributed system, taken from [11]


   1.1 Grid Systems
   The focus of this article will be on grids, which are a specific class of distributed systems. The grid
architecture is aimed towards the sharing of all kinds of (specialized) resources across a distributed
system, possibly over widespread locations. One of the things that make grids interesting is this
possibility of sharing multiple resource types. This sharing can vary from relatively simple types like
storage space, databases or even entire software programs to highly specialized (and often expensive)
scientific instruments, like devices that perform astronomic measurements or the like. This sharing

                                                                                                        3
guarantees both a higher availability of high-end instruments and also keeps costs at a reasonable
level. For example, an organization that requires only a fraction of time and functionality of this
device for its system does not need to purchase this entire instrument, but it can use this device (if it is
part of the grid) as if it were local in the same system. In theory, it would not even matter how far
away this tool is located. Similarly, organisations that are located all across the world need to invest
only once in such expensive devices. The first time the notion of a grid appeared was in the scientific
world, where instruments and other high-end resources were to be shared. In this system also
knowledge about certain research could be pooled for global access. Science often requires this high
data availability and a lot of computing power for performing simulations of all kinds.
    To function reasonably well, there are some desirable properties these grids must support. In short,
grids are designed to allow “dependable, consistent and pervasive access to (high-end) resources”. It
must be dependable in the sense that it can provide guarantees about the performance of the particular
parts of the system such as response time. Also, the system must guarantee that the work that is done
is correct and conforms to the user’s demands. The need for a consistent system means the access to
different resource types is done using a standard (uniform) interface. It would be impossible to create
an easy-to-use system when resource access is different for each different resource class. This would
also guarantee poor upgradeability, as for each new type of resource a new access mechanism could
(and must) be developed. Pervasiveness would allow all kinds of systems to “plug in” from anywhere.
    In many respects grid systems are similar to the power grid: a simple access mechanism for all
kinds of machines, it is always there (except when a power failure occurs) and users don’t exactly
know or need to know where exactly the power is produced. If a single power plant needs maintenance
or it is offline for some reason, others can take over. Last but not least, it has an excellent
price/performance ratio. Power plants are very expensive, but when millions of people use the
produced power, they do not need to pay the huge costs.
    If such a kind of infrastructure would exist, not only would it provide us with a lot of raw
computing power, it could offer a whole new range of applications, based on the coupling of people,
computers, instruments etc. Think for instance of collaborative engineering, in which two experts in
different professions can jointly design a new engine (for example) by both running the same instance
of a software design package, in which both parties in turn can add something to the design. But they
don’t need to be present at the same location; one of them can be on the other side of the world. They
both can share the same visual output of the program and modify the design at the same time. The
display can be a simple monitor, but also can be some sort of 3d environment display. This is only one
of the many possible applications that sharing all kinds of resources can allow. Grid developers expect
the grid to be a success story similar to the Internet: it is widely accepted and adopted and has
radically changed the way we share and collect information. In the same way, grids are expected to
change the current ways that we execute applications.

   1.2 Issues in designing grid systems
   To get the most benefits of a grid system, a few issues need to be taken care of in the grid
architecture:
   1. The allocation and the releasing of remote resources must be manageable in a flexible way.
       This is a requirement if grids are to support all kinds of applications that dynamically allocate
       and free resources whenever they are required. The owner of the resources must have the
       opportunity to define exactly what (part of the) resources can be shared, who (a group or an
       individual) has access and under what conditions these resources are shared. A set of
       individuals and/or institutions defined by such sharing rules is also called a Virtual
       Organisation (VO) [4].
   2. Grids must have a high performance, or at least provide some guarantees about the degree of
       performance. Only then can systems decide what actions to place at what locations, or just do it
       themselves if that is the fastest solution.
   3. Another important issue is heterogeneity: systems of all kinds must be able to participate in the
       grid, even if their hardware architecture is different (or even incompatible) with the
       architecture of other systems in the grid. This is an important element for creating scalable
       systems that are relatively easy to upgrade.


                                                                                                          4
   4. Issues concerning security, policy and some sort of payment for resources have to be dealt
      with. Also, grids must be scalable, easy to configure and be relatively fault tolerant.
   5. Grids do not need (or want to use) global control, systems just execute applications whenever
      needed, so there is no single application running at a time, initiated by a single master (or group
      of controlling systems). Most systems are autonomous in this sense.

    It is quite obvious that connecting all kinds of different resources and presenting components in a
uniform way requires a lot of work. There must be some kind of grid middleware present to hide the
internal differences of the shared resources and offer uniform access to all shared resources. The focus
of this article will be on grid systems, and how to write applications for them, no matter how different
these systems are from a hardware perspective. Therefore we will have a look at an Open Grid Service
Architecture (OGSA), as well as the most important underlying technologies for building large-scale,
open grid systems.

2 OGSA: an Open Grid Services Architecture
    The focus of this article will primarily be on the Open Grid Service Architecture (OGSA). OGSA
is an architecture for building large-scale grid systems, and it is for an important part based on two
existing technologies, namely the Globus Toolkit and Web Services. In this chapter, OGSA and the
technologies it is based on will be described.

   2.1 Basic OGSA characteristics
    OGSA is a new architecture for building grids; it is supposed to set a standard in the grid world and
it is described in [3], it illustrates the grid components as well as how OGSA-based systems should
work. From the Globus Toolkit the main high-level system components are used and refactored. The
web services are used as the standard interaction infrastructure, which can be used by all systems in
the grid. The exact definition is described in [11]. All in all, OGSA can be seen as an evolution of the
Globus toolkit, using standard communication mechanisms to allow all kinds of systems to participate.
A comparison can be seen in Figure 2.




                      Figure 2: The Globus architecture compared to OGSA

    In the high level, the “user-level” components are offered. These are the interfaces that are similar
to the interfaces presented by the Globus Toolkit (described later). These interfaces provide access to
functions like resource management (“factories”) and resource reservation management. The same
interfaces are offered by the OGSA framework, but the main difference lies in the transport protocol
(for communication between computers). In older versions of Globus, special protocols are used,
whereas OGSA uses more general mechanisms (Web Services) for communication.
    Basically, the same interfaces and components are used in OGSA, but they are refactored to fit in
the OGSA design. Below that, the web services (basically XML documents) are used to build the
communication messages. These can be transported using a standard mechanism (like HTTP or other
internet protocols).
    One of the most important concepts of OGSA is that it is service-oriented. In OGSA, all machines
need to interact with other machines, which can be based on different hardware architectures. To allow
these machines to understand each other, information exchange must be done using some kind of high-

                                                                                                       5
level abstraction (web services in this case) which is understandable for both parties. Services are
network-enabled ‘individuals’ that provide some service capability through the exchange of messages.
    Leaving out machine information allows very different kinds of systems to interact with each other.
The difficulty here is how to represent such data at a high level. If a machine has a certain amount of
CPU cycles to share or a number of megabytes for storing data, these numbers can have different
meanings on the both machines. Therefore, other aspects of the service must be addressed, such as
response time or security. Services are to be defined in terms of Quality-of-Service rather than
computing power. This service-oriented view allows virtualization, the composition of a higher-level
service from lower level services. A client can access a single machine that also carries the
information of its lower-level machines. The client does not need to be aware of the fact that services
are created by composing several smaller-scale systems. All in all, service orientation is the key to
interoperability in OGSA.

   2.2 The Globus Toolkit
    The Globus Toolkit essentially provides the middleware for grid systems; it is a central component
for managing grid infrastructures. The toolkit provides a set of services and libraries that address
issues like resource discovery, defining sharing rules, managing resources and data, communication,
fault detection and portability [4].
    The Globus Toolkit is already widely used in a number of grid projects worldwide: there has been a
testbed running from 1998, called GUSTO. The third version of the Globus Toolkit (GT3) will be
based on the OGSA model. The previous version (GT2) already supports some of the key concepts of
OGSA, but in a little too specialized and less flexible way. OGSA is more general and has the benefits
of web-based middleware (for better interoperability). In the OGSA-based implementation (GT3) the
most important design elements of the Globus Toolkit are redesigned to fit into the OGSA framework.
To get a little understanding of what OGSA uses from Globus, the high level services relevant for
OGSA are discussed in this chapter.

   One of the most important toolkit components in Globus is the GRAM (Grid Resource and
Allocation Manager) protocol. It contains a “gatekeeper” service. The gatekeeper is a trusted process
that provides secure and reliable management and creation of services. This is essentially a factory
service which can allocate resources for user processes.
   The Meta Directory Service (MDS-2) provides the framework for the discovery (and access) to the
shared resources / services. It does so through soft state registration, data modeling and a local registry
(the “GRAM reporter”) which is another important service of GRAM (the monitoring service). The
GRAM reporter monitors and publishes data about the internal state of the node in a registry.
   GSI, the Grid Service Infrastructure allows clients to sign on to the network service. They have to
be authenticated only once, after which they get a proxy credit (an X.509 proxy certificate [3]) for
requesting services.

   2.3 Web Services
   One of the goals of OGSA is to allow different kinds of systems to interact with each other, using
standardized protocols. Inventing new interaction mechanisms is an option, but this has some
unhelpful disadvantages: new protocols have to be created, and they have to be standardized. Apart
from the fact this takes a lot of time (devising protocols, gaining support, etc.) A better solution is to
embrace protocols that are already widespread and standardized. Consider, for example, protocols
such as FTP and HTTP: almost every computing system that is connected to the Internet supports this
protocol and can use it to interact with all kinds of systems across the world. This interoperability was
one of the key features that made the Internet successful, and is likely to be of great importance to the
success of grid systems.
   Web services are mechanisms that provide standard means of describing services and transporting
these services. To achieve this, the description details are formatted in a language that is readably by
machines regardless of the underlying hardware (as is done with HTML in the internet). For this,
XML documents are used. Like HTML, these are written in plain text and so all computer specific
details are of no importance. These XML documents can be transported by the, also widely used,

                                                                                                         6
HTTP (but they are not restricted to HTTP only). HTTP has the advantage that it is supported by a lot
of different computer systems, so it can always be used in the system if no better alternative is
available. The Web services are used in OGSA for the distribution, description and the discovery of
service data. Each one of the protocols will be discussed below.
   The explanation of the individual components of Web services requires some basic understanding
of XML. Please note that the most examples are rather simplified, as the scope of this article is on
OGSA, and not on details like how to exactly assemble correct WSDL files. They are just to give an
insight in the working. Details about XML namespaces, XSD (XML schema definitions) and SOAP
headers etc. are mostly omitted.

   2.3.1 Exchanging information: SOAP
   The Simple Object Access Protocol (SOAP) [6] is an XML-messaging format. It can either transmit
a document or a Remote Procedure Call (RPC). The RPC is particularly interesting for distributed
computing, as it is a basic mechanism for the execution of program code on remote machines. To
realize such a remote call, most of the time some kind of information needs to be supplied, and
possibly a result value has to be returned (as is the case with local procedure calls in single computer
systems). Information between the calling procedure and the target procedure is transmitted in the
form of parameters (values). These can be simple (integers, strings etc) or more complex
(combinations of parameters, like a record in Pascal, or a struct in C/C++).
   With local procedure calls, the called procedure receives the parameters in its own, native, data
format. Different machines, however, can have different representations of similar data types.
Therefore, parameters are wrapped in a SOAP document, which is an XML document and so it is at
least readable for both participants. They can both convert the values to their internal representation
for their own use. A SOAP request-response chain for a 2-way RPC is shown in Figure 3.




                        Figure 3: SOAP message exchange (two-way RPC)

   The use of this protocol lies in its simplicity and its widespread support (almost every vendor
provides support, and it is even used in major B2B standards [5]). The simplicity it offers provides
great flexibility: it can be transported using various communication protocols (HTTP, E-mail (!) etc),
for multiple purposes (document-oriented or carrying parameter values). It is not even limited to
request-response messages: only single messages are defined, so a RPC can contain two separately
defined SOAP messages: one for input and one for the output. If no output is supplied, no messages
need to be defined (it is up to the application, exactly the way we want it). No matter how complex the
message exchange patterns are, they all can be composed from one-way messages.
   Although its name suggests some object-oriented link, SOAP does not extend the object-oriented
model: it does not support inheritance and other advanced object-related terms, it does also not allow
the transport of operations on its data (methods), it can only carry data. Things are indeed kept really
simple, as otherwise its use could be limited to applications that use exactly the same ideas as SOAP.




                                                                                                      7
   SOAP Message structure
   Soap messages are just XML documents. They are used to wrap the information (parameters) in a
generally understandable format. An example of a SOAP message is shown in Example 1:

<soap:Envelope>

  <soap:Body>
    <m:getAccountInfo>
      <accountNr> 304121077 </accountNr>
      <key> 2094967244 </key>
    </m:getAccountInfo>
  </soap:Body>

</soap:Envelope>
                               Example 1: SOAP message envelope

   This SOAP message example contains in its body the parameters for a web service called
getAccountInfo (also used in section 2.3.2). The parameters that are supplied can be used by the
target service to process the service, after which possibly a return message is sent. Additionally, a
SOAP header can be used, which can contain additional information that can be used by intermediate
systems that are passing the message on.
   As can be seen, this does not yet define any message chains. It is up to the higher level services
(WSDL in this case) to combine single messages in request-response chains. This will be dealt with in
the next section on WSDL.

    2.3.2 Describing services: WSDL
    The Web Service Description Language (WSDL) [2] is used in OGSA as the standard way of
describing services. WSDL is a XML-based language used to describe web services and how to access
these web services. What WSDL defines is basically what kind of messages (and in what format) can
be sent to the service provider, and what results (messages) can be expected in return: WSDL
describes the I/O interactions between service provider and clients of the web service. WSDL is also
an XML-based language, so it is a document format readable for various machines; the system itself
can decide how to interpret these messages.
    WSDL has some important benefits. First, the service description and the actual implementation (at
the server side) are forced to be separated, which is the only way to accomplish full interoperability,
all system details are excluded from the service description.
    How this is used in OGSA is to define Grid Services as WSDL documents that conform to a given
set of standard conventions, so all the systems know how to extract the information relevant to the grid
system from the WSDL document. First, I will give a short introduction on the basics of WSDL, after
which we will look at a short example to clarify the concept.

   WSDL basics
   A WSDL document offers the interface to the application code, by describing the supported
operations, the associated message formats, type definitions and protocol bindings (like SOAP). All
these parts are expressed in a XML document using WSDL-specified XML tags [9]. The most
important tags for this purpose are:
    • <portType>, containing a number of <operation> tags. This is one of the most important
        elements in WSDL. Here the web service is defined, including all of the operations and the
        required messages. A port type can be compared to a function library or similar in more
        traditional languages.
    • <message>, the operations defined in the WSDL portType part are defined as series of
        successive messages (like input, output). Each one of these messages must be described in
        detail, telling how many, and what kind of parameters are accepted by a function call (thus
        must be present in the message). Also …?


                                                                                                      8
     •     <types>, are used to express type definitions in an understandable language. WSDL does not
           introduce its own type definitions; instead it supports the XML Schemas specification (XSD),
           which is a set of standard XML type definitions.
     •     <binding>, a specification for the protocol, data format or structure of a particular portType.
           It is used to attach a these specific protocols etc. to an abstract message. The separation of
           protocol binding and abstract operation definitions allows the reuse of these abstract
           definitions to be reused in another binding.
     •     <port>, WSDL ports, or endpoints, are used to specify the services (services are described as
           a collections of endpoints). Ports itself are defined as a combination of a binding and a
           network address.
     •     <service>, this contains a collection of related endpoints.

   There also exists a <documentation> tag, which is optional and can be used to contain human-
readable information about certain messages.

     A simple scenario
   To illustrate a little of the working of WSDL, I will use the following example. Suppose we have a
bank that for some reason decides to attach some of their banking services to some kind of network
using web services. The services that are to be made addressable can be described in any kind of
programming language, Java in this example. The server that supplies this service has access to the
BankingService class, and supports 2 operations, retrieving the account information (using the
getAccountInfo function) and updating the account information (updateAccount). This could
look something like in example x. In this case, the BankingService can be seen as the web service
with operations getAccountInfo and updateAccount.

 public class BankingService {

         public double getAccountInfo(String accountNr, int key) {
             /* request account information*/
             return someAmount;
         }

         public double updateAccount(String accountNr, int key, double amount) {
             /* update account with amount in the integer ’amount’ */
             return someAmount;
         }
 }
                                     Example 2: The Banking Service

   Now remember that the machines that issue a request have no knowledge of the implementation
details like what kind of programming language is used, including the details of how the data types are
represented at the server side. To allow access through WSDL, all this information needs to be
represented in an ‘intermediate’ standardized form. The use of standardized XSD types allows both
server and client to read the value and convert it to some understandable format.

     To describe this service using WSDL, some things must be defined:
          • The message formats must be defined. Remember that a WSDL document basically
              defines input and output messages (e.g. a SOAP request of form X results in a response
              message of form Y).
          • The supported operations. It is clear that only simple, high-level operations are to be
              freely accessible. To manually define what operations are allowed, each operation is
              defined in a WSDL <operation> field.
          • Which operation bindings, port types, type definitions, etc are supported.
     The messages are defined using the WSDL <message> fields as shown in Example 3.



                                                                                                        9
 <message name=’BankingService.getAccountInfo’>
   <part name=’accountNr’ type=’xsd:string’/>
   <part name=’key’ type=’xsd:int’/>
 </message>

 <message name=’BankingService.getAccountInfoResult’>
   <part name=’result’ type=’xsd:double’/>
 </message>

 <message name=’BankingService.updateAccount’>
   <part name=’accountNr’ type=’xsd:string’/>
   <part name=’key’ type=’xsd:int’/>
   <part name=’amount’ type=’xsd:double’/>
 </message>

 <message name=’BankingService.updateAccountResult’>
   <part name=’result’ type=’xsd:double’/>
 </message>
                                    Example 3: Message definitions

   The first two messages define request and response messages for getting the account information.
The next two messages are used to update the account information (like adding some amount of
money or setting a new value). Merely defining these messages does not yet supply the information
about request-response chains of messages. To define these, the <operation> fields are used, in
which the operations are defined in terms of their input and output (if any) messages. How this can be
done is shown in Example 4:

 <portType name=’BankingServicePort’>

    <operation name=’getAccountInfo’>
      <input message=’wsdlns:BankingService.getAccountInfo’/>
      <output message=’wsdlns:BankingService.getAccountInfoResult’/>
    </operation>

    <operation name=’updateAccount’>
      <input message=’wsdlns:BankingService.updateAccount’/>
      <output message=’wsdlns:BankingService.updateAccountResult’/>
    </operation>

    <operation name=’mutedUpdateAccount’>
      <input message=’wsdlns:BankingService.updateAccount’/>
    </operation>

 </portType>
                                    Example 4: Defining operations

    The first operation, getAccountInfo, defines which input message is used, and what results are
to be expected after the command has completed. What this operation defines is that a message of type
getAccountInfo results in a response of the form getAccountInfoResult. The second operation
is similar to the first one, except that the updateAccount function is issued. Also, a request without a
response is possible, to this extent the third operation is used, which only defines an input message.
All of the allowed operations are collected in a <portType> field.

   Protocol Bindings
   The last part that is interesting for grid technologies is the binding. In the binding field, the type of
communication protocol is defined. This ensures that the Web service can be accessed by diverse
communication mechanisms, so it is not limited to SOAP only, it could as well support HTTP-GET
for instance. The <binding> field is used to describe for each operation which protocol(s) are

                                                                                                         10
supported. If more than one is supported, than for each access mechanism, a different binding part
needs to be described. The binding could look like this:

<binding type=’BankingServicePort’ name=’BankingSoapBinding’>
  <soap:binding style=’rpc’ />
    <operation name=’getAccountInfo’>
      <soap:operation
          soapAction=’http://banking.org/actions/getAccountInfo’>

          <input>
             <soap:body use=’encoded’ namespace=’http://banking.org/msg’
                  encodingStyle=’http://schemas.xmlsoap.org/soap/encoding’>
          </input>

          <output>
             <soap:body use=’encoded’ namespace=’http://banking.org/msg’
                 encodingStyle=’http://schemas.xmlsoap.org/soap/encoding’>
          </output>

     <operation>
  </soap:binding>
</binding>
                             Example 5: Binding operations to SOAP

The binding type refers to the (previously described portType “BankingServicePort”). This binding
uses a RPC-style message exchange, because this is the description of a function call. The
soapAction field of the <soap:operation> defines the namespace. What namespaces exactly are
will not be described in this article; it basically acts as a unique operation identifier. It is not a
reference to a website at the given location, but merely a network-wide way to distinguish this
operation from other operations with similar names, which can be residing at other servers.
    Within the <input> and <output> a <soap:body> element is to be used to define the encoding
scheme. Without going in to much detail about soap encoding, it is sufficient to say that this message
uses standard soap encoding. How this is actually done is considered to be a bit too “low-level” for
this article: most of these XML-type documents are invisible to the users most of the time anyway.

    2.3.3 Service Discovery: UDDI & WSIL
    Within grid systems, the need for so-called service discovery also exists, otherwise the published
service information could be hard to find. There are two possible service publication methods
supported in OGSA. One of them uses a centralized mechanism (a service registry) where all of the
registered service information can be stored. This approach is based on the Universal Description,
Discovery and Integration (UDDI) specification [14]. The other supported way of finding information
is not centralized and is based on the Web Services Inspection Language (WSIL, or WS-Inspection)
[1]. WSIL documents are stored at the server so service providers can also be queried for service
information. Both of them will be discussed in a little more detail below.

   Centralized service descriptions: UDDI
   The UDDI specifications define a way to publish and discover information about Web Services. It
uses the notion of a distributed registry of businesses/organisations and their associated services, in
which can be searched by other organisations that are interested in a specific kind of services. These
service descriptions are the core components in UDDI, and are called UDDI business registrations,
which are XML files conforming to the UDDI specification. See Figure 4 for an overview of the
individual parts of the registration & discovery procedure.




                                                                                                    11
                              Figure 4: using UDDI to locate services

   The UDDI registry, where all the registrations are stored, logically consists of three parts. The
“white pages” include contact information, like address and identifiers, the “yellow pages” in which
services are categorized based on industry, and the “green pages” which contain the technical
information, like service descriptions.

   Local service descriptions: WS-Inspection
   WSIL, the Web Services Inspection Language [1], is based on another concept of service
discovery. Whereas UDDI uses a centralized mechanism for the collection of service description,
WSIL basically defines a set of conventions to directly query a server for its service data.
   The WS-Inspection specification contains two primary functions. The first on defines standard
conventions. To easily find data from a server, the inspection document is always placed at a standard
location, so that it can always be found if only the name/location of the service provider is known. At
the web point of presence of a service, a document called inspection.wsil can be placed, which
contains the WS inspection information. This type of documents can be retrieved using standard
mechanisms (like HTTP-GET or something like that).
   Secondly, WS-Inspection defines an XML format for listing references to existing service
descriptions. As WSIL document can contain a collection of service descriptions and links to other
sources of service description [3]. Most of the time, these are just URLs to WSDL documents, or it
can be a pointer to an entry in an UDDI registry.

   2.4 The need for OGSA
    Whereas existing grids and their infrastructures are used mostly in scientific environments, another
application area has approached. E-business is an area in which also services need to be integrated
across distributed platforms. Business-to-business (B2B) collaborations often require cross-enterprise
applications to function, thus requiring a supporting infrastructure. Also, organisations are offloading
processes to certain service providers (SP) to achieve a better price/performance ratio.
    To manage cross-grid applications, organisations are in general concerned with seamless
integration of infrastructure. Using standard mechanisms, like OGSA, is a must in achieving a flawless
installation of such systems: OGSA uses proven and widely supported technologies, which offer great
advantages for the standardization of OGSA.


3 OGSA functionality
   To enable the use of the mechanisms discussed in the previous chapter, they need to be adapted to
the grid systems architecture. To this end, OGSA defines a grid service, which is basically a web
service that conforms to a set of rules to fit the grid framework. The grid service describes a set of
conventions (for naming and upgradeability) for web services and defines a set of interfaces for the

                                                                                                     12
functions that OGSA based grid systems must support. These interfaces address the following
functionality:
       • Service discovery. This allows systems to find a service provider for a certain grid service.
       • Dynamic service creation. To allow all kinds of service instances to be created for a remote
            resource.
       • Lifetime management. To manage the lifetime of a service and increase fault tolerance
            (services automatically time out if the requestor of a service fails to keep the service alive.
       • Notification is necessary to inform a system that previously requested a service that the
            requested service is available at a certain location.

    Before going into the details of these main functions of OGSA-based systems, first some
explanation will be given about the OGSA service model. In OGSA, shared resources of any kind are
all represented in the form of a grid service. The services are defined by the type(s) of service they can
perform; more specifically, by the interfaces that they implement (and offer to the clients). Interfaces
are implemented in the form of WSDL port types. The GridService is the only required port type, the
other ones are optional. In other words, a web service is a grid service if it implements the GridService
port type. Descriptions can be found in [12] and [3]. The grid service interface offers operations, for
example, to retrieve service data and to destroy the grid service.
    If different instances offer the same service (implement the same interface) there must be some
way to recognize which occurrences of these services is used / required. To this end, grid services
maintain an internal state. Also, grid service instances are given a globally unique name, called a Grid
Service Handle. GSHs have in theory an infinite lifetime, but grid services must be upgradeable. To
achieve this, a Grid Service Reference is used. A GSH and a GSR are both network-wide pointers to a
service instance. The difference lies in the lifetime and specific information.
    The GSH for a service instance is always the same, so it must not carry too much protocol-specific
information. Using a handle-to-reference mapper (the grid service interface handleMap is used for
this), the GSH can be converted to a GSR which has a limited ending time. The only thing that is left
to be done is to register a GSH at least at one mapper (called the home handleMap). Because a grid
service that implements this interface does not need to know the location of the service, a GSH must
always be mapped at the home handleMap. To find this home, the URL to the service instance is
included in the GSH.

   Service discovery
    Because of the lack of global control each machine must have it mechanisms to find the services it
requires, as OGSA is based completely on services and the dynamic creation of services. So there
must be standard mechanisms available for describing the services, as well as mechanisms for finding
the services.
    Coupled with a grid service instance is a set of service data (a description of what services the
instance is able to handle). Service data is encapsulated in so-called service data elements, which is
just another collection of XML parts. A grid service that supports discovery is called a registry, and
implements the Registry service interface. This registry allows the registration of GSHs at the registry.
If a remote party wished to query the service instance, it can issue the FindServiceData operation
which is part of the GridService interface. In this way the GSHs registered in the registry are made
available to the outside world.

   Service creation
   Dynamic service creation is another key functionality implemented in OGSA. In order to use a
remote service, the creation of such a service is to be handled by a special service called a Factory. If
an instance implements the CreateService operation of the Factory interface it can create remote
services. The factory then returns a GSH to the reserved service and an initial GSR. How services are
created is not specified by the interface but it is dependent on the implementation.




                                                                                                        13
   Lifetime management
    Most of the services in OGSA are transient. Instead of using a fixed set of long-lasting services,
they are dynamically manageable. This means that if a service is no longer needed, it can be destroyed
so that the resources it was using can be freed for use by others. Resources are initially used to
perform a required task, pass the results back to the requestor and then terminate. Another possibility
is the explicit destroying of services whenever they are not needed anymore (the GridService interface
offers the Destroy and SetTerminiationTime operations for this use) but unexpected things may
happen.
    Messages containing termination requests can get lost. If a service does not terminate it continues
using resources that could otherwise be used by others. With heavyweight processes this is certainly
undesirable. The managing of service lifetime is done by so-called keepalive messages. This (to a
certain extent) provides reliability and prevents services from continuing to execute if the requestor
dies. What this means is that the service requestor sends the service messages once in a while to
inform that it is still up and running. If the message stream stops, the service may decide to terminate
because it is doing work that nobody is interested in and using up resources that could be freed.
    A service may need some additional time to complete (extra) service requests. If it times out,
information can get lost. Requestors can in turn request a lifetime extension so the service will not yet
terminate.

   Notification
    Notification lets clients show their interest in receiving notifications of particular messages. For
example, this can be used whenever a client is trying to request a service of which no instance is
available yet, or it is for other reasons interested in a certain topic. These messages are asynchronous
and one-way sent and delivered, so a client that has subscribed to a certain topic can expect
notification messages to come in at all time. So, messages are called to be floating from a notification
source to a notification sink. If a service wishes to receive any kind of notification messages, it must
implement the NotificationSink interface (which provides the DeliverNotification operation). In order
for a service to be able to send notification messages, it must implement the NotificationSource
interface, and its operations SubscribeToNotificationTopic and UnSubscribeToNotification Topic.

4 Using OGSA for writing grid applications:
  The Globus Toolkit version 3.0
    After the introductions to all theoretical and architectural components it might be a good idea to
have a look at how a real grid middleware works, how applications can be written that use this
services and how the language bindings to WSDL are made. For this, the latest version of the Globus
Toolkit (version 3.0, hereafter simply called “GT3”) will be discussed. GT3 is an evolution of the
previous versions of the toolkit (“GT2” and earlier) but it is different in the sense that GT3 is the first
version of the Globus Toolkit that is based on the OGSA architecture (as described in section 2.1).
    GT3 also implements the Open Grid Service Infrastructure (OGSI) framework, which is described
in [12]. The difference between OGSI and OGSA lies mainly in what exactly they describe. OGSA
describes what architectural facilities need to be present for a system to be compatible to OGSA.
These facilities can be components like factories or registries. OGSI defines how the WSDL format is
used for the description a grid service: how can different OGSA systems communicate understand
what other services have to offer. So, OGSI specifies the description format, by specifying which of
the (WSDL) interfaces a service must support in order to fit in the OGSA structure, so that it can
correctly interpret the standard Grid Services, and knows how to extract OGSA-related functionality
like lifetime management and service invocation mechanisms.
    Currently there are no full-version GT3 releases available yet, but the first alpha release has been
published in January 2003. The intent behind GT3 is “to deliver an open source implementation of
OGSI, several OGSI-compliant services and the ability to create new OGSI-compliant services” [13],
it can be seen as an early reference implementation. As mentioned, GT3 is based on earlier versions of
the Globus Toolkit and it offers most of the high-level services GT2 already offered, but in a more
flexible way because it uses the OGSI transport mechanisms (the web services framework). The

                                                                                                        14
implementation of OGSI allows GT3 to collaborate with other E-business and E-science projects that
also conform to this specification, which was one of the main motivations for designing the whole
OGSA after all.
   The example listings are part of the programmer’s guide that comes with GT3. The version used
for this article is “technology preview version 4.0”, which is not an official release.

   4.1 Programming model
   Before explaining how to write a server, we will first have a look at what the OGSA programming
model looks like. One of the most important parts is translating the server-side and client-side code to
WSDL files conforming to OGSI (and vice-versa). Also, SOAP messages that arrive while the
program is running must be translated to machine-level code. This issue, creating a language binding
between the code and the WSDL interfaces can be done in various ways.
   Generating the bindings can be done at compile-time or at program runtime. Compile-time creation
of interfaces is mostly done at the time the server-side code is written. When the program interfaces
(written in Java or other programming languages) are ready, the WSDL interfaces can be created
immediately. However, this is not possible for all parts of the request-response chain where translation
between WSDL and machine-understandable code is necessary. For example, when a client issues a
request (probably a SOAP message) the information in that message needs to be interpreted at runtime
by the server process.
   GT3 currently provides a Java programming model. The interfaces in the GT3 API (Application
Programmers Interface) can be directly called in the program. In GT3, the JAX-RPC (Java XML-
based Remote Procedure Call) [17] from Sun provides the runtime WSDL to Java mapping model
(and vice versa). In the remainder of this chapter will be a brief discussion on how to implement
servers and clients using the Java API provided by GT3.

   4.2 Programming a server
    Before describing how to actually write a service that is to be deployed for use on the grid, it might
be a good idea to have a look at how the server side API is structured. The general structure is shown
in Figure 5. As can be seen services written in Java inherit a ServiceSkeleton, which handles the
common behaviour of all grid services, including the most general manipulation methods (create,
destroy etc.). To implement a persistent service, the PersistentServiceSkeleton can be used. Persistent
services are statically installed and do not need to be dynamically created using a factory. A factory is
a static service and can as such be constructed from the persistentService class.
    Further, all services have one ServiceDataContainer in which the service description is stored, to
be accessed by other processes (clients). This can be done using a client initiated approach (pulling) by
invoking the findServiceData(). It is also possible to use a push-based approach (subscribing or
notification) by using the addListener() method (not shown in Figure 5).




                                                                                                       15
                  Figure 5: Grid Service Server Programming Model (Java) [8]

   The general steps a programmer must take to deploy the service can be found in the GT3
documentation. This programmer’s guide for creating a server consists of the following steps:

   1.   Provide a Service Interface
   2.   Generate Service Stubs
   3.   Implement the Service
   4.   Deploy the Service

   The service interface can be provided by writing a Java interface for the service. A Java interface
usually consists of method definitions (names, parameters, types) and class definitions (which classes
are to be inherited and implemented). This is enough information for generating the WSDL interfaces
because for this the actual implementation is not important. See Example 6 for the Java interface of the
server used in this chapter.

package org.globus.ogsa.guide.impl;

public interface Counter {
    public int add(int val);
    public int subtract(int val);
    public int getValue();
}

                    Example 6: Java server side interface for a simple counter

   The next step is the creation of the WSDL interfaces. They can be written by hand, or by using a
special tool that can convert Java interfaces to WSDL. An example of a tool that can be used is the
Java2WSDL tool [16]. This generates WSDL and JAX-RPC compatible bindings that can be used on
the client and the server side (in this case a CounterPortType). After this, the Java version of the
service interface can be implemented, after which it can be deployed (registered in a Grid Hosting
Environment, by adding it to registries and such).




                                                                                                     16
   4.3 Programming a client
   Client applications can be written directly on top of the JAX-RPC client APIs. A number of utility
classes for simplifying GSH to GSR resolution are provided with GT3. Basically, the only essential
knowledge JAX-RPC lacks is awareness of Grid Service Handles and References.
   Normally, a client receives a GSH from a registry or a well-known location. The handle can than
be given to the ServiceLocator which constructs a proxy or stub, that has the responsibility of making
the function calls while respecting the binding information as described in the WSDL service
description. The service locators for the specific types are automatically generated when the WSDL
bindings are generated, so in the case of Example 6, a CounterServiceLocator is generated.
   A number of examples of services and clients are supplied with GT3, this counter server and client
are also supplied, as well as more simple examples that demonstrate parts of the functionality of GT3.
The following example of a simple counter is also described in the GT3 programmer’s guide [15]. The
service interface (the result of step 1 in section 4.2) can be used to generate all interfaces (for WSDL
examples see section 2.3.2).




                   Figure 6: Grid Service Client Programming Model (Java) [8]

    Example 6 lists the server side interface for a simple counter. This interface can be used to generate
all necessary server-side interfaces to deploy the service (as described in section 4.2). In Example 7 a
possible client-side application for using this service can be seen, which comes with GT3 [15].
    In this example, the most important parts take place in the try{} part of the program. This is
where the service is located, after which the client program can invoke the methods on the object. A
CounterServiceLocator is used to get the client stub for the counter service, which can then be used as
if it were the real object. The stub then transports the request to the server, where it is interpreted, the
result is produced, a return message is sent, after which the client method returns. The client can use
the CounterPortType object the locator returns without having to worry about the object location. This
example does not use GSH and GSR constructs, for which a FactoryPortType object has to be used.
This creates a GSR which can be mapped to a handle by also using a CounterServiceLocator. For
more examples (also about this counter) see the GT3 programmers guide [15].




                                                                                                         17
public class CounterClient {
    public static void main (String[] args) {

           int val = Integer.parseInt(args[2]);
           try {
               CounterServiceLocator counterLocator = new
                                     CounterServiceLocator();
               URL url = new URL(args[0]);
               CounterPortType counter =
                                     counterLocator.getCounterService(url);

               if (args[1].equalsIgnoreCase("add")) {
                   System.out.println("Counter add:" + counter.add(val));
               }
               else if (args[1].equalsIgnoreCase("...")) {
                   /* other functions based on supplied arguments */
               }
           } catch (Exception e) {
               e.printStackTrace();
           }
     }
}
         Example 7: implementation of a client side program using the Counter service

    4.4 Larger grid applications using GT3
   Now how do things work when the applications get more complex? Examples such as these do not
go far beyond the “Hello world” level. More advanced applications are of course possible: grids are
designed to share various kinds of resources and OGSA even goes beyond ‘simple’ grids, as it is
aimed at even cross-grid applications.
   For a non-expert in the grid filed, it would probably be quite challenging to extract example listings
for such complex environments that are still easy to understand. The OGSA framework is relatively
new and, as a result, not many “real” applications are available for it yet. This is not very surprising: at
the moment of writing, GT3 is still only available in alpha releases.
   This article was intended to provide an overview of how web services and grid systems relate to
each other, and what could possibly be done with such a kind of architecture. How cross-grid services
can best be implemented, and how they really work is something that will have to be learned by
experience, and it will probably still take some time before we can get a good view of that.

5 Conclusions
    As we have seen, grid systems are expected to be the next big step in the evolution of (networks of)
computers. Of course, it is hard to predict the future: it depends on more factors than just on the
available technology. Middleware packages like Globus will first need to prove to be a worthy
addition to existing commercial settings, rather than just for the purpose of science. However, grids are
still relatively new, and what the future will look like can not be predicted only by having the
technology: it is also necessary for the business world to become aware of the possibilities that lie in
using (cross-grid) applications. In addition, toolkits like Globus need to provide an easy-to-use
solution that guarantees high performance with a good price-performance ratio.
    For the intended applications it is at least necessary to have an infrastructure that can be installed
almost everywhere, OGSA makes this possible by using the web services construction. However, the
large scale use of solutions like GT3 will probably not be applied until the technology has been fully
matured, and when there is wide range of support and tools that would make programming grids
relatively effortless. Otherwise, a lot of research will probably be required to write such advanced
applications, which could be a serious problem for a lot of programmers.

                                                                                                         18
6 References
 Literature
 [1] Brittenham, P. An overview of the Web Services Inspection                    Language.   2001.
     www.ibm.com/developerworks/webservices/library/ws-wsilover.

 [2] E. Christensen et al., Web Services Description Language (WSDL) 1.1. W3C Note, 15 March
     2001; http://www.w3.org/TR/wsdl

 [3] Foster, I., Kesselman, C., Nick, J.M. and Tuecke, S. The Physiology of the Grid. Tech report
     of the Globus project. Draft version 2/17/2002. www.globus.org/research/papers/ogsa.pdf

 [4] Foster, I., Kesselman, C. and Tuecke, S. The Anatomy of the Grid: Enabling scalable virtual
     organisations. Intl J. High Performance Computing Applications, vol. 15, no. 3, 2001, pp.
     200-222; http://www.globus.org/research/papers/anatomy.pdf.

 [5] Fremantle, P., Weerawarana, S. and Khalaf, R. Enterprise Services. ISSN: 0001-0782

 [6] Mitra, N. SOAP Version 1.2 Part 0: Primer. W3C working draft 26 June 2002.
     http://www.w3.org/TR/2002/WD-soap12-pzrt0-20020626

 [7] Nagy, W.A., Ballinger, K. The WS-Inspection and UDDI relationship. November 2001,
     http://www-106.ibm.com/developerworks/webservices/library/ws-wsiluddi.html

 [8] Sandholm, T., Seed, R. and Gawor, J. OGSI Technology Preview Core – A Grid Service
     Container Framework. http://www.globus.org

 [9] Shohoud, Y. Introduction to WSDL. Copyright© 2000-2001 DevXpert Corporation.
     http://www.learnxmlws.com/tutors/wsdl/wsdlprint.asp

 [10] Stal, M. Web Services: beyond component-based computing. ISSN: 0001-0782

 [11] Tanenbaum, A.S., van Steen, M. Distributed Systems, principles and paradigms. Prentice Hall
      2002. ISBN: 0-13-088893-1

 [12] Tuecke S, et al., Grid Service Specification. Draft 3 (7/17/2002)
      http://www.gridforum.org/ogsi-wg

 [13] Globus Toolkit™ 3.0 FAQ, November 19, 2002. http://www.globus.org/about/faq.html

 [14] UDDI specification, available at http://www.uddi.org

 Software References
 [15] For the examples, the Technology Preview 4.0 of the Globus Toolkit 3 was used. Technology
      previews and alpha releases (including documentation) are available for download at
      http://www.globus.org/ogsa/releases/alpha/
      and http://www-unix.globus.org/ogsa/docs/alpha3/java_programmers_guide.html

 [16] The Java2WSDL tool is part of the Jakarta Ant package. It is available at
      http://jakarta.apache.org/ant/

 [17] The JAX-RPC package as well as supporting documentation can be found at
      http://java.sun.com/xml/jaxrpc/

                                                                                                19

								
To top