An Overview of Computational Grid Technologies

Document Sample
An Overview of Computational Grid Technologies Powered By Docstoc
					An Overview of
Computational Grid
   Marlon Pierce
   Community Grids Laboratory
   Indiana University
Grids in I533 Context

                                                 Security, Reliability, etc
 Client Environments: Portals, Taverna, etc
  Workflow, Information, Sharing, Ontology
                 Gaussian,      Logical File
PubChem, etc
                Data Mining       Systems
General Data General Exec General File
  Services        Services        Services
      Web Service Core Specifications

            (Verbal description on next slide)
Grids in I533 Context
   I533 covers a diverse set of topics.
   (Web) Services are the core abstraction
     Execution Services: computational chemistry, data mining, text
     Data Services: PubChem, OGSA-DAI

     Information and metadata services: Ontologies, information
       discovery and sharing.
     Orchestration services (workflow): Taverna, BPEL, etc.

   Grids are collections of services with some glue
     Decentralized security, information system agreements (from
       monitoring to metadata), abstract execution protocols, etc.
     Service Oriented Architecture
    Brief History of Grids
   The term “Grid Computing” was coined by Dr. Larry Smarr, then
    director of NCSA, back in 1992.
   The original concept: computing power should be available on
    demand, for a fee.
       Just like the electrical power grid.
   Today, Grids are thought of as federations of services that span
   Grids are usually driven by science applications.
       Most core funding from the DOE, NSF, UK e-Science, and other scientific
        agencies in the EU, Japan, China, Korea, etc.
           These agencies all cooperate to some degree.
       DOD has its own version of things, the Global Information Grid, that is
        currently unrelated.
       IBM, MS, Oracle, Sun, etc have varying degrees of interest.

Grid Computing Research
   Historically, grid computing has been targeted at simplifying
    access to high performance computing and giant scientific data
     Example: NSF TeraGrid includes both hardware and software
       along with a common administration infrastructure.
           IU is one of the partners.
   There are many overviews of Grid computing.
       See for example Globus World presentations from 2004, 2005
       Show lots of “gee whiz” pictures of big science problems using
        the Grid.
       Usually mention seti@home, and more recently, Google and
       These annoy me.
           Seti@home has nothing to do with Grid computing.
Grid Computing Research
   Grid computing is large scale
    distributed computing
       “Middleware”
   It’s not the pervasive              There is no
    computing power Grid                  Grid!
    originally envisioned.
   As long as its research, we get
    to keep working on it.
   I’ll examine some key
    technologies for building a Grid
    installation, but not “the” Grid.

                                        Dr. Dave Semeraro has his doubts.
    Some Desirable Grid Characteristics
   Grids are collections of services.
     Accessing computational facilities to run codes.

     Accessing remote databases, data warehouses and file systems.

     Transferring large data sets.

     Accessing remote instruments and sensors.

   Collections are created from multiple partners: Virtual Organizations
     Must support decentralized management.

     Common security abstraction layer
           Authentication: required and solved.
           Authorization: Research 4Ever!
       Common information infrastructure
           Monitoring hardware and networks: required and solved
           Finding resources (i.e. “Semantic Grid”) Research 4Ever!
     Ex: TeraGrid combines NCSA, SDSC, IU, TACC, ORNL, Purdue, ...
   Generations
     Generation 1: UNIX daemons, command-line clients, protocol-
     Generation 2: Based on Web Service standards
  Physical             Virtual Organisation    Organisation

                   Virtual Organization
                   View of Deployment                       Virtual
Virtual Organisation                                     Organisation

I. Foster,

 Physical                                     Physical
 Organisation                                 Organisation
   Grid Computing Software Examples
Globus Toolkit     Job managers for science applications, Grid security
(ANL, ISI)         frameworks, file management tools, etc.
Condor             A job scheduler and cycle scavenger optimally
(UW)               running applications on available resources. “High
                   throughput computing”
Storage Resource Middleware that provides a uniform interface for
Broker           connecting to heterogeneous data resources over a
(SDSC)           network and accessing replicated data sets.

OMII               UK e-Science program’s software arm.

OGSA-DAI           From UK e-Science program. Wraps XML and
(U. Edinburgh)     relational databases as Grid services and provides a
                   workflow client library for query processing.
Making Interoperable Tools
   There are a large number of Grid-related research
    projects and tools.
   They need some common protocols
       Not just wire protocols but also security procedure
   Two most important
       GSI: A global security system
       GRAM: a global method for executing remote operations.
   Grid standards and would-be standards are defined
    through the Global Grid Forum.
   We will concentrate on the Globus Toolkit in these
    lectures, but GSI and GRAM are important to
    several other projects.
       Condor, SRB, Sun Grid Engine, etc.
    Globus Services Landscape

Grid Security Infrastructure

    An overview
Grid Security Infrastructure Keywords
    Public Key Infrastructure (PKI)
      Most Grid use asymmetric encryption keys

      Based on OpenSSL but with GSSAPI extensions

      Users have a public key and a private key.
            Public keys can decrypt messages encrypted by private keys and
             vice versa.
            Public key: encrypts a message
            Private key: signs a message. Only you have the private key, so
             only you can generate that specific signature.
        I encrypt with your public key and sign with my private key.
            Only you can unencrypt, and you know it came from me.
    PKI tools are part of Java’s SDK, so try them out.
    Certificate Authorities: establishing trust.
      Can you trust a public key?

      Yes, if you trust the signer.
      Large Grids have CAs.

      You can run your own with SimpleCA.

      CAs can be hierarchical.
More Keywords: GSS API
   Generic Security Service API (GSSAPI)
     PKI is slow and symmetric keys are much faster.

     GSSAPI establishes a “context” between two communicators by
      sharing a secret symmetric session key.
     Very similar protocol to WS-SecureConversation

   Java implementation part of standard SDK release.
     Try it out, but it requires Kerberos

   GSI uses the GSSAPI to establish security contexts.
   We will see how to program clients in the next lecture.
Single Sign On and Delegation
   Single Sign On
       A “Grid” implies that you can access lots of machines, but
        not necessarily anonymously.
           Charged for usage: supercomputer centers issue allocations.
       SSO is the ability to login once, get a ticket, and access
        many machines without constantly providing username and
       GSI is very similar to a somewhat older system called
        Kerberos, which you can still get.
   Delegation is the security concept that supports this.
       In practice, GSI handles delegation by resigning
       Take advantage of hierarchical CA organization for trust.
Credential Delegation in GSI

Butler et al,
    A Public Key more usercert.pem
Bag Attributes
   localKeyID: 01 00 00 00
   Pierce 64229
issuer= /DC=org/DC=DOEGrids/OU=Certificate
   Authorities/CN=DOEGrids CA 1
----------------------[Stuff deleted]---------------------------------
    A Private Key more userkey.pem
Bag Attributes
   localKeyID: 01 00 00 00 Microsoft Enhanced Cryptographic Provider v1.0
   friendlyName: 6f50c542f27d23ca349e371673b2ff8d_2586cc29-aa58-
Key Attributes
   X509v3 Key Usage: 10
Proc-Type: 4,ENCRYPTED
DEK-Info: DES-EDE3-CBC,42533BEF0D5016EB

-----------------------[Stuff Deleted]-----------------------------------------------------------
    MyProxy Credential Repository
   Private keys are troublesome
    and dangerous.
       You need to put one on every
        machine that you may use for
        initial login.
       This increases chance it will
        get stolen.
       Can be placed on expensive
        smart cards.
   Solution: MyProxy Server
       On-line credential repository.
       Issues short-term keys to any
        client that knows the
        username and password.
       Very convenient for Web
        portal applications.

               J. Basney,
Grid as a Virtual Organization

   Now that we have an SSO, we can set this up
    across many different partner sites.
   Use one super-CA or at least mutually trust our
    partner CAs.
       That is, my org will trust messages signed by your CA.
   This is the beginnings of a “Virtual Organization”.
   Real organizations contribute resources to the VO.
   VOs can be long-lived.
       TeraGrid, Open Sciences Grid
   Ad-hoc Grids are more of a research issue.
    GSI in Action: GridFTP
   GSI is not a service itself.
   You use it to build secure services.
   These services inherit several capabilities
       They can authenticate to each other.
       Messages are secure
           Encrypted, non-repudiated, tamper-proof, replay-proof, etc.
       You can delegate two remote services to take an action on your
   GridFTP is an example of a GSI enabled service.
       File operations and transfers, based on standard IETF FTP protocol.
       Supports parallel TCP
       Supports striping: several GridFTP servers can act as a logical
        GridFTP server, each working on a different data subset.
   A nice summary:
  GridFTP Third Party Transfer Cartoon

                          Client             Credential

        “Move File X
        to Host B.”

                Host A               Host B
               GridFTP              GridFTP
                Source             Destination
                Server               Server
GridFTP Clients
   Command line clients
       globus-url-copy
       uberftp
   Programming interfaces: build your own
       Java and Python CoG Kits
       Java CoG reviewed next lecture.
Grid Resource Allocation
Management (GRAM)
What Is GRAM?
   GRAM is a protocol for mapping generic user requests to specific
   Heritage: must execute jobs on supercomputers.
     Interactive: use Unix fork.

     Queue Systems: PBS, LSF, Condor, Sun Grid Engine, etc.

   This must take place as the user.
     Allocation accounting, logging, general peace of mind at stodgy
       HPC centers.
   Note this is very different from e-Business.
     You don’t need a database account to buy something from
  Pre-Web Service GRAM Components
                             MDS client API calls
                             to locate resources
           Client                                    MDS: Grid Index Info Server
                             MDS client API calls                                    Site boundary
                             to get resource info

 GRAM client API calls to
request resource allocation                     MDS:   Grid Resource Info Server
   and process creation.                                            Query current status
                          GRAM client API state                     of resource
        Globus Security     change callbacks
        Infrastructure                              Local Resource Manager
                                                                                 Allocate &
                                                                              create processes
                         Create      Job Manager

        Gatekeeper                  Parse
                                                        Monitor &
                                                         control       Process
                                      RSL Library

GRAM Job Specifications

   The major purpose of GRAM is to execute one or
    more remote commands on the user’s behalf.
       Abstract UNIX shell, PBS, Condor, etc.
   So how do you specify the command?
   Pre-Web Service Grids (i.e. based on Globus 2)
    uses the Resource Specification Language (RSL).
   Web Service Grids (i. e. based on Globus 4) use the
    XML Job Description Language.
GRAM Client Tools
   You can execute remote commands using clients tools
   We will develop Java clients next time.
   GT 2 command line examples (with RSL)
     globusrun: all purpose client

     globus-job-run: interactive jobs
     globus-job-submit: batch jobs

     globus-job-cancel: stop batch jobs

   GT 4 command line examples (with JDL)
     globusrun-ws: all purpose client
     globus-job-run-ws: interactive job submission

     globus-job-submit-ws: batch job submission

     globus-job-clean-ws: stop batch jobs.
Sample RSL String
   The following runs the UNIX echo and the
   This is an argument to globusrun.
   Use this to execute “echo” and “mpi-hello”.
(* Multijob Request *)
+(&(executable = /bin/echo)
   (arguments = Hello, Grid From Subjob 1)
   (resource_manager_name =
  (count = 1)
( &(executable = mpi-hello)
  (arguments = Hello, Grid From Subjob 2)
  (resource_manager_name =
   (count = 2)
   (jobtype = mpi)
      A Very Simple Job Description
       <argument>this is an example string </argument>
        <name>PI</name> <value>3.141</value>
More Details on Job Submission
   The full Job Description Schema is here:
   You can do much more complicated things.
     Run sequences of jobs.

     Stage files with GridFTP.

     Delegate jobs to other GRAMs.

   But this is controversial.
     Lots of people have worked on job management workflow
     Several based on Apache Ant, for example.

     BPEL is the Web Service standard.
Grids and Web Services
    Globus Services Landscape

are up
Grids and Web Services
   The requirements of Grids are very similar to those of
    Service Oriented Architecture-based systems.
   Grid and Web Service integration began in 2002.
     Open Grid Services Architecture: “Physiology of the Grid”
       paper for Foster et al.
     Aborted start in Globus Toolkit 3, OGSI

     Current Globus Toolkit 4 much more successful.

   OGSA-DAI, Condor, and SRB all have Web Service
   Many UK e-Science projects also follow a similar approach.
     Sometimes referred to as the “WS-I+” approach to distinguish
       it from the Globus/IBM approach.
     See
     See OMII releases
         GT4 GRAM Structure: WSRF/WSN
         Poster Child
                                 Service host(s) and compute element(s)

                    GT4 Java Container                                    Compute element
                                            Local job
                           GRAM              control
                           services                                               Local

                                                               GRAM             scheduler

                    Delegation   request
                                                               GridFTP         User
                          RFT File
                                             FTP                                job
                                                                   FTP data
                                                               GridFTP         storage
      Reliable File Transfer: Third Party Transfer

   Fire-and-forget transfer                                     RFT Client

   Web services interface                          SOAP                        Notifications
                                                   Messages                      (Optional)
   Many files & directories
                                                                 RFT Service
   Integrated failure recovery
GridFTP Server                                                            GridFTP Server

     Master       Protocol      Data                    Data          Protocol      Master
      DSI        Interpreter   Channel                 Channel       Interpreter     DSI

         IPC Link                                                         IPC Link

      IPC           Slave       Data                    Data            Slave        IPC
    Receiver         DSI       Channel                 Channel           DSI       Receiver
Grid Web Service Extensions

   WSDL and SOAP form the core of Grid
   WS-Addressing and WS-Security family are
   Globus and friends are working to extend
    core Web Service standards through OASIS.
       WS-Resource Framework (WSRF): modeling
        stateful resources.
       WS-Notification: Web Service version of one-to-
        many messaging.
Stateful Resources and Grids
   Web Service Architectures and thus Grids are really message
    oriented, not RPC based.
     All state should be in the SOAP message.

     This allows messages to go through many SOAP intermediaries.

    Request/response does not really map to Grid requirements.
     Services may take hours or days to complete, so need callbacks.

           Ex: computational chemistry codes on TeraGrid, RFT for many TB of
       Services may need to push information to listeners.
           “Big file 1 is done, now move big file 2”
   Grid resources may also come and go.
     Instruments typically generate data at scheduled times.

     Down for maintenance, upgrades, reconfiguration, etc.

   WSRF and WS-Notification attempt to solve these Grid
Web Service Resource Framework

   WSRF is a collection of WSDL specifications
    and associated messages.
       WS-Resource
       WS-ResourceProperties
       WS-ResourceLifetime
       WS-ServiceGroup
       WS-BaseFault
   See http://www.oasis-

   The WS-Resource decouples a (stateful) resource
    from the Web Service that accesses it.
   For example, a database is a resource that may be
    accessed through a Web Service.
   The resource may be defined by metadata.
       Our database needs to provide clues to the type of data it
       Need this for discovery.
       This metadata is contained in WS-ResourceProperties
Goals of WS-ResourceProperties
   Provide a metadata                 Use XML Schemas to hold
    property framework for              WSDL message definitions
    describing resources.               that define the resource
   Provide a Web Service               properties.
    interface for performing           Associate these messages
    operations on these                 with WSDL portTypes.
    properties.                        The actual values of the
       Query and retrieve              Schema are in an XML
        properties.                     document.
       Update values on a                 Store it in memory, put it in
        resource (controversial).           a database, derive it at
       Subscribe to property               query time, ...

        This requires some understanding of WSDL and SOAP.
                   Upcoming lecture will cover this.
Goals of WS-ResourceLifetime

   Resources may have lifetimes.
       For example, your quantum chemistry calculation
        may take a few hours.
   This may be associated with a WS-Resource.
   WS-ResourceLifetime defines methods for
       Destroying a resource at some future time (and
        t=0 allowed).
       Learning the lifetime of a resource.
       Extending the lifetime of a resource.
WS-Notification Core Specs

   WS-BaseNotification
       Specs for controlling publications and subscriptions of
        events (i.e. resource property changes.)
       Subscribers subscribe directly to publishers.
   WS-Topics
       Topics are used to organize messages.
       You may publish or subscribe to a topic rather than a
        specific resource endpoint.
   WS-BrokeredNotification
       Brokers decouple publishers from subscribers.

   Stateful resources will need to notify one or
    more listeners when their state changes.
   For example, a Web lecture has many
       Beginning and end of the lecture.
       Changes in slides.
       To my knowledge, no one has tried this.
   Real examples based on WS-GRAM, RFT.
A Skeptical View of WSRF
   WSRF has several independent implementations.
     WSRF.NET (UV), Python (LBL), Perl (UK), C/C++ (ANL) ,...

   But is this critical mass?
     What about MS, Oracle, and other big Web Service players.

   OASIS specification approval is glacial.
     Many specs, even if approved, have died on the vine for lack of
     Many more are a mess because of complicated dependencies.
         WS-Addressing has released many versions, screwing up many
          dependent specs.
   Competing specs exists.
     MS’s WS-Eventing, for example.

   “Semantic Grid” using an entirely different approach for
     RDF, OWL provide more natural modeling of metadata than tree-
        based XML Schemas.
   Ignores UDDI as an information system.
   I ran out of room.
Future Challenges

   Real time interaction
   Joy of use
   Intuitive user interface
    Global scalability
       1000s of simultaneous users
       Addictive
   (Observation courtesy Prof. Fran Berman)