Lecture 3 - Grid and Middleware

Document Sample
Lecture 3 - Grid and Middleware Powered By Docstoc
					        Grid Computing (3)
(Special Topics in Computer Engineering)


            Veera Muangsin
           13 February 2004

                                           1
                   Outline
•   High-Performance Computing
•   Grid Computing
•   Grid Applications
•   Grid Architecture
• Grid Middleware
• Grid Services




                                 2
            Before the Grid
            User                 • independent sites
          Application            • independent
                                   hardware and
                                   software
 The User is responsible for     • independent user
resolving the complexities of      ids
      the environment
                                 • security policy
                                   requiring local
                                   connection to the
           Network
                                   machine.


 Site A                 Site B

                                              3
                   First Step to the Grid
                     User                            Metacenter
                   Application                       • Two or more
                                                       resources
A layer of abstraction is added that hides some of
                                                       connected in a
the complexities associated with running jobs in a     controlled user
  distributed computing environment, however,          environment
                  limitations exist
                                                     Constraints
                                                     • common
                     Network                           architecture
       Centralized Scheduler and file staging        • single name
                                                       space
          Site A                       Site B        • common
                                                       scheduler
                                                                   4
                             The Grid Today
                           User                             Common Middleware
1 Request info from                                         - abstracts
  the grid               Application                          independent,
2 Get response                                                hardware, software,
                         1    2    3                          user ids, into a
3 Make selection and                                          service layer with
  submit job                                                  defined APIs
       The underlying infrastructure is abstracted into     - comprehensive
     defined APIsGrid Middleware
                  thereby simplifying developer and the       security,
     user access to resources, however, this layer is not   - allows for site
                       Infrastructure
                          intelligent                         autonomy
                        Network                             - provides a common
                                                              infrastructure based
                                                              on middleware

                Site A                    Site B

                                                                          5
                   The Near Future Grid
                     User                             Customizable Grid
                                                        Services built on
                   Application                          defined Infrastructure
                                                        APIs
   Resources are accessed via various
      intelligent services that access                • automatic selection
 Intelligent, Customized Middleware
             infrastructure APIs                        of resources
                                                      • information products
Grid Middleware - Infrastructure APIs                   tailored to users
                 Scientist
 The result: The (service oriented) and Application   • accountless
                focus on science and not
 Developer can Infrastructure                           processing
        on systems management
                 Network                              • flexible interface:
                                                        web based, command
                                                        line, APIs


          Site A                      Site B

                                                                      6
       Layered Grid Architecture
  (By Analogy to Internet Architecture)
                                       Application




                                                                      Internet Protocol Architecture
“Coordinating multiple resources”:
ubiquitous infrastructure services,        Collective
app-specific distributed services                       Application

“Sharing single resources”:
negotiating access, controlling use       Resource

“Talking to things”: communication
(Internet protocols) & security        Connectivity     Transport
                                                         Internet
“Controlling things locally”: Access
to, & control of, resources               Fabric           Link

                                                             7
                               Grid Components
                                      Applications and Portals
                                                                                                       Grid
 Scientific   Engineering       Collaboration       Prob. Solving Env.     …      Web enabled Apps     Apps.




                             Development Environments and Tools                                        Grid
Languages     Libraries       Debuggers     Monitoring        Resource Brokers     …      Web tools    Tools




                             Distributed Resources Coupling Services                                   Grid
 Comm.        Sign on & Security       Information        Process        Data Access   …      QoS      Middleware



                                   Local Resource Managers
 Operating Systems        Queuing Systems        Libraries & App Kernels     …     TCP/IP & UDP
                                                                                                       Grid
                               Networked Resources across                                              Fabric

 Computers      Clusters
                                     Organisations
                               Storage Systems        Data Sources       …    Scientific Instruments
                         Example:
             High-Throughput Computing System            API
                                                         SDK

     App        High Throughput Computing System      C-point
                                                      Protocol


Collective Dynamic checkpoint,      job management,   Checkpoint
  (App)    failover, staging                          Repository


Collective
           Brokering, certificate authorities
(Generic)
                                                         API
                                                         SDK
Resource Access to data, access to computers,
         access to network performance data            Access
                                                      Protocol
 Connect Communication, service discovery (DNS),
         authentication, authorization, delegation
                                                      Compute
    Fabric Storage systems, schedulers                Resource


                                                           9
                          Example:
                    Data Grid Architecture

     App            Discipline-Specific Data Grid Application


Collective Coherency control, replica selection, task management,
  (App)    virtual data catalog, virtual data code catalog, …

Collective Replica catalog, replica management, co-allocation,
(Generic) certificate authorities, metadata catalogs,

            Access to data, access to computers, access to network
Resource
            performance data, …

         Communication, service discovery (DNS),
 Connect authentication, authorization, delegation

   Fabric Storage systems, clusters, networks, network caches, …

                                                                     10
             Globus Toolkit
• Grid computing middleware
  – Software between the hardware and high-level
    services
  – Basic libraries, services, command-line programs
• Most common middleware used in grids
• Integrated with Web Service




                                                 11
            Globus Software Architecture
                                              •get and put files
                      •login                  •3rd party copy
                      •execute commands       •interactive file
                      •copy files             management
  information about                           •parallel transfers         •execute remote
  resources and services                                                    applications
                                                                     •stage executable, stdin,
                                                                     stdout, stderr
          Monitoring and                                     Globus Resource Allocation
         Discovery Service         Grid       Grid FTP           Manager (GRAM)
               (MDS)               SSH
              LDAP            Grid Security Infrastructure      PBS     LSF    fork/exe
                                          (GSI)                                    c
                             X.509 Certificates    SSL/TLS
   distributed
directory service                                                        job management
                                                                             systems
               credentials for       •authentication
               users, services,      •secure communication •single sign on
                    hosts                                  •delegation of
                                                           credentials
                                                           •authorization           12
   Globus Deployment Architecture
                                  User                User           Web portal
                                                application/tool
   Globus                        Grid FTP       GRAM        Grid SSH     MDS
    client                        Client         Client       Client    Client
   system


                                                                                     Clients are
                         MDS                                                        programs and
                                                     MDS GIIS
                        server                                                         libraries
                        system


             Grid FTP   GRAM         Grid SSH                    Grid SSH     GRAM        Grid FTP
              Server    Server        Server                      Server      Server       Server

                         PBS             MDS                       MDS            LSF
                                         GRIS                      GRIS


Globus                                                                                               Globus
server                                                                                               server
system                                                                                               system


                                                                                                       13
           Globus Toolkit™
• A software toolkit addressing key technical
  problems in the development of Grid enabled
  tools, services, and applications
  – Offer a modular “bag of technologies”
  – Enable incremental development of grid-enabled
    tools and applications
  – Implement standard Grid protocols and APIs
  – Make available under liberal open source license


                                                14
            General Approach
• Define Grid protocols & APIs
  – Protocol-mediated access to remote resources
  – Integrate and extend existing standards
  – “On the Grid” = speak “Intergrid” protocols
• Develop a reference implementation
  – Open source Globus Toolkit
  – Client and server SDKs, services, tools, etc.
• Grid-enable wide variety of tools
  – Globus Toolkit, FTP, SSH, Condor, SRB, MPI, …

                                                    15
           Four Key Protocols
• The Globus Toolkit™ centers around four
  key protocols
  – Connectivity layer:
     • Security: Grid Security Infrastructure (GSI)
  – Resource layer:
     • Resource Management: Grid Resource Allocation
       Management (GRAM)
     • Information Services: Grid Resource Information
       Protocol (GRIP)
     • Data Transfer: Grid File Transfer Protocol
       (GridFTP)
                                                         16
The Globus Toolkit™:
Security Services
The Globus Project™
  Argonne National Laboratory
USC Information Sciences Institute

      http://www.globus.org


                                     17
       Why Grid Security is Hard
• Resources are often located in distinct administrative
  domains
   – Each resource has own policies & procedures
• Set of resources used by a single computation may be
  large, dynamic, and unpredictable
   – Not just client/server, requires delegation
• It must be broadly available & applicable
   – Standard, well-tested, well-understood protocols; integrated
     with wide variety of tools



                                                            18
    Grid Security Infrastructure (GSI)
• Extensions to standard protocols & APIs
  – Standards: SSL/TLS, X.509 & CA, GSS-API
  – Extensions for single sign-on and delegation
• Globus Toolkit reference implementation of GSI
  – SSLeay/OpenSSL + GSS-API + SSO/delegation
  – Tools and services to interface to local security
     • Simple ACLs; SSLK5/PKINIT for access to K5, AFS; …
  – Tools for credential management
     •   Login, logout, etc.
     •   Smartcards
     •   MyProxy: Web portal login and delegation
                                                        19
     •   K5cert: Automatic X.509 certificate creation
                                   GSI in Action
     “Create Processes at A and B that Communicate & Access Files at C”
             Single sign-on via “grid-id”
             & generation of proxy cred.       User Proxy
 User        Or: retrieval of proxy cred.
                                                   Proxy
                                                 credential
             from online repository
                                            Remote process
                                            creation requests*
           GSI-enabled Authorize                                 Ditto   GSI-enabled
Site A
           GRAM server Map to local id                                   GRAM server Site B
(Kerberos)                                                                           (Unix)
                       Create process
 Computer              Generate credentials                                    Computer
 Process                                                                        Process
              Local id                      Communication*                       Local id
  Kerberos    Restricted      Remote file                                       Restricted
   ticket       proxy
                            access request*                                       proxy

                                                          GSI-enabled
                                            Site C         FTP server
                                            (Kerberos)
* With mutual authentication                                       Authorize
                                            Storage                Map to local id
                                            system                 Access file
                                                                                     20
                 Review of
          Public Key Cryptography
• Asymmetric keys
  – A private key is used to encrypt data.
  – A public key can decrypt data encrypted with the
    private key.
• An X.509 certificate includes…
  – Someone’s subject name (user ID)
  – Their public key
  – A “signature” from a Certificate Authority (CA) that:
     • Proves that the certificate came from the CA.
     • Vouches for the subject name
     • Vouches for the binding of the public key to the subject

                                                                  21
Public Key Based Authentication
• User sends certificate over the wire.
• Other end sends user a challenge string.
• User encodes the challenge string with private key
   – Possession of private key means you can authenticate as
     subject in certificate
• Public key is used to decode the challenge.
   – If you can decode it, you know the subject
• Treat your private key carefully!!
   – Private key is stored only in well-guarded places, and only
     in encrypted form


                                                          22
                  User Proxies
• Minimize exposure of user’s private key
• A temporary, X.509 proxy credential for use
  by our computations
  –   We call this a user proxy certificate
  –   Allows process to act on behalf of user
  –   User-signed user proxy cert stored in local file
  –   Created via “grid-proxy-init” command
• Proxy’s private key is not encrypted
  – Rely on file system security, proxy certificate file
    must be readable only by the owner
                                                     23
                Delegation
• Remote creation of a user proxy
• Results in a new private key and X.509
  proxy certificate, signed by the original key
• Allows remote process to act on behalf of
  the user
• Avoids sending passwords or private keys
  across the network

                                             24
            GSI Applications
• Globus Toolkit™ uses GSI for authentication
• Many Grid tools, directly or indirectly, e.g.
  – Condor-G, SRB, MPICH-G2, Cactus, GDMP, …
• Commercial and open source tools, e.g.
  – ssh, ftp, cvs, OpenLDAP, OpenAFS
  – SecureCRT (Win32 ssh client)
• And since we use standard X.509 certificates,
  they can also be used for
  – Web access, LDAP server access, etc.
                                            25
      The Globus Toolkit™:
Resource Management Services
       The Globus Project™
        Argonne National Laboratory
      USC Information Sciences Institute

            http://www.globus.org


                                           26
                  The Challenge
• Enabling secure, controlled remote access to
  heterogeneous computational resources and
  management of remote computation
   –   Authentication and authorization
   –   Resource discovery & characterization
   –   Reservation and allocation
   –   Computation monitoring and control
• Addressed by new protocols & services
   – GRAM protocol as a basic building block
   – Resource brokering & co-allocation services
   – GSI for security, MDS for discovery
                                                   27
         Resource Management
• The Grid Resource Allocation Management
  (GRAM) protocol and client API allows
  programs to be started on remote resources,
  despite local heterogeneity
• Resource Specification Language (RSL) is
  used to communicate requirements
• A layered architecture allows application-
  specific resource brokers and co-allocators to be
  defined in terms of GRAM services
  – Integrated with Condor, PBS, MPICH-G2, …
                                               28
     Resource Management Architecture

                                       Broker
                                                              RSL
                                                              specialization
                       RSL
                                                    Queries     Information
   Application
                                                    & Info        Service
                       Ground RSL

                                     Co-allocator


                                    Simple ground RSL
Local        GRAM                       GRAM                       GRAM
resource
managers         LSF                   Condor                       NQE

                                                                       29
 Globus Toolkit Implementation
• Gatekeeper
  – Single point of entry
  – Authenticates user, maps to local security
    environment, runs service
  – In essence, a “secure inetd”
• Job manager
  – A gatekeeper service
  – Layers on top of local resource management
    system (e.g., PBS, LSF, etc.)
  – Handles remote interaction with the job
                                                 30
                         GRAM Components
                             MDS client API calls
                             to locate resources
            Client                                   MDS: Grid Index Info Server
                             MDS client API calls                                    Site boundary
                             to get resource info


 GRAM client API calls to
request resource allocation
                                                MDS:   Grid Resource Info Server
   and process creation.                                            Query current status
                          GRAM client API state                     of resource
          Grid Security     change callbacks
        Infrastructure                              Local Resource Manager
                                                                                 Allocate &
                                                       Request
                                                                              create processes
                         Create      Job Manager

        Gatekeeper                  Parse
                                                                       Process
                                                        Monitor &
                                                         control       Process
                                      RSL Library
                                                                       Process

                                                                                       31
     Job Submission Interfaces
• Globus Toolkit includes several command
  line programs for job submission
  – globus-job-run: Interactive jobs
  – globus-job-submit: Batch/offline jobs
  – globusrun: Flexible scripting infrastructure
• Others are building better interfaces
  – General purpose
     • Condor-G, PBS, GRD, Hotpage, etc
  – Application specific
     • ECCE’, Cactus, Web portals
                                                   32
  The Globus Toolkit™:
Information Services
  The Globus Project™
    Argonne National Laboratory
  USC Information Sciences Institute

        http://www.globus.org


                                       33
     Grid Information Services
• System information is critical to operation
  of the grid and construction of applications
  – What resources are available?
     • Resource discovery
  – What is the “state” of the grid?
     • Resource selection
  – How to optimize resource use
     • Application configuration and adaptation?
• We need a general information
  infrastructure to answer these questions
                                                   34
 Examples of Useful Information
• Characteristics of a compute resource
  – IP address, software available, system
    administrator, networks connected to, OS
    version, load
• Characteristics of a network
  – Bandwidth and latency, protocols, logical
    topology
• Characteristics of the Globus infrastructure
  – Hosts, resource managers

                                                35
 Grid Information: Facts of Life
• Information is always old
  – Time of flight, changing system state
  – Need to provide quality metrics
• Distributed state hard to obtain
  – Complexity of global snapshot
• Component will fail
• Scalability and overhead
• Many different usage scenarios
  – Heterogeneous policy, different information
    organizations, etc.                         36
       Grid Information Service
• Provide access to static and dynamic
  information regarding system components
• A basis for configuration and adaptation in
  heterogeneous, dynamic environments
• Requirements and characteristics
  –   Uniform, flexible access to information
  –   Scalable, efficient access to dynamic data
  –   Access to multiple information sources
  –   Decentralized maintenance
                                                   37
         The GIS Problem: Many
    Information Sources, Many Views

       R   R
               ?           R
                                           VO C
                       R
                                       R   R       ?
           R                   R
    VO A




                                   ?       R       R
R                          R
           ?                                   R
               R   R       VO B
R
      R

                                                       38
       Information Protocols
• Grid Resource Registration Protocol
  – Support information/resource discovery
  – Designed to support machine/network failure
• Grid Resource Inquiry Protocol
  – Query resource description server for
    information
  – Query aggregate server for information
  – LDAP V3.0 in Globus 1.1.3


                                                  39
            GIS Architecture

             Customized Aggregate Directories
Users
                  A         A
    Enquiry
    Protocol
                                   Registration
                                     Protocol



        R         R         R          R

 Standard Resource Description Services
                                                40
  Metacomputing Directory Service
• Use LDAP as Inquiry
• Access information in a distributed directory
   – Directory represented by collection of LDAP servers
   – Each server optimized for particular function
• Directory can be updated by:
   – Information providers and tools
   – Applications (i.e., users)
   – Backend tools which generate info on demand
• Information dynamically available to tools and
  applications

                                                           41
    Two Classes Of MDS Servers
• Grid Resource Information Service (GRIS)
   – Supplies information about a specific resource
   – Configurable to support multiple information providers
   – LDAP as inquiry protocol
• Grid Index Information Service (GIIS)
   – Supplies collection of information which was gathered from
     multiple GRIS servers
   – Supports efficient queries against information which is
     spread across multiple GRIS server
   – LDAP as inquiry protocol


                                                          42
  Grid Resource Information Service
• Server which runs on each resource
   – Given the resource DNS name, you can find the GRIS server
     (well known port = 2135)
• Provides resource specific information
   – Much of this information may be dynamic
      • Load, process information, storage information, etc.
      • GRIS gathers this information on demand
• “White pages” lookup of resource information
   – Ex: How much memory does machine have?
• “Yellow pages” lookup of resource options
   – Ex: Which queues on machine allows large jobs?

                                                               43
  Grid Index Information Service
• GIIS describes a class of servers
   – Gathers information from multiple GRIS servers
   – Each GIIS is optimized for particular queries
      • Ex1: Which Alliance machines are >16 process SGIs?
      • Ex2: Which Alliance storage servers have >100Mbps bandwidth to
        host X?
   – Akin to web search engines
• Organization GIIS
   – The Globus Toolkit ships with one GIIS
   – Caches GRIS info with long update frequency
      • Useful for queries across an organization that rely on relatively
        static information (Ex1 above)
• Can be merged into GRIS
                                                                       44
       Logical MDS Deployment
         Grads      Gusto



GIIS

         ISI




GRISes
                                45
 Example: Discovering CPU Load

• Retrieve CPU load fields of compute resources
% grid-info-search -L “(objectclass=GlobusComputeResource)” \
                dn cpuload1 cpuload5 cpuload15
dn: hn=lemon.mcs.anl.gov, ou=MCS, o=Argonne National Laboratory,
 o=Globus, c=US
cpuload1: 0.48
cpuload5: 0.20
cpuload15: 0.03

dn: hn=tuva.mcs.anl.gov, ou=MCS, o=Argonne National Laboratory,
 o=Globus, c=US
cpuload1: 3.11
cpuload5: 2.64
cpuload15: 2.57
                                                                46
    The Globus Toolkit™:
Data Management Services
     The Globus Project™
      Argonne National Laboratory
    USC Information Sciences Institute

          http://www.globus.org


                                         47
 Data Intensive Issues Include …
• Harness [potentially large numbers of] data,
  storage, network resources located in
  distinct administrative domains
• Respect local and global policies governing
  what can be used for what
• Schedule resources efficiently, again subject
  to local and global constraints
• Achieve high performance, with respect to
  both speed and reliability
• Catalog software and virtual data
                                             48
    Desired Data Grid Functionality
•   High-speed, reliable access to remote data
•   Automated discovery of “best” copy of data
•   Manage replication to improve performance
•   Co-schedule compute, storage, network
•   “Transparency” wrt delivered performance
•   Enforce access control on data
•   Allow representation of “global” resource
    allocation policies
                                            49
A Model Architecture for Data Grids
                  Attribute
  Metadata        Specification                          Replica
  Catalog                    Application                 Catalog
                                                         Multiple Locations
    Logical Collection and
                                         Selected
    Logical File Name
                                         Replica       Replica               MDS
                                                       Selection
                                                           Performance
          GridFTP Control Channel                          Information &
                                                           Predictions
                                                                             NWS


                      GridFTP        Disk Cache
                      Data
                      Channel     Tape Library
    Disk Array                                             Disk Cache
 Replica Location 1               Replica Location 2    Replica Location 3

                                                                        50
   Globus Toolkit Components
Two major Data Grid components:

1. Data Transport and Access
   Common protocol
      Secure, efficient, flexible, extensible data movement
   Family of tools supporting this protocol

2. Replica Management Architecture
   Simple scheme for managing:
     multiple copies of files
     collections of files                              51
 Access/Transport Protocol Requirements
• Suite of communication libraries and related tools
  that support
  –   GSI, Kerberos security
                                – Integrated instrumentation
  –   Third-party transfers
                                – Loggin/audit trail
  –   Parameter set/negotiate
                                – Parallel transfers
  –   Partial file access
                                – Striping (cf DPSS)
  –   Reliability/restart       – Policy-based access control
  –   Large file support        – Server-side computation
  –   Data channel reuse        – Proxies (firewall, load bal)
• All based on a standard, widely deployed protocol
                                                        52
 And The Protocol Is … GridFTP
• Why FTP?
   – Ubiquity enables interoperation with many commodity
     tools
   – Already supports many desired features, easily extended
     to support others
   – Well understood and supported
• We use the term GridFTP to refer to
   – Transfer protocol which meets requirements
   – Family of tools which implement the protocol
• Note GridFTP > FTP
• Note that despite name, GridFTP is not restricted to
  file transfer!
                                                          53
       GridFTP: Basic Approach
• FTP protocol is defined by several IETF RFCs
• Start with most commonly used subset
  – Standard FTP: get/put etc., 3rd-party transfer
• Implement standard but often unused features
  – GSS binding, extended directory listing, simple
    restart
• Extend in various ways, while preserving
  interoperability with existing servers
  – Striped/parallel data channels, partial file, automatic
    & manual TCP buffer setting, progress monitoring,
    extended restart                                   54
           Replica Management
• Maintain a mapping between logical names
  for files and collections and one or more
  physical locations
• Important for many applications
  – Example: CERN HLT data
     •   Multiple petabytes of data per year
     •   Copy of everything at CERN (Tier 0)
     •   Subsets at national centers (Tier 1)
     •   Smaller regional centers (Tier 2)
     •   Individual researchers will have copies

                                                   55
             Replica Catalog Structure:
            A Climate Modeling Example
                                                    Replica Catalog


                            Logical Collection                       Logical Collection
                            C02 measurements 1998                    C02 measurements 1999

                            Filename: Jan 1998
                            Filename: Feb 1998
                            …


                                                                         Logical
   Location                     Location
  jupiter.isi.edu              sprite.llnl.gov                           File Parent
Filename: Mar 1998            Filename: Jan 1998
Filename: Jun 1998            …                           Logical File                 Logical File
Filename: Oct 1998            Filename: Dec 1998          Jan 1998                     Feb 1998
Protocol: gsiftp              Protocol: ftp
UrlConstructor:               UrlConstructor:           Size: 1468762
gsiftp://jupiter.isi.edu/     ftp://sprite.llnl.gov/
  nfs/v6/climate                 pub/pcmdi                                                            56
     Replica Catalog Services
  as Building Blocks: Examples
• Combine with information service to build
  replica selection services
  – E.g. “find best replica” using performance info
    from NWS and MDS
  – Use of LDAP as common protocol for info and
    replica services makes this easier
• Combine with application managers to build
  data distribution services
  – E.g., build new replicas in response to frequent
    accesses                                       57
    Replica Catalog Directions
• Many data grid applications do not require
  tight consistency semantics
  – At any given time, you may not be able to
    discover all copies
  – When a new copy is made, it may not be
    immediately recognized as available
• Allows for much more scalable design
  – Distributed catalogs: local catalogs which
    maintain their own LFN -> PFN mapping
  – Soft-state updates as basis for building various
    configurations of global catalogs              58
           Virtual Data in Action
                            Major Archive
                              Facilities
• Data request may
      Access local data
      Compute locally
      Compute remotely
      Access remote data             Network caches &
                                       regional centers
• Scheduling subject to
  local & global policies
• Local autonomy
                                                      Local
                                                      sites
                            ?
                                                              59
         Evolution of Grid Technologies
• Initial exploration (1996-1999; Globus 1.0)
   – Extensive appln experiments; core protocols
• Data Grids (1999-??; Globus 2.0+)
   – Large-scale data management and analysis
• Open Grid Services Architecture (2001-??, Globus 3.0)
   – Integration w/ Web services, hosting environments, resource
     virtualization
   – Databases, higher-level services
• Radically scalable systems (2003-??)
   – Sensors, wireless, ubiquitous computing

                                                           60
                              Grids and Open Standards
                                                                            App-specific
Increased functionality,




                                                                             Services
    standardization




                                                            Open Grid
                                       Web services
                                                           Services Arch
                                                               GGF: OGSI, …
                           X.509,                             (+ OASIS, W3C)
                           LDAP,       Globus Toolkit Multiple implementations,
                           FTP, …                         including Globus Toolkit
                                        Defacto standards
                            Custom     GGF: GridFTP, GSI
                           solutions

                                                                  Time
                                                                                     61
                  “Web Services”
• Increasingly popular standards-based framework for
  accessing network applications
   – W3C standardization; Microsoft, IBM, Sun, others
• WSDL: Web Services Description Language
   – Interface Definition Language for Web services
• SOAP: Simple Object Access Protocol
   – XML-based RPC protocol; common WSDL target
• WS-Inspection
   – Conventions for locating service descriptions
• UDDI: Universal Desc., Discovery, & Integration
   – Directory for Web services
                                                        62
          The Need to Support
       Transient Service Instances
• “Web services” address discovery & invocation of
  persistent services
   – Interface to persistent state of entire enterprise
• In Grids, must also support transient service instances,
  created/destroyed dynamically
   – Interfaces to the states of distributed activities
   – E.g. workflow, video conf., dist. data analysis
• Significant implications for how services are managed,
  named, discovered, and used
   – In fact, much of our work is concerned with the management
     of service instances                                 63
Open Grid Services Architecture
• Service orientation to virtualize resources
• From Web services:
   – Standard interface definition mechanisms: multiple
     protocol bindings, multiple implementations, local/remote
     transparency
• Building on Globus Toolkit:
   –   Grid service: semantics for service interactions
   –   Management of transient instances (& state)
   –   Factory, Registry, Discovery, other services
   –   Reliable and secure transport
• Multiple hosting targets: J2EE, .NET, “C”, …
                                                          64
     Open Grid Services Architecture




                                        schemas
        More specialized &
                                                        Priorities:




                                         Other
         domain-specific
             services                                         Data access and
                                                               integration




                                         OGSA schemas
     OGSA services: registry,
   authorization, monitoring, data                            Security
  access, management, etc., etc.
                                                              SLA negotiation
 Open Grid Services Infrastructure                            Manageability
           Web Services                                       Monitoring
Host. Env.  & Protocol Bindings
                                                              …
  Hosting Environment
 Hosting Environment        Transport
                            Protocol




                                                                               65
          OGSA Service Model
• System comprises (a typically few) persistent
  services & (potentially many) transient services
• All services adhere to specified Grid service
  interfaces and behaviors
  – Reliable invocation, lifetime management,
    discovery, authorization, notification,
    upgradeability, concurrency, manageability
• Interfaces for managing Grid service instances
  – Factory, registry, discovery, lifetime, etc.
=> Reliable, secure mgmt of distributed state
                                                   66
              The Grid Service
• A (potentially transient) Web service with
  specified interfaces & behaviors, including
  –   Creation (Factory)
  –   Global naming (GSH) & references (GSR)
  –   Lifetime management
  –   Registration & Discovery
  –   Authorization
  –   Notification
  –   Concurrency
  –   Manageability
                                                67
       Use of Web Services (1)
• A Grid service interface is a WSDL portType
• A Grid service definition is a WSDL
  extension (serviceType) containing:
  – A set of one or more portTypes supported by the
    service
  – portType & serviceType compatibility statements,
    to support upgradability
     • For discovery of compatible services when interfaces
       are upgraded
  – Implementation version information
                                                         68
          Use of Web Services (2)
• A GSR is a WSDL document with extensions:
   – Extension to service element to reference serviceType
   – Service element extensions to carry the GSH, and the
     expiration time of the GSR
• A GSH is an URL, with the following properties:
   – Globally unique for all time
   – http get on GSH + “.wsdl” returns GSR
   – Can derive GSH to Mapper from it
• Registry returns WS-Inspection documents


                                                             69
                  Grids: An Emerging, Common Computing and Data Infrastructure
                                   for Science and Engineering
 Portals



                Web Portal Access to Application and                     Specialized Portal Access (high                            ...
                           Grid Services                                performance displays, PDAs, etc.)
 Services




                   Data Management:            Resource              Fault           Workflow
                                                                                                         Accounting             Applications
                    replication and            Brokering          Management        Management
                       metadata


                           Encapsulation as                   Encapsulation for                  Encapsulation as
                            Web Services                    Script Based Services               Java Based Services



                          Resource             Scheduling and Access            Uniform Data             Monitoring
Basic Grid
Functions




                          Discovery                to Computing                   Access                 and Events

                                                       Grid Communication Functions




                                                                                                                                          Operational Support
                                                           transport services
                                                            security services

                                                            Communications
                                 Internet                   optical networks        space-based networks               ...

             national supercomputer                     Distributed Resources                          scientific instruments
                    facilities
                                            clusters                                tertiary storage
                                                             Condor pools
                                                            of workstations

                                                                                                                                  70
            Grids: A Common Computing and Data Infrastructure for
                         Science and Engineering
                                                 Portals: Services Presented to the Users to Accomplish Tasks


          User                                                                           STS/SLI
                                                 Collaboration                                                             ISS                     ES                                                                                  Aviation
       Environment                                                                       Mission                                                                                          MER/CIP
                                                    Portals                                                              Training                Modeling                                                                              Capacity
         Portals                                                                         Analysis

              Application Domain                                                                                                                             Application Domain Specific
              Independent Portals                                                                                                                                      Portals


          Grid Web Services: Grid Functions and Application Functions Packaged for Building Portals




                                                                                                                                                                                                                    Archive Gateways
                                                                                                                         Data Processing &




                                                                                                                                                                                  Sensor Gateways
                  Data Management




                                                                                                                                                              Flight Simulation




                                                                                                                                                                                                    System Models
                                                                                                                                             Computational




                                                                                                                                                                                    Instrument &
                                                              Programming




                                                                                         Collaboration
     Management




                                                                            Management




                                                                                                         Visualization




                                                                                                                                              Simulation
                                                                            Experiment
                                    Monitoring




                                                                                                                                                                                                                                                   Coupling
                                                                                                                                                                                                                                         Zooming
      Workflow




                                                                Services




                                                                                           Services




                                                                                                                              Analysis
                                                   Events




                  Domain Independent                                                                                                               Domain Specific Web Services –
                   Grid Web Services                                                                                                                 Encapsulated Applications


Grid Common Services: Uniform Access, Security, and Management of Compute, Data, and Instrument Resources


                                                            Multi-Site Compute, Data, and Instrument Resources
                                                                                                                                                                                                                                       71
                                                Combining Grid and Web Services
                                  Application                                                                    Web                                                                           Grid Services:
   Clients
                                                                                                                                                                                                                                                                           Resources
                                    Portals                                                                    Services                                                               Collective and Resource Access

                                                                                                           Job Submission /
                                    Discipline /                                                               Control                                                                                      Grid ssh
                                    Application
                                      Specific                                                               File Transfer                                                                CORBA                                                                             Compute
                                      Portals
X Windows




                                                        XML / SOAP over Grid Security Infrastructure
                                                                                                                                                                                                                                                                             (many)




                                                                                                                                                                                                                         Grid Protocols and Grid Security Infrastructure
                                    (e.g. SDSC                                                                                                                                                                  GRAM
                                                                                                          Data Management




                                                                                                                                    Grid Protocols and Grid Security Infrastructure
                                   TeleScience)
                                                                                                                                                                                         Condor-G
                                                                                                              Monitoring                                                                                      SRB/
                                    Problem
                                     Solving                                                                                                                                                                Metadata
                                  Environments                                                                   Events                                                                                     Catalogue                                                       Storage
Web Browser

              http, https. etc.




                                  (AVS, SciRun,                                                                                                                                                                                                                             (many)
                                     Cactus)                                                                       ……                                                                 Data Replica and
                                                                                                                                                                                                             GridFTP
                                                                                                                                                                                      Metadata Catalog
                                                                                                              Credential
                                   Environment
                                                                                                             Management                                                                             Grid
                                   Management
                                   (LaunchPad,                                                                Workflow                                                                            Monitoring                                                                Communi-
                                     HotPage)                                                                Management                                                                          Architecture                                                                cation
PDA




                                                                                                          other services:                                                                Grid X.509
                                                                                                          •visualization                                                                Certification           MPI
                                   composition                                                            •interface builders                                                            Authority
                                   frameworks                                                             •collaboration tools                                                                               Secure,
                                    (e.g. XCAT)                                                           •numerical grid                                                                   Grid
                                                                                                                                                                                                             Reliable                                                      Instruments
                                                                                                            generators                                                                  Information
                                                                                                                                                                                                           Group Comm.                                                       (various)
                                                                                                          •etc.                                                                           Service
                                                                                                         CoG Kits implementing
                                                                                                            Web Services in                                                            Grid Web Service
                                  Python, Java, etc.,
                                        JSPs                                                             servelets, servers, etc.                                                     Description (WSDL)
                                                                                                                                                                                      & Discovery (UDDI)
                                   Apache SOAP,
                                     .NET, etc.
                                                                                                       Apache Tomcat&WebSphere
                                                                                                       &Cold Fusion=JVM + servlet                                                                                                                                          72
                                                                                                         instantiation + routing
         For More Information
• Globus Project™
  – www.globus.org
• Grid Forum
  – www.gridforum.org
• Book (Morgan Kaufman)
  – www.mkp.com/grids




                                73

				
DOCUMENT INFO