Docstoc

Grid Middleware

Document Sample
Grid Middleware Powered By Docstoc
					   Grid Middleware Service


              Nov. 9, 2002


            Chan-Hyun Youn
Information and Communications University
                                       Int’l DataGrid Workshop


              Contents
• Grid and Middleware Services
• Architectural Model for Resource
  Management
   Hierarchical Resource Management
   Abstract Owner
   Market Model
• Scheduling Algorithms in Economy Grid
• Example of Application level Scheduler
• Concluding Remarks
                   chyoun@icu.ac.kr                       2
                                         Architecture of a Grid
                                                 Discipline Specific Portals and
                                            Scientific Workflow Management Systems
           Toolkits: Visualization, data publish/subscribe, etc.
             Applications: Simulations, Data Analysis, etc.
   Grid Common Services: Standardized Services and Resources Interfaces




                                                                                                                                   Communication


                                                                                                                                                   Authentication
                                                                                                         Collaboration




                                                                                                                                                   Authorization
                                                                                          Uniform Data




                                                                                                                                                                                                       Management
                                                             Global Event



                                                                            Cataloguing




                                                                                                          and Remote
Information




                                                Scheduling




                                                                                                          Instrument




                                                                                                                                                                                          Monitoring
                          Brokering
               Resource




                                                                                                           Services




                                                                                                                                                                    Services
                                                               Services
                                      Queuing




                                                                                                                                     Services
                                                                                                                         Network




                                                                                                                                                                               Auditing
                                                                                                                                                                    Security
               Uniform
  Service



                Access




                                                                                            Access
                                       Global




                                                                                                                          Cache




                                                                                                                                                                                                          Fault
                                                                               Data
    Grid




                                                   Co-




                                                                                                                                                                          = Globus services
          clusters                                                                   Resources
                    national
              supercomputer facility                         Condor pools                                     tertiary storage                         national user facilities


                                                                       network
                                                                       caches



                                                                                                                                                   Source: IPG (Johnston)
                          high-speed networks and communications services
         Heterogeneous Computing: Int’l DataGrid Workshop
         IPG Milestone Completed 10/2000
- Two problem solving environments use IPG services for uniform access to
                               heterogeneous resources.
                                                               1) Condor Workstation Pool mgr.
IPG Grid Common Services: Standardized services and
             uniform resource access                        •Molecular design application for
                                                            nanotechnology devices and materials
                                                            • Uses 0.5 million otherwise idle CPU
                                                            hours/year scavenged from a 60-100
                                                            Sun and SGI workstations - a subset of
                                                            the NAS Condor pool
                                                            •The Condor system is an IPG
                                                            middleware service

                                                         2) Parameter Study Manager
                                                       • ILab aerospace design
                                                       parameter study manager uses
                                                       IPG to access distributed
                                                       computing and data resources            study
                                                                                               object

                                                                                            study
                                                                                           concept




                                                                 results
                                                                   results
                  IPG managed compute and
                                    chyoun@icu.ac.kr                 results
                                                                       results
                                                                         results           4
                  data management resources
       Online Instrumentation:        Int’l DataGrid Workshop

       Real-time Experiment Interaction
   Unitary Plan Wind Tunnel
                                     multi-source
                                    data analysis,
                                 desktop & VR clients
                                 with shared controls




real-time
collection                                              real-time experiment control




                   computer
                   simulations       chyoun@icu.ac.kr
                                 archival storage                              5
                                                                                     Int’l DataGrid Workshop

 Grid from Services View
 Applications           Chemistry                      Cosmology                             Environment

                                         Biology                          High Energy Physics


                   Distributed     Data-                             Remote          Problem             Remote
                                                   Collaborative
Application        Computing     Intensive                         Visualization     Solving         Instrumentation
                                                   Applications
 Toolkits          Applications Applications                       Applications     Applications      Applications
                                                     Toolkit
                    Toolkit       Toolkit                            Toolkit          Toolkit            Toolkit



Grid Services     Resource-independent and application-independent services  :
(Middleware)       E.g.,authentication, authorization, resource location, resource allocation, events, accounting,
                  remote data access, information, policy, fault detection



 Grid Fabric Resource-specific implementations of basic services :
             E.g., Transport protocols, name servers, differentiated services, CPU schedulers, public key
 (Resources) infrastructure, site accounting, directory service, OS bypass


                                                   chyoun@icu.ac.kr                                           6
                                                  Int’l DataGrid Workshop


 Middleware
• Layered collection of middleware services that provide to
  applications uniform views of distributed resource components
  and the mechanisms for assembling them into systems
   – Grid Workload Management, Data Management, Monitoring services
   – Management of the Local Computing Fabric
   – Mass Storage


• Services extend both “up and down” through the various layers
  of the computing and communications infrastructure




                             chyoun@icu.ac.kr                        7
                                                               Int’l DataGrid Workshop

Functions in Middleware
•   Workload management
     – The workload is chaotic – unpredictable job arrival rates, data access patterns
     – The goal is maximising the global system throughput (events
        processed per second)
•   Data management
     – Management of petabyte-scale data volumes, in an environment with
        limited network bandwidth and heavy use of mass storage (tape)
     – Caching, replication, synchronisation, object database model
•   Application monitoring
     – Tens of thousands of components, thousands of jobs and individual
        users
     – End-user - tracking of the progress of jobs and aggregates of jobs
     – Understanding application and grid level performance
     – Administrator – understanding which global-level applications were
        affected by failures, and whether and how to recover


                                     chyoun@icu.ac.kr                             8
                                                    Int’l DataGrid Workshop

Middleware (in Local Fabric)
•   Effective local site management of giant computing fabrics
     – Automated installation, configuration management, system
        maintenance
     – Automated monitoring and error recovery - resilience,
        self-healing
     – Performance monitoring
     – Characterisation, mapping, management of local Grid resources

•   Mass storage management
      multi-PetaByte data storage
      “real-time” data recording requirement
      active tape layer – 1,000s of users
      uniform mass storage interface
      exchange of data and meta-data between mass storage systems

                              chyoun@icu.ac.kr                         9
                                                                            Int’l DataGrid Workshop
 Applications
                   Technical Approach in Layered Network
                                                                                Applications need uniform
                                                                                views of resources, and
  Applications                                                                  middleware must deal with
                                                                                the fact that most “real”
                           Network          QoS                Access           resources are “locally”
Applications                Cache          Broker              Control          owned
                      Resource
                     Scheduling                           Monitoring &                     Wind Tunnel
                                                          Management
                   Global Middleware Services
Super- Computer                                                                              Cluster
                   Local Services                              Ames
                    LBNL
                                   NCAR                ANL
Tertiary storage
                                       Internet

        Cache                     ESNet            Internet 2
                                       GigaPop                        GigaPop             Tertiary (mass)
                              Campus                    vBNS                 IDREN            storage



                                                 chyoun@icu.ac.kr     Source: Grid’98 Workshop (Johnston)
                                                                               Int’l DataGrid Workshop
   Applications                     Operation Model (1)
                                                        Middleware must actually reach well !
 Some services
are provided in              Network          QoS                                          Most services drill
                                                                 Access
the middleware                Cache          Broker                                        down to institutional
                                                                 Control                   resources
                        Resource               Monitoring &
     Resource
   Characteristics     Scheduling              Management                Data
                                                                      Catalogues                    Wind Tunnel
                     Global Middleware Services
Super- Computer                                                                                      Cluster
                     Local Services                              Ames
                      LBNL
                                     NCAR                ANL                         Some services drill
                                                                                     down to the various
 Tertiary storage                                                                    network layers
                                         Internet

         Cache                      ESNet            Internet 2
                                         GigaPop                        GigaPop                Tertiary (mass)
                                Campus                    vBNS                     IDREN           storage



                                                   chyoun@icu.ac.kr     Source: Grid’98 Workshop (Johnston)
                                                                                  Int’l DataGrid Workshop
         Applications          Operation Model (2)
                                                   Middleware layer and infrastructure to provide
                                                   the transparent access for applications !
 Some services
                                                                                                         Cache
are provided in                Network             QoS               Access
the middleware                  Cache             Broker                                Re-configure
                                                                     Control
                        Resource                   Monitoring &
     Resource
   Characteristics     Scheduling                  Management               Data
                                                                         Catalogues                    Wind Tunnel
                     Global Middleware Services
Super- Computer                                                                                        Cluster
                        Proxy management          Configure
                        for multi-site                               Ames
                        resources
                                           Analyzer
                      LBNL                                   ANL                                    Re-configure
 Tertiary storage                                 NCAR
                                                     Local Services
                                           Internet
         Cache                                                                    Re-configure
                                      ESNet             Internet 2                               Tertiary (mass)
                            Monitor        GigaPop                          GigaPop                  storage
                                  Campus                      vBNS                    IDREN
                                        Monitor
                                                      chyoun@icu.ac.kr      Source: Grid’98 Workshop (Johnston)
                                            Int’l DataGrid Workshop


Middleware Approach

• Toolkit and services addressing key technical
  problems
   – Modular “bag of services” model
   – Not a vertically integrated solution
   – can be applied to many application domains
• Inter-domain issues, rather than clustering
   – Integration of intra-domain solutions

                        chyoun@icu.ac.kr                      13
                   Int’l DataGrid Workshop




Globus




chyoun@icu.ac.kr                     19
                                               Int’l DataGrid Workshop


Globus Approach
• A software toolkit addressing key technical problems
   – Offer a modular bag of technologies
   – Enable incremental development of grid-enabled tools and
     applications
   – Define and standardize grid protocols and APIs

• Focus is on inter-domain issues, not clustering
   – Supports collaborative resource use spanning multiple
     organizations
   – Integrates cleanly with intra-domain services
   – Creates a collective service layer

                            chyoun@icu.ac.kr                     20
                                             Int’l DataGrid Workshop


Globus Approach

• Focus on architecture issues           Applications

   – Provide implementations of grid Diverse global services
     protocols and APIs as basic
     infrastructure
   – Use to construct high-level, domain-
     specific solutions              Core Globus
• Design principles                  services
   – Keep participation cost low
   – Enable local control
   – Support for adaptation
                                             Local OS

                          chyoun@icu.ac.kr                     21
                                              Int’l DataGrid Workshop


 Four Key Protocols

• The Globus Toolkit centers around four key
  protocols
   – Connectivity layer:
      • Security: Grid Security Infrastructure (GSI)
   – Resource layer:
      • Resource Management: Grid Resource Allocation
        Management (GRAM)
      • Information Services: Grid Resource Information
        Protocol (GRIP)
      • Data Transfer: Grid File Transfer Protocol (GridFTP)

                           chyoun@icu.ac.kr                     22
                                                                    Int’l DataGrid Workshop


      Grid Security Infrastructure in Action
             Single sign-on via “grid-id”
             & generation of proxy cred.        User Proxy
 User        Or: retrieval of proxy cred.
                                                    Proxy
                                                  credential
             from online repository
                                             Remote process
                                             creation requests*
           GSI-enabled Authorize                                       GSI-enabled
Site A                                                                                      Site B
           GRAM server Map to local id                                 GRAM server
(Kerberos)                                                                                  (Unix)
                       Create process
 Computer              Generate credentials                                    Computer
 Process                                                                        Process
              Local id                 Communication*                            Local id
  Kerberos    Restricted     Remote file                                        Restricted
   ticket       proxy
                           access request*                                        proxy

                                                          GSI-enabled
                                            Site C         FTP server
                                            (Kerberos)
      * With mutual                                                Authorize
      authentication                         Storage               Map to local id
                                             system                Access file
                                            chyoun@icu.ac.kr                          23
                                              Int’l DataGrid Workshop


Resource Management

• The Grid Resource Allocation Management (GRAM)
  protocol and client API allows programs to be started on
  remote resources, despite local heterogeneity
• Resource Specification Language (RSL) is used to
  communicate requirements
• A layered architecture allows application-specific resource
  brokers and co-allocators to be defined in terms of GRAM
  services
   – Integrated with Condor, MPICH-G2, …




                           chyoun@icu.ac.kr                     24
                                                      Int’l DataGrid Workshop


Resource Management Issues for Grid Computing
 • Site autonomy
    – Resources owned by different organizations, in different
       administrative domains
    – Local policies for use, scheduling, security
 • Heterogeneous substrate
    – Different local resource management systems
 • Policy extensibility
    – Local sites need ability to customize their resource management
       policies
 • Co-allocation
    – May need resources at several sites
    – Mechanism for allocating multiple resources, initiating
       computation, monitoring and managing
 • On-line control
    – Adapt application requirements to resource availability
                                chyoun@icu.ac.kr                        25
                                                           Int’l DataGrid Workshop


    Resource Management Architecture


                                        Broker
                                                                 RSL
                       RSL                                       specialization

                                                       Queries     Information
   Application
                                                       & Info        Service
                       Ground RSL

                                     Co-allocator

                                    Simple ground RSL
Local        GRAM                        GRAM                         GRAM
resource
managers         LSF                   EASY-LL                         NQE


                                    chyoun@icu.ac.kr                         26
                                                       Int’l DataGrid Workshop

Local Resource Managers

•   Implemented with Globus Resource Allocation Manager
    (GRAM)
    –   Processing RSL specifications representing resource requests
        • Deny request
        • Create one or more processes (jobs) that satisfy request
    –   Enable remote monitoring and management of jobs
    –   Periodically update MDS information service with current
        availability and capabilities of resources
•   GRAM is responsible for
    –   Parsing and processing RSL
    –   Job monitoring
    –   MDS update


                                chyoun@icu.ac.kr                         27
                                               Int’l DataGrid Workshop


Grid Information Services
• System information is critical to operation of the grid and
  construction of applications
   – What resources are available?
       • Resource discovery
   – What is the “state” of the grid?
       • Resource selection
   – How to optimize resource use
       • Application configuration and adaptation?
• We need a general information infrastructure to answer
  these questions


                            chyoun@icu.ac.kr                     28
                                            Int’l DataGrid Workshop


GIS Architecture


               Customized Aggregate Directories
   Users
                    A               A
        Enquiry
        Protocol
                                               Registration
                                                 Protocol



           R        R                   R           R

     Standard Resource Description Services
                     chyoun@icu.ac.kr                         29
                                                        Int’l DataGrid Workshop


A Model Architecture for Data Grids

                 Attribute
 Metadata        Specification                          Replica
 Catalog                    Application                 Catalog
                                                        Multiple Locations
   Logical Collection and
                                        Selected
   Logical File Name
                                        Replica       Replica               MDS
                                                      Selection
                                                          Performance
         GridFTP Control Channel                          Information &
                                                          Predictions
                                                                            NWS


                     GridFTP        Disk Cache
                     Data
                     Channel     Tape Library
   Disk Array                                             Disk Cache
Replica Location 1               Replica Location 2    Replica Location 3
                                 chyoun@icu.ac.kr                           30
                                              Int’l DataGrid Workshop


GridFTP: Basic Approach

• FTP protocol is defined by several IETF RFCs
• Start with most commonly used subset
   – Standard FTP: get/put etc., 3rd-party transfer
• Implement standard but often unused features
   – GSS binding, extended directory listing, simple restart
• Extend in various ways, while preserving interoperability with
  existing servers
   – Striped/parallel data channels, partial file, automatic &
     manual TCP buffer setting, progress monitoring, extended
     restart

                           chyoun@icu.ac.kr                     31
                                                                 Int’l DataGrid Workshop


  Striped GridFTP Server

             GridFTP
             client                 To Client or Another Striped GridFTP Server



GridFTP Control Channel                GridFTP Data Channels


              mpirun                 GridFTP Server Parallel Backend
   GridFTP                                               MPI (Comm_World)
   server                 Control    Control   Control                      Control
              Control                                            …
   master
              socket      Plug-in    Plug-in   Plug-in                      Plug-in
                                                          MPI (Sub-Comm)



                                         MPI-IO


                  Parallel File System (e.g. PVFS, PFS, etc.)


                                                             …
                                      chyoun@icu.ac.kr                             32
                    Int’l DataGrid Workshop




Condor




 chyoun@icu.ac.kr                     33
                                              Int’l DataGrid Workshop


What is Condor?
• Condor converts collections of distributively
  owned workstations and dedicated clusters into a
  distributed high-throughput computing facility.
       Resource finder
       Batch queue manager
                                   All jobs
       Scheduler
       Checkpoint/Restart
       Process migration          Jobs linked
       Remote system calls        with the Condor
                                   library

                          chyoun@icu.ac.kr                      34
                                                Int’l DataGrid Workshop

Layered Design
         Resource

            Access Control
                                Resource Owner
            Match-Making
Condor




                                System Administrator
            Request Agent
                                Customer/User

           Application RM

         Application

                             chyoun@icu.ac.kr                     35
                                              Int’l DataGrid Workshop


Unique Mechanisms

• Checkpointing
  – Enables Preemptive Resume Resource Allocation (essential
    in an opportunistic environment)
• Remote I/O
  – Enables computation across administrative domains
    (essential for HTC)
• ClassAds
  – Enables flexible resource matchmaking (essential in a
    distributively owned environment)


                           chyoun@icu.ac.kr                     36
                                             Int’l DataGrid Workshop

Condor System Structure
                      Central Manager
                                              Collector
Negotiator


                   N                  C
Submit Machine                                Execution Machine

    [...A]

      [...C]     CA                          RA
    [...B]




         Customer Agent        Resource Agent
                          chyoun@icu.ac.kr                     37
      Job Submission Machine                                           Int’l DataGrid Site
                                                                     Job Execution Workshop
                 Persistant
                 Job Queue

                                                                       Globus Daemons
End User                                                                        +
Requests
                                                                      Local Site Scheduler
                               Condor-G
                              GridManager
  Condor-G                                                               [See Figure 1]
  Scheduler      Fork           GASS
                                Server
      Fork




                              Condor-G
                              Collector
                                                        Resource           Job
                                                       Information
                                                                            Condor
                                                                           Daemons
     Condor                                      Transfer Job X
    Shadow




                                                                              Fork
   Process for
      Job X
                                                   Redirected                Job X
                                                   System Call         Condor System Call
                                                      Data
                                                                      Trapping & Checkpoint
                                                                             Library




                                            chyoun@icu.ac.kr                                  38
                   Int’l DataGrid Workshop




TENT




chyoun@icu.ac.kr
                                           Int’l DataGrid Workshop


TENT
• A distributed workflow management and
  integration system for engineering
  applications developed by

  – German Aerospace Center (DLR), Simulation and
    Software Technology (SISTEC)
    http://www.sistec.dlr.de

  – German National Research Center for Information
    Technology (GMD), Institute for Algorithms and
    Scientific Computing (SCAI) http://www.gmd.de/scai

                        chyoun@icu.ac.kr                     40
                                    Int’l DataGrid Workshop

TENT - The Integration Framework




                  visualization


                 chyoun@icu.ac.kr                     41
                                   Int’l DataGrid Workshop

TENT Packages




                chyoun@icu.ac.kr                     42
                                     Int’l DataGrid Workshop

TENT - Software architecture




                  chyoun@icu.ac.kr                     43
                                Int’l DataGrid Workshop




Architectural Models for Resource
    Management in the Grid




             chyoun@icu.ac.kr                     44
                                                                                Int’l DataGrid Workshop



Typical Grid Computing Environment

                 Grid Information Service
                                                                              Grid Resource Broker


                                      R2                                                       Application
                                                                   database
                                                              R3                R4

                                R5                       RN
Grid Resource Broker
                                            R6
                                                                                  R1
                                                                                          Resource Broker




                                        Grid Information Service




                                                 chyoun@icu.ac.kr                                    45
                                                    Int’l DataGrid Workshop
Sources of Complexity in Grid Resource
Management
• No single administrative control.
• No single ownership policy:
   – Each resource owner has their own policies or scheduling
     mechanisms
   – Users must honour them (particularly external Grid users)
• Heterogeneity
   – resources : PCs, Workstations, clusters, supercomputers, instruments,
     databases, software …
   – fabric management systems and
      management policies
   – application requirements
• Dynamic availability – may appear and disappear…
                                chyoun@icu.ac.kr                      46
                                          Int’l DataGrid Workshop


Sources of Complexity in Grid Resource
Management

 • Unreliable resource – disappear from view
 • No uniform cost model - varies from one user’s
   resource to another and from time of day.
 • No single access mechanism – Web, custom
   interfaces, command line…


                       chyoun@icu.ac.kr                     47
                                                               Int’l DataGrid Workshop

Grid Resource
Management Issues
•Authentication (once).
•Specify (code, resources, etc.).
•Discover resources.
•Negotiate resources.
•Discover authorization, acceptable
•Negotiate authorization,
use, Cost, etc.                   Domain 1
•Acquire resources. etc.
 acceptable use, Cost,
•Scheduleresources.
•Acquire Jobs.                      Domain 2

•Initiate computation.
•Schedule jobs.
•Steer computation.
•Initiate computation.
•Access remote data-sets.
•Steer computation.
•Collaborate with results.
•Account for usage.
                                        chyoun@icu.ac.kr                         48
                                                           Rajkumar Buyya (Monash Univ.)
                                                                    Int’l DataGrid Workshop

Data Access for Resource Management
                                                         Data Disseminator
    Status update
    message in           Grid Status                                   Update message out
                          Registry
                          Manager

   Gridspace update
   message in            Grid Space                              Grid Status
                          Manager
                                                 Gridspace        Registry



                        Request                                    Request
                                                 Gridespace
                        Router/                                    Router/
                      Allocator (1)                Cache         Allocator (2)
    Resource request                                                   Route or Allocation
    message in                                                         message out


        Route or allocation with single choice


                                          chyoun@icu.ac.kr                                   49
                                                      Int’l DataGrid Workshop


Architectural Models for RM
MODEL                 REMARKS                    Systems
Hierarchical          It captures model     Globus, Legion, CCS,
                      followed in most      Apples, NetSolve, Ninf.
                      contemporary systems.

Abstract Owner (AO)   Order and delivery         Expected to emerge
                      model and focuses on       and most peer-2-peer
                      long term goals.           computing systems
                                                 likely to be based on
                                                 this.
Market Model          It follows economic        GRACE, Nimrod/G,
                      model for resource         JavaMarket, Mariposa.
                      discover, sharing, &
                      scheduling.

                              chyoun@icu.ac.kr                           50
                                                                    Int’l DataGrid Workshop
Hierarchical RM
                               Access/Admission
                                Control Agent                                Grid
 user                                                                    information
                                                                            service
                            Global           Global
                           Scheduler        Scheduler

   Persistent
   Job control                    Connection
                                  Connection
   agent                            control
                                     control
                                                                                monitor
                    Global                               Global
                   Scheduler          Global            Scheduler
                                     Scheduler
                                                                Deployment
                                                                Agent



                                                                     Domain Resource
                                                                         manager
 Local Scheduler                                                      or control agent



 resource                                Control domain
                   task


                                       chyoun@icu.ac.kr                                   51
                                              Int’l DataGrid Workshop


 Resource Management in Globus
• The Grid Resource Allocation Management (GRAM)
  protocol and client API allows programs to be started on
  remote resources, despite local heterogeneity
• Resource Specification Language (RSL) is used to
  communicate requirements
• A layered architecture allows application-specific resource
  brokers and co-allocators to be defined in terms of GRAM
  services
   – Integrated with Condor, MPICH-G2, …




                           chyoun@icu.ac.kr                     52
                                                           Int’l DataGrid Workshop

 Resource Management Architecture in
              Globus
                                        Broker
                                                                 RSL
                       RSL                                       specialization

                                                       Queries     Information
   Application
                                                       & Info        Service
                       Ground RSL

                                     Co-allocator

                                    Simple ground RSL
Local        GRAM                        GRAM                         GRAM
resource
managers         LSF                   EASY-LL                         NQE


                                    chyoun@icu.ac.kr                         53
                                                       Int’l DataGrid Workshop

         Local Resource Managers
•   Implemented with Globus Resource Allocation Manager
    (GRAM)
    –   Processing RSL specifications representing resource requests
        • Deny request
        • Create one or more processes (jobs) that satisfy request
    –   Enable remote monitoring and management of jobs
    –   Periodically update MDS information service with current
        availability and capabilities of resources
•   GRAM is responsible for
    –   Parsing and processing RSL
    –   Job monitoring
    –   MDS update


                                chyoun@icu.ac.kr                         54
                                                                 Int’l DataGrid Workshop


          Globus/MPICH-G2 components
                          MDS client API calls
                          to locate resources
         MPI Apps                                 MDS: Grid Index Info Server
    Process MPI           MDS client API calls                                 Local site
     messages             to get resource info                                 boundary
        MPICH-G2
                                            MDS: Grid Resource Info Server
    Client API calls to
request resource allocation                                  Query current status
   and process creation.                                     of resource
                           Provide state change
                            callbacks to client  Globus Resource Manager
        Globus Security                                                    Allocate &
         Infrastructure                              Request
                                                                        create processes
                                  Globus-job-manager
                    Launch                                        Process
                                 Parse               Monitor &
         Globus                                       control     Process
       Gatekeeper                  RSL Library


                                           chyoun@icu.ac.kr                        55
                                                                       Int’l DataGrid Workshop

High throughput workload management
system architecture (simplified design)
                                                     Resource
                                                     Discovery
             Submit jobs         Master                                GIS
             (using Class-Ads)          condor_submit
                                        (Globus Universe)



                    Condor-G                                             Information on
                                                                         characteristics and
                                                                         status of local resources
                                                     globusrun



            GRAM                 GRAM                        GRAM

            CONDOR                LSF                            PBS

    Site1
                                             Site2                     Site3

                                    chyoun@icu.ac.kr                                     56
                   Condor Globus Universe
Job Submission Machine                          Job Execution Site

                                                          Globus
                                                        GateKeeper
   Condor-G             End User




                                                                         Fo
                                                   rk
   Scheduler




                                                                           rk
                                                 Fo
                        Requests

                      Persistant     Globus                                 Globus
                     Job Queue     JobManager                             JobManager




                                       Submit




                                                                                 Submit
   Fork




                                                 Site Job Scheduler
                                         (PBS, Condor, LSF, LoadLeveler, NQE, etc.)
   Condor-G
  GridManager

          GASS
          Server
                                     Job X                                      Job Y
                                                                     Int’l DataGrid Workshop

                 AO General Model
   Order       Pickup     Order               Pickup          Job             Result
  window       Window    window               Window



    Abstract Owner
                                                                     Job shop
                                  Manager                  (Estimator & Execution)
(a) External view of
    AO model
     Order      Pickup   Sales Rep.     Delivery Rep.
   window      Window
                                                                AO for Grid

                                          AO3
                                                        (d) Job scheduling step AO
  Resource Manager                      AO2

                                  AO1
                                                         Estimator     list    Executor
  Physical Resource            (c ) AO is broker


  (b) AO is Resource Owner                                     (e) Job Shop


                                   chyoun@icu.ac.kr                                       58
                                                Int’l DataGrid Workshop

          AO is owner or broker
                                                User
• User negotiates with AO            Requests          Resources
  through “order window”              Order               Pickup
• That AO may own some               Window
                                                 AO       Window

  resources, and/or it may
  broker with other AOs
                             Order    Pickup      Order         Pickup
  for those resources
                                                       Manager
• After negotiation,           Resource
                               Manager             Sales      Delivery
  resources are delivered
                                                              AO3
  through “pickup window”       Physical
                                Resource                   AO2
                                                      AO1
                        chyoun@icu.ac.kr                          59
                                                    Int’l DataGrid Workshop

                      AO Resources
• Resources are objects                      Instrument
• Classes are                                   (File)
   – Instrument
                                                            Instrument
       • Data source, sink, transform        Channels
                                                            (Program)
       • e.g. programs, people, files,
        data collection devices
                                             Instrument       Instrument
   – Channel
                                                (File)        (Program)
       • Moves data among instruments
   – Complexes of above
• Attributes define sizes, times,            Instrument      Instrument
                                              (Person)       (Telescope)
  connections, etc.

                               chyoun@icu.ac.kr                       60
                                                     Int’l DataGrid Workshop


       Negotiating with an AO
       Make dummy resource
         (with attributes set to
        constants, variables, or       Pick one,             Assign tasks
USER         “don’t care”)             Try again,            to resource,
       + bid + delivery plan           Or give up            use, relinquish
       + variable constraints
                                                  Perhaps
                                                  later...

                                                                Delivery
           Order Window
                                                                Window
                     Resource candidates
AO              (values for variables/attributes         Resource
                   + asking price for each)
                               chyoun@icu.ac.kr                        61
                                          Int’l DataGrid Workshop


    Economic Models for Trading
•   Commodity Market Model
•   Posted Prices Models
•   Bargaining Model
•   Tendering (Contract Net) Model
•   Auction Model
•   Proportional Resource Sharing Model
•   Shareholder Model
•   Partnership Model

                       chyoun@icu.ac.kr                     62
                                                                              Int’l DataGrid Workshop


Economy Grid = Globus + GRACE
                                  Applications
Science      Engineering        Commerce         Portals         …           ActiveSheet
                                                                                               Grid
                                                                                               Apps.
GlobusView        High-level Services and Tools                         Grid Status
                                                                                              Grid
 DUROC          MPI-G       MPI-IO         CC++        Nimrod/G              globusrun
                                                                                              Tools

                Heartbeat         Core Services
  Nexus          Monitor                                     GRAM             GRACE-TS
                                       Globus
                                      Security
                                                                                            Grid
 MDS      GASS          DUROC        Interface       GARA           GMD          GBank      Middleware


Condor    GRD       QBank          Local                   JVM        TCP         UDP          Grid
                                  Services                                                     Fabric
LSF       PBS       eCash                             Linux           Irix        Solaris


                                                                 Source: Rajkumar Buyya (Monash Univ.)
                                           chyoun@icu.ac.kr
                                                                            Int’l DataGrid Workshop

  Grid Architecture for Computational
  Economy
                                                                 Grid Market       Information
                                                                  Services          Server(s)
                                                  Sign-on                                              Health
                                                                                                       Monitor
                                                   Info ?
                            Grid Explorer                          Grid Node N

                                                   Secure
               Job
Application    Control                                           Grid Node1
                           Schedule Advisor         QoS
               Agent                                                                       Pricing
                                                                    Trade Server          Algorithms
                                                  Trading
                            Trade Manager                                                 Accounting
                                                                     Resource
                                                                    Reservation          Misc. services

                                                     …
                     Deployment Agent
                                                  JobExec          Resource Allocation

Grid User          Grid Resource Broker            Storage
                                                                    R1        R2     …         Rm
                                            Grid Middleware
                                                Services                 Grid Service Providers
Source: Rajkumar Buyya (Monash Univ.)
                                              chyoun@icu.ac.kr                                    64
                                              Int’l DataGrid Workshop


GRACE components

• A resource broker (e.g., Nimrod/G)
• Various resource trading protocols for different economic
  models
• A mediator for negotiating between users and grid service
  providers (Grid Market Directory)
• A deal template for specifying resource requirements and
  services offers
• Grid Trading Server
• Pricing policy specification
• Accounting (e.g., QBank) and payment management
  (GridBank, not yet implemented)

                           chyoun@icu.ac.kr                     65
                                                               Int’l DataGrid Workshop
 Flow Diagram for Pricing, Accounting, Allocations
 and Job Scheduling
                                                                GRID Bank
      Pricing Policy




                                                    0
                                                           (digital transactions)
                                         0
               2

 1 Trade Server 3                    QBank                 DB@Each Site

                                    5         8             0. Make Deposits,
                                                               Transfers, Refunds,
                                                               Queries/Reports
                          4     Resource Manager            1. Clients negotiates for
                                                               access cost.
                                                            2. Negotiation is performed
                                  IBM-LL/PBS/….                per owner defined policies.
                                                            3. If client is happy, TS informs
                                                               QB about access deal.
                                    6          7
                                                            4. Job is Submitted
                                                            5. Check with QB for “go ahead”
                                                            6. Job Starts
Rajkumar Buyya (Monash Univ.)   Compute Resources           7. Job Completes
                                                            8. Inform QB about resource
                                 clusters/SGI/SP/...           resource utilization. 66
                                        chyoun@icu.ac.kr
                                                             Int’l DataGrid Workshop

Nimrod/G : A Grid Resource Broker
• A resource broker for managing, steering, and executing task
  farming (parametric sweep/SPMD model) applications on Grid
  based on deadline and computational economy.
• Based on users’ QoS requirements, our Broker dynamically leases
  services at runtime depending on their quality, cost, and
  availability.
• Key Features
   – A single window to manage & control experiment
   – Persistent and Programmable Task Farming Engine
   – Resource Discovery
   – Resource Trading
   – Scheduling & Predications
   – Generic Dispatcher & Grid Agents
   – Transportation of data & results
   – Steering & data management
   – Accounting                   Source: Rajkumar Buyya (Monash Univ.)
                                     chyoun@icu.ac.kr                          67
                                                                                     Int’l DataGrid Workshop
                          A Glance at Nimrod-G Broker
                      Nimrod/G Client       Nimrod/G Client        Nimrod/G Client




                                   Nimrod/G Engine
                                                                                  Schedule Advisor


              Grid                                                                   Trading Manager
              Store

                                    Grid Dispatcher                                   Grid Explorer
Grid Middleware
   Globus, Legion, Condor, etc.                                  TM      TS
                                                                                              GE        GIS


                                                                                 Grid Information Server(s)
                                                 RM & TS
            RM & TS                                           RM & TS
                             G
                                                                         C
                                             L
        G
                                            Legion enabled
Globus enabled node.          L             node.
                                                                                              G     C L
                                  RM: Local Resource Manager, TS: Trade Server                    Condor enabled node.

                                                    chyoun@icu.ac.kr                                    68
                                                                             Source: Rajkumar Buyya (Monash Univ.)
                                                                         Int’l DataGrid Workshop


                        Nimrod/G Interactions
                Nimrod-G Grid Broker        Grid Info
                                             Server

                                Grid
                              Scheduler    Grid Trade
  Grid Tools        Task                     Server
     And          Farming
 Applications
                   Engine                                      Local
                                             Process                           Nimrod        User
                                 Grid                         Resource         Agent        Process
                              Dispatcher      Server          Manager


Do this in 30
min. for $10?

                                  File                  File access
                                 Server

                  User Node                        Grid Node                    Compute Node

                                                                  Source: Rajkumar Buyya (Monash Univ.)

                                            chyoun@icu.ac.kr                                69
                                                      Int’l DataGrid Workshop


       Adaptive Scheduling Steps


                                           Discover
 Discover Establish       Compose &                          Evaluate &
                                             More
Resources  Rates           Schedule                          Reschedule
                                          Resources



        Distribute Jobs         Meet requirements ? Remaining
                                  Jobs, Deadline, & Budget ?




                                            Source: Rajkumar Buyya (Monash Univ.)
                              chyoun@icu.ac.kr                          70
                                               Int’l DataGrid Workshop


           Concluding Remarks
•   Restriction in Grid Middleware
    –   Difficulties in distributed computing and resource
        management policy
    –   Difficulties of middleware implementation required for
        heterogeneous systems in meta-computing infrastructure
•   Globus, Condor, TENT, PARIS, Cactus, ….
• Difficulties of Resource Management in Grid
  Computing
• Models for Grid resource management architecture
    – Hierarchical, AO, and Market-model ….
                            chyoun@icu.ac.kr                     71

				
DOCUMENT INFO