Docstoc

The LA Grid meta-scheduling project

Document Sample
The LA Grid meta-scheduling project Powered By Docstoc
					                                                                           The LA Grid Meta-Scheduling Project
                                                           Team: Liana Fong1, S. Masoud Sadjadi2, Yanbin Liu1, Ivan Rodero3, David Villegas2, Selim
                                                                               Kalayci2, Norman Bobroff1, and Julita Corbalan3
                                                                     1: IBM T. J. Watson Research Center, Yorktown Heights, NY 10598
                                                                             2: Florida International University, Miami, FL 33199
                                                                           3: Barcelona Supercomputing Center, Barcelona, Spain
                         I: Objectives                                                                                                                             II: P2P Meta scheduling                                                                                                                    III: Related Work
                                                                                                                                                                                                                                                                    Centralized model:
Objective                                                                                                                                                                                                                                                           • Meta-scheduling has direct information of all
                                                                                                                                            FIU
  • Support interoperation and cooperation of network of                                                                                                  FIU-GCB                                                                                                     resources available at the various institutes of the                                            Meta-scheduler
                                                                                                                                  LAGrid                                                                          Some key aspects of the
     distributed schedulers                                                                                                                                                                                                                                           virtual organization
                                                                                                                                                           SGE                                                    Metascheduler Protocol:                           • Responsible for scheduling job execution on all                                       local         local          local
                                                                                                                                    Fork                                                                                                                                                                                                                  scheduler     scheduler      scheduler

                                                                                                                                                                                                                • Heterogeneous sites; inner structure of             resources
Strategic Importance
                                                                                                                                                                                                                                                                    • Local schedulers at individual institutes will act as job
   • Enhance usability: common job control language to different                                                                                                                                                  domains doesn’t effect the functionality of         dispatchers.
      resource domains                                                                                                                            Meta-
                                                                                                                                                Scheduler
                                                                                                                                                                                                                  the protocol.                                     Hierarchical model:
   • Drive interoperability of schedulers: proprietary and open-                                                          Peer-to-peer                                   Peer-to-peer
                                                                                                                                                                                                                • Site autonomy; each metascheduler is              • Meta-scheduling has no direct access to resources in                                            Meta-scheduler

      source                                                                                                                                                                                                                                                          the virtual organization
                                                                                                                                                                                                 IBM              responsible from its own site, and offers         • Assign jobs to the local schedulers of the various                                    local          local           local
   • Provide integrated scheduling views for enterprise and grid                     BSC
                                                                                                                          Meta-
                                                                                                                        Scheduler
                                                                                                                                                                            Meta-
                                                                                                                                                                          Scheduler                               as much information as it wants to other
                                                                                                                                                                                                                                                                                                                                                         dispatcher     dispatcher      dispatcher
                                                                                                                                               Peer-to-peer                                                                                                           institutes
      customers
                                                                                         CEPBA                                                                                                   IBM-USA          sites.                                            • Local schedulers will match jobs to resources.
                                                                                                             BSCgrid                                                          IBM-India                                                                             Distributed model:
Technology Benefits                                                                      LL/Fork                                                                                                 TDWB           • Peer-to-peer; no centralized body, no             • Multiple local schedulers with a companion meta-
                                                                                                              Fork
                                                                                                                                                                                                                                                                                                                                                                  Meta-scheduler
  • Meet various user service objectives: policy driven (e.g.                                                                                                                  TDWB                               single-point of failure.                            scheduling functional entity                                                                local-scheduler
    capability based, response time based)                                                                                                                                                                                                                          • Local schedulers can submit jobs to each others
  • Maximize resource availability to users with transparency of                                                                                                                                                                                                      through their respective meta-scheduling functional                                                     Meta-scheduler
                                                                                                                                                                                                                                                                                                                                                          Meta-scheduler
                                                                                                                                                                                                                                                                      entities.
    locations                                                                                                    C        P: Job flow is from C to P, resource info flow is from P to C
                                                                                                                                                                                                                                                                            Ref: “Distributed job scheduling on computational grid using multiple
                                                                                                                                                                                                                                                                                                                                                          local-scheduler       local-scheduler

  • Optimize utilization of resources across domains                                                                                                                                                                                                                        Simultaneous Requests” by Vijay Subramani, Rajkumar Kettimuthu,
                                                                                                                                                                                                                                                                            Srividya Srinivasan, and P. Sadayappan                                             Job flow                   Info flow



                                                                                                                                                                    IV: System architecture
                                                                                                                              FIU                                                                                                                                                                             IBM

Connection API                                                                                                          WS Client
                                                                                                                                                                                                                       Connection API
• Establish and terminate connections                                                JSDL                                                            Connection                                                                                                        Connection
                                                                                                                                                                                                                                                                                                                                                    Web Console
                                                                                                   User            Global
  between domain meta-schedulers.                                                                  Client
                                                                                                                            Resource                 Management                                                                                                        Management                                                                   Command-ine
                                                                                                                 Scheduling
                                                                                                                            manager
                                                                                                                  manager
• Negotiate roles and connection parameters                                                                                                             Job
                                                                                                                                                                                                                         Job Mgmt API                                     Job
  using the interface                                1. User Client takes the job request
                                                                                                                                                                                                                                                                                                        IBM Confidential                                      JSDL
                                                       1. User Client takes the job request                                                          Management                                                                                                        Management
                                                     from the local User. This request is                      Site scheduling manager
   • Provider roles: provide resources for job         from the local User. This request is
                                                     forwarded to Global Scheduling                                                                                                                                Resource exchange API
                                                       forwarded to Global Scheduling
     execution; is responsible of sending out        Manager (GSM).
                                                       Manager (GSM).                                                  Gridway                         Resource                                                                                                         Resource
                                                     2. GSM queries the Resource                                                                      Management                                                                                                       Management
                                                       2. GSM queries the Resource
     resource information                            Manager (RM) for resources. RM
                                                       Manager (RM) for resources. RM
                                                     stores information about local and
   • Consumer roles: use resources provided            stores information about local and
                                                     remote resources.
                                                       remote resources.                                    Globus                  Globus
     by providers; route job request to              3. IfIfavailable resources are found
                                                       3. available resources are found
                                                     on local site, job request is
     providers.                                        on local site, job request is
                                                     forwarded to Site Scheduling
                                                       forwarded to Site Scheduling
• Send heart beats: exchanged to guarantee           Manager (SSM).
                                                       Manager (SSM).                                                                                                                                            Resource     Job     Connection
                                                                                                                                                                                                                                                                                                                                              1.
                                                     4. SSM leverages Gridway                                                                                                                                   Management Management Management                                                                                               1.
                                                       4. SSM leverages Gridway
  the healthy state of the connection.               functionality to submit the job to the
                                                       functionality to submit the job to the
                                                                                                             SGE                      Fork                                                                                                                                         IBM Confidential
                                                     Grid Middleware (Globus).
                                                       Grid Middleware (Globus).
                                                     5. IfIfthere are not available                          GCB                  LAGrid
                                                       5. there are not available
Resource exchange API                                resources locally, job request is sent
                                                       resources locally, job request is sent               Cluster               Cluster                                                                       Apache
                                                     to aa remote site through WS Client
                                                       to remote site through WS Client                                                                                                                       Axis2 Server   WS Interface
• Exchange the scheduling capability and             6. Alternatively, job requests from
                                                       6. Alternatively, job requests from
  capacity of the domain controlled by the           other peers can be received from the
                                                       other peers can be received from the
                                                     WS layer.                                                                                                                                                                    LAGrid
                                                       WS layer.
  meta-scheduler                                                                                                                                                                                   eNANOS                          Plugin
                                                                                                                                                                                                                                            WS Client            LoadLeveler                                                                          IBM Confidential
                                                                                                                                                                                                                     LAGrid RP
   • Exchanged information can be a                  1. The eNANOS Client forwards the user requests to the eNANOS Broker.
                                                                                                                                                                                                    Client
                                                                                                                                                                                                                                                        Globus
                                                      1. The eNANOS Client forwards the user requests to the eNANOS Broker.                                                                                                                                       CEPBA
     complete or incremental set of data             2. The remote request from the P2P infrastructure are managed by regular WS (Axis2) acting
                                                      2. The remote request from the P2P infrastructure are managed by regular WS (Axis2) acting
                                                                                                                                                                                          JSDL
                                                                                                                                                                                                 Command-line                    eNANOS
                                                     as a wrapper to a GT4 service that implements the LAGrid APIs and protocols. Connections
                                                      as a wrapper to a GT4 service that implements the LAGrid APIs and protocols. Connections                                                                                                                   Resources
                                                     and other data is stored in Resource Properties.                                                                                                                            Resource
                                                      and other data is stored in Resource Properties.                                                                                             Java API
                                                     3. Jobs and resources (aggregated data) obtained from local and remote sites are used in the                                                                                                       Globus
Job management API                                    3. Jobs and resources (aggregated data) obtained from local and remote sites are used in the
                                                     eNANOS Resource Broker scheduling. Jobs are executed under the local domain through
                                                                                                                                                                                                                                  Broker                            Fork
                                                      eNANOS Resource Broker scheduling. Jobs are executed under the local domain through
• Submit, re-route and monitor job executions        Globus services, or are forwarded to other meta-scheduler.
                                                      Globus services, or are forwarded to other meta-scheduler.
                                                                                                                                                                                                                                        GT4 Container
                                                                                                                                                                                                                                                                  BSCGrid
                                                     4. eNANOS provide its resources data, forwards jobs and performs other operations (such as
  across schedulers                                   4. eNANOS provide its resources data, forwards jobs and performs other operations (such as
                                                     sending heart beats) through a WS Client.                                                                                                                                    BSC                              Cluster
                                                      sending heart beats) through a WS Client.

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:9
posted:12/4/2011
language:Esperanto
pages:1