Grid Resource Management and Sch

Document Sample
Grid Resource Management and Sch Powered By Docstoc
					Grid Resource Management
            and Scheduling
Review on Scheduling Concept
       The multitasking function of any current OS needs the CPU
        and other resources to be shared and utilized among
       Scheduling criteria
            throughput (job per time unit)
            turn-around time (from submission to complete)
            waiting time (total wait time in ready queue)
            response time (for interactive process, from issuing command to

    2                                                               Grid Computing
Scheduling Algorithms
       First Come First Serve
       Shortest Job First (Assign CPU to the process that has
        smallest CPU burst in ready queue)
       Priority scheduling (Pick the highest priority in the
        queue to be executed next)
       Round Robin
       Random selection
       Etc.

    3                                                 Grid Computing
Job Scheduling Problem
       A Grid system needs the efficient management of
        heterogeneous, geographically distributed and
        dynamically available resources.
       “How can we execute a set of tasks T, on a set of
        processors P subject to some set of optimizing
        criteria C”
       A job can be anything that needs a resource
           application(s), Web queries, …
       A job can consist of many tasks
       A resource can be anything that can be scheduled
           machine, storage space, bandwidth, …
Complications in Scheduling
       Efficient application performance and efficient
        system performance are not necessarily the same
       It may not be possible to obtain optimal
        performance for multiple applications
       Load balancing may not provide the optimal
        scheduling policy
       Application and system environment must be
        modeled in some detail in order to determine a
        performance-efficient schedule

Scheduling Paradigms
       Centralized Scheduling
           Jobs are submitted to a central scheduler which dispatches
            them to appropriate nodes
           Advantage: up-to-date information leads to better
            scheduling decision
           Disadvantage: scalability, reliability (scheduler can become
            a bottleneck and a single point of failure.
       Distributed Scheduling
           Jobs are submitted to local schedulers which interact to
            other local schedulers to dispatch jobs
           Advantage: scalability, reliability
           Disadvantage: out-of-date information leads to sub-optimal
            scheduling decision
    6                                                          Grid Computing
Scheduling Paradigms
       Hierarchical Scheduling
           Jobs are submitted to a central scheduler which interacts
            with local schedulers to dispatch jobs.
           Advantage: global scheduler and local scheduler can have
            different policies in scheduling
           Disadvantage: central scheduler can be a bottleneck

    7                                                         Grid Computing
How scheduling works
1.   Resource discovery (identify available resources)
      The Pull Model: A single daemon query grid resources
      and collect state information (CPU load, available
      The push Model: Each resource send local state to the
      central scheduler.
2.   Resource selection (select best resource based on the
     user constraints)
3.   Schedule generation (select jobs from the queue for
4.   Job execution (submit jobs to the resources for
 8                                                   Grid Computing
Popular (local) schedulers
       Condor
       SGE (Sun Grid Engine)
       PBS (Portable Batch System)
       LSF (Load Sharing Facility)

    9                                 Grid Computing
Grid Scheduling (Broker)
     “The process of making scheduling decisions involving
      resources over multiple administrative domains.”

1.        Searching multiple administrative domains for resources
2.        Selecting resource set (1 or many resources on 1 or many
3.        Assign tasks within that resource set

     Job is anything that needs a resource
     Resource is anything that can be scheduled.
     Currently, user is the most common “Grid Scheduler”
     10                                                  Grid Computing
Grid Scheduler VS. Local Scheduler
    An operating system (like Windows and Linux) is
     responsible for scheduling and managing resources of
     a single computer.
    A local scheduler (or local resource manager) is
     responsible for scheduling and managing resources at
     a single site, or perhaps a single cluster.
    Grid scheduler managing a cluster of clusters of
     computers. However, it does not have ownership or
     control over the resources at a site.

    11                                            Grid Computing
No Control
    Grid scheduler does not own the resources at a site
     unlike the local scheduler.
    Also, a Grid scheduler does not have control over local
    A Grid scheduler must make best-effort decisions and
     then submit the job to the selected resources or
     actually their local schedulers.
    Often, a Grid scheduler does not know about all the
     jobs being sent to the resources it is considering using.
    Jobs are submitted to a local scheduler as a user.

    12                                               Grid Computing
How good can best-effort decisions be?
    The decisions a scheduler makes are only as good as
     the information provided to it.
    Many theoretical schedulers assume every detailed
     information is accurately and instantly available, which
     is hardly ever true.
    Most of the time, we have only the highest level of
     information (No global real-time fine-grain

    13                                              Grid Computing
What kind of information?
    It may be known that
        An application needs to run on Linux
        It will produce output files between 20-30 MB.
        It should take less than 3 hours, but may take up to 5 hours
    It may be known that
        A machine is running Linux.
        It had 500 MB of free storage 10 minutes ago.
        Its CPU utilization was 85% 10 minutes ago.

    14                                                      Grid Computing
Where to get information?
    Generally, Grid schedulers get information from a Grid
     Information System (GIS)
    GIS gathers information from individual local
    Examples of GIS systems include
        Globus Monitoring and Discovery Service (MDS2)
        Grid Monitoring Architecture (GMA)
    Common features
        Organizing sets of information providers in a Grid
        Gather information repeatedly
        Provide information via well-defined schema and protocol

    15                                                    Grid Computing
Stages of Grid Scheduling
Phase One-Resource Discovery

       1. Authorization Filtering
                                     Phase Three- Job Execution
       2. Application Definition
                                      6. Advance Reservation
     3. Min. Requirement Filtering       7. Job Submission

                                        8. Preparation Tasks

Phase Two - System Selection           9. Monitoring Progress

                                         10 Job Completion
      4. Information Gathering
                                        11. Clean-up Tasks
         5. System Selection

Phase 1: Resource Discovery
 Determining which resources are available to a given
  user and pass a minimal feasibility requirements for
  the job.
1. Authorization Filtering
2. Job Requirement Definition
3. Minimum Requirement Filtering

    17                                         Grid Computing
1. Authorization Filtering
   Determine the set of resources that the user submitting
    the job has access to.
   Without authorization, the job cannot run.
   At the end of this step, a user should have a list of
    machines or resources to which he can access (has an
    account on).
   Common solution: have a list of account names,
    machines, and passwords

2. Application Requirement Definition
   The user must be able to specify some minimal set of job
    requirements in order to further filter the set of feasible
   Specify job requirements
        Static : architecture (Intel, SPARC, etc.), OS, software, etc.
        Dynamic : no. of processors, memory, storage space, expected
         execution time, etc.
   Ideally:
        Smart tools to automatically generate information about
         application requirements at runtime
   Today’s systems:
        Generally user defined
        Defined in command line or submission script or Condor
        Often inaccurate, incomplete

On a Grid, …
   On a Grid, application requirements changes according
    to heterogeneity of the systems
   For example, expected execution time depends on the
    performance of the machine that the job is assigned to.
   Executable programs and libraries may be made
    available for different architectures and OS’s
        They are also different in performance and resource
   Often, requirements must be compensated for the
    error (as much as 50%).

3. Minimum Requirement Filtering
   Filter out resources that do not meet minimal job
   Mostly use static data as filter
   The result is a reduced set of possible resources to
    investigate in more detail in the next step.

Phase 2: System Selection
 Selecting a single resource set from possible resources
  that meet minimum requirements.
4. Dynamic Information Gathering
5. System Selection

    22                                           Grid Computing
4. Dynamic Information Gathering
   Dynamic searches to match resources with application
   Information sources
        Grid Information System (GIS)
        Local scheduler
   Issue
        Local site policies may specify a percentage of the
         resources, in terms of capacity and time , to be allocated to
         grid. These details must be considered as a part of dynamic
         collection of data.

Information Gathering
   What data do we need?
   What is the right way to collect it?
        Scalability (more accuracy needs more queries)
        Consistency (data being cached for faster search)
   How long will it remain valid?
        update rates

5. System Selection
   Decide which resources to use by matching between resources
    and application information.
   May involve 2 steps – choosing resource(s) and then mapping
    tasks within those choices
   What is needed:
        Matches based on current information, using variance information
         and other predictions
   User’s selection:
        Best estimate
   Today’s systems:
        Condor - matchmaking
        PBS - heuristic algorithms
        Maui/Silver - submit to local sites, evaluate

6. Advance Reservation (Optional)
   Reserve resources in advance.
   Users:
        Call up system administrator to reserve resources.
   Ideally:
        Automatically done when you submit a job based on user’s
   Current systems
        Enabled in PBSPro and Maui

7. Job Submission
   Submit the job to the selected resources
   No standards for job submission
   Current systems
        Local schedulers
            Scheduler-specific commands, e.g. qsub
            Job scripts (shell script with scheduler directive commands)
        Globus GRAM
            Wrap local scheduling submission
        Each has it’s own API

8. Preparation tasks
   May involve directory setup, staging, file transferring,
    claiming a reservation, or other actions needed to
    prepare the resources for applications.
   Users or in job scripts
        scp, ftp, mkdir
   Ideally
        Automatically done as part of job submission
   Current systems
        Condor/DagMan can do file staging

9. Monitoring Progress
   How is my job doing?
   Should I move it somewhere else?
   Users:
        qstat
        Moving is hard to do, so generally not done
   Ideally:
        System takes care of it based on intuitive knowledge of user
         requirements, and good prediction techniques
   Current Systems:
        Every local scheduler has a stat command
        Globus-job-status command

10. Job Completion
    When the job is finished, the user needs to be notified.
    Submission scripts for parallel environment often
     include an e-mail notification parameters.

    30                                               Grid Computing
11.Clean up Tasks
    After the job is finished, a user may need to
        Retrieve files from that resources for later data analysis.
        Remove temporary setting
    User generally do this by hand or by including the
     cleanup information in the job submission scripts.

    31                                                        Grid Computing
Grid Scheduling - Condor
    Condor is a batch job scheduler that allow users to
        dedicated computers
        computers that are not always available (non-dedicated)
    Condor serves
        the need to move or remove jobs (Preemption) before they
         are completed. Condor will checkpoint and preempt jobs
         when the owner needs the computer back.
        the need to deal with heterogeneity platform through the
         match making process.

    32                                                    Grid Computing
Condor Pool

33            Grid Computing
Submission Host
    Each submission host has a job queue.
    Each job can have one of the following status
        Idle - no activity
        Busy – running
        Suspended – job is currently suspended
        Vacating - job is currently checkpointing.
        Killing – job is currently being queue

    34                                                Grid Computing
    ClassAds (Classified Advertisements) is a language that
     provides descriptions of jobs and resources.
    Policies and constraints can be expressed by users, owners,

    When users submit jobs to Condor, they don’t submit
     to the global queue.
    Condor is based on a decentralized model, where users
     submit jobs to a local queue on their computer.
    The local scheduler then interact with the matchmaker.
    Thus, there’re 3 entities invovled
        User agent
        Owner agent
        Matchmaker

    36                                            Grid Computing
Condor-matchmaking and Claiming

Matchmaking and Claiming Process (1)
1.        A user submit a job to the user agent (stored in queue).
2.        A user agent sends a ClassAd file to inform the
          matchmaker that it has a job to run. ClassAd is sent every
          5 minutes until the job is scheduled.
3.        Every 5 minutes, the owner agent also submit a ClassAd
          file that describes the computer it is responsible for.
4.        The matchmaker accept ClassAds from both agents.
          (ClassAds will be discarded if they are not re-submitted
          frequently enough).
5.        The matchmaker attempts to find a match

     38                                                    Grid Computing
Matchmaking and Claiming Process (2)
6. When a match is found, both agents are informed. The
   user and the owner agents to claim the match
   independently of the matchmaker.
7. The user agent contact the owner agent. ClassAd is re-
   check in case the status changes.
8. The user agent sends a job to the owner agent to begin
9. The user agent monitors the progress

 39                                                Grid Computing
Condor-shadow process
    Shadow is created when a job is started on a machine
    Shadow is responsible for implementing Condor’s
     remote I/O capabilities.
    Two main functions are
        Check pointing to the computer the job was submitted from
         or redirecting to a specialized checkpoint server.
        I/O operations will be performed on the computer that the
         job was submitted from.

    40                                                   Grid Computing
Condor - DAGMan
 A Condor job can have many tasks where input/output or execution
 of one or more tasks is dependent on one or more other tasks.

 Directed Acyclic Graph Manager (DAGMan) is a meta-scheduler that
 Submit jobs to Condor in an order represented by a DAG.


A task

Condor - Summary
    Prepare job to run un-attended – Batch processing
    Select the condor run time environment (universe) –
     Serial Job, Parallel Jobs, Grid and Meta-scheduler
    Create a submit description file
    Submit the job
      Preemptive – Resume Scheduling
            Take advantage of resources that may only be available
            Handling of job priority
            Fair sharing

GRAM and other schedulers

                                 Jobs, via Globus,
                                 can be submitted
                                 to systems managed
                                 by other schedulers.


                                 GRAM implements a
                                 Protocol for
                       ClassAd   with those schedulers
    GRAM provides a web services interface for initiating,
     monitoring, and managing the execution of computations
     on remote computers, despite local heterogeneity .
    Mainly, GRAM is used to dispatch (a large number of)
     individual tasks to computational cluster
    GRAM can also be used to deploy and manage services
    Enable remote execution with uniform interface
    GT4 uses Job Description Document (JDD) and Resource
     Specification Language 2 (RSL-2) to communicate the
    A user can run remote jobs run under local users account

45   Grid Computing
GRAM Client Examples
    The globus-job-run client is an example GRAM client
     that integrates GASS services forexecutable staging and
     standard I/O redirection, using command-line
     arguments rather than RSL.
        % globus-job-run /bin/ls
        % globus-job-run –s myprog
        % globus-job-run \
        –s myprog –stdin –s in.txt –stdout –s out.txt

    46                                                     Grid Computing

Shared By: