Docstoc

Scheduling

Document Sample
Scheduling Powered By Docstoc
					                    Advanced
    Operating Systems
                  Lecture 6: Scheduling

                      University of Tehran
             Dept. of EE and Computer Engineering
                                 By:
                      Dr. Nasser Yazdani
Univ. of Tehran        Distributed Operating Systems   1
          How efficiently use resources
   Sharing CPU and other resources of the systm.
   References
       Surplus Fair Scheduling: A Proportional-Share CPU
        Scheduling Algorithm for Symmetric Multiprocessors
       Scheduler Activations: Effective Kernel Support for User-
        Level Management of Parallelism",
       Condor- A Hunter of Idle Workstation
       Virtual-Time Round-Robin: An O(1) Proportional Share
        Scheduler
       A SMART Scheduler for Multimedia Applications
       Linux CPU scheduling,
        Univ. of Tehran    Distributed Operating Systems        2
      Outline
    Scheduling
    Scheduling policies.
    Scheduling on Multiprocessor
    Thread scheduling




    Univ. of Tehran   Distributed Operating Systems   3
        What is Scheduling?
   OS policies and mechanisms to allocates resources
    to entities.
   An O/S often has many pending tasks.
       Threads, async callbacks, device input.
   The order may matter.
       Policy, correctness, or efficiency.
   Providing sufficient control is not easy.
       Mechanisms must allow policy to be expressed.
   A good scheduling policy ensures that the most
    important entity gets the resources it needs

     Univ. of Tehran       Distributed Operating Systems   4
       Why Scheduling?
   This topic was popular in the days of time
    sharing, when there was a shortage of
    resources.
   It seemed irrelevant in era of PCs and
    workstations, when resources were plenty.
   Now the topic is back from the dead to
    handle massive Internet servers with paying
    customers Where some customers are more
    important than others

     Univ. of Tehran   Distributed Operating Systems   5
       Resources to Schedule?
   Resources you might want to schedule: CPU
    time, physical memory, disk and network
    I/O, and I/O bus bandwidth.
   Entities that you might want to give
    resources to: users, processes, threads, web
    requests, or MIT accounts.




     Univ. of Tehran   Distributed Operating Systems   6
        Key problems ?
   Gap between desired policy and available
    mechanism. The desired policies often include
    elements that not implementable. Furthermore,
    often there are many conflicting goals (low latency,
    high throughput, and fairness), and the scheduler
    must make a trade-off between the goals.
   Interaction between different schedulers. One have
    to take a systems view. Just optimizing the CPU
    scheduler may do little to for the overall desired
    policy.

      Univ. of Tehran   Distributed Operating Systems   7
    Scheduling Policy
    Examples
      Allocate cycles in proportion to money.
      Maintain high throughput under high
       load.
      Never delay high pri thread by > 1ms.
      Maintain good interactive response.
      Can we enforce policy with the thread
       scheduler?


Univ. of Tehran   Distributed Operating Systems   8
      General plan
   Understand where scheduling is occurring.
   Expose scheduling decisions, allow control.
   Account for resource consumption, to allow
    intelligent control.




    Univ. of Tehran   Distributed Operating Systems   9
         Parallel Computing

   Speedup - the final measure of success
       Parallelism vs Concurrency
            Actual vs possible by application
       Granularity
            Size of the concurrent tasks
            Reconfigurability
       Number of processors
       Communication cost
       Preemption v. non-preemption
       Co-scheduling
            Some things better scheduled together



         Univ. of Tehran          Distributed Operating Systems   10
    Best place for scheduling?
   Application is in best position to know its own
    specific scheduling requirements
       Which threads run best simultaneously
       Which are on Critical path
       But Kernel must make sure all play fairly
   MACH Scheduling
       Lets process provide hints to discourage running
       Possible to hand off processor to another thread
            Makes easier for Kernel to select next thread
            Allow interleaving of concurrent threads
       Leaves low level scheduling in Kernel
       Based on higher level info from application space



    Univ. of Tehran           Distributed Operating Systems   11
       Example
   Give each process one equal CPU time. Interrupt every 10 msec
    and then selecting another in a round-robin fashion. Works if
    processes are compute-bound. What if a process gives up some
    of its 10 ms to wait for input?
   How long should the quantum be? is 10 msec the right answer?
    Shorter quantum => better interactive performance, but lowers
    overall system throughput.
   What if the environment computes for 1 msec and sends an
    IPC to the file server environment? Shouldn't the file server get
    more CPU time because it operates on behalf of all other
    functions?
   Potential improvements: track "recent" CPU use (e.g., over the
    last second) and always run environment with least recent CPU
                                   Distributed Operating Systems      12
    use. Univ. of Tehranyou sleep long enough you lose.) Other solution:
          (Still, if
    Scheduling is a System
    Problem
   Thread/process scheduler can’t enforce
    policies by itself.
   Needs cooperation from:
        All resource schedulers.
        Software structure.
   Conflicting goals may limit effectiveness.




Univ. of Tehran      Distributed Operating Systems   13
          Goals
   Low latency
       People typing at editors want fast response
       - Network services can be latency-bound, not
        CPU-bound
   High throughput
       Minimize context switches to avoid wasting CPU,
        TLB
       misses, cache misses, even page faults.
   Fairness

        Univ. of Tehran   Distributed Operating Systems   14
        Scheduling Approaches
   FIFO
    + Fair
    - High latency
   Round robin
    + fair
    + low latency
    - poor throughput
   STCF/SRTCF (shortest time/remaining time to
    completion first)
    + low latency
    + high throughput
    - unfair: Starvation
      Univ. of Tehran      Distributed Operating Systems   15
          Shortest Job First (SJF)
   Two types:
       Non-preemptive
       Preemptive
   Requirement: the elapse time needs to be
    known in advance
   Optimal if all jobs are available
    simultaneously (provable)
   Is SJF optimal if all the jobs are not available
    simultaneously?

        Univ. of Tehran   Distributed Operating Systems   16
    Preemptive SJF
   Also called Shortest Remaining Time First
        Schedule the job with the shortest remaining
         time required to complete
   Requirement: the elapse time needs to be
    known in advance




Univ. of Tehran     Distributed Operating Systems   17
          Interactive Scheduling
   Usually preemptive
       Time is sliced into quantum (time intervals)
       Decision made at the beginning of each quantum
   Performance Criteria
       Min Response time
       best proportionality
   Representative algorithms:
       Priority-based
       Round-robin
       Multi Queue & Multi-level Feedback
       Shortest process time
       Guaranteed Scheduling
       Lottery Scheduling
       Fair Sharing Scheduling
        Univ. of Tehran    Distributed Operating Systems   18
           Priority Scheduling
   Each job is assigned a priority with FCFS
    within each priority level.
   Select highest priority job over lower ones.
   Rational: higher priority jobs are more
    mission-critical
       Example: DVD movie player vs. send email
   Problems:
       May not give the best AWT
       indefinite blocking or starvation a process
         Univ. of Tehran   Distributed Operating Systems   19
    Set Priority
   Two approaches
        Static (for system with well known and
         regular application behaviors)
        Dynamic (otherwise)
   Priority may be based on:
        Cost to user.
        Importance of user.
        Aging
        Percentage of CPU time used in last X hours.

Univ. of Tehran      Distributed Operating Systems   20
        Pitfall: Priority Inversion
    •      Low-priority thread X holds a lock.
    •      High-priority thread Y waits for the lock.
    •      Medium-priority thread Z pre-empts X.
    •      Y is indefinitely delayed despite high priority.
          When a higher priority process needs to read or modify
           kernel data that are currently being accessed by a lower
           priority process.
          The higher priority process must wait!
           But the lower priority cannot proceed quickly due to
           scheduling.
       Solution: priority inheritance
          When a lower-priority process accesses a resource, it inherits
           high-priority until it is done with the resource in question.
           And then its priority reverses to its natural value.

    Univ. of Tehran           Distributed Operating Systems                 21
    Pitfall: Long Code Paths
   Large-granularity locks are convenient.
        Non-pre-emptable threads are an extreme
         case.
   May delay high-priority processing.




Univ. of Tehran     Distributed Operating Systems   22
    Pitfall: Efficiency
   Efficient disk use requires unfairness.
        Shortest-seek-first vs FIFO.
        Read-ahead vs data needed now.
   Efficient paging policy creates delays.
        O/S may swap out my idle Emacs to free
         memory.
        What happens when I type a key?
   Thread scheduler doesn’t control these.



Univ. of Tehran     Distributed Operating Systems   23
    Pitfall: Multiple Schedulers
   Every resource with multiple waiting
    threads has a scheduler.
   Locks, disk driver, memory allocator.
   The schedulers may not cooperate or
    even be explicit.




Univ. of Tehran   Distributed Operating Systems   24
    Example: UNIX
   Goals:
        Simple kernel concurrency model.
             Limited pre-emption.
        Quick response to device interrupts.
   Many kinds of execution environments.
   Some transitions are not possible.
   Some transitions can’t be controlled.



Univ. of Tehran         Distributed Operating Systems   25
  UNIX Environments
                      Process            Process
                     User Half          User Half          User

                      Kernel               Kernel          Kernel
                       Half                 Half

                     Timer                Network
                  Soft Interrupt        Soft Interrupt

                        Device           Device
                       Interrupt        Interrupt

                               Timer
                              Interrupt
Univ. of Tehran            Distributed Operating Systems            26
    UNIX: Process User Half
   Interruptable.
   Pre-emptable via timer interrupt.
        We don’t trust user processes.
   Enters kernel half via system calls, faults.
        Save user state on stack.
        Raise privilege level.
        Jump to known point in the kernel.
   Each process has a stack and saved registers.



Univ. of Tehran        Distributed Operating Systems   27
    UNIX: Process Kernel Half
   Executes system calls for its user process.
        May involve many steps separated by sleep().
   Interruptable.
        May postpone interrupts in critical sections.
   Not pre-emptable.
        Simplifies concurrent programming.
        No context switch until voluntary sleep().
        No user process runs if a kernel half is runnable.
   Each kernel half has a stack and saved
    registers.
   Many processes may be sleep()ing in the
    kernel.
Univ. of Tehran        Distributed Operating Systems          28
    UNIX: Device Interrupts
   Device hardware asks CPU for an interrupt.
        To signal new input or completion of output.
        Cheaper than polling, lower latency.
    Interrupts take priority over u/k half.
        Save current state on stack.
        Mask other interrupts.
        Run interrupt handler function.
        Return and restore state.
   The real-time clock is a device.



Univ. of Tehran        Distributed Operating Systems    29
    UNIX: Soft Interrupts
   Device interrupt handlers must be short.
   Expensive processing deferred to soft intr.
        Can’t do it in kernel-half: process not known.
        Example: TCP protocol input processing.
        Example: periodic process scheduling.
   Devices can interrupt soft intr.
   Soft intr has priority over user & kernel
    processes.
        But only entered on return from device intr.
        Similar to async callback.
        Can’t be high-pri thread, since no pre-emption.

Univ. of Tehran        Distributed Operating Systems       30
   UNIX Environments
                    Process               Process
                   User Half             User Half       User

                    Kernel                 Kernel        Kernel
                     Half                   Half

                           Soft Interrupt

                                Device
                               Interrupt
Transfer w/ choice
Transfer, limited choice
Transfer, no choice
 Univ. of Tehran         Distributed Operating Systems            31
    Pitfall: Server Processes
   User-level servers schedule requests.
        X11, DNS, NFS.
   They usually don’t know about kernel’s
    scheduling policy.
   Network packet scheduling also
    interferes.




Univ. of Tehran     Distributed Operating Systems   32
    Pitfall: Hardware
    Schedulers
   Memory system scheduled among CPUs.
   I/O bus scheduled among devices.
   Interrupt controller chooses next
    interrupt.
   Hardware doesn’t know about O/S policy.
   O/S often doesn’t understand hardware.




Univ. of Tehran   Distributed Operating Systems   33
    Time Quantum
   Time slice too large
       FIFO behavior
       Poor response time
   Time slice too small
       Too many context switches (overheads)
       Inefficient CPU utilization
  Heuristic: 70-80% of jobs block within
   time-slice
 Typical time-slice 10 to 100 ms

 Time spent in system depends on size of
   job.
Univ. of Tehran Distributed Operating Systems 34
          Multi-Queue Scheduling
   Hybrid between priority and round-robin
   Processes assigned to one queue permanently
   Scheduling between queues
       Fixed Priorities
       % CPU spent on queue
   Example
       System processes
       Interactive programs
       Background Processes
       Student Processes
               the
    AddressTehran starvation and infinite blocking problems
       Univ. of           Distributed Operating Systems  35
  Multi-Queue Scheduling:
  Example

                                 20%




                                 30%




                                 50%



Univ. of Tehran   Distributed Operating Systems   36
Multi-Processor Scheduling:
Load Sharing
     Decides
          Which process to run?
          How long does it run
          Where to run it?
                                                   (CPU (horsepower))

      I want
      to ride
         it
                                     …
           Process 1   Process 2                 Process n

  Univ. of Tehran      Distributed Operating Systems                    37
    Multi-Processor Scheduling
    Choices
   Self-Scheduled
        Each CPU dispatches a job from the ready
         queue
   Master-Slave
        One CPU schedules the other CPUs
   Asymmetric
        One CPU runs the kernel and the others runs
         the user applications.
        One CPU handles network and the other
         handles applications
Univ. of Tehran     Distributed Operating Systems   38
Gang Scheduling for Multi-
Processors
   A collection of processes belonging to one
    job
   All the processes are running at the same
    time
        If one process is preempted, all the
         processes of the gang are preempted.
   Helps to eliminate the time a process
    spends waiting for other processes in its
    parallel computation.

Univ. of Tehran     Distributed Operating Systems   39
    Scheduling Approaches
   Multilevel feedback queues
       A job starts with the highest priority queue
       If time slice expires, lower the priority by one
        level
       If time slice does not expire, raise the priority
        by one level
       Age long-running jobs




Univ. of Tehran      Distributed Operating Systems    40
    Lottery Scheduling
   Claim
       Priority-based schemes are ad hoc
   Lottery scheduling
       Randomized scheme
       Based on a currency abstraction
       Idea:
           Processes own lottery tickets
           CPU randomly draws a ticket and execute the
            corresponding process


Univ. of Tehran        Distributed Operating Systems      41
    Properties of Lottery
    Scheduling
   Guarantees fairness through probability
   Guarantees no starvation, as long as each
    process owns one ticket
   To approximate SRTCF
       Short jobs get more tickets
       Long jobs get fewer




Univ. of Tehran     Distributed Operating Systems   42
    Partially Consumed
    Tickets
   What if a process is chosen, but it does
    not consume the entire time slice?
       The process receives compensation tickets
       Idea
           Get chosen more frequently
           But with shorter time slice




Univ. of Tehran        Distributed Operating Systems   43
           Ticket Currencies
   Load Insulation
       A process can
        dynamically change its                                 base:3000

        ticketing policies                           1000                     2000

        without affecting other                    Alice:200                 Bob:100

        processes                                     200                      100

   Need to convert                               process1:500             process2:100

    currencies before                     200                     300          100

    transferring tickets                thread1                  thread2     thread3




         Univ. of Tehran   Distributed Operating Systems                             44
        Condor
 Identifies idle workstations and schedules
  background jobs on them
 Guarantees job will eventually complete

 Analysis of workstation usage patterns
       Only 30%
   Remote capacity allocation algorithms
       Up-Down algorithm
             Allow fair access to remote capacity
   Remote execution facilities
       Remote Unix (RU)
     Univ. of Tehran        Distributed Operating Systems   45
          Condor Issues
   Leverage: performance measure
       Ratio of the capacity consumed by a job
        remotely to the capacity consumed on the home
        station to support remote execution
 Checkpointing: save the state of a job so that
  its execution can be resumed
 Transparent placement of background jobs

 Automatically restart if a background job fails

 Users expect to receive fair access

 Small overhead



        Univ. of Tehran   Distributed Operating Systems   46
    Condor - scheduling
 Hybrid of centralized static and distributed
  approach
 Each workstation keeps own state

  information and schedule
 Central coordinator assigns capacity to

  workstations
        Workstations use capacity to schedule




Univ. of Tehran     Distributed Operating Systems   47
         Real time Systems
   Issues are scheduling and interrupts
        Must complete task by a particular deadline
        Examples:
             Accepting input from real time sensors
             Process control applications
             Responding to environmental events
   How does one support real time systems
        If short deadline, often use a dedicated system
        Give real time tasks absolute priority
        Do not support virtual memory
             Use early binding
        Univ. of Tehran      Distributed Operating Systems   48
    Real time Scheduling
   To initiate, must specify
          Deadline
          Estimate/upper-bound on resources
   System accepts or rejects
          If accepted, agrees that it can meet the deadline
          Places job in calendar, blocking out the resources it
           will need and planning when the resources will be
           allocated
   Some systems support priorities
          But this can violate the RT assumption for already
           accepted jobs

    Univ. of Tehran       Distributed Operating Systems      49
         User-level Thread
         Scheduling


Possible Scheduling
   50-msec process
    quantum
   run 5 msec/CPU burst




       Univ. of Tehran     Distributed Operating Systems   50
    Kernel-level Thread
    Scheduling
Possible scheduling
   50-msec process
    quantum
   threads run 5 msec/CPU
    burst




Univ. of Tehran    Distributed Operating Systems   51
    Thread Scheduling
    Examples
   Solaris 2
        priority-based process scheduling with four scheduling classes:
         real-time, system, time sharing, interactive.
        A set of priorities within each class.
        The scheduler converts the class-specific priorities into global
         priorities and selects to run the thread with the highest global
         priority. The thread runs until (1) it blocks, (2) it uses its time
         slice, or (3) it is preempted by a higher priority threads.
   JVM
        schedules threads using a preemptive, priority-based
         scheduling algorithm.
        schedules the ``runnable'' thread with the highest priority. If
         two threads have the same priority, JVM applies FIFO.
         schedules a thread to run if (1) other thread exits the
         ``runnable state'' due to block(), exit(), suspend() or stop()
         methods; (2) a thread with higher priority enters the
         ``runnable''state.
Univ. of Tehran            Distributed Operating Systems                52
           Surplus Fair Scheduling
           Motivation
                Server

               Web
                                            Network              End-stations
          Streaming
                    E-commerce



   Diverse web and multimedia applications popular
       HTTP, Streaming, e-commerce, games, etc.
   Applications hosted on large servers (typically
    multiprocessors)
   Key Challenge: Design OS mechanisms for Resource
    Management
         Univ. of Tehran         Distributed Operating Systems             53
          Requirements for OS
          Resource Management
   Fair, Proportionate Allocation
       Eg: 20% for http, 30% for streaming, etc.
   Application Isolation
       Misbehaving/overloaded applications should not
        affect other applications
   Efficiency
       OS mechanisms should have low overheads
Focus: Achieving these objectives for CPU
 scheduling on multiprocessor machines
        Univ. of Tehran   Distributed Operating Systems   54
    Proportional-Share
    Scheduling
                  Wt=2      Wt=1                Applications



                   2/3       1/3               CPU bandwidth

   Associate a weight with each application and
    allocate CPU bandwidth proportional to weight
   Existing Algorithms
        Ideal algorithm: Generalized Processor Sharing
        E.g.: WFQ, SFQ, SMART, BVT, etc.
   Question: Are the existing algorithms adequate
    for multiprocessor systems?
Univ. of Tehran          Distributed Operating Systems         55
            Starvation Problem
   SFQ : Start tag of a thread ( Service /
    weight )
   Schedules the thread with minimum start
                                    ...
    tag
                        S1=0   S1=1              S1=10        S1=11
    CPU 1                             ...                                       A
                                                                             (Wt=100)

                        S2=0   S2=100            S2=1000
    CPU 2                             ...                      B starves
                                                                               B
                                                                             (Wt=1)

                                                                               C
    CPU 2
                                                 S3=10        S3=110
                                                                       ...   (Wt=1)


                        0      100              1000         1100
    Time
      Univ. of Tehran                 C arrives
                                      Distributed Operating Systems             56
        Weight Readjustment
   Reason for starvation:
       Infeasible Weight Assignment (eg: 1:100 for 2
        CPUs)
       Accounting is different from actual allocation
   Observation:
       A thread can’t consume more than 1 CPU
        bandwidth
       A thread can be assigned at most (1/p) of total
        CPU bandwidth            wi     1
                                     
       Feasibility Constraint:  wj     p
     Univ. of Tehran                      j
                       Distributed Operating Systems   57
      Weight Readjustment
      (contd.)
   Goal: Convert given weights to feasible
    weights
                Decreasing Order of weights



                                                              ...
                      CPU 1       CPU 2         CPU 3
                                                              ...   CPU p

   Efficient: Algorithm is O(p)
   Can be combined with existing algorithms
    Univ. of Tehran           Distributed Operating Systems                 58
                                      Effect of Readjustment
                                       SFQ without Readjustment                               SFQ with Readjustment
                             30
Number of iterations (105)




                             25
                                                           A (wt=10)                                     A (wt=10)
                             20                                                                                      B (wt=1)

                             15
                                                     B (wt=1)
                             10                                                                                      C (wt=1)

                             5                           C (wt=1)

                             0
                                  0       10        20      30         40       50    0       10    20      30       40         50
                                                    Time (s)                                         Time (s)


                                Weight Readjustment gets rid of
                                 starvation problem
                                  Univ. of Tehran                   Distributed Operating Systems                         59
                                      Short Jobs Problem
                                  Frequent arrivals and departures of
                             20    short jobs
                                            Ideal                SFQ
Number of iterations (105)




                                            J1, wt=20                                         J1, wt=20
                             15             J2-J21, wt=1x20                                   J2-J21, wt=1x20
                                            J_short, wt=5                                     J_short, wt=5


                             10



                             5


                              0
                                  0     5    10     15   20   25   30    35   40    0     5    10   15   20     25   30   35    40
                                                     Time (s)                                        Time (s)


                                     SFQ does unfair allocation!
                                  Univ. of Tehran                  Distributed Operating Systems                               60
        Surplus Fair Scheduling
                      Surplus = ServiceActual - ServiceIdeal
  Service                                                         Actual
 received
by thread i                                                       Ideal
                                                        Surplus



                                             t
                               Time
       Scheduler picks the threads with least
        surplus values
            Lagging threads get closer to their due
            Threads that are ahead are restrained
    Univ. of Tehran           Distributed Operating Systems                61
         Surplus Fair Scheduling
         (contd.)

   Start tag (Si) : Weighted Service of thread i
                         Si = Servicei / wi
   Virtual time (v) : Minimum start tag of all
    runnable threads
   Surplus (α i ) : α i = Servicei - Servicelagging
                   = wi Si - wi v
   Scheduler selects threads in increasing order of
    surplus
       Univ. of Tehran   Distributed Operating Systems   62
                                  Surplus Fair Sched with
                                  Short Jobs
                                                         Ideal                                   Surplus Fair Sched
                             20
Number of iterations (105)




                                           J1, wt=20                                             J1, wt=20
                                           J2-J21, wt=1x20                                       J2-J21, wt=1x20
                             15            J_short, wt=5                                         J_short, wt=5


                             10



                             5



                              0
                                  0    5    10      15    20     25   30   35   40     0     5    10   15   20   25   30   35    40
                                                     Time (s)                                           Time (s)
                                     Surplus Fair Scheduling does
                                      proportionate allocation
                                  Univ. of Tehran                     Distributed Operating Systems                             63
     Proportionate Allocation
            Processor Shares received by two web servers
                7

                6

                5
 Processor
 Allocation 4
(Normalized)
                3

                2

                1

                0
                     1:1      1:2            1:4           1:7
                           Weight Assignment
   Univ. of Tehran         Distributed Operating Systems         64
       Application Isolation
               MPEG decoder with background compilations
              50

                           Surplus Fair
              40           Time-sharing

Frame Rate 30
(frames/sec)
              20


              10


               0
                   0       2        4           6              8   10
                       Number of background compilations
     Univ. of Tehran           Distributed Operating Systems            65
       Scheduling Overhead
             10
                       Surplus Fair
              8        Time-sharing


Context switch 6
    time
  (microsec)
               4


              2


              0
                   0   10    20       30     40   50
                         Number of processes

     Context-switch time(~10μ s) vs. Quantum
               (~100ms) Distributed Operating Systems
      size Tehran
      Univ. of                                        66
          Summary
   Existing proportional-share algorithms
    inadequate for multiprocessors
   Readjustment Algorithm can reduce unfairness
   Surplus Fair Scheduling practical for
    multiprocessors
       Achieves proportional fairness, isolation
       Has low overhead
   Heuristics for incorporating processor affinity
   Source code available at:
    http://lass.cs.umass.edu/software/gms
       Univ. of Tehran Distributed Operating Systems   67
        Scheduler Activations

   In a multiprocessor system, threads
    could be managed in:
       User Space only
            Key feature: Cooperative
       Kernel Space only
            Key feature: Preemptive
       User Space on top of Kernel Space
                         Some User-Level Threads
                          -------------------------
                        Some Kernel-Level Threads
                          -------------------------
                                 Some CPUs

    Univ. of Tehran        Distributed Operating Systems   68
Scheduler activations
   User level scheduling of threads
          Application maintains scheduling queue
   Kernel allocates threads to tasks
          Makes upcall to scheduling code in application when
           thread is blocked for I/O or preempted
          Only user level involved if blocked for critical section
   User level will block on kernel calls
          Kernel returns control to application scheduler




    Univ. of Tehran        Distributed Operating Systems         69
        User-Level Thread Management
   Sample measurements were obtained by Firefly
    running Topaz (in microsecs).
Procedure call: 7 microsecs.
Kernel Trap: 19 microsecs.




         Operation FastThrea          Topaz             Ultrix
                   ds                 Threads           Processes
         Null Fork      34            948               11300
         Signal-Wait    37            441               1840

      Univ. of Tehran        Distributed Operating Systems          70
  User-level on top of kernel
  threads
Three layers:                    Problems caused by:
   Some User-Level
                                      Kernel threads are
   Threads (how many?)                 scheduled obliviously
--------------                         with respect to the
   Some Kernel-Level                   user-level thread state
   Threads (how many?)                Kernel threads block,
                                       resume, and are
--------------                         preempted without
   Some CPUs
                                       notification to user
   (how many?)                         level




Univ. of Tehran      Distributed Operating Systems          71
         The way out: Scheduler
         Activation
   Processor allocation is done by the
    kernel
   Thread scheduling is done by each
    address space
   The kernel notifies the address space
    thread scheduler of every event
    affecting the address space
   The address space notifies kernel of the
    subset of user-level events that can
    affect processor allocation decisions
    Univ. of Tehran   Distributed Operating Systems   72
    Scheduler Activation (cont.)
   Goal
        Design a kernel interface and a user-level
         thread package that can combine the
         functionality of kernel threads with
         performance and flexibility of user-level
         threads.
   Secondary Goal:
        If thread operations do not involve kernel
         intervention the achieved performance should
         be similar to user-level threads.



Univ. of Tehran       Distributed Operating Systems   73
          Scheduler Activation (cont.)
   The difficulty is IN achieving all the above:
        in a multi-programmed/multiprocessor system the
         required control and scheduling information is
         distributed between the kernel and the user-space!


   To be able to manage the application’s
    parallelism successfully:
        user-level support routines (software) must be
         aware of kernel events (processor reallocations, I/O
         requests and completions etc.) –this often is all
         HIDDEN stuff from the application.

        Univ. of Tehran   Distributed Operating Systems   74
            Scheduler Activation (cont.)
1. Provide each application with a VIRTUAL
         MULTIPROCESSOR.
          each application knows how many processors are allocated.
          each application has total control over the processors and its own
           scheduling.
          The OS has control over the allocation of processors among address spaces
           and ability to change the number of processors assigned to an application
           during its execution.
2.       To achieve the above, the kernel NOTIFIES.
          the address space thread scheduler for kernel events affecting it!
          application has complete knowledge of its scheduling state.
3.  The user-space thread system NOTIFIES the kernel for
    kernel operations that may affect the allocation of
    processors (helping in good performance!).
       mechanism that achieves Operating Systems SCHEDULER
Kernel Univ. of Tehran    Distributed
                                      the above:                                 75
         Scheduler Activation Data
         Structures
   Each scheduler activation maintains two execution stacks:
      One mapped into the kernel.

      Another mapped into the application address space.

   Each user-level thread is provided with its own stack at creation
    time.
   When a user-level thread calls into kernel, it uses its
    activation’s kernel stack.
   The user-level thread scheduler runs on the activation’s user-
    level stack.
   The kernel maintains.
      Activation control block: records the state of the scheduler
       activation’s thread when it blocks in the kernel or it is
       preempted.
      Keeps track of which thread is running on every scheduler
         Univ. of Tehran     Distributed Operating Systems       76
    When a new Program is
    started…
   Kernel creates a scheduler activation
   Assigns to it a processor
   Does an upcall to the user-space application (at a fixed
    entry point).
   THEN
   The user-lever system
   Receives the upcall and uses the scheduler activation as the
    context to initialize itself and start running the first (main)
    thread!
   The first thread may ask for the creation of more threads
    and additional processors.
   For each processor the kernel will create a new activation
    and upcall the user-level to say that the new processor is
    there
   The user-level picks a thread and executes it on the new
    processor.

Univ. of Tehran        Distributed Operating Systems           77
           Notify the user level of an
           event…
   The kernel created a new scheduler activation
   Assigns to it a new processor and upcalls the user-space.
   As soon as the upcall happens
        An event can be processed
        Run a user-level thread
        Trap and block into the kernel
   KEY DIFFERENCE between Scheduler Activation and Kernel Threads
    If an activation’s user-level thread is stopped by the kernel, the thread is never
    directly resumed by the kernel!
   INSTEAD a new scheduler activation is done whose prime objective is to notify the
    user-space that the thread has been suspended. Then…
         the user-level thread system removes the state of the thread from the “old”
         activation.
        Tells the kernel that the “old” activation can be reused.
        Decides which thread to run on the processor
   INVARIANT: number of activations = number of processors allocated to a job.


         Univ. of Tehran           Distributed Operating Systems                   78
         Events that the Kernel vectors to user-Space as
         Activations
   Add-this-processor(processor #)
    /* Execute a runnable user-level thread */


   Processor-has-been-preempted(preempted activation # and its
    machine state)
    /* Return to the ready list the user-level thread that was
    executing in the context of the preempted scheduler activation */


   Scheduler-activation-has-blocked(blocked activation #)
    /* the blocked activation no longer uses its processor */


   Scheduler-activation-has-been-unlocked(unblocked activation
    # and its machine state)
    /*Return to the ready list the user-level thread that was executing in
    context of the blocked scheduler activation */
       Univ. of Tehran           Distributed Operating Systems               79
                     I/O happens for Thread (1)…..


                                                           T1




User Program


                                  (2)                                (3)
 User-Level         (1)
 Runtime System                                                            (4)




                    (A)          (B)
                          Add           Add
                          Processor     Processor
                                                       Operating System Kernel


       Processors


Univ. of Tehran                Distributed Operating Systems                     80
                    A’s Thread has blocked on an I/O request


                                                            T2



User Program


                                (2)             (3)
 User-Level         (1)                                                        (4)
 Runtime System
                        B
                          (A)            (B)          (C)
Operating
System Kernel                                         A’s thread has blocked



       Processors


      Univ. of Tehran                 Distributed Operating Systems                  81
                                   A’s Thread I/O completed


                                                                    T3

User Program

                                                                         (2)
                                              (3)                              (4)
 User-Level            (1)                                         (1)
 Runtime System




                             (A)       (B)          (C)      (D)
Operating
System Kernel



       Processors


     Univ. of Tehran               Distributed Operating Systems                     82
                  A’s Thread resumes on Scheduler Activation D


                                                                   T4

User Program

                                                                         (2)
                                           (3)         (1)                     (4)
 User-Level                                                        (1)
 Runtime System




                                                 (C)         (D)
Operating
System Kernel



       Processors


     Univ. of Tehran            Distributed Operating Systems                        83
  User-Level Events Notifying the Kernel
Add-more-processors (additional # of processor needed)
   /* allocate more processors to this address space and start them running scheduler
   activations */

This-processor-is-idle()
   /* preempt this processor if another address space needs it */



The kernel’s processor allocator can favor address spaces that use fewer processors
   (and penalize those that use more).




Univ. of Tehran              Distributed Operating Systems                       84
  Thread Operation Latencies
  (msec.)




Univ. of Tehran   Distributed Operating Systems   85
  Speedup




Univ. of Tehran   Distributed Operating Systems   86
       Next Lecture
 Concurrency
 References




    Univ. of Tehran   Distributed Operating Systems   87

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:14
posted:10/7/2012
language:English
pages:87