Document Sample
Synchronization Powered By Docstoc
					                       UNIVERSITY of WISCONSIN-MADISON
                          Computer Sciences Department

CS 537                                                   Andrea C. Arpaci-Dusseau
Introduction to Operating Systems                         Remzi H. Arpaci-Dusseau
                                                                Haryadi S. Gunawi

        Threads and Synchronization
 Questions answered in this lecture:
      OS support for threads (ULT, KLT, Hybrid)
      Inter-process communication
      Synchronization properties
      Implementing critical sections

A process: unit of resource allocation and
  • If a process crashes, won’t affect other processes
  • If a thread crashes, the other threads in the same
    process will be affected
  • A thread tracks its own PC, stack pointer,
    registers, etc.
  • Threads in the same process share the heap area
Where is thread management code
 implemented?                                       2
  • User-level or kernel-level?
       OS Support for Threads
Three approaches for thread support
  • User-level threads
  • Kernel-level threads
  • Hybrid of User-level and Kernel-level threads

         Thread Model #1: ULT
User-level threads (ULT): Many-to-one thread mapping
  • Implemented by user-level runtime libraries (e.g. Project #5)
     – Create, destroy, schedule, synchronize threads at user-level
     – Saving and restoring thread contexts
  • OS is not aware of user-level threads
     – OS thinks each process contains only a single thread of control
  • Does not require OS support; Portable
     – Can run on any OS
  • Can tune scheduling policy to meet application demands
     – You can create several scheduling policies in your thread library
     – Use one that is appropriate for your users (e.g. FIFO, SJF),
       rather than using a general purpose scheduler
     – Unfortunately: cooperative scheduling (non-preemptive). Why?
  • Lower overhead thread operations since no system calls
     – Thread switching does not require kernel mode
     – All thread management data structures in user address space
       Thread Model #1: ULT
  • Cannot leverage multiprocessors
    – OS assigns one process to only one processor at a time
    – In ULT scheme, OS only manages processes
    – Ex: 2 processor, 1 process, 2 threads/process
  • Entire process blocks when one thread blocks
    – Your thread library does not know if a read file operation
      will block (perform I/O to the disk) or not (read from
    – Reason: OS does not inform the application that the
      system call is blocked
    – Hence, the thread library cannot switch running another
    – Ex: 1 process in the system,10 threads in the process,
      and thread-1 calls read(file), and file is not in memory. 5
      When thread-1 is waiting, the thread library cannot
        Thread Model #2 (KLT)
Kernel-level threads (KLT): One-to-one thread
  • OS provides each user-level thread with a kernel thread
    (Linux pthread implementation)
  • Each kernel thread scheduled independently
  • Thread operations (creation, scheduling, synchronization)
    performed by OS
  • Each kernel-level thread can run in parallel on a
  • When one thread blocks, other threads from process can be
  • Higher overhead for thread operations (10-30x slower)
     – Requires a mode switch to the kernel
     – Ex: running a user code, time-slice expires, change to kernel
       mode to perform thread scheduling                            6
  • OS must scale well with increasing number of threads
        Thread Model #3: Hybrid
Hybrid of Kernel and user-level threads: m-to-n thread mapping
   • Application creates m threads
   • OS provides pool of n kernel threads
   • Few user-level threads mapped to each kernel-level thread
   • Can get best of user-level and kernel-level implementations
   • Works well given many short-lived user threads mapped to
     constant-size pool
   • Complicated (OS must export interfaces about the processors)
   • How to select mappings?
       – Users are lazy
   • How to determine the best number of kernel threads?
       – User specified
       – OS dynamically adjusts number depending on system load
                  Linux Threads
Unique solution:
   • Does not recognize a distinction between threads and
   • User-level threads are mapped into kernel-level processes
   • These processes share the same group ID
      – To share resources and avoid the need
      – Avoid context switch when scheduler switches among processes
        in the same group
   • A new thread is created  a process is “cloned” (rather than
      – Some clone flags to define shared elements
Check it out:
   • Create a program that forks, run it in the background, and
     sleep in both processes. Run ‘ps’, you’ll get two sequential
   • Create a program that creates two threads, and then forks.8
     Put a sleep. Run ‘ps’, you’ll see that the two PIDs are not
      Synchronization Properties
Critical section: Required Properties
   • Mutual exclusion
      – Only one thread in critical section at a time
   • Progress (deadlock-free)
      – If several simultaneous requests, must allow one to proceed
      – Must not depend on threads outside critical section
   • Bounded (starvation-free)
      – Must eventually allow each waiting thread to enter
Desirable Properties
   • Efficient
      – Don’t consume substantial resources while waiting
      – Do not busy wait (I.e., spin wait)
   • Fair
      – Don’t make some processes wait longer than others
    Software implementation of C/S:
              Attempt #1
Idea: load and store is atomic
Code uses a single shared lock variable
   Boolean lock = false; // shared variable

     while (lock) /* spin-wait */ ;
     lock = true;
     /* critical section */
     lock = false;

Problems? Which principle is violated?
   • Do not guarantee mutual exclusion

                     Attempt #2
Each thread has its own lock; lock indexed by tid (0, 1)
Boolean lock[2] = {false, false}; // shared

  lock[tid] = true;
  while (lock[1-tid]) /* wait */ ;
  // critical section
  lock[tid] = false;

Problems? Which principle is violated?
   • Deadlock (mutual blocking)

     Software implementation of C/S:
               Attempt #3
Code uses “turn” to define who should gets in
Thread 0:
   while (turn != 0); // spin loop
   /* c/s */
   turn = 1;
Thread 1:
   while (turn != 1); // spin loop
   /* c/s */
   turn = 0;
Problems? Which principle is violated?
   • (Good: solve mutual exclusion problem)
   • Must alternating
       – Pace execution is the slower of the two threads
       – Ex: T0 uses c/s every 1 sec, T1 uses c/s every 1 hour  Not efficient!
         (T0 has to spin-wait for 1 hour)
   • Too much dependence
       – If thread 0 hits an infinite loop outside the code, T1 never gets in  No
         progress (not deadlock-free)                                          12
        Peterson’s Algorithm:
      Solution for Two Threads
Combine approaches 2 and 3: Separate locks and turn
Int turn = 0; // shared
Boolean lock[2] = {false, false};

  lock[tid] = true;
  turn = 1-tid;
  while (lock[1-tid] && turn == 1-tid) /* wait */ ;

  // critical section

  lock[tid] = false;

         Peterson’s Algorithm:
Mutual exclusion: Enter critical section if and
 only if
  • Other thread does not want to enter
  • Other thread wants to enter, but your turn
Progress: Both threads cannot wait forever at
  while() loop
  • Completes if other process does not want to enter
  • Other process (matching turn) will eventually finish
Bouded waiting
  • Each process waits at most one critical section

HW support for implementing C/S
To implement, need atomic operations
Atomic operation: No other instructions can be interleaved
Examples of atomic operations
    •   Loads and stores of words
         – Load r1, B
         – Store r1, A
    •   Code between interrupts on uniprocessors
         – Disable timer interrupts, don’t do any I/O (“I want to get in to the C/S, do not bug
           me while I’m in there”)
         – Bad: only works on uniprocessors, cannot guarantee mutual exclusion in multi-
           processors (i.e. this does not solve: “can I get into the critical section now?”)
    •   Special hw instructions
         – Compare&Swap
           int compare_and_swap(*addr, testval, newval) {
              oldval = *addr;
              if (oldval == testval) *word = newval;
              return oldval; // the thread that sees the old value is the winner
         – Example: (lock is initially 0)
              while (compare_and_swap(lock, 0, 1) == 1); /* do nothing. This is called spin-
           waiting */
              /* critical section */
              lock = 0;
Future lectures:
    •   HW instructions too low-level (e.g. must set lock to 0 back again, error-

Shared By: