Docstoc

threads and fork

Document Sample
threads and fork Powered By Docstoc
					Systems Programming


         Fork Wait Exit Exec Threads
                Reference
• 10.3.2 Process Management System Calls in
  UNIX
• Modern Operating System
   2nd Edition, Andrew S. Tanenbaum
• Man pages available in course folder
               Process Creation
• Includes
   Build kernel data structures
   Allocate memory
• Reasons to create a process
     •   Submit a new batch job/Start program
     •   User logs on to the system
     •   OS creates on behalf of a user (printing)
     •   Spawned by existing process
       Unix Process Creation
• When the system starts up it is running in
  kernel mode
• There is only one process, the initial process.
• At the end of system initialization, the initial
  process starts up another kernel process.
• The init kernel process has a process identifier
  of 1.
           Process Creation
• These new processes may themselves go on
  to create new processes.
• All of the processes in the system are
  descended from the init kernel thread.
• You can see the family relationship between
  the running processes in a Linux system using
  the pstree command
• A new process is created by a fork() system
  call
         The fork() system call
At the end of the system call there is a new process
  waiting to run once the scheduler chooses it
• A new data structure is allocated
• The new process is called the child process.
• The existing process is called the parent process.
• The parent gets the child‟s pid returned to it.
• The child gets 0 returned to it.
• Both parent and child execute at the same point after
  fork() returns
      Unix Process Control
            The fork syscall
            returns a zero to the
            child and the child
            process ID to the
int pid;    parent
int status = 0;                       Parent uses wait to
                                            until the
                                      sleepFork creates an
if (pid = fork()) {                         exits; wait
                                      childexact copy of the
                                      returns child pid
     /* parent */                          parent process
                                      and status.
     ……
     pid = wait(&status);              Wait variants
} else {                               allow wait on a
     /* child */                       specific child, or
                                    Child process of
     ……                                notification
                                       stops and back
                                    passes statusother
     exit(status);                  to parent on exit,
                                       signals
}                                   to report
                                    success/failure
         The fork() system call
At the end of the system call there is a new process
  waiting to run once the scheduler chooses it
• A new data structure is allocated
• The new process is called the child process.
• The existing process is called the parent process.
• The parent gets the child‟s pid returned to it.
• The child gets 0 returned to it.
• Both parent and child execute at the same point after
  fork() returns
      Unix Process Control
            The fork syscall
            returns a zero to the
            child and the child
            process ID to the
int pid;    parent
int status = 0;                       Parent uses wait to
                                            until the
                                      sleepFork creates an
if (pid = fork()) {                         exits; wait
                                      childexact copy of the
                                      returns child pid
     /* parent */                          parent process
                                      and status.
     ……
     pid = wait(&status);              Wait variants
} else {                               allow wait on a
     /* child */                       specific child, or
                                    Child process of
     ……                                notification
                                       stops and back
                                    passes statusother
     exit(status);                  to parent on exit,
                                       signals
}                                   to report
                                    success/failure
                                    Another example
main() {
 pid_t pid;
int rv;
switch(pid=fork()) {
     case -1: perror("fork"); /* something went wrong */
              exit(1); /* parent exits */
    case 0:
          printf(" CHILD: This is the child process!\n");
          printf(" CHILD: My PID is %d\n", getpid());
            printf(" CHILD: My parent's PID is %d\n", getppid());
            printf(" CHILD: Enter my exit status (make it small): ");
          scanf(" %d", &rv); printf(" CHILD: I'm outta here!\n");
            exit(rv);
    default:
          printf("PARENT: This is the parent process!\n");
            printf("PARENT: My PID is %d\n", getpid());
            printf("PARENT: My child's PID is %d\n", pid);
          printf("PARENT: I'm now waiting for my child to exit()...\n");
          wait(&rv);
          printf("PARENT: My child's exit status is: %d\n", rv);
            printf("PARENT: I'm outta here!\n");
    }
}
         Child Process Inherits
•   Stack
•   Memory
•   Environment
•   Open file descriptors.
•   Current working directory
•   Resource limits
•   Root directory
    Child process DOESNOT Inherit
•   Process ID
•   Different parent process ID
•   Process times
•   Own copy of file descriptors
•   Resource utilization (initialized to zero)
     How can a parent and child
      process communicate?
• Through any of the normal IPC mechanism
  schemes.
• But have special ways to communicate
• For example
   The variables are replicas
   The parent receives the exit status of the child
         The wait() System Call
• A child program returns a value to the parent, so
  the parent must arrange to receive that value
• The wait() system call serves this purpose
    pid_t wait(int *status)
    it puts the parent to sleep waiting for a child‟s result
    when a child calls exit(), the OS unblocks the parent
     and returns the value passed by exit() as a result of
     the wait call (along with the pid of the child)
    if there are no children alive, wait() returns
     immediately
    also, if there are zombies, wait() returns one of the
     values immediately (and deallocates the zombie)
           What is a zombie?
• In the interval between the child terminating
  and the parent calling wait(), the child is said to
  be a „zombie‟.
• Even though its not running its taking up an
  entry in the process table.
• The process table has a limited number of
  entries.
            What is a zombie?
• If the parent terminates without calling wait(), the
  child is adopted by init.
The solution is:
• Ensure that your parent process calls wait() or
  waitpid or etc, for every child process that terminates.
                   Waitpid()
#include <sys/types.h>
#include <sys/wait.h>
pid_t waitpid(pid_t pid, int *stat_loc, int options);

waitpid(child_pid, (int *) 0, WNOHANG);
                                 exit()
void exit(int status);
• After the program finishes execution, it calls exit()
• This system call:
    takes the “result” of the program as an argument
    closes all open files, connections, etc.
    deallocates memory
    deallocates most of the OS structures supporting the
     process
    checks if parent is alive:
       • If so, it holds the result value until parent requests it, process does
         not really die, but it enters the zombie/defunct state
       • If not, it deallocates all data structures, the process is dead
                  execv()
• We usually want the child process to run some
  other executable
• For Example, ls
          The ls Command




Steps in executing the command ls type to the shell
                            execv
• int execv(const char *path, char *const argv[]);
• the current process image with a new process image.
• path is the filename to be executed by the child
  process
• When a C-language program is executed as a result
  of this call, it is entered as a C-language function call
  as follows:
    int main (int argc, char *argv[]);
• The argv array is terminated by a null pointer.
• The null pointer terminating the argv array is not
  counted in argc.
int main()
{
   int pid;
   if(pid = fork())
   {
         pid = wait(&status);
   }
   else
   {
         char *newargv[3];
         int i;

        newargv[0] = "ls";
        newargv[1] = NULL;

        i = execv("/bin/ls", newargv);
        perror("exec: execv() failed");
    }
}
                            Threads
• process: address space + single thread of control
• sometimes want multiple threads of control (flow) in same
  address space
• quasi-parallel
• threads separate resource grouping & execution
• thread: program counter, registers, stack
• also called lightweight processes
• multithreading: avoid blocking when waiting for resources
    multiple services running in parallel
• state: running, blocked, ready, terminated
             Why threads?
• Parallel execution
• Shared resources  faster communication
  without serialization
• easier to create and destroy than processes
  (100x)
• useful if some are I/O-bound  overlap
  computation and I/O
• easy porting to multiple CPUs
            Thread variants
• POSIX (pthreads)
• Sun threads (mostly obsolete)
• Java threads
            Creating a thread
int pthread_create(pthread_t *tid, const
  pthread_attr_t *, void *(*func)(void *), void
  *arg);
• start function func with argument arg in new thread
• return 0 if ok, >0 if not
• careful with arg argument
      Network server example
• Lots of little requests (hundreds to thousands a
  second)
• simple model: new thread for each request  doesn't
  scale (memory, creation overhead)
• dispatcher reads incoming requests
• picks idle worker thread and sends it message with
  pointer to request
• if thread blocks, another one works on another
  request
• limit number of threads
             Worker thread
while (1) {
  wait for work(&buf);
  look in cache
  if not in cache
    read page from disk
  return page
}
           Leaving a thread
• threads can return value, but typically NULL
• just return from function (return void *)
• main process exits  kill all threads
• pthread_exit(void *status)
      Thread synchronization
• mutual exclusion, locks: mutex
   protect shared or global data structures
• synchronization: condition variables
• semaphores
Threads Programming
  Threads vs. Processes

Creation of a new process using fork is
  expensive (time & memory).

A thread (sometimes called a lightweight
  process) does not require lots of
  memory or startup time.
            fork()
Process A
 Global      fork()
Variables
                      Process B
  Code                 Global
                      Variables

  Stack                 Code


                        Stack
      pthread_create()
Process A
 Thread 1
 Global     pthread_create()
Variables

  Code
                           Process A
                            Thread 2
  Stack
                               Stack
           Multiple Threads

Each process can include many threads.

All threads of a process share:
     memory (program code and global data)
     open file/socket descriptors
     signal handlers and signal dispositions
     working environment (current directory, user ID,
      etc.)
Thread-Specific Resources
Each thread has it’s own:
  – Thread ID (integer)
  – Stack, Registers, Program Counter
  – errno (if not - errno would be useless!)


Threads within the same process can
  communicate using shared memory.
        Must be done carefully!
         Posix Threads
We will focus on Posix Threads - most
 widely supported threads programming
 API.
         Thread Creation
pthread_create(
    pthread_t *tid,
    const pthread_attr_t *attr,
    void *(*func)(void *),
    void *arg);


func is the function to be called.
When func() returns the thread is terminated.
         pthread_create()

• The return value is 0 for OK.
 positive error number on error.


• Does not set errno !!!


• Thread ID is returned in tid
             Thread IDs

Each thread has a unique ID, a thread can
  find out it's ID by calling pthread_self().

Thread IDs are of type pthread_t which is
  usually an unsigned int. When debugging,
  it's often useful to do something like this:

printf("Thread %u:\n",pthread_self());
      Thread Arguments

When func() is called the value arg
 specified in the call to pthread_create()
 is passed as a parameter.


func can have only 1 parameter, and it
  can't be larger than the size of a void *.
  Thread Arguments (cont.)

Complex parameters can be passed by creating
 a structure and passing the address of the
 structure.

The structure can't be a local variable (of the
  function calling pthread_create)!!
     - threads have different stacks!
        Thread args example
struct { int x,y } ints2;

void *blah( void *arg) {
  struct ints2 *foo = (struct ints2 *) arg;
  printf("%u sum of %d and %d is %d\n",
       pthread_self(), foo->x, foo->y,
       foo->x+foo->y);
  return(NULL);
}
         Thread Lifespan
Once a thread is created, it starts
 executing the function func() specified in
 the call to pthread_create().

If func() returns, the thread is terminated.

A thread can also be terminated by calling
  pthread_exit().

If main() returns or any thread calls exit()all
   threads are terminated.
         Detached State
Each thread can be either joinable or
  detached.

Detached: on termination all thread
 resources are released by the OS. A
 detached thread cannot be joined.
         Joinable Thread
Joinable: on thread termination the thread
  ID and exit status are saved by the OS.

  One thread can "join" another by calling
  pthread_join - which waits (blocks) until
  a specified thread exits.

int pthread_join( pthread_t tid,
           void **status);
   Shared Global Variables
int counter=0;
void *pancake(void *arg) {
    counter++;
    printf("Thread %u is number %d\n",
                 pthread_self(),counter);
}
main() {
    int i; pthread_t tid;
  for (i=0;i<10;i++)
     pthread_create(&tid,NULL,pancake,NULL);
}
    DANGER! DANGER!
       DANGER!
Sharing global variables is dangerous -
 two threads may attempt to modify the
 same variable at the same time.

Just because you don't see a problem
  when running your code doesn't mean
  it can't and won't happen!!!!
        Avoiding Problems

pthreads includes support for Mutual Exclusion
  primitives that can be used to protect against
  this problem.

The general idea is to lock something before
  accessing global variables and to unlock as
  soon as you are done.

Shared socket descriptors should be treated as
  global variables!!!
          pthread_mutex

A global variable of type pthread_mutex_t is
  required:

pthread_mutex_t counter_mtx=
     PTHREAD_MUTEX_INITIALIZER;

Initialization to PTHREAD_MUTEX_INITIALIZER
is required for a static variable!
       Locking and Unlocking
• To lock use:
pthread_mutex_lock(pthread_mutex_t &);



• To unlock use:
pthread_mutex_unlock(pthread_mutex_t &);
      Example Problem
       (Pop Tart Quiz)
A server creates a thread for each client.
  No more than n threads (and therefore
  n clients) can be active at once.

How can we have the main thread know
 when a child thread has terminated
 and it can now service a new client?
       pthread_join() doesn’t
                        help
pthread_join (which is sort of like wait()) requires that
  we specify a thread id.

We can wait for a specific thread, but we can't wait for
 "the next thread to exit".
             Use a
        global variable?
When each thread starts up:
  – acquires a lock on the variable (using a
    mutex)
  – increments the variable
  – releases the lock.


When each thread shuts down:
  – acquires a lock on the variable (using a
    mutex)
  – decrements the variable
  – releases the lock.
  What about the main loop?
active_threads=0;
// start up n threads on first n clients
// make sure they are all running
while (1) {
      // have to lock/relase active_threads
      if (active_threads < n)
               // start up thread for next client
      busy_ waiting(is_bad);
}
     Condition Variables
pthreads support condition variables,
  which allow one thread to wait (sleep)
  for an event generated by any other
  thread.

This allows us to avoid the busy waiting
  problem.


pthread_cond_t foo =
  PTHREAD_COND_INITIALIZER;
    Condition Variables (cont.)

A condition variable is always used with mutex.

pthread_cond_wait(pthread_cond_t *cptr,
                  pthread_mutex_t *mptr);

pthread_cond_signal(pthread_cond_t *cptr);


             don’t let the word signal confuse you -
             this has nothing to do with Unix signals
  Revised menu strategy
Each thread decrements active_threads
  when terminating and calls
  pthread_cond_signal to wake up the main
  loop.

The main thread increments active_threads
  when each thread is started and waits
  for changes by calling pthread_cond_wait.
  Revised menu strategy
All changes to active_threads must be
  inside the lock and release of a
  mutex.

If two threads are ready to exit at
   (nearly) the same time – the second
   must wait until the main loop
   recognizes the first.

We don’t lose any of the condition
 signals.
           Global Variables
// global variable the number of active
// threads (clients)
int active_threads=0;

// mutex used to lock active_threads
pthread_mutex_t at_mutex =
   PTHREAD_MUTEX_INITIALIZER;

// condition var. used to signal changes
pthread_cond_t at_cond =
    PTHREAD_COND_INITIALIZER;
     Child Thread Code
void *cld_func(void *arg) {
  ...
  // handle the client
  ...
  pthread_mutex_lock(&at_mutex);
  active_threads--;
  pthread_cond_signal(&at_cond);
  pthread_mutex_unlock(&at_mutex);
  return();
}
             Main thread
                                   IMPORTANT!
// no need to lock yet             Must happen while
active_threads=0;                  the mutex lock is
                                   held.
while (1) {
    pthread_mutex_lock(&at_mutex);
    while (active_threads < n ) {
        active_threads++;
        pthread_start(…)
    }
    pthread_cond_wait( &at_cond, &at_mutex);
    pthread_mutex_unlock(&at_mutex);
}
  Other pthread functions
Sometimes a function needs to have
 thread specific data (for example, a
 function that uses a static local).

Functions that support thread specific
  data:
                          The book has a
pthread_key_create()
                          nice example
pthread_once()
                          creating a safe and
pthread_getspecific()
                          efficient readline()
pthread_setspecific()
          Thread Safe library
              functions
• You have to be careful with libraries.

• If a function uses any static variables (or global
  memory) it‟s not safe to use with threads!

• The book has a list of the Posix thread-safe
  functions…
        Thread Summary
Threads are awesome, but dangerous.
You have to pay attention to details or it's
  easy to end up with code that is incorrect
  (doesn't always work, or hangs in
  deadlock).

Posix threads provides support for mutual
 exclusion, condition variables and thread-
 specific data.

				
Ali butt Ali butt Ali butt http://
About