Concurrency by VLV0pVAI


          What is Concurrency
• Ability to execute two operations at the
  same time
• Physical concurrency
  – multiple processors on the same machine
  – distributing across networked machines
• Logical concurrency
  – illusion or partial parallelism
• Designer/programmer doesn’t care which!
            Real or Apparent
       depends on your point of view

• Multiple computers in distributed computing or
  multiprocessors on a computer
• Multiple clients on same or multiple computers
• Multiple servers on one or many machines
• Where does the complexity lie?
  – It varies on how you design the system
    Why is concurrency important?

• One machine is only capable of a limited speed
• Multiple machines/processors
  – share workload and gain better utilization
  – optimize responsibility/requirements to each
    machine’s ability
  – place secure processes in secure environments
  – parallel processors are tailored to the problem domain
• Single machine
  – OS can give limited parallelism through scheduling
   Concurrency considerations

• What level of concurrency to consider
• How it is handled on a single processor
  – understand process scheduling
• How to share resources
• How to synchronize activity
• How to synchronize I/O specifically
Process Scheduling
• The activation of a program.
• Can have multiple processes of a program
• Entities associated with a process
  –   Instruction pointer
  –   user who owns it
  –   memory location of user data areas
  –   own run-time stack
         Process Scheduling
• Order of process execution is unpredictable
• Duration of execution is unpredictable
• Whether there is an appearance or real
  concurrency is not relevant to the designer.
• There will generally be a need to
  resynchronize activity between cooperating
            Context switch
• When OS switches running process, it
  manipulates internal process tables
• Loading/storing registers must be done
• Threads minimize that effort. Same
  process but a different stack.
• Necessary overhead but must balance
  overhead with advantage of concurrency
Operating Systems Scheduling
      Ready        Running        Blocked
       205          200            177
       198                         206
       201                         180

 Process 200 blocks when reading from the disk
      Ready         Running       Blocked
        198          205            177
        201                         206
      What other activities are
      important in scheduling?

• Jobs go from RUNNING to READY when
  they lose their time slot
• Jobs go from blocked to READY when the
  I/O operation they are waiting for completes
• Jobs go from RUNNING to being removed
  completely upon exit
    Concurrency at many levels
• Process level ..
  – Unix fork command…. (review example)
  – network client-server level
• Subprocess level (like a procedure) .. thread
• Statement level
• Scheduling level
        How do you get limited
       parallelism from the OS?
        A runs                   There are times
        onCPU;      Disk reads   when both processors
        blocks      for
        on read     B while
                                 (CPU and controller)
       Process C    blocked      are busy so real parallelism
time   runs for                  does occur but not at the
       it’s time
                   Disk reads    level of the CPU
                   for A while
                   A blocked

       Process B
What if YOU have to create the
High-level Concurrency
             Unix and 95/98/NT
        Unix                         95/98/NT
•process concurrency    •thread part of same process
•use fork and exec      •own copy of locals
•fork                   •shared copy of globals
    •clone self         •shared resources like file descriptors
    •parent and child   •each has own activation record
    differ by ppid      •parent and child do not always
    •two processes      begin/continue at same spot as in
•exec                   fork
    •new process        •thread ceases execution on
    replaces original   return
    •single different   •_beginthread() needs procedure
    process             •Createprocess() like fork/exec
                  Unix fork()
                  process level

void main()        produces
{ fork();                          HiHi
  cout << “Hi”;

          Question is… who “Hi”ed first?
 Process 1                            Output
void main()
{ fork();
  cout << “Hi”;}    Process 2
                   void main()
                   { fork();
 Process 1           cout << “Hi”;}
void main()                                    time
                      OR              HiHi
{ fork();
  cout << “Hi”;}    Process 2          OR
                   void main()
                   { fork();           HiHi
                     cout << “Hi”;}
     Another Example
           cout << “a”;
           cout << “b”;

How many processes are generated?
How many possible outputs can you see?
A more common
 unix Example

              STEP 1:
              forks a new
  /bin/sh        shell

                               STEP 2:
STEP 3:                        execs (replaces self)
exit to the
                               a new program
       Use in servers (and clients)
            (not important for us)
While (1)
   {   talksock = accept (listsock...
        cid = fork();
        if (cid > 0)
          { // this is parent code
                 …            Use listsock
          }                   close talksock
                              repeat loop
          { // this is child code
                 …            close listsock
          }                   use talksock
(windows style)

void main()
{ count(4);                      1
}                                2
void count(int i)                4
{int j;
  for (j=1; j<=i; j++)
    cout << j<<endl;
 void main()
 { _beginthread(( (void(*) void()) count, 0 , (void*) 5);

void count(int i)                                   1       1
{int j;                                             2       2
  for (j=1; j<=i; j++)                              1       1
    cout << j<<endl;                                2       2
}                                                   3       3
                                                    3       3
                                                    4       4
                                                    4       4
The Synchronization
   Concurrency frequently requires
  Cooperation Synchronization      Competition Synchronization

     A is working on something        A needs to read a stream
     B must wait for A to finish      B needs to read the stream
                                      Only one can read at a time

A does this      x = f + g;        A does this      instr >> a;
B does this      h = x + y;        B does this      instr >> b;
                                     We’ll see how to do this later!
       The synchronization problem
          Shared Memory        Task A        Task B
                              T=T+1         T=T*2
             T=3               fetch T(3)
                                incr T(4)
                               lose CPU      fetch T(3)
time                           (time)
                                             double T(6)
              T=6                             store T
                                get CPU
             T=4                store T
         TRY THIS: What other combinations could occur?
    The essence of the problem

• There are times during which exclusive
  access must be granted
• These areas of our program are called
  critical sections
• Sometimes this is handled by disabling
  interrupts so process keeps processor
• Most often through a more controlled
  mechanism like a semaphore
          Where do we see it?
•   Database access
•   Any data structures
•   File/Printer/Memory resources
•   Any intersection of need between
    processing entities for data
Synchronization Solutions
        for c/c++
       (java later)
• One means of synchronizing activity
• Managed by the operating system
  – implementing yourself will not work (mutual exclusion)
• Typical wait and release functions called
  by applications
• Count associated
  – 0/1 binary implies only a single resource to manage
  – larger value means multiple resources
• Queue for waiting processes (blocked)
           wait and release
wait ( semA )                    release ( semA )
  { if semA>0 then                 {if semA queue empty
      decr semA;                      incr semA;
    else                             else
      put in semA queue;               remove job in semA queue
      (block it)                       (unblock it)
  }                                }

        This represents what the operating system
        does when an application asks for access to
        the resource by calling wait or release
        on the semaphore
             Standard example
semaphore fullspots, emptyspots;
fullspots.count=0;                       shared resource
emptyspots.count= BUFLEN;

task consumer;
    wait (fullspots);
    release (emptyspots);
  end loop;
end producer;

task producer;
    wait (emptyspots);
    DEPOSIT(VALUE);                Why do you need TWO
    release (fullspots);           semaphores? Are adding
  end loop;
end producer;                      and removing the same?
   Competition Synchronization
What if multiple processes want to put objects in the buffer?
   We might have a similar synchronization problem.
           Use BINARY semaphore for access
            COUNTING semaphores for slots

semaphore access, fullspots, emptyspots;
access.count=1; // BINARY
emptyspots.count= BUFLEN;

task consumer;                             task producer;
  loop                                       loop
    wait (fullspots);                          wait (emptyspots);
    wait (access);                             wait (access);
    FETCH(VALUE);                              DEPOSIT(VALUE);
    release (access);                          release (access);
    release (emptyspots);                      release (fullspots);
  end loop;                                  end loop;
end producer;                              end producer;

               Remind you of a printer queue problem?

Parallel Fortran
Statement-level Parallel Process

     PARALLEL LOOP 20 K=1,20
     PRIVATE (T)
     DO 20 I = 1, 500
      DO 30 J = 1, 1500
  30 T = T + ( B(J,K)*A(I,J))
  20 C(I,K)=T

To top