Docstoc

Mutual Exclusion in Distributed Systems  Single Processor Systems  use semaphore monitor etc  Distributed Systems  centralized algorithm • central server c

Document Sample
Mutual Exclusion in Distributed Systems  Single Processor Systems  use semaphore monitor etc  Distributed Systems  centralized algorithm • central server c Powered By Docstoc
					           Mutual Exclusion in Distributed Systems

 Single Processor Systems
     use semaphore, monitor, etc.

 Distributed Systems

     centralized algorithm
        • central server coordinate the ordering for entering CS
        • overload the central site
        • introduce a single point of failure in the system
           Mutual Exclusion in Distributed Systems

 decentralized algorithms
     non-token based algorithms
       • Lamport's algorithm
       • Ricart-Agrawala's algorithm
       • Maekawa's algorithm

     token based algorithms
        • token-ring algorithm
        • broadcast algorithm
        • tree-based algorithm

     self-stabilizing algorithm
                          Lamport's Algorithm

 Request the CS:
    1. Pi broadcasts request (ti, i) to all processors and puts the request in its local
       queue (in the order of timestamps t of the requests)
    2. Pj upon receiving the request (ti, i), puts the request in its local queue (in the
       order of timestamps t of the requests) and sends reply (tj, j) to Pi

 Enter the CS:
    1. if Pi has received reply messages from all sites with timestamps larger than ti
        and its request is at the top of the queue, then it enters the CS

 Release the CS:
    1. Pi, upon exiting CS, removes its request from the queue and sends release (ti)
       to all processors
    2. Pj, upon receiving the message, removes the request from the top of the
       queue
                Lamport's Algorithm -- Properties

 this algorithm requires
     a total ordering of events
     all sites to be alive
 requires 3(N1) messages per request
 response time in a very low load
       2T
       T: per message communication latency
       assume there is no one in CS
       send N1 request messages sent in parallel (T)
       send N1 response messages sent in parallel (T)
       so, requester enters CS after 2T time
                    Ricart-Agrawala's Algorithm

 Request the CS:
    1. Pi broadcasts request (ti, i) to all processors
    2. Pj, upon receiving the request
          a) sends reply (tj, j) to Pi if Pj is neither requesting nor executing in the
             CS
          b) sends reply (tj, j) to Pi if Pj is requesting the CS but the timestamp for
             Pj’s request is larger than ti
          c) defers the request otherwise

 Enter the CS:
    1. if Pi has received reply messages from all sites, then it enters the CS

 Release the CS:
    1. Pi upon exiting CS, sends reply (j) to all the deferred requests
                    Ricart-Agrawala's Algorithm

 this algorithm requires
     a total ordering of events
     require all sites to be alive
 requires 2(N1) messages per request
 response time in a very low load
     2T
     send N1 request messages in parallel (T)
     send N1 response messages in parallel (T)
                        Maekawa's Algorithm

 Request set
    each node has a request set
    when the node wants to enter the critical section, it sends its request to all
     nodes in its request set
    the request set of each node does not include all nodes in the system
    the intersection of any two request sets is non-empty


 Example
      consider three nodes, X, Y, and Z
      X’s request set include nodes X and Y
      Y’s request set include nodes Y and Z
      Z’s request set include nodes Z and X
                          Maekawa's Algorithm

 Request the CS:
    1. Pi multicasts request (ti, i) to its request set, including itself
    2. Pj upon receiving the request
          a) if it is not currently locked, then locks itself and sends reply (j) to Pi
          b) otherwise, puts the request in a queue (in the order of the timestamp)
 Enter the CS:
    1. if Pi has received reply messages from all sites in its request set, then it
        enters the CS
 Release the CS:
    1. Pi upon exiting CS, sends release (ti) to all processors in its request set
    2. Pj upon receiving the message
          a) if the waiting queue is not empty then it removes the entry in the queue
             and sends reply (j) to that node
          b) otherwise, unlocks itself
              Maekawa's Algorithm -- Properties

 requires a total ordering of events
 requires 3N messages per request
 response time in a very low load
     2T
     send K1 request messages sent in parallel (T)
     send K1 response messages sent in parallel (T)
 has the potential deadlock problem
   Potential Deadlock Problem in Maekawa's Algorithm

 requests reach different sites in different order
     consider nodes X, Y, Z, who issue requests to enter the critical section
     X’s request has the lowest timestamp, Z’s request has the highest
     A is the mediator of requests from X and Y
     B is the mediator of requests from Y and Z
     C is the mediator of requests from X and Z
     A received X’s request first and locked itself for X
     B received Y’s request first and locked itself for Y
     C received Z’s request first and locked itself for Z
    
     X will not get a reply from C
     Y will not get a reply from A
     Z will not get a reply from B
     deadlock
         Solution to the Potential Deadlock Problem

 detect the potential deadlock
     when a request with a smaller timestamp is received, while the node is
      locked for a request with a larger timestamp
 resolution
     ask the requester with a larger timestamp to give up its granted privilege if
      it has not already gotten all replies
         • for the previous example, C asks Z to give up the granted privilege
           Resolve the Potential Deadlock Problem

 Request the CS:
   1. Pi multicasts request (ti, i) to its request set, including itself
   2. Pz upon receiving the request
         a) if it is not currently locked, then locks itself and sends reply (z) to Pi
         b) if it is currently locked for Pk, then
               • if request from Pk has a smaller timestamp then puts the new
                   request in a waiting queue (in the order of the timestamp) and sends
                   failed (z) to Pi
               • otherwise (Pi 's request has a smaller timestamp), sends inquire (z)
                   to Pk
           Resolve the Potential Deadlock Problem

 Request the CS:
   3. Pk upon receiving inquire (z)
         a) if it has received a failed message then sends relinquish (k) to all sites in
            its request set
         b) if it has received all reply messages then ignores the inquire message
         c) otherwise, simply waits
   4. Pz, upon receiving relinquish (k),
         a) changes the lock to lock for Pi and sends reply (z) to Pi

 Property
    requires at most 5N messages per request
    response time under very low load: 2T
                        Request Set Generation

 Assume
     total N nodes
 Let Si denote the request set for Pi, the request sets have to satisfy
     Si  Sj  , for all i, j
     Si, for all i, always contains Pi
 additional desirable properties
     |Si| = |Sj| = K, for all i, j, and for some K
        • i.e., the request sets are of equal size, and each is of size K
     O(Pi) = O(Pj) = D, for all i and j
        • O(Pi) denotes the number of occurrences of Pi in all request sets
        • i.e., each node is involved in D request sets
                       Request Set Generation

 relationship between K and D
     N nodes, each has a request set of size K
     total NK nodes required (can be duplicates)
     since there are N nodes, each site need to be duplicated D times
    K=D


 request set size K
     consider the first request set, it has K nodes, each of them can be in (K1)
      other request sets
     Each other request set should contain at least one of the nodes in the first
      request set
     total K(K1) extra request sets other than the first one
     N = K(K1)+1  K  N
                         Request Set Generation

 assume N = K (K1) + 1, for some K, and K1 is a prime number
 consider a matrix of size K1 by K1
 it can generate K groups of K1 nonintersecting sets
       K1 nonintersecting rows
       K1 nonintersecting columns
       (K2) of (K1) nonintersecting diagonals
       different diagonals: jump 1 on each row (the real diagonal), jump 2, ....,
        jump (K1)1
 each number (out of the first K numbers) can be combined with each of
  the K1 nonintersecting sets to produce K1 of 1-element-intersected
  sets
              Request Set Generation Example -- K=6


                       N = 6 * 5 + 1 = 31, K = 6, matrix is 5 by 5
                             the first K numbers 123456 form one set
                             1 combined with all rows to form one set
1 2   3   4   5   6          2 combined with all columns to form one set
  7   8   9   a   b
                             3 combined with all jump-1 diagonals
  c   d   e   f   g
  h   i   j   k   l             • jump-1 diagonals: 7djpv, 8ekqr, 9flms, ....
  m   n   o   p   q          4 combined with all jump-2 diagonals
  r   s   t   u   v             • jump-2 diagonals: 7elnu, 8fhov, 9gipr, ....
                             5 combined with all jump-3 diagonals
                                • jump-3 diagonals: 7fiqt, 8gjmu, ....
                             6 combined with all jump-4 diagonals
                                • jump-4 diagonals: 7gkos, 8clpt, …, bfjnr
                             total K(K1)+1 = 31 sets
              Request Set Assignment Example -- K=6

1 2   3   4   5   6    How to assign the 31 sets to the 31 nodes
  7   8   9   a   b    node 1 gets the first set: 123456
  c   d   e   f   g    the request set constructed from each row is assigned to
  h   i   j   k   l     the 2nd node in the set
  m   n   o   p   q
                          • e.g., request set 1789ab is assigned to node 7
  r   s   t   u   v
                       now, all nodes in the first column have their request sets
                       node 2 gets the set of 2 and first column
      3 4 5 6          the request set constructed from each column is assigned
                        to the 2nd node in the set
      d   e   f   g       • e.g., node 8 has request set 28dins
      i   j   k   l       • note that, set 27chmr is assigned to node 2, not 7
      n   o   p   q    now, the first node of each column and each row have
      s   t   u   v     their request sets
                       the jump-X diagonals will be assigned to the rest of the
                        nodes
              Request Set Assignment Example -- K=6

                       the request set constructed from each jump-1 diagonal is
1 2   3   4   5   6     assigned to the 3rd node in the request set
  7   8   9   a   b
                          • request set 37djpv is assigned to node d
  c   d   e   f   g
  h   i   j   k   l       • but, set 3bciou is assigned to node 3, not node c
  m   n   o   p   q    the request set constructed from each jump-2 diagonal is
  r   s   t   u   v     assigned to the 4th node in the request set
                          • e.g., request set 47elnu is assigned to node l
                          • but, set 48fhov is assigned to node 4, not node h
                       the request set constructed from each jump-3 diagonal is
                        assigned to the 5th node in the request set
                          • e.g., request set 57fiqt is assigned to node q
                          • but, set 58gjmu is assigned to node 5, not node m
                       the request set constructed from each jump-4 diagonal is
                        assigned to the last node in the request set
                          • e.g., request set 67gkos is assigned to node s
                          • but, set 6bfjnr is assigned to node 6, not node r
         Request Sets Generation Algorithm (Cont.)

 if K1 is a power of a prime number
     it is possible to generate optimal request sets


 if K1 is not a power of a prime number or N cannot be expressed as
  K(K1)+1
     find a number M where M is the smallest integer which is greater than N
      and can be expressed as K(K1), for some K, where K is the power of a
      prime number
     generate the required sets for M processors
     replace numbers N+1..M by 1..MN
     remove MN sets


 same thing can be done for site failures
           Request Set Generation Example -- N=5

 consider the closest prime number that can be divided into K(K1)+1
 N=5  M=7
 derive the sets from M=7 and remove the duplicated nodes
        123
         45
         1 2 -- replace nodes 6 and 7 by 1 and 2
            – S1 = {1, 2, 3}
            – S4 = {1, 4, 5}
            – S6 = {1, 1, 2}  remove
            – S2 = {2, 4, 1}
            – S5 = {2, 5, 2}  {2, 5}
            – S7 = {3, 4, 2}  remove
            – S3 = {3, 5, 1}
                         Token Ring Algorithm

 a unique token is associated with the CS
 Pi enters CS only if it owns the token

 Request to enter CS:
    1. if Pj owns the token and it does not need to enter the CS, then it passes the
        token to P(j+1) mod N
    2. Pi will sooner or later gets the token

 Enter the CS:
    1. when Pi owns the token, it enters CS

 Release the CS:
    1. pass the token to the next processor
               Token Ring Algorithm -- Properties

 simple and no deadlock or starvation
 number of messages and response time
      if only one node needs the token, the token will traverse N/2 nodes on
       average
      best case: 0 message (the node has the token)  0 delay
      worst case: N1 messages (sequentially)  (N1)T delay
   tolerable overhead with small N
   cannot scale up for large N
   it is difficult to design a fault tolerant algorithm for this scheme
   The concept of token is similar to centralized control, however, the
    central site is moving
            Suzuki-Kasami's Broadcast Algorithm

 data structures:
     vector X: associated with the token
        • X[i]: the timestamp of the last request from Pi that has been served
     vector RTj: associated with node Pj
        • RTj[i]: the timestamp of the most current request from Pi known by Pj
     node j determines whether a node k has an outstanding request by checking
      whether RTj[k] > X[k]
            Suzuki-Kasami's Broadcast Algorithm

 Request the CS:
    Pi increase RTi[i] by 1 and broadcasts request (RTi[i], i) to all processors
    Pj upon receiving the request request (T,i)
       a) update RTj[i] to max (RTj[i], T)
       b) if it has the token then execute (A)
       c) otherwise, do nothing
 A:
    Go through RT and X
       • The algorithm did not specify a specific starting point
       • Can starting from (j+1) % N, to avoid starvation
            – (Pj holds the token, is where the last check stopped at)
    If RTj[k] > X[k], for some k then
       • Pk has an outstanding request, send the token to Pk
             Suzuki-Kasami's Broadcast Algorithm

 Enter the CS:
     if Pi has received the token then it enters the CS


 Release the CS:
     Pi upon exiting CS, sets X[i]= RTi[i]
     execute (A)
    Suzuki-Kasami's Broadcast Algorithm -- Properties

 this algorithm gives better fault tolerance in the sense of handling
  requests
     as long as the request is received by some processors that will possess the
      token, the request will be processed
 however, the problem of missing token is still there
     e.g. the token is held by a dead processors or is sent to a dead processor
 require N messages per request
     N1 messages for broadcasting the request
     1 message sending the token
     if the node that wants to enter the critical section happens to have the token,
      then there is no message needed
 response time
     in general, there is a delay of 2T
     in best case, there is no delay
                Raymond's Tree-Based Algorithm

 the processors are structured as a tree and the token is placed at the root
  node
 the tree restructures when the token moves

 Request the CS (going up the tree):
    1. Pi send request (i) to its parent and puts the request in its queue if it does not
       hold the token
    2. Pj upon receiving the request
          a) puts the request in its queue
          b) if it has not sent a request to its parent then
                • sends request (j) to its parent
          c) otherwise (a request has already been sent to its parent for another
             child node)
                • does nothing
                 Raymond's Tree-Based Algorithm

 Request the CS (going down the tree):
     3. root site upon receiving the request
           a) puts the request in its queue
           b) executes (DTPR)
     4. Pj, upon receiving the token,
           a) if it was not requesting to enter CS or its request was not on the top of
              its queue then executes (DTPR)


   D. delete the top entry from its requesting queue
   T. send the token to the requesting child
   P. update parent pointer to point to the requesting child
   R. if its request queue is non-empty then send a request to the new
    parent
                Raymond's Tree-Based Algorithm

 Enter the CS:
    1. if Pi has received the token and its request is on the top of its queue then it
        enters the CS


 Release the CS:
    1. Pi upon exiting CS
          a) if its queue is not empty, then executes (DTPR)
      Raymond's Tree-Based Algorithm-- Example

              1                            1
                                                           2. node 2 receives
      2               3            2               3          the request, it sends
                                                              the request to node 1
  4       5       6       7    4       5       6       7
1. token is at node 1         3. node 4 also sends a request,
   node 5 made a request         node 2 receives it

              1                            1

      2               3            2               3

  4       5       6       7    4       5       6       7
4. token is at node 2 now     5. node 5 gets the token, it enters CS
   node 2 becomes the root    6. node 2 sends a request to node 5
       Raymond's Tree-Based Algorithm-- Example

               1                            1

       2               3            2               3
   4       5       6       7    4       5       6       7
7. node 5 sends the token      8. node 4 gets the token, it enters CS
   to node 2                   9. node 3 sends a request

               1                            1

       2               3            2               3

   4       5       6       7    4       5       6       7
10. the request from node 2    11. node 3 gets the token, and becomes
    comes to node 4                the root
        Raymond's Tree-Based Algorithm -- Properties

 the node with the token is always the root node
 requires the nodes on the entire path, from requester to root, to be alive
  in order to process a request
     still has the lost token problem
 requires 2 logN messages per request in average
     longest path: 2 logN (when the root is at the leaf of the original tree)
     best case: 0 messages
     worst case: 4 logN messages (2 logN to the root, 2 logN back with token)
 response time
       the message passing has to be done sequentially
       the average response time: T logN
       the best case response time: 0
       the worst case response time: 4T logN
             Performance Comparisons


 algorithm      response time       # messages
 Lamport            2T+E              3(N1)
Ricart-Ag           2T+E              2(N1)
Maekawa             2T+E            3N  5N
token-ring       [0NT]+E             0N
 broadcast       [0 or 2T]+E          0 or N
tree-based     [04T logN]+E       [0  4 logN]


        T: per message transmission time
        E: computation time
        response time: consider low load
        Self-stabilizing Algorithm for Mutual Exclusion

 Definition of self-stabilizing systems:
     the system can be in globally legitimate/illegitimate states
     in each legitimate state, at least one privilege will present
     node with the privilege can make a move
     from each legitimate state, a legitimate move will bring the system to a
      legitimate state
     for any pair of legitimate states, there exists a sequence of moves
      transferring the system from the one into the other
     a system is a self-stabilizing iff regardless of the initial state, the system
      will end up in a legitimate state after a finite number of moves
     Self-stabilizing Algorithm for Mutual Exclusion

 The Main Concept:
    the system, if unstable, should stabilize in a finite number of steps
       • when the system is in legitimate states, it assures mutual exclusion
       • when the system is in illegitimate states, it is possible two processes
          enter the critical section at the same time
       • from illegitimate state, the system will become legitimate in a finite
          number of steps
    the system should be self-stabilizing  each node makes local moves and
     the global system state will be legitimate
         Self-Stabilizing Mutual Exclusion Algorithm

   N nodes in the system, P0, ..., PN1
   view the nodes as being placed in a ring with a bottom node
   S: the state of a node itself
   L: the state of the left neighbor
   K: the state value of each node is bounded by [0,K)


           top node                           bottom node

             P         P         P     …         PN 
             0         1         2               1
                           K-State Algorithm

 the top node:
          if (L = S) then "enter CS"; S := (S+1) mod K;


 the other nodes:
          if (L  S) then "enter CS"; S := L;



         top node                               bottom node

            P          P          P      …         PN 
            0          1          2                1
           Mutual Exclusion Algorithm -- Example

 When the system starts from a legitimate state
    A) 3 3 3 3 2 2 -- node 4 gets the privilege
         3 3 3 3 3 2 -- node 5 gets the privilege
         3 3 3 3 3 3 -- node 0 gets the privilege
         4 3 3 3 3 3 -- node 1 gets the privilege
         4 4 3 3 3 3 -- node 2 gets the privilege
    B) 3 3 3 3 5 5  3 3 3 3 3 5  3 3 3 3 3 3  4 3 3 3 3 3  4 4 3 3 3 3
    C) 4 4 4 1 1 1  4 4 4 4 1 1  4 4 4 4 4 1  4 4 4 4 4 4  5 4 4 4 4 4


 When the system is at an illegitimate state
    D) 3 3 4 4 2 2 -- nodes 2 and 4 get the privilege, assume both make moves
         3 3 3 4 4 2 -- nodes 3 and 5 get the privilege
         3 3 3 3 4 4 -- back to legitimate state
    E) 3 3 4 4 1 3  4 3 3 4 4 1  4 4 3 3 4 4  5 4 4 3 3 4  5 5 4 4 3 3
         5 5 5 4 4 3  5 5 5 5 4 4 -- back to legitimate state
          Properties with Self-Stabilization Systems

 ideal for decentralized control
 can tolerate transient failures
 if a processor fails and then recovers, there is no need to re-establish its
  state, the self-stabilizing algorithm will automatically restore the system
  to a legitimate state
 cannot tolerate permanent failures
 has a vulnerable period

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:59
posted:11/1/2011
language:English
pages:40