Embed
Email

CPSC 668

Document Sample

Shared by: yurtgc548
Categories
Tags
Stats
views:
0
posted:
1/23/2012
language:
pages:
37
Mutual Exclusion in Shared Memory







Slides provided by

Prof. Jennifer Welch





1

Shared Memory Model



• Processors communicate via a set of

shared variables, instead of passing

messages.

• Each shared variable has a type,

defining a set of operations that can be

performed atomically.







2

Shared Memory Model Example



p0 p1 p2







read write read write









X Y









3

Shared Memory Model



• Changes to the model from the

message-passing case:

– no inbuf and outbuf state components

– configuration includes a value for each

shared variable

– only event type is a computation step by a

processor

– An execution is admissible if every

processor takes an infinite number of steps

4

Computation Step in Shared

Memory Model

• When processor pi takes a step:

– pi 's state in old configuration specifies

whch shared variable is to be accessed

and with which operation

– operation is done: shared variable's value

in the new configuration changes

according to the operation's semantics

– pi 's state in new configuration changes

according to its old state and the result of

the operation



5

Observations on SM Model



• Accesses to the shared variables are

modeled as occurring instantaneously

(atomically) during a computation step,

one access per step

• Definition of admissible execution

implies

– asynchronous

– no failures



6

Mutual Exclusion (Mutex) Problem

• Each processor's code is divided into four

sections:

entry



remainder critical



exit



– entry: synchronize with others to ensure mutually

exclusive access to the …

– critical: use some resource; when done, enter

the…

– exit: clean up; when done, enter the…

– remainder: not interested in using the resource

7

Mutual Exclusion Algorithms



• A mutual exclusion algorithm specifies

code for entry and exit sections to

ensure:

– mutual exclusion: at most one processor

is in its critical section at any time, and

– some kind of "liveness" or "progress"

condition. There are three commonly

considered ones…





8

Mutex Progress Conditions

• no deadlock: if a processor is in its entry

section at some time, then later some

processor is in its critical section

• no lockout: if a processor is in its entry

section at some time, then later the same

processor is in its critical section

• bounded waiting: no lockout + while a

processor is in its entry section, other

processors enter the critical section no more

than a certain number of times.

• These conditions are increasingly strong.

9

Mutual Exclusion Algorithms



• The code for the entry and exit sections

is allowed to assume that

– no processor stays in its critical section

forever

– shared variables used in the entry and exit

sections are not accessed during the

critical and remainder sections







10

Complexity Measure for Mutex

• An important complexity measure for

shared memory mutex algorithms is

amount of shared space needed.

• Space complexity is affected by:

– how powerful is the type of the shared

variables

– how strong is the progress property to be

satisfied (no deadlock vs. no lockout vs.

bounded waiting)



11

Test-and-Set Shared Variable



• A test-and-set variable V holds two

values, 0 or 1, and supports two

(atomic) operations:

– test&set(V):

temp := V

V := 1

return temp

– reset(V):

V := 0





12

Mutex Algorithm Using Test&Set



• code for entry section:

repeat

t := test&set(V)

until (t = 0)

An alternative construction is:

wait until test&set(V) = 0





• code for exit section:

reset(V)

13

Mutual Exclusion is Ensured



• Suppose not. Consider first violation,

when some pi enters CS but another pj is

already in CS





pj enters CS: pi enters CS:

sees V = 0, sees V = 0, impossible!

sets V to 1 sets V to 1



no node leaves CS so V stays 1



14

No Deadlock

• Claim: V = 0 iff no processor is in CS.

– Proof is by induction on events in

execution, and relies on fact that mutual

exclusion holds.

• Suppose there is a time after which a

processor is in its entry section but no

processor ever enters CS.



no processor is in CS

V always equals 0, next t&s returns 0

proc enters CS, contradiction!

no processor enters CS

15

What About No Lockout?



• One processor could always grab V

(i.e., win the test&set competition) and

starve the others.

• No Lockout does not hold.

• Thus Bounded Waiting does not hold.









16

Read-Modify-Write (rmw) Shared

Variable

• Assume: The state of such a variable can

be of any size.

• Variable V supports the (atomic) operation

– rmw(V,f ), where f is any function

temp := V

V := f(V)

return temp

• This variable type is very “strong”: One

shared variable suffices to achieve “no

lockout”

17

Mutex Algorithm Using RMW

• Conceptually, the list of waiting processors is

stored in a circular queue of length n

• Each waiting processor remembers in its

local state its location in the queue (instead of

keeping this info in the shared variable)

• Shared RMW variable V keeps track of active

part of the queue with first and last pointers,

which are indices into the queue (between 0

and n-1)

– so V has two components, first and last



18

Conceptual Data Structure







The RMW shared object

just contains these two

"pointers"









19

Mutex Algorithm Using RMW



• Code for entry section:

// increment last to enqueue self

position := rmw(V,(V.first,V.last+1)

// wait until first equals this value

repeat

queue := rmw(V,V)

until (queue.first = position.last)



• Code for exit section:

// dequeue self

rmw(V,(V.first+1,V.last))

20

Correctness Sketch



• Mutual Exclusion:

– Only the processor at the head of the

queue (V.first) can enter the CS, and only

one processor is at the head at any time.

• n-Bounded Waiting:

– FIFO order of enqueueing, and fact that no

processor stays in CS forever, give this

result.





21

Space Complexity



• The shared RMW variable V has two

components in its state, first and last.

• Both are integers that take on values

from 0 to n-1, n different values.

• The total number of different states of V

thus is n2.

• And thus the required size of V in bits is

2*log2 n .

22

Spinning

• A drawback of the RMW queue algorithm is

that processors in entry section repeatedly

access the same shared variable

– called spinning

• Having multiple processors spinning on the

same shared variable can be very time-

inefficient in certain multiprocessor

architectures

• Alter the queue algorithm so that each waiting

processor spins on a different shared variable



23

RMW Mutex Algorithm With

Separate Spinning

• Shared RMW variables:

– Last : corresponds to last "pointer"

from previous algorithm

• cycles through 0 to n

• keeps track of index to be given to the

next processor that starts waiting

• initially 0





24

RMW Mutex Algorithm With

Separate Spinning

• Shared RMW variables:

– Flags[0..n-1] : array of binary variables

• these are the variables that processors

spin on

• make sure no two processors spin on the

same variable at the same time

• initially Flags[0] = 1 (proc "has lock") and

Flags[i ] = 0 (proc "must wait) for i > 0







25

Mutex using separte spinning

Initially Last = 0; Flags[0]=1; Flags[i]=0, 1

my-place := rmw(Last, Last+1 mod n)

wait until (Flags[my-place] = 1)

Flags[my-place] = 0





Flags[my-place + 1 mod n] = 1

26

Overview of Algorithm

• entry section:

– get next index from Last and store in a local

variable myPlace

– spin on Flags[myPlace] until it equals 1

(means proc "has lock" and can enter CS)

– set Flags[myPlace] to 0 ("doesn't have lock")

• exit section:

– set Flags[myPlace+1] to 1 (i.e., give the

priority to the next proc)





27

Question



• Do the shared variables Last and Flags

have to be RMW variables?



• Answer: The RMW semantics

(atomically reading and updating a

variable) are needed for Last, to make

sure two processors don't get the same

index at overlapping times.



28

Invariants of the Algorithm



1. At most one element of Flags has

value 1 ("has lock")

2. If no element of Flags has value 1,

then some processor is in the CS.

3. If Flags[k] = 1, then exactly

(Last - k) mod n processors are in the

entry section, spinning on Flags[i], for i

= k, (k+1) mod n, …, (Last-1) mod n.

29

Typo in textbook: replace (k-Last-1) on page 69 first paragraph by (Last-k)

Correctness



• Those three invariants can be used to

prove:

– Mutual exclusion is satisfied

– n-Bounded Waiting is satisfied.









30

A lower bound on number of

shared memory states









31

Lower Bound on Number of

Memory States

Theorem (4.4): Any mutex algorithm with

k-bounded waiting (and no-deadlock)

uses at least n states of shared

memory.

Proof: Assume in contradiction there is

an algorithm using less than n states of

shared memory.





32

Lower Bound on Number of

Memory States

• Consider this execution of the algorithm:

p0 p0 p0 … p1 p2 pn-1

C C0 C1 C2 …… Cn-1

p0 in CS by p1 in p2 in pn-1 in

No-deadlock entry entry entry

section section section





• There exist i and j such that Ci and Cj

have the same state of shared memory.



33

Lower Bound on Number of

Memory States

Same shared memory state in Ci and Cj

pi+1, pi+2, …, pj

Ci Cj

p0 in CS, p0 in CS,

p1-pi in entry, p1-pj in entry,

rest in remainder rest in remainder



 = sched. in which 

p0-pi take steps alternately

by ND, some ph ph enters CS

has entered CS k+1 times while

k+1 times pi+1 is in entry



34

ND = no deadlock, CS = critical section

Lower Bound on Number of

Memory States

• But why does ph do the same thing when

executing the sequence of steps in  when

starting from Cj as when starting from Ci?

• All the processes p0,…,pj do the same thing

because:

– they are in same states in the two configs

– shared memory state is same in the two configs

– only differences between Ci and Cj are

(potentially) the states of pi+1,…,pj and they don't

take any steps in 



35

Discussion of Lower Bound

• The lower bound of n just shown on number of

memory states only holds for algorithms that

must provide bounded waiting in every

execution.

• Suppose we weaken the liveness condition to

just no-lockout in every execution: then

square-root(2n) + ½ distinct shared memory

states is a lower bound

• And if liveness is weakened to just

no-deadlock in every execution, then the bound

is just 2 (see algo. using test&set: slide 13)

36

"Beating" the Lower Bound with

Randomization

• An alternative way to weaken the requirement

is to give up on requiring liveness in every

execution

• Consider Probabilistic No-Lockout: every

processor has non-zero probability of

succeeding each time it is in its entry section.

• Now there is an algorithm using O(1) states of

shared memory.



Recommended reading: Section 14.2

37



Related docs
Other docs by yurtgc548
AC120 lecture 26
Views: 1  |  Downloads: 0
ABSTRACT - GPCET MCA EMERALDS
Views: 0  |  Downloads: 0
Absolute Garbage Systems
Views: 1  |  Downloads: 0
Abnormal Psychology
Views: 0  |  Downloads: 0
ABC of Arterial and venous Disease
Views: 0  |  Downloads: 0
Abacus Fund Management LLC Presentation
Views: 0  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!