Embed
Email

PSA

Document Sample

Shared by: cuiliqing
Categories
Tags
Stats
views:
4
posted:
11/2/2011
language:
English
pages:
26
THE PRICE OF STOCHASTIC

ANARCHY







Christine Chung University of Pittsburgh

Katrina Ligett Carnegie Mellon University

Kirk Pruhs University of Pittsburgh

Aaron Roth Carnegie Mellon University

Load Balancing on Unrelated Machines

2





 n players, each with a job to run, chooses one of m

machines to run it on

Time Machine Machine

Machine Machine Needed 1 2

1 2 Job 1



Job 2



Job 3





 Each player’s goal is to minimize her job’s finish time.

 NOTE: finish time of a job is equal to load on the

machine where the job is run.

Load Balancing on Unrelated Machines

3





 n players, each with a job to run, chooses one of m

machines to run it on

Time Machine Machine

Machine Machine Needed 1 2

1 2 Job 1



Job 2



Job 3





 Each player’s goal is to minimize her job’s finish time.

 NOTE: finish time of a job is equal to load on the

machine where the job is run.

Load Balancing on Unrelated Machines

4





 n players, each with a job to run, chooses one of m

machines to run it on

Time Machine Machine

Machine Machine Needed 1 2

1 2 Job 1



Job 2



Job 3





 Each player’s goal is to minimize her job’s finish time.

 NOTE: finish time of a job is equal to load on the

machine where the job is run.

Load Balancing on Unrelated Machines

5





 n players, each with a job to run, chooses one of m

machines to run it on

Time Machine Machine

Machine Machine Needed 1 2

1 2 Job 1



Job 2



Job 3





 Each player’s goal is to minimize her job’s finish time.

 NOTE: finish time of a job is equal to load on the

machine where the job is run.

Unbounded Price of Anarchy in the Load

Balancing Game on Unrelated Machines

6





 Price of Anarchy (POA) measures the cost of having no central

authority.

 Let an optimal assignment under centralized authority be one

in which makespan is minimized.

 POA = (makespan at worst Nash)/(makespan at OPT)

 Bad POA instance: 2 players and 2 machines (L and R).

 OPT here costs δ. 1

 Worst Nash costs 1. L R δ

 Price of Anarchy: job 1 δ 1

cost of worst Nash 1

 job 2 1 δ

cost at OPT 

Drawbacks of Price of Anarchy

7







 A solution characterization with no road map.

 If there is more than one Nash, don’t know which

one will be reached.

 Strong assumptions must be made about the

players: e.g., fully informed and fully convinced of

one anothers’ “rationality.”

 Nash are sometimes very brittle, making POA

results feel overly pessimistic.

Evolutionary Game Theory

8







 Young (1993) specified a model of

adaptive play.

Evolutionary Game Theory

9









dispense (1993) specified a model understand

“I  Young with the notion that people fully of

structure of the games allows us to they have

the adaptive play that they play, that predict a

which solutions will be chosen they can make

coherent model of others’ behavior, thatin the long

run calculations of infinite complexity, and that

rational by self-interested decision-making all

this is common knowledge. and resources.

ofagents with limited info Instead I postulate a

world in which people base their decisions on limited

data, use simple predictive models, and sometimes do

unexplained or even foolish things.”

– P. Young, Individual Strategy and Social Structure, 1998

Evolutionary Game Theory

10







 Young (1993) specified a model of

adaptive play.

 Adaptive play allows us to predict which

solutions will be chosen in the long run by

self-interested decision-making agents

with limited info and resources.

L R



Adaptive Play Example job 1 δ 1

job 2 1 δ

11





 In each round of play, each player uses some simple,

reasonable dynamics to decide which strategy to play. E.g.,

 imitation dynamics

 Sample s of the last mem strategies I played

 Play the strategy whose average payoff was highest

(breaking ties uniformly at random)

 best response dynamics

 Sample the other player’s realized strategy in s of the last mem

rounds.

 Assume this sample represents the probability distribution of what

the other player will play the next round, and play a strategy

that is a best response (minimizes my expected cost).

L R



Adaptive Play Example job 1 δ 1

job 2 1 δ

12





 In each round of play, each player uses some simple,

reasonable dynamics to decide which strategy to play. E.g.,

 imitation dynamics

 Sample s of the last mem strategies I played

 Play the strategy whose average payoff was highest

(breaking ties uniformly at random)

 best response dynamics

 Sample the other player’s realized strategy in s of the last mem

rounds.

 Assume this sample represents the probability distribution of what

the other player will play the next round, and play a strategy

that is a best response (minimizes my expected cost).

L R

Adaptive Play Example: job 1 δ 1

a Markov process job 2 1 δ

13





 Let mem = 4.

(Then there are 2^8 = 256 total states in the state space.)

player 1 LLLL LLLR LLLL RRRR RRRR

player 2 LLLL LLLL LLLR ... LRRR RRRR

3/4 1/4

1 1



LLRR LLRL

LLLL LLLL



 If s = 3, each player randomly samples three past

plays from the memory, and picks the strategy among

them that worked best (yielded the highest payoff).

L R

Absorbing Sets of the job 1 δ 1



14

Markov Process job 2 1 δ



 An absorbing set is a set of states that are all

reachable from one another, but cannot reach any

states outside of the set.

 In our example, we have 4 absorbing sets:

RRRR RRRR LLLL LLLL

RRRR LLLL RRRR LLLL

1 1 NASH OPT1 1



 But which state we end up in depends on our initial

state. Hence we perturb our Markov process as

follows:

 During each round, each player, with probability ε, does

not use imitation dynamics, but instead chooses a machine

at random.

L R



Stochastic Stability job 1 δ 1

job 2 1 δ

15





 The perturbed process has only one big absorbing set

(any state is reachable from any other state).

 Hence we have a unique stationary distribution με

(where μεP = με).

 The probability distribution με is the time-average

asymptotic frequency distribution of Pε.

 A state z is stochastically stable if lim   ( z )  0

 0

L R

Finding Stochastically job 1 δ 1



16

Stable States job 2 1 δ



 Theorem (Young, 1993): The stochastically stable states are

those states contained in the absorbing sets of the unperturbed

process that have minimum stochastic potential.

RRRR

LLLL

RRRR LLLL

1

RRRR LLLL

1

1

LLLL

RRRR

1

L R

Finding Stochastically job 1 δ 1



17

Stable States job 2 1 δ



 Theorem (Young, 1993): The stochastically stable states are

those states contained in the absorbing sets of the unperturbed

process that have minimum stochastic potential.

RRRR

LLLL

RRRR LLLL

RRRR LLLL



1 LLLL 3

RRRR L L L L RRRR LLLL L L L L

RRRR R R R R RRRR L L L L

L R

Finding Stochastically job 1 δ 1



18

Stable States job 2 1 δ



 Theorem (Young, 1993): The stochastically stable states are

those states contained in the absorbing sets of the unperturbed

process that have minimum stochastic potential. = cost of min

spanning tree

RRRR rooted there

2 6

LLLL

RRRR LLLL

RRRR LLLL



1 LLLL 3

RRRR

L R

Finding Stochastically job 1 δ 1



19

Stable States job 2 1 δ



 Theorem (Young, 1993): The stochastically stable states are

those states contained in the absorbing sets of the unperturbed

process that have minimum stochastic potential. = cost of min

spanning tree

6 RRRR rooted there

1 6

2 LLLL

RRRR LLLL

RRRR LLLL



3 LLLL

RRRR

L R

Finding Stochastically job 1 δ 1



20

Stable States job 2 1 δ



 Theorem (Young, 1993): The stochastically stable states are

those states contained in the absorbing sets of the unperturbed

process that have minimum stochastic potential. = cost of min

spanning tree

6 RRRR rooted there

1 6

1 LLLL

RRRR LLLL

RRRR 5 LLLL



3 LLLL

RRRR

L R

Finding Stochastically job 1 δ 1



21

Stable States job 2 1 δ



 Theorem (Young, 1993): The stochastically stable states are

those states contained in the absorbing sets of the unperturbed

process that have minimum stochastic potential. = cost of min

spanning tree

6 RRRR rooted there

1 6

1 LLLL

RRRR LLLL

RRRR 2 5 LLLL



Stochastically LLLL

RRRR 4

Stable!

Recap: Adaptive Play Model

22





 Assume the game is played repeatedly by players with

limited information and resources.

 Use a decision rule (aka “learning behavior” or “selection

dynamics”) to model how each player picks her strategy

for each round.

 This yields a Markov Process where the states represent

fixed-sized histories of game play.

 Add noise (players make “mistakes” with some small

positive probability and don’t always behave according

to the prescribed dynamics)

Stochastic Stability

23





 The states in the perturbed Markov process with

positive probability in the long-run are the

stochastically stable states (SSS).

 In our paper, we define the Price of Stochastic

Anarchy (PSA) to be

cost of SSS

max

SSS cost at OPT









23

L R



PSA for Load Balancing job 1 δ 1

job 2 1 δ

24







 Recall bad instance: POA = 1/δ (unbounded)

 But the bad Nash in this case is not a SSS. In fact,

OPT is the only SSS here. So PSA = 1 in this instance.

 Our main result: Ω(m) ≤ PSA ≤ m∙Fib(n)(mn+1)

 For the game of load balancing on unrelated machines,

while POA is unbounded, PSA is bounded.

 Specifically, we show PSA ≤ m∙(Fib(n)(mn+1)), which is m

times the (mn+1)th n-step Fibonacci number.

 We also exhibit instances of the game where PSA > m.



(m is the number of machines, n is the number of jobs/players)

Closing Thoughts

25





 In the game of load balancing on unrelated machines, we

found that while POA is unbounded, PSA is bounded.

 Indeed, in the bad POA instances for many games, the worst

Nash are not stochastically stable.

 Finding PSA in these games are interesting open questions that

may yield very illuminating results.

 PSA allows us to determine relative stability of equilibria,

distinguishing those that are brittle from those that are more

robust, giving us a more informative measure of the cost of

having no central authority.

L R



Conjecture job 1 δ 1

job 2 1 δ

26







 You might notice in this game that if players

could coordinate or form a team, they would

play OPT.

 Instead of being unbounded, [AFM2007] have

shown the strong price of anarchy is O(m).

 We conjecture that PSA is also O(m), i.e., that

a linear price of anarchy can be achieved

without player coordination.



Related docs
Other docs by cuiliqing
11.1 Exploring Area and Perimeter
Views: 0  |  Downloads: 0
Volusia County
Views: 2  |  Downloads: 0
choosing_topics_and_y10
Views: 0  |  Downloads: 0
CLE Credit - rscrpubs.com
Views: 2  |  Downloads: 0
Meeting Minutes September 8 Final
Views: 0  |  Downloads: 0
nov2411
Views: 3  |  Downloads: 0
EKG Spreadsheet - Geocities.ws
Views: 0  |  Downloads: 0
Gift from Christ to the Church
Views: 0  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!