THE PRICE OF STOCHASTIC
ANARCHY
Christine Chung University of Pittsburgh
Katrina Ligett Carnegie Mellon University
Kirk Pruhs University of Pittsburgh
Aaron Roth Carnegie Mellon University
Load Balancing on Unrelated Machines
2
n players, each with a job to run, chooses one of m
machines to run it on
Time Machine Machine
Machine Machine Needed 1 2
1 2 Job 1
Job 2
Job 3
Each player’s goal is to minimize her job’s finish time.
NOTE: finish time of a job is equal to load on the
machine where the job is run.
Load Balancing on Unrelated Machines
3
n players, each with a job to run, chooses one of m
machines to run it on
Time Machine Machine
Machine Machine Needed 1 2
1 2 Job 1
Job 2
Job 3
Each player’s goal is to minimize her job’s finish time.
NOTE: finish time of a job is equal to load on the
machine where the job is run.
Load Balancing on Unrelated Machines
4
n players, each with a job to run, chooses one of m
machines to run it on
Time Machine Machine
Machine Machine Needed 1 2
1 2 Job 1
Job 2
Job 3
Each player’s goal is to minimize her job’s finish time.
NOTE: finish time of a job is equal to load on the
machine where the job is run.
Load Balancing on Unrelated Machines
5
n players, each with a job to run, chooses one of m
machines to run it on
Time Machine Machine
Machine Machine Needed 1 2
1 2 Job 1
Job 2
Job 3
Each player’s goal is to minimize her job’s finish time.
NOTE: finish time of a job is equal to load on the
machine where the job is run.
Unbounded Price of Anarchy in the Load
Balancing Game on Unrelated Machines
6
Price of Anarchy (POA) measures the cost of having no central
authority.
Let an optimal assignment under centralized authority be one
in which makespan is minimized.
POA = (makespan at worst Nash)/(makespan at OPT)
Bad POA instance: 2 players and 2 machines (L and R).
OPT here costs δ. 1
Worst Nash costs 1. L R δ
Price of Anarchy: job 1 δ 1
cost of worst Nash 1
job 2 1 δ
cost at OPT
Drawbacks of Price of Anarchy
7
A solution characterization with no road map.
If there is more than one Nash, don’t know which
one will be reached.
Strong assumptions must be made about the
players: e.g., fully informed and fully convinced of
one anothers’ “rationality.”
Nash are sometimes very brittle, making POA
results feel overly pessimistic.
Evolutionary Game Theory
8
Young (1993) specified a model of
adaptive play.
Evolutionary Game Theory
9
dispense (1993) specified a model understand
“I Young with the notion that people fully of
structure of the games allows us to they have
the adaptive play that they play, that predict a
which solutions will be chosen they can make
coherent model of others’ behavior, thatin the long
run calculations of infinite complexity, and that
rational by self-interested decision-making all
this is common knowledge. and resources.
ofagents with limited info Instead I postulate a
world in which people base their decisions on limited
data, use simple predictive models, and sometimes do
unexplained or even foolish things.”
– P. Young, Individual Strategy and Social Structure, 1998
Evolutionary Game Theory
10
Young (1993) specified a model of
adaptive play.
Adaptive play allows us to predict which
solutions will be chosen in the long run by
self-interested decision-making agents
with limited info and resources.
L R
Adaptive Play Example job 1 δ 1
job 2 1 δ
11
In each round of play, each player uses some simple,
reasonable dynamics to decide which strategy to play. E.g.,
imitation dynamics
Sample s of the last mem strategies I played
Play the strategy whose average payoff was highest
(breaking ties uniformly at random)
best response dynamics
Sample the other player’s realized strategy in s of the last mem
rounds.
Assume this sample represents the probability distribution of what
the other player will play the next round, and play a strategy
that is a best response (minimizes my expected cost).
L R
Adaptive Play Example job 1 δ 1
job 2 1 δ
12
In each round of play, each player uses some simple,
reasonable dynamics to decide which strategy to play. E.g.,
imitation dynamics
Sample s of the last mem strategies I played
Play the strategy whose average payoff was highest
(breaking ties uniformly at random)
best response dynamics
Sample the other player’s realized strategy in s of the last mem
rounds.
Assume this sample represents the probability distribution of what
the other player will play the next round, and play a strategy
that is a best response (minimizes my expected cost).
L R
Adaptive Play Example: job 1 δ 1
a Markov process job 2 1 δ
13
Let mem = 4.
(Then there are 2^8 = 256 total states in the state space.)
player 1 LLLL LLLR LLLL RRRR RRRR
player 2 LLLL LLLL LLLR ... LRRR RRRR
3/4 1/4
1 1
LLRR LLRL
LLLL LLLL
If s = 3, each player randomly samples three past
plays from the memory, and picks the strategy among
them that worked best (yielded the highest payoff).
L R
Absorbing Sets of the job 1 δ 1
14
Markov Process job 2 1 δ
An absorbing set is a set of states that are all
reachable from one another, but cannot reach any
states outside of the set.
In our example, we have 4 absorbing sets:
RRRR RRRR LLLL LLLL
RRRR LLLL RRRR LLLL
1 1 NASH OPT1 1
But which state we end up in depends on our initial
state. Hence we perturb our Markov process as
follows:
During each round, each player, with probability ε, does
not use imitation dynamics, but instead chooses a machine
at random.
L R
Stochastic Stability job 1 δ 1
job 2 1 δ
15
The perturbed process has only one big absorbing set
(any state is reachable from any other state).
Hence we have a unique stationary distribution με
(where μεP = με).
The probability distribution με is the time-average
asymptotic frequency distribution of Pε.
A state z is stochastically stable if lim ( z ) 0
0
L R
Finding Stochastically job 1 δ 1
16
Stable States job 2 1 δ
Theorem (Young, 1993): The stochastically stable states are
those states contained in the absorbing sets of the unperturbed
process that have minimum stochastic potential.
RRRR
LLLL
RRRR LLLL
1
RRRR LLLL
1
1
LLLL
RRRR
1
L R
Finding Stochastically job 1 δ 1
17
Stable States job 2 1 δ
Theorem (Young, 1993): The stochastically stable states are
those states contained in the absorbing sets of the unperturbed
process that have minimum stochastic potential.
RRRR
LLLL
RRRR LLLL
RRRR LLLL
1 LLLL 3
RRRR L L L L RRRR LLLL L L L L
RRRR R R R R RRRR L L L L
L R
Finding Stochastically job 1 δ 1
18
Stable States job 2 1 δ
Theorem (Young, 1993): The stochastically stable states are
those states contained in the absorbing sets of the unperturbed
process that have minimum stochastic potential. = cost of min
spanning tree
RRRR rooted there
2 6
LLLL
RRRR LLLL
RRRR LLLL
1 LLLL 3
RRRR
L R
Finding Stochastically job 1 δ 1
19
Stable States job 2 1 δ
Theorem (Young, 1993): The stochastically stable states are
those states contained in the absorbing sets of the unperturbed
process that have minimum stochastic potential. = cost of min
spanning tree
6 RRRR rooted there
1 6
2 LLLL
RRRR LLLL
RRRR LLLL
3 LLLL
RRRR
L R
Finding Stochastically job 1 δ 1
20
Stable States job 2 1 δ
Theorem (Young, 1993): The stochastically stable states are
those states contained in the absorbing sets of the unperturbed
process that have minimum stochastic potential. = cost of min
spanning tree
6 RRRR rooted there
1 6
1 LLLL
RRRR LLLL
RRRR 5 LLLL
3 LLLL
RRRR
L R
Finding Stochastically job 1 δ 1
21
Stable States job 2 1 δ
Theorem (Young, 1993): The stochastically stable states are
those states contained in the absorbing sets of the unperturbed
process that have minimum stochastic potential. = cost of min
spanning tree
6 RRRR rooted there
1 6
1 LLLL
RRRR LLLL
RRRR 2 5 LLLL
Stochastically LLLL
RRRR 4
Stable!
Recap: Adaptive Play Model
22
Assume the game is played repeatedly by players with
limited information and resources.
Use a decision rule (aka “learning behavior” or “selection
dynamics”) to model how each player picks her strategy
for each round.
This yields a Markov Process where the states represent
fixed-sized histories of game play.
Add noise (players make “mistakes” with some small
positive probability and don’t always behave according
to the prescribed dynamics)
Stochastic Stability
23
The states in the perturbed Markov process with
positive probability in the long-run are the
stochastically stable states (SSS).
In our paper, we define the Price of Stochastic
Anarchy (PSA) to be
cost of SSS
max
SSS cost at OPT
23
L R
PSA for Load Balancing job 1 δ 1
job 2 1 δ
24
Recall bad instance: POA = 1/δ (unbounded)
But the bad Nash in this case is not a SSS. In fact,
OPT is the only SSS here. So PSA = 1 in this instance.
Our main result: Ω(m) ≤ PSA ≤ m∙Fib(n)(mn+1)
For the game of load balancing on unrelated machines,
while POA is unbounded, PSA is bounded.
Specifically, we show PSA ≤ m∙(Fib(n)(mn+1)), which is m
times the (mn+1)th n-step Fibonacci number.
We also exhibit instances of the game where PSA > m.
(m is the number of machines, n is the number of jobs/players)
Closing Thoughts
25
In the game of load balancing on unrelated machines, we
found that while POA is unbounded, PSA is bounded.
Indeed, in the bad POA instances for many games, the worst
Nash are not stochastically stable.
Finding PSA in these games are interesting open questions that
may yield very illuminating results.
PSA allows us to determine relative stability of equilibria,
distinguishing those that are brittle from those that are more
robust, giving us a more informative measure of the cost of
having no central authority.
L R
Conjecture job 1 δ 1
job 2 1 δ
26
You might notice in this game that if players
could coordinate or form a team, they would
play OPT.
Instead of being unbounded, [AFM2007] have
shown the strong price of anarchy is O(m).
We conjecture that PSA is also O(m), i.e., that
a linear price of anarchy can be achieved
without player coordination.