# PSA by cuiliqing

VIEWS: 14 PAGES: 26

• pg 1
```									THE PRICE OF STOCHASTIC
ANARCHY

Christine Chung      University of Pittsburgh
Katrina Ligett    Carnegie Mellon University
Kirk Pruhs           University of Pittsburgh
Aaron Roth        Carnegie Mellon University
2

   n players, each with a job to run, chooses one of m
machines to run it on
Time     Machine   Machine
Machine   Machine        Needed     1         2
1         2            Job 1

Job 2

Job 3

   Each player’s goal is to minimize her job’s finish time.
   NOTE: finish time of a job is equal to load on the
machine where the job is run.
3

   n players, each with a job to run, chooses one of m
machines to run it on
Time     Machine   Machine
Machine   Machine        Needed     1         2
1         2            Job 1

Job 2

Job 3

   Each player’s goal is to minimize her job’s finish time.
   NOTE: finish time of a job is equal to load on the
machine where the job is run.
4

   n players, each with a job to run, chooses one of m
machines to run it on
Time     Machine   Machine
Machine   Machine        Needed     1         2
1         2            Job 1

Job 2

Job 3

   Each player’s goal is to minimize her job’s finish time.
   NOTE: finish time of a job is equal to load on the
machine where the job is run.
5

   n players, each with a job to run, chooses one of m
machines to run it on
Time     Machine   Machine
Machine   Machine        Needed     1         2
1         2            Job 1

Job 2

Job 3

   Each player’s goal is to minimize her job’s finish time.
   NOTE: finish time of a job is equal to load on the
machine where the job is run.
Unbounded Price of Anarchy in the Load
Balancing Game on Unrelated Machines
6

   Price of Anarchy (POA) measures the cost of having no central
authority.
   Let an optimal assignment under centralized authority be one
in which makespan is minimized.
   POA = (makespan at worst Nash)/(makespan at OPT)
   Bad POA instance: 2 players and 2 machines (L and R).
   OPT here costs δ.                                        1
   Worst Nash costs 1.                        L      R      δ
   Price of Anarchy:                 job 1    δ       1
cost of worst Nash 1
        job 2    1      δ
cost at OPT     
Drawbacks of Price of Anarchy
7

   A solution characterization with no road map.
   If there is more than one Nash, don’t know which
one will be reached.
players: e.g., fully informed and fully convinced of
one anothers’ “rationality.”
   Nash are sometimes very brittle, making POA
results feel overly pessimistic.
Evolutionary Game Theory
8

   Young (1993) specified a model of
Evolutionary Game Theory
9

dispense (1993) specified a model understand
“I  Young with the notion that people fully of
structure of the games allows us to they have
the adaptive play that they play, that predict a
which solutions will be chosen they can make
coherent model of others’ behavior, thatin the long
run calculations of infinite complexity, and that
rational by self-interested decision-making all
this is common knowledge. and resources.
ofagents with limited info Instead I postulate a
world in which people base their decisions on limited
data, use simple predictive models, and sometimes do
unexplained or even foolish things.”
– P. Young, Individual Strategy and Social Structure, 1998
Evolutionary Game Theory
10

 Young (1993) specified a model of
 Adaptive play allows us to predict which
solutions will be chosen in the long run by
self-interested decision-making agents
with limited info and resources.
L    R

Adaptive Play Example                                       job 1   δ    1
job 2   1    δ
11

   In each round of play, each player uses some simple,
reasonable dynamics to decide which strategy to play. E.g.,
 imitation dynamics
 Sample s of the last mem strategies I played
 Play the strategy whose average payoff was highest
(breaking ties uniformly at random)
   best response dynamics
   Sample the other player’s realized strategy in s of the last mem
rounds.
   Assume this sample represents the probability distribution of what
the other player will play the next round, and play a strategy
that is a best response (minimizes my expected cost).
L    R

Adaptive Play Example                                       job 1   δ    1
job 2   1    δ
12

   In each round of play, each player uses some simple,
reasonable dynamics to decide which strategy to play. E.g.,
 imitation dynamics
 Sample s of the last mem strategies I played
 Play the strategy whose average payoff was highest
(breaking ties uniformly at random)
   best response dynamics
   Sample the other player’s realized strategy in s of the last mem
rounds.
   Assume this sample represents the probability distribution of what
the other player will play the next round, and play a strategy
that is a best response (minimizes my expected cost).
L   R
Adaptive Play Example:                            job 1   δ   1
a Markov process                                  job 2   1   δ
13

   Let mem = 4.
(Then there are 2^8 = 256 total states in the state space.)
player 1   LLLL LLLR LLLL                       RRRR RRRR
player 2   LLLL LLLL LLLR              ...      LRRR RRRR
3/4    1/4
1                                                         1

LLRR         LLRL
LLLL         LLLL

   If s = 3, each player randomly samples three past
plays from the memory, and picks the strategy among
them that worked best (yielded the highest payoff).
L       R
Absorbing Sets of the                            job 1   δ       1

14
Markov Process                                   job 2   1       δ

   An absorbing set is a set of states that are all
reachable from one another, but cannot reach any
states outside of the set.
   In our example, we have 4 absorbing sets:
RRRR         RRRR          LLLL          LLLL
RRRR         LLLL          RRRR          LLLL
1            1 NASH    OPT1                  1

   But which state we end up in depends on our initial
state. Hence we perturb our Markov process as
follows:
   During each round, each player, with probability ε, does
not use imitation dynamics, but instead chooses a machine
at random.
L   R

Stochastic Stability                           job 1   δ   1
job 2   1   δ
15

   The perturbed process has only one big absorbing set
(any state is reachable from any other state).
   Hence we have a unique stationary distribution με
(where μεP = με).
 The probability distribution με is the time-average
asymptotic frequency distribution of Pε.
   A state z is stochastically stable if lim   ( z )  0
 0
L   R
Finding Stochastically                            job 1   δ   1

16
Stable States                                     job 2   1   δ

   Theorem (Young, 1993): The stochastically stable states are
those states contained in the absorbing sets of the unperturbed
process that have minimum stochastic potential.
RRRR
LLLL
RRRR                                   LLLL
1
RRRR                                   LLLL
1
1
LLLL
RRRR
1
L   R
Finding Stochastically                            job 1   δ   1

17
Stable States                                     job 2   1   δ

   Theorem (Young, 1993): The stochastically stable states are
those states contained in the absorbing sets of the unperturbed
process that have minimum stochastic potential.
RRRR
LLLL
RRRR                              LLLL
RRRR                              LLLL

1                 LLLL                           3
RRRR      L       L   L   L     RRRR       LLLL      L     L   L   L
RRRR      R       R   R   R                RRRR      L     L   L   L
L   R
Finding Stochastically                              job 1   δ   1

18
Stable States                                       job 2   1   δ

   Theorem (Young, 1993): The stochastically stable states are
those states contained in the absorbing sets of the unperturbed
process that have minimum stochastic potential. = cost of min
spanning tree
RRRR                     rooted there
2   6
LLLL
RRRR                                  LLLL
RRRR                                  LLLL

1                   LLLL                             3
RRRR
L   R
Finding Stochastically                              job 1   δ   1

19
Stable States                                       job 2   1   δ

   Theorem (Young, 1993): The stochastically stable states are
those states contained in the absorbing sets of the unperturbed
process that have minimum stochastic potential. = cost of min
spanning tree
6                      RRRR                     rooted there
1   6
2     LLLL
RRRR                                    LLLL
RRRR                                    LLLL

3        LLLL
RRRR
L   R
Finding Stochastically                              job 1   δ   1

20
Stable States                                       job 2   1   δ

   Theorem (Young, 1993): The stochastically stable states are
those states contained in the absorbing sets of the unperturbed
process that have minimum stochastic potential. = cost of min
spanning tree
6                      RRRR                     rooted there
1   6
1        LLLL
RRRR                                    LLLL
RRRR                 5                  LLLL

3        LLLL
RRRR
L   R
Finding Stochastically                                  job 1   δ   1

21
Stable States                                           job 2   1   δ

   Theorem (Young, 1993): The stochastically stable states are
those states contained in the absorbing sets of the unperturbed
process that have minimum stochastic potential. = cost of min
spanning tree
6                       RRRR                        rooted there
1       6
1         LLLL
RRRR                                        LLLL
RRRR            2    5                      LLLL

Stochastically              LLLL
RRRR           4
Stable!
22

   Assume the game is played repeatedly by players with
limited information and resources.
   Use a decision rule (aka “learning behavior” or “selection
dynamics”) to model how each player picks her strategy
for each round.
   This yields a Markov Process where the states represent
fixed-sized histories of game play.
   Add noise (players make “mistakes” with some small
positive probability and don’t always behave according
to the prescribed dynamics)
Stochastic Stability
23

   The states in the perturbed Markov process with
positive probability in the long-run are the
stochastically stable states (SSS).
   In our paper, we define the Price of Stochastic
Anarchy (PSA) to be
cost of SSS
max
SSS cost at OPT

23
L   R

PSA for Load Balancing                                job 1   δ   1
job 2   1   δ
24

   Recall bad instance: POA = 1/δ (unbounded)
   But the bad Nash in this case is not a SSS. In fact,
OPT is the only SSS here. So PSA = 1 in this instance.
   Our main result: Ω(m) ≤ PSA ≤ m∙Fib(n)(mn+1)
 For the game of load balancing on unrelated machines,
while POA is unbounded, PSA is bounded.
 Specifically, we show PSA ≤ m∙(Fib(n)(mn+1)), which is m
times the (mn+1)th n-step Fibonacci number.
 We also exhibit instances of the game where PSA > m.

(m is the number of machines, n is the number of jobs/players)
Closing Thoughts
25

   In the game of load balancing on unrelated machines, we
found that while POA is unbounded, PSA is bounded.
   Indeed, in the bad POA instances for many games, the worst
Nash are not stochastically stable.
   Finding PSA in these games are interesting open questions that
may yield very illuminating results.
   PSA allows us to determine relative stability of equilibria,
distinguishing those that are brittle from those that are more
having no central authority.
L   R

Conjecture                          job 1   δ   1
job 2   1   δ
26

 You might notice in this game that if players
could coordinate or form a team, they would
play OPT.
 Instead of being unbounded, [AFM2007] have
shown the strong price of anarchy is O(m).
 We conjecture that PSA is also O(m), i.e., that
a linear price of anarchy can be achieved
without player coordination.

```
To top