PSA by cuiliqing

VIEWS: 14 PAGES: 26

									THE PRICE OF STOCHASTIC
ANARCHY



Christine Chung      University of Pittsburgh
Katrina Ligett    Carnegie Mellon University
Kirk Pruhs           University of Pittsburgh
Aaron Roth        Carnegie Mellon University
    Load Balancing on Unrelated Machines
2


       n players, each with a job to run, chooses one of m
        machines to run it on
                                        Time     Machine   Machine
               Machine   Machine        Needed     1         2
                 1         2            Job 1

                                        Job 2

                                        Job 3


       Each player’s goal is to minimize her job’s finish time.
       NOTE: finish time of a job is equal to load on the
        machine where the job is run.
    Load Balancing on Unrelated Machines
3


       n players, each with a job to run, chooses one of m
        machines to run it on
                                        Time     Machine   Machine
               Machine   Machine        Needed     1         2
                 1         2            Job 1

                                        Job 2

                                        Job 3


       Each player’s goal is to minimize her job’s finish time.
       NOTE: finish time of a job is equal to load on the
        machine where the job is run.
    Load Balancing on Unrelated Machines
4


       n players, each with a job to run, chooses one of m
        machines to run it on
                                        Time     Machine   Machine
               Machine   Machine        Needed     1         2
                 1         2            Job 1

                                        Job 2

                                        Job 3


       Each player’s goal is to minimize her job’s finish time.
       NOTE: finish time of a job is equal to load on the
        machine where the job is run.
    Load Balancing on Unrelated Machines
5


       n players, each with a job to run, chooses one of m
        machines to run it on
                                        Time     Machine   Machine
               Machine   Machine        Needed     1         2
                 1         2            Job 1

                                        Job 2

                                        Job 3


       Each player’s goal is to minimize her job’s finish time.
       NOTE: finish time of a job is equal to load on the
        machine where the job is run.
    Unbounded Price of Anarchy in the Load
    Balancing Game on Unrelated Machines
6


       Price of Anarchy (POA) measures the cost of having no central
        authority.
       Let an optimal assignment under centralized authority be one
        in which makespan is minimized.
       POA = (makespan at worst Nash)/(makespan at OPT)
       Bad POA instance: 2 players and 2 machines (L and R).
       OPT here costs δ.                                        1
       Worst Nash costs 1.                        L      R      δ
       Price of Anarchy:                 job 1    δ       1
               cost of worst Nash 1
                                         job 2    1      δ
                   cost at OPT     
    Drawbacks of Price of Anarchy
7



       A solution characterization with no road map.
       If there is more than one Nash, don’t know which
        one will be reached.
       Strong assumptions must be made about the
        players: e.g., fully informed and fully convinced of
        one anothers’ “rationality.”
       Nash are sometimes very brittle, making POA
        results feel overly pessimistic.
    Evolutionary Game Theory
8



       Young (1993) specified a model of
        adaptive play.
     Evolutionary Game Theory
9




       dispense (1993) specified a model understand
    “I  Young with the notion that people fully of
         structure of the games allows us to they have
     the adaptive play that they play, that predict a
         which solutions will be chosen they can make
    coherent model of others’ behavior, thatin the long
         run calculations of infinite complexity, and that
    rational by self-interested decision-making all
          this is common knowledge. and resources.
       ofagents with limited info Instead I postulate a
      world in which people base their decisions on limited
     data, use simple predictive models, and sometimes do
                unexplained or even foolish things.”
                  – P. Young, Individual Strategy and Social Structure, 1998
     Evolutionary Game Theory
10



      Young (1993) specified a model of
       adaptive play.
      Adaptive play allows us to predict which
       solutions will be chosen in the long run by
       self-interested decision-making agents
       with limited info and resources.
                                                                         L    R

     Adaptive Play Example                                       job 1   δ    1
                                                                 job 2   1    δ
11


        In each round of play, each player uses some simple,
         reasonable dynamics to decide which strategy to play. E.g.,
           imitation dynamics
              Sample s of the last mem strategies I played
              Play the strategy whose average payoff was highest
               (breaking ties uniformly at random)
            best response dynamics
                Sample the other player’s realized strategy in s of the last mem
                 rounds.
                Assume this sample represents the probability distribution of what
                 the other player will play the next round, and play a strategy
                 that is a best response (minimizes my expected cost).
                                                                         L    R

     Adaptive Play Example                                       job 1   δ    1
                                                                 job 2   1    δ
12


        In each round of play, each player uses some simple,
         reasonable dynamics to decide which strategy to play. E.g.,
           imitation dynamics
              Sample s of the last mem strategies I played
              Play the strategy whose average payoff was highest
               (breaking ties uniformly at random)
            best response dynamics
                Sample the other player’s realized strategy in s of the last mem
                 rounds.
                Assume this sample represents the probability distribution of what
                 the other player will play the next round, and play a strategy
                 that is a best response (minimizes my expected cost).
                                                               L   R
     Adaptive Play Example:                            job 1   δ   1
     a Markov process                                  job 2   1   δ
13


         Let mem = 4.
           (Then there are 2^8 = 256 total states in the state space.)
      player 1   LLLL LLLR LLLL                       RRRR RRRR
      player 2   LLLL LLLL LLLR              ...      LRRR RRRR
                         3/4    1/4
             1                                                         1

                   LLRR         LLRL
                   LLLL         LLLL

         If s = 3, each player randomly samples three past
          plays from the memory, and picks the strategy among
          them that worked best (yielded the highest payoff).
                                                              L       R
     Absorbing Sets of the                            job 1   δ       1

14
     Markov Process                                   job 2   1       δ

        An absorbing set is a set of states that are all
         reachable from one another, but cannot reach any
         states outside of the set.
        In our example, we have 4 absorbing sets:
                RRRR         RRRR          LLLL          LLLL
                RRRR         LLLL          RRRR          LLLL
                     1            1 NASH    OPT1                  1

        But which state we end up in depends on our initial
         state. Hence we perturb our Markov process as
         follows:
            During each round, each player, with probability ε, does
             not use imitation dynamics, but instead chooses a machine
             at random.
                                                            L   R

     Stochastic Stability                           job 1   δ   1
                                                    job 2   1   δ
15


        The perturbed process has only one big absorbing set
         (any state is reachable from any other state).
        Hence we have a unique stationary distribution με
         (where μεP = με).
          The probability distribution με is the time-average
           asymptotic frequency distribution of Pε.
        A state z is stochastically stable if lim   ( z )  0
                                              0
                                                               L   R
     Finding Stochastically                            job 1   δ   1

16
     Stable States                                     job 2   1   δ

        Theorem (Young, 1993): The stochastically stable states are
         those states contained in the absorbing sets of the unperturbed
         process that have minimum stochastic potential.
                                     RRRR
                                     LLLL
                 RRRR                                   LLLL
                                         1
                 RRRR                                   LLLL
                       1
                                                               1
                                    LLLL
                                    RRRR
                                          1
                                                                L   R
      Finding Stochastically                            job 1   δ   1

17
      Stable States                                     job 2   1   δ

         Theorem (Young, 1993): The stochastically stable states are
          those states contained in the absorbing sets of the unperturbed
          process that have minimum stochastic potential.
                                      RRRR
                                      LLLL
                       RRRR                              LLLL
                       RRRR                              LLLL

                   1                 LLLL                           3
     RRRR      L       L   L   L     RRRR       LLLL      L     L   L   L
     RRRR      R       R   R   R                RRRR      L     L   L   L
                                                                 L   R
     Finding Stochastically                              job 1   δ   1

18
     Stable States                                       job 2   1   δ

        Theorem (Young, 1993): The stochastically stable states are
         those states contained in the absorbing sets of the unperturbed
         process that have minimum stochastic potential. = cost of min
                                                            spanning tree
                                     RRRR                     rooted there
                                                 2   6
                                     LLLL
                    RRRR                                  LLLL
                    RRRR                                  LLLL

                1                   LLLL                             3
                                    RRRR
                                                                 L   R
     Finding Stochastically                              job 1   δ   1

19
     Stable States                                       job 2   1   δ

        Theorem (Young, 1993): The stochastically stable states are
         those states contained in the absorbing sets of the unperturbed
         process that have minimum stochastic potential. = cost of min
                                                            spanning tree
              6                      RRRR                     rooted there
                                                 1   6
                               2     LLLL
                  RRRR                                    LLLL
                  RRRR                                    LLLL

                           3        LLLL
                                    RRRR
                                                                 L   R
     Finding Stochastically                              job 1   δ   1

20
     Stable States                                       job 2   1   δ

        Theorem (Young, 1993): The stochastically stable states are
         those states contained in the absorbing sets of the unperturbed
         process that have minimum stochastic potential. = cost of min
                                                            spanning tree
              6                      RRRR                     rooted there
                                                 1   6
                            1        LLLL
                  RRRR                                    LLLL
                  RRRR                 5                  LLLL

                           3        LLLL
                                    RRRR
                                                                     L   R
     Finding Stochastically                                  job 1   δ   1

21
     Stable States                                           job 2   1   δ

        Theorem (Young, 1993): The stochastically stable states are
         those states contained in the absorbing sets of the unperturbed
         process that have minimum stochastic potential. = cost of min
                                                                spanning tree
              6                       RRRR                        rooted there
                                                 1       6
                            1         LLLL
                  RRRR                                        LLLL
                  RRRR            2    5                      LLLL

          Stochastically              LLLL
                                      RRRR           4
             Stable!
     Recap: Adaptive Play Model
22


        Assume the game is played repeatedly by players with
         limited information and resources.
        Use a decision rule (aka “learning behavior” or “selection
         dynamics”) to model how each player picks her strategy
         for each round.
        This yields a Markov Process where the states represent
         fixed-sized histories of game play.
        Add noise (players make “mistakes” with some small
         positive probability and don’t always behave according
         to the prescribed dynamics)
     Stochastic Stability
23


        The states in the perturbed Markov process with
         positive probability in the long-run are the
         stochastically stable states (SSS).
        In our paper, we define the Price of Stochastic
         Anarchy (PSA) to be
                            cost of SSS
                        max
                        SSS cost at OPT




                                            23
                                                                   L   R

     PSA for Load Balancing                                job 1   δ   1
                                                           job 2   1   δ
24



        Recall bad instance: POA = 1/δ (unbounded)
        But the bad Nash in this case is not a SSS. In fact,
         OPT is the only SSS here. So PSA = 1 in this instance.
        Our main result: Ω(m) ≤ PSA ≤ m∙Fib(n)(mn+1)
          For the game of load balancing on unrelated machines,
           while POA is unbounded, PSA is bounded.
          Specifically, we show PSA ≤ m∙(Fib(n)(mn+1)), which is m
           times the (mn+1)th n-step Fibonacci number.
          We also exhibit instances of the game where PSA > m.

         (m is the number of machines, n is the number of jobs/players)
     Closing Thoughts
25


        In the game of load balancing on unrelated machines, we
         found that while POA is unbounded, PSA is bounded.
        Indeed, in the bad POA instances for many games, the worst
         Nash are not stochastically stable.
        Finding PSA in these games are interesting open questions that
         may yield very illuminating results.
        PSA allows us to determine relative stability of equilibria,
         distinguishing those that are brittle from those that are more
         robust, giving us a more informative measure of the cost of
         having no central authority.
                                                 L   R

     Conjecture                          job 1   δ   1
                                         job 2   1   δ
26



      You might notice in this game that if players
       could coordinate or form a team, they would
       play OPT.
      Instead of being unbounded, [AFM2007] have
       shown the strong price of anarchy is O(m).
      We conjecture that PSA is also O(m), i.e., that
       a linear price of anarchy can be achieved
       without player coordination.

								
To top