Repeated Games Prisoner's Dilemma

W
Shared by: yaofenjin
Categories
Tags
-
Stats
views:
11
posted:
8/27/2011
language:
English
pages:
10
Document Sample
scope of work template
							                    Lecture 9

Repeated Games: Prisoner’s Dilemma



                   BESS/TSM
EC4010 - Economic Theory - Module 2 - Game Theory

                 Pedro C. Vicente
              Trinity College Dublin
                           Introduction
                          Prisoner’s Dilemma

 We now consider a setting in which players repeatedly engage in the same
  strategic form game
 We focus on the Prisoner’s Dilemma as follows (C for cooperate; D for
  defect):
                                    Player     2
                                      C      D
                        Player C     2, 2    0, 3
                          1     D    3, 0    1, 1

 Consider the following strategy (grim trigger strategy):
    choose C to begin and then as long as the other player chooses C
    if in any period the other player chooses D, then choose D in every
      subsequent period

2
                             Introduction
                                     NE

 How should a player respond if her opponent uses this strategy?
    If she chooses C in every period, then the outcome is (C,C) and her
      payoff is 2 in every period
    If she switches to D, she obtains a payoff of 3 in that period (a
      short-term gain) and a payoff of 1 in every subsequent period (a long
      term loss)
    As long as she values the future sufficiently, the stream of payoffs
      (3,1,1,...) is worse than (2,2,2,...), so that she is better off choosing C
      in every period (C at all periods is a best response)
    The same grim-trigger strategy is also a best response (same outcome
      as when using always C); thus grim-trigger is a NE when players are
      patient
 Another NE is always D (think of best response)



3
                                   Repeated Games
                                          Preferences

 The outcome of a repeated game is a sequence of outcomes of a strategic
  form game
 How does a player evaluate such sequences? We assume the evaluation of
  each sequence of outcomes in the repeated game by the discounted sum of
  the associated sequence of payoffs
 Each player i has a payoff function u i for the strategic form game and a
  discount factor  i between 0 and 1 such that she evaluates the sequence
   a 1 , a 2 , . . . , a T of outcomes (a t is the action profile at time t) of the
  strategic form game by the sum
                                                                                      T
         ui a   1
                     iui a   2
                                      2ui
                                        i     a   3
                                                      . . .  T 1 u i
                                                                i         a   T
                                                                                            ti 1 u i a t
                                                                                      t1
 If  i is close to 0, then player i cares very little about the future - she is
  very impatient; we assume all players have the same discount factor:
   i   for all i


4
                         Repeated Games
                             Preferences (cont’ed)



 Take an infinite stream of payoffs w 1 , w 2 , . . . ; since we can define a
  constant at all periods to be c; we can write the one that makes us
                                                             Ý
  indifferent to the given stream of payoffs as 1  
                                                 c
                                                                ti 1 w t
                                                           t1
                         Ý
 We call 1                    ti 1 w t the discounted average of the stream
                         t1
  w1, w2, . . .




5
                            Repeated Games
                                Repeated Games

Definition (Repeated Game): Let G be a strategic game; denote the set of
players by N and the set of actions and payoff function of each player i by A i
and u i respectively; the T-period (encompassing the case of T  Ý) repeated
game of G for the discount factor  is the extensive game with perfect
information and simultaneous moves in which:
  The set of player is N
  The set of terminal histories is the set of sequences a 1 , a 2 , . . . , a T of
     action profiles in G
  The player function assigns the set of all players to every history
      a 1 , . . . , a t , for every value of t
  The set of actions available to any player i after any history is A i
  Each player i evaluates each terminal history a 1 , a 2 , . . . , a T according to
                                               T
     its discounted average 1                    ti 1 u i a t
                                         t1




6
                         Repeated Games
                  Finitely Repeated Prisoner’s Dilemma



 Nash Equilibrium: Every NE of a finitely repeated PD generates the
  outcome (D,D) in every period
 Playing C at any point is always a worse response, provided the game is
  finitely repeated and there is not always an opportunity to punish defection
  (consider deviating at the last given C)
 SPE: since every SPE is a NE and there is only one NE, we know that is a
  SPE




7
                              Repeated Games
              Strategies in Infinitely Repeated Prisoner’s Dilemma
 A strategy of player i in an infinitely repeated game of the strategic form
  game G specifies an action of player i (a member of A i ) for every sequence
   a 1 , . . . , a t of outcomes of G
 The grim-trigger strategy for an infinitely repeated PD is defined as
                                             C if a 1 , . . . , a tj  C, . . . , C
                                                    j
  s i /  C and s i a 1 , . . . , a t                                              , for every
                                                       D otherwise
  history a 1 , . . . , a t , where j is the other player
      we can think of this strategy as having two states: one, call it C, in
          which C is chosen; another, call it D in which D is chosen; initially
          the state is C; if when the state is C, the other player chooses D, then
          the state changes to D
 Tit-for-tat strategy: the length of punishment depends on the behavior of
  the player being punished; if she continues to do D, then tit-for-tat
  continues to do so; if she reverts to C, then tit-for-tat reverts to C also


8
                         Repeated Games
              NE in Infinitely Repeated Prisoner’s Dilemma

 (D,D) still a NE at the infinitely repeated Prisoner’s Dilemma
 Grim-trigger strategies:
    Suppose that player 1 uses the grim-trigger strategy; if player 2 uses
      the same strategy, then the outcome is (C,C) in every period, with
      stream 2, 2, . . . , with discounted average 2
    If player 2 adopts a strategy that generates a different sequence of
      outcomes, then in at least one period her action is D; in all subsequent
      periods, 1 chooses D, so 2 goes for D subsequently as well (best
      response); meaning (3,1,1,...) from the first period in which 2 chooses
      D,            whose              discounted           average         is
       1  3     2   3 . . .  1  3  1  3 1   ;
      thus player 2 cannot increase her payoff by deviating if
      3 1   2                 1
                                   2
    This is the condition for the grim-trigger strategies to be a NE

9
                         Repeated Games
          NE in Infinitely Repeated Prisoner’s Dilemma (cont’ed)

 Tit-for-tat strategies:
    Suppose that player 1 adheres to this strategy; denote by t the first
       period in which player 2 chooses D (then player 1 chooses D in
       period t  1, and continues to choose D until player 2 reverts to C);
       then player 2 has two options from t  1: she can revert to C, in which
       case in period t  2 she faces the same situation she faced at the start
       of the game, or she can continue to choose D, in which case player 1
       will continue to do so too; if player 2’s best response to tit-for-tat
       implies choosing D at some period, then she either alternates between
       D and C, or chooses D in every period
    Alternating: stream (3,0,3,0,...), with discounted average
        1  1 3 2  1  3

      Always D: stream (3,1,1,...), with discounted average 3 1   
      Tit-for-tat is equilibrium if 2    3
                                         1
                                              and 2 3 2; both conditions
       are equivalent to       1
                                2
10

						
Related docs
Other docs by yaofenjin