The Dirty Faces Game Revisited by dfgh4bnmu


									       The University of Adelaide
         School of Economics

      Research Paper No. 2007-01

The Dirty Faces Game Revisited
   Ralph-C. Bayer and Mickey Chan
               The Dirty Faces Game Revisited
                      Ralph-C. Bayer∗and Mickey Chan
                                 February 15, 2007

          Weber (2001) uses the Dirty Faces Game to examine the depth of it-
      erated rationality. Weber does not consider equilibria that contain weakly
      dominated actions. So he implicitly assumes that it is common knowledge
      that no one ever uses weakly dominated actions. We show that allowing
      for equilibria in weakly dominated strategies greatly extents the set of po-
      tentially rational actions. The original game therefore lacks discriminatory
      power, as many actions categorised as irrational by Weber can actually be
      part of an equilibrium strategy. We slightly modify the payoff structure
      and establish strict dominance, which leads to a unique equilibrium. The
      resulting dominance-solvable game is implemented in an experiment. We
      find that subjects are either able to iterate right to the equilibrium or fail to
      do so when two or more steps of iteration are necessary. Virtually all sub-
      jects were able to do one step of iteration. Further, we find evidence that
      the lack of confidence in other players’ iterative abilities induces deviations
      from equilibrium play.
      Keywords: Game Theory, Iterative Reasoning, Experimental Economics.
      JEL Classification Numbers: C91, C92, C72.

      corresponding author: Department of Economics, Room 126, Napier Building, Ade-
laide 5005, Australia.     Phone: +61(8) 8303 4666, Fax: +61(8) 8223 1460.     email:

1       Introduction
In the Dirty Faces Game – originally developed by Littlewood (1953) – com-
mon knowledge of rationality is necessary for players to behave according to the
Bayesian Nash equilibrium. In different configurations of game, different levels
of iterated rationality and common knowledge are necessary. Therefore the game
can be used to measure the level of iterated rationality in humans with the help
of laboratory experiments. A particularly interesting feature of the Dirty Faces
Game is the fact that a higher iterative is beneficial to all players. In other games
commonly used to measure the level of iteration,1 such as beauty-contest and cen-
tipede games (McKelvey and Palfrey, 1992; Fey et al., 1996, e.g.), it is best for
an individual player to iterate one level more deeply than the opponent(s). In
contrast, the Dirty Faces Game is ideal for testing how many levels of iteration a
player can do, as in a situation where n steps of iteration are necessary, a player,
who believes that the other players can do at least n − 1 steps of iteration, has
no incentive to deviate from equilibrium play.
    The basic idea of the game is the following. Players interact in a group. All
players observe the type of all other players (a dirty face or clean face) but not
their own type. Then given the common knowledge that at least one player has
a ‘dirty face’ the objective is to find out as quickly as possible if the own face is
dirty or not. Clearly, one level of iteration is necessary if a player observes no
dirty faces. Then given that it is common knowledge that there is at least one
dirty face she should know that it is her face that is dirty. In a second step, a
player who sees that one other player has a dirty face can infer the state of his
face from the reaction of this player. If this player announces that he has a dirty
face, then she knows that her own face cannot be dirty, as this player must have
seen only clean faces in order to reach this conclusion. Seeing only one person
with a dirty face, who does not announce ‘my face is dirty’, should lead to the
conclusion that this other person must have seen at least one other person with
a dirty face. Since the only dirty face she sees is that of this particular person,
she can conclude that she must have a dirty face, too. Two steps of iterated
rationality are necessary in these two cases, where a player observes one dirty
face. The same logic applies for situations where more than one dirty face is
observed. The depth of iteration necessary is always the number of dirty faces
observed plus one.
    For a player being able to deduct her type properly, the following conditions
have to be satisfied. First, she has to be able to perform the iterated reasoning
described above. Secondly, all the other players also have to be able to perform
the necessary steps of iteration. Thirdly, the ability of the players to perform the
necessary steps of iteration has to be common knowledge.
    See for example the beauty contest in Nagel (1995), Duffy and Nagel (1997), Ho et al.
(1998), and Bosch-Domenech, Antoni et al. (2002) or the email game developed by Rubinstein
(1989) and experimentally examined in chapter 5 of Camerer (2003).

    Weber (2001) adapted the Dirty Faces Game for the laboratory with the aim
to test the depth of commonly known rational iteration in humans. In what
follows we show that Weber’s implementation requires an extreme assumption:
players do not play weakly dominated strategies and this is common knowl-
edge among all players. By implicitly concentrating on the equilibrium in un-
dominated strategies, Weber excludes a continuum of Perfect Bayesian Nash
Equilibria that include weakly dominated actions. In doing so, he categorises
some equilibrium play as non-rational. More specifically, Weber imposes the re-
striction that people have to believe that others immediately announce their type
once they know it, even if in equilibrium the other players are indifferent between
doing so or waiting.
    We modify Weber’s setup such that all the equilibria in weakly dominated
strategies disappear and only a unique equilibrium remains. Using this setup,
we can use a laboratory experiment to investigate the level of commonly known
iteration without having to rely on the implicit assumption that weak dominance
is commonly known to be obeyed by all players. We find that the frequency of
equilibrium play is roughly compatible with Weber’s findings. However, a direct
comparison is not the aim of this paper as we cannot control for subject-pool
effects. We concentrate on the question of when and why are people deviating
from equilibrium play. For this purpose, we use an econometric panel model. Our
main finding is that almost all people are able to do one step of iteration, while in
many groups the iterative chain breaks down when two or more levels of iteration
are necessary. However, if a subject is able to do two levels of iteration he or she
can usually do three levels as well. Furthermore, we find evidence that doubts
that other players are able to correctly iterate explains some off-equilibrium play.
In larger groups individuals play less often according to equilibrium even if the
same level of iteration is necessary for doing so, since the perceived likelihood
that at least one other player may not be able to perform the necessary iteration
    The remainder of the paper is organised as follows. In the next section we
explain our version of the dirty faces game. In Section 3 we show that our mod-
ification removes the unwanted equilibria from Weber’s model and leads to a
unique equilibrium, which can be reached by iterated deletion of strictly dom-
inated strategies. Our experimental design is described in 4. The results are
reported in Section 5. Section 6 concludes.

2      The Dirty Faces Game with discounting
In this section we develop our specification of the Dirty Faces Game. Our version
has the same structure as Weber’s game. However, our game will allow for
discounting. The n-player Dirty Faces Game proceeds as follows:
    1. There are n players, i = 1 . . . n. Player −i denotes the partner(s) of player

     i. We call the collection of all players ∀i ∈ {1, . . . , n} in a game a cohort.

  2. Nature draws a type θi ∈ {O, X} for all players from a distribution that is
     identical for and independent between all players. The commonly known
     probabilities are 1 − p for type O and p for type X, respectively. Being of
     type X can be seen as having a dirty face.

  3. An announcement takes place, which provides common knowledge among
     the players if there is at least one type X player in the cohort. The an-
     nouncement is denoted by the boolean indicator variable ρ ∈ {true, f alse},
     and declares whether:

      (a) no one has drawn type X, ρ = f alse, or
      (b) there is at least one type X player, ρ = true.

  4. The players observe the types of their partners θ−i , but cannot observe their
     own type θi .

  5. The players make decisions and these are evaluated in the following se-

       Stage counter : Starting with t = 1.
       Decision stage: Each player chooses an action, either up or down, where
          the action down correspondents to saying ‘I have a dirty face’, while
          ‘up’ correspondents to ‘I don’t know’.
       Evaluation stage: The game ends if either
           (a) any player has chosen down, or
           (b) n stages have passed, i.e. when t = n.
           Otherwise, the game continues with players returning to the Decision
           stage. The stage counter t is advanced by one and the players learn
           the actions taken by all players in the previous stage.

  6. Payoffs are realised.

   The payoffs ui (ai , θi ) are dependent on the action and the type of the player.
Furthermore payoffs are discounted with discount factor δ per period. When
down has been chosen, a type X player receives α, and a type O player receives
−β. Including discounting we have the following payoffs:

                               ui (down, X) = δ t−1 α

                              ui (down, O) = −δ t−1 β

   On the other hand, the player receives zero payoff whenever choosing up,
regardless of his type and the period in question:

                                   ui (up, θi ) = 0

    So players who correctly infer that they have dirty face will be rewarded, while
players who wrongly claim having a dirty face are penalised. For our purposes, it
is important to make sure that players will not find it attractive to gamble if they
do not have information in addition to the prior. The following condition makes
sure that it is sequentially rational to choose up given that the prior beliefs are

                                pα − (1 − p)β < 0                                (1)
    This condition ensures that the expected payoff of choosing down is negative
if subjects have no further information than the prior probability for type X. It
follows that the action down – claiming to have a dirty face – is strictly dominated
by playing up – admitting not to know – when the prior beliefs are held. However,
this is not sufficient to render the game dominance solvable, as will be shown
below. We need a further assumption on δ. It will turn out that in Weber’s
formulation without discounting, where δ is equal to one, dominance solvability
breaks down, while for δ ∈ (0, 1) iterated deletion and correct updating leads to
a unique Perfect Bayesian Equilibrium. The payoffs for different actions given a
particular type are summarised in Table 1.

              Table 1: Payoffs in the discounted Dirty Faces Game
                                           Own Type
                                           X       O
                         Own       up      0       0
                                          t−1      t−1
                        Actions down δ α −δ β

      In the decision stage, the basis on which players choose a particular action
is their beliefs. The belief of player i, µi , is his believed probability of being
of type X, i.e. having a dirty face, which has all the properties of a standard
probability measure. The action and the belief of player i at stage t, given
the observation of the partners’ types and the announcement, are denoted by
  (t)               (t)
ai (θ−i , ρ) and µi (θ−i , ρ), respectively. We omit the history of the game as it is
perfectly determined by t and the fact that the game is still going. For example,
ai (θ−i , ρ) contains the fact that all players ∀i ∈ {1, . . . , n} must have played
ai (θ−i , ρ) = up.
      A pure strategy is a profile of actions - one for every information set reached.
Thus a pure strategy can be expressed by an action vector a that contains an

action for each possible information set. It contains an action for all situations
(θ−i , ρ) and periods t. In an equilibrium we also need to specify a belief vector µ
for each player. This vector, which is sometimes called an assessment - assigns a
believed probability of being of type X for every information set. As the Dirty
Faces Game is a Bayesian game, strategies and belief vectors for all players are
necessary to describe an equilibrium.

3    Equilibrium
In what follows we will informally charcterise the Perfect Bayesian Equilibria of
the game for two players depending on the discount factor δ. The logic easily
extends to any number of players. In a Perfect Bayesian Equilibrium, all actions
have to be sequentially rational given own beliefs and the strategies of all other
players. Additionally, the beliefs have to be consistent with equilibrium actions
and prior beliefs. Where possible Bayes’ rule is used for updating.
    Suppose that both players have drawn type O. Then the announcement will
be that no player is of type X — ρ = f alse. Without any further iteration
necessary both players should see that their only sequentially rational actions are
                         (1)∗                       (2)∗
choosing up twice — ai (X, f alse) = up and ai (X, f alse) = up ∀i. This
is part of any equilibrium. The more interesting case is the one where at least
one player is of type X. Given that this is the case the announcement will be
ρ = true.
Proposition 1. For n = 2 and δ ∈ (0, 1] there exists an equilibrium, which ∀i
contains the actions

                              ai      (O, true) = down
                              ai      (O, true) = down
                              ai      (X, true) = up
                              ai      (X, true) = down
Proof. Correct updating requires that θ−i = O and ρ = true implies µi (O, true) =
1. So playing down yields an expecting payoff of α in t = 1 and δα in t = 2. The
maximum deviation payoff from delaying playing down is δα. As δ ≤ 1, playing
down is never a profitable deviation in this case.
    Correct belief formation requires that µi (X, true) = p, as no updating is
possible. So ai (X, true) = up follows from the non-gambling condition in (1).
For the given equilibrium strategy µi (X, true) = 1 is implied because the other
player’s action up in t = 1 is only consistent with θi = X.
    The equilibrium above, which we will refer to as “separating” in what follows,
since the different types play different strategies, is the only equilibrium Weber

considered. This equilibrium makes sense in that a player who sees only an
opponent of type O, but knows that there is at least one type X, concludes that
it must be her, which leads to an immediate cashing in with the action down.
A player who knows that all players are able to make this one step of iteration
will be able to make a second step of inference: If I see a type X playing up
in the first period this player must have seen me being of type X. Otherwise
she would have played down. So I must be of type X. We also see that that
this equilibrium hinges crucially on a player choosing down as soon as he or
she has inferred being of type X. Unfortunately, this is only weakly dominant
if δ = 1. So in this case the dominance solvability, which is highly desirable
for measuring common rationality, breaks down. Even worse, there are many
other equilibria for Weber’s case of δ = 1. Therefore Weber characterises some
behaviour as irrational, which can actually be equilibrium play. The following
proposition shows this. Let us denote a mixed behavioural strategy that gives the
probability of player i choosing down in period t given θ−i and ρ as σi (θ−i , ρ).
Proposition 2. For n = 2 and δ = 1 there exists a continuum of equilibria,
                        σi (O, true) ∈ [0, 1]∀i.
Proof. Mixing in period t = 1 for given θ−i = O and ρ = true is sequentially
rational if the expected payoff for playing up and down are identical. Knowing
that θi = X yields α for playing down. Playing up yields 0 if the other player
plays down in period t = 1. However, if the other player plays up in period t = 1
then playing down in period t = 2 yields δα = α for δ = 1. So playing up in
period t = 1 gives the same payoff if the other player plays up with probability 1
in the first period. Condition (1) ensures that playing down is strictly dominated
for the other player in period one.
    In Weber’s formulation delaying playing down if the type is known is only
weakly dominated. Since a player who delays ending the game by choosing up
does not have to fear the other player jumping in and ending the game, delaying
is part of an equilibrium strategy. This is the case because there are no costs of
delaying. Consequently, depending on the equilibrium probability σi (O, true)
the other player −i may choose up or down in period t = 2 after observing
θ−i = X and ρ = true 2 Since we cannot infer mixing probabilities from a
few individual choices, laboratory experiments with Weber’s payoff settings have
the problem that they only have discriminatory power if the assumption holds
that weakly dominated equilibrium-strategies are never played and that this is
common knowledge.
    We introduce a waiting cost in the form of a discount factor δ ∈ (0, 1). This
forces a rational player to end the game immediately when she can infer that
       The condition for a risk-neutral player −i to choose down in such a situation is
 (1)∗                 pα
σi      (O, true) ≤ (1−p)β

her type is X. Therefore, the continuum of equilibria in mixed strategies and
all its resulting asymmetric equilibria vanish. With such a setup we are able to
properly discriminate between commonly rational behaviour and iteration failure
in a laboratory experiment.
Proposition 3. For n = 2 and δ ∈ (0, 1) the “separating equilibrium” from
proposition 1 is unique.
Proof. The situation θ−i = O and ρ = true now requires playing down with
probability one for sequential rationality, as playing up is strictly dominated.
The payoff from playing down is α, which is greater than the maximum payoff
that can be achieved by playing up, which is δα. The beliefs for all other situations
are pinned down as described in the proof for proposition 1, which only allow for
the described actions to be sequentially rational.
    The logic easily extends to games with more than two players. The unique
equilibrium strategy for a particular player with δ ∈ (0, 1) can be characterised
    1. Always paly up if ρ = f alse

    2. If ρ = true play up if t−1 is smaller than the number of players you observe
       being of type X, play down otherwise.
    In a setting without discounting, δ = 1, the class of equilibria is quite large.
The fact that players are indifferent in equilibrium between playing down immedi-
ately once they know their type or delaying gives rise to many different equilibria
in pure and mixed strategies, such that it is not possible to properly discrimi-
nate between equilibrium and erroneous play.3 Also note that in the discounting
environment iterated deletion of strictly dominated actions from a strategy leads
to the unique equilibrium, which makes it possible to test at which iteration step
commonly known rationality breaks down. The steps of iteration necessary for an
individual player to choose an equilibrium strategy is the number of dirty faces
observed plus one. The player must also believe that the other players can do at
least one step of iteration less.

4     Experimental implementation
Our experiments only differ from Weber’s second series of sessions, which he used
to investigate learning effects, by introducing discounting on delayed payoffs. Our
     In footnote 18, p 239, Weber attributes some individual behaviour that deviates from the
“separating equilibrium” to spite instead of irrationality. Closer inspection shows that the
observed behaviour is actually consistent with equilibrium play. One player who understood
the identical equilibrium payoffs for delaying and immediate play of down used this knowledge
to mislead the other group member.

aim is to see how removing the multiple equilibria may affect the experimental
outcomes. This purpose leaves little room for design choices, since we want
to qualitatively compare the results with those from Weber’s experiments.4 In
particular, we stick with the use of the neutral language labeling types O and X,
instead of using framed language like “clean” and “dirty faces”.
    There are - as in Weber’s paper - two treatments with cohort sizes of n = 2
and n = 3. So the necessary level of iterated reasoning for equilibrium play
is relatively low. The prior probabilities for drawing type O and type X are
1/3 and 2/3, respectively. These priors reduce the occurrence of the trivial case
of ρ = f alse, sufficiently. The payoff parameters are α = 100 and β = 400
points, which together with the prior probabilities should prevent gambling. The
expected return is -67 points when subjects choose down if they hold the prior
beliefs. Meanwhile, choosing up at the end of the game always yields zero. The
prior probabilities and the relative size of the payoffs are identical to Weber’s
learning setup.
    The discount factor is set to δ = 0.8. The reduction in the payoff is noticeable
as each stage passes, but not so significant that gambling behaviour at latter
stages might be induced. The incentive condition against gambling is maintained
throughout the game, since the discounting is applied to all payoffs.

4.1     Procedure and summary statistics
The sum of points earned during the experiment were converted to cash at the
end of the game.5 The conversion ratio was one Australian Dollar (AUD) for
every 100 points. An endowment of 900 points, or AUD 9, was provided to every
subject at the beginning of the session. The main purpose of the endowment was
to prevent early bankruptcy in the session. An injection of points (treated like a
loan) was provided to subject that went bankrupt.6
    Each treatment consisted of fourteen consecutive Dirty Faces Games. All
games were independent from each other, i.e. the type draws were independent.
The profits are shown at the end of each period, and payments are made in cash
at the end of the session. Subjects are anonymously matched with the same other
subject(s) for all games in a session; we use a partner treatment. All these facts
are common knowledge to all participants.
    Each game starts with the computer randomly and independently assigning
types to subjects according to the priors. An announcement is made on the
screen whether there is at least one type X player in the cohort. It is followed
     A direct comparison is not possible due to potential subject pool effects.
     In the comparable treatments Weber paid two random periods.
     The average/median profits after fourteen periods were AUD 11.74/12.4 in the 2-player
treatment, and AUD 7.98/9 in the 3-player treatment. The number of people who ended up
with less than the endowment were 1(out of 42 subjects) in the 2-player game and 8(out of 48)
in the 3-player game.

by revealing the type(s) of the partner(s) - in the 3-player case, the two partners
are identified as Left and Right. Subjects enter the first stage and are asked to
simultaneously choose their actions - either up or down. After everyone in the
cohort has chosen an action, the actions chosen by their partners are revealed. If
the game advances to the next stage, subjects are asked to choose actions again.
The game continues until someone has chosen down or n stages have passed. The
period payoff is displayed at the end of the period, but subjects are never told
their own type at the end of the game.
    The sessions were conducted in a computer lab at the University of Adelaide.
The subjects were recruited from a pool of students. The experimental ses-
sions were programmed and run using the software z-tree by Fischbacher (2007).
Communication among subjects was controlled for, as they were seated in self-
contained booths. The seating order was randomly assigned. No communication
was allowed during the session. Subjects received all the relevant information in
written instructions and interactively on screen. The instructions for the 2-player
game can be found in the appendix A.
    The experimental sessions were conducted between October, 2004 and May,
2005. There were two sessions for each treatment, 2-player and 3-player games.
The sessions provided a total of 1260 observations from 90 subjects who played
in 14 periods - 42 subjects formed 21 groups in the 2-player game, and the other
48 subjects formed 16 groups in the 3-player game. Subjects were primarily
undergraduate students from the University of Adelaide. The distribution of
students from various disciplines is shown in Table 2.

         Table 2: Distribution of courses the subjects were enrolled in
                              course number percent
                                Arts         4     4.44
                         Commerce           17    18.89
                         Economics          30    33.33
                        Engineering         24    26.67
                            Finance          4     4.44
                            Sciences         9    10.00
                             Others          2     2.22
                               Total        90 100.00

5    Results
In what follows we will present our main results. We begin with some descriptive
statistics on the frequencies of agreement by treatment and type realisation. In
later sections we use statistical inference in order to analyse cohort size effects

and the influence of the ‘difficulty’ of a situation on the likelihood of individual
play following the equilibrium strategy.

5.1    Descriptive statistics
Table 3 reports individual level data over the 14 playing periods. For a given
event, each entry states the number of subjects choosing an action in agreement
with the predicted best response. The percentage of these subjects out of all
subjects who were in the same situation is stated in parentheses. For example, in
the tenth period of the 2-player game there were 9 subjects, who observed a type
O partner. Of those subjects 90% (or 0.90) chose the predicted best response
    There is clearly a high frequency of consistency with equilibrium play in the
events where the required number of iterations is low. These are cases where
one level of iteration is required when subjects have observed ρ = true and all
partners are type O - θ−i = O for n = 2 or θ−i = OO for n = 3. Beyond that,
the frequency of agreement with the predicted response gets much lower as the
required level of iteration increases.
    Holding the required number of iterations constant, it seems that more players
are able to perform two levels of iteration in the 2-player game (observed X, 0.62)
than in the 3-player game (observed OX, 0.52). This might suggest that a higher
number of players in a cohort complicates the thinking process of subjects and
hinders the iterative process.
    In the 3-player game, there does not seem to be any significant difference in
the frequency of agreement with the predicted responses between the two highest
levels of iteration. The frequencies are 0.52 for two levels of iteration (observed
OX) and 0.55 for three levels of iteration (observed XX). It is reasonable to
assume that if subjects are unable to perform two steps of iteration then they
should be unable iterate correctly in situations that require more than two steps.
The closeness of the two frequencies may also suggest that once subjects are able
to perform two levels of iteration, they may have understood the structure and
are therefore also able to solve tasks where higher-level iteration is necessary.
They might slide down the slippery slope of iteration and go all the way.
    The following should be kept in mind when looking at these summary statis-

  1. We consider players in the following occasions as rational, even though their
     chosen actions have led to losses. To see this, suppose the cohort has drawn
     OX in the 2-player game. The player with type O has observed a player
     of type X as partner. The best responses are up for the player of type O
     and down for the player of type X in the first stage. The game should have
     ended there. However, if player X has chosen the strictly dominated action
     up in the first stage, the O type should choose down in the second, as his

                                                 Table 3: Individual rationality across periods
                                                                  Number of players
                                            2                                                 3
          Obs.    ρ = f alse            O              X                 ρ = f alse       OO       XO       XX
         pred.a          UU             D             UD       Total          UUU            D      UD     UUD       Total
         Period                                                   Agreements:n(freq.)
            1       2(1.00)     11(0.92)         17(0.61) 30(0.71)                     2(0.67)  8(0.44) 12(0.44) 22(0.46)
            2       5(0.83)      7(0.88)         15(0.54) 27(0.64)                     2(1.00) 13(0.59) 15(0.63) 30(0.63)
            3       5(0.83)      8(0.89)         15(0.56) 28(0.67)          5(0.83) 5(0.83) 11(0.55) 10(0.63) 31(0.65)
            4       8(1.00)     10(1.00)         16(0.67) 34(0.81)                     3(0.75)  9(0.50) 12(0.46) 24(0.50)
            5       6(1.00)      8(1.00)         14(0.50) 28(0.67)          5(0.83) 1(1.00)     8(0.57) 15(0.56) 29(0.60)

            6       2(1.00)     10(0.83)         19(0.68) 31(0.74)                     4(1.00) 10(0.42) 11(0.55) 25(0.52)
            7       2(1.00)      9(1.00)         20(0.65) 31(0.74)                     7(1.00) 15(0.58)  8(0.53) 30(0.63)
            8       6(1.00)     10(1.00)         17(0.65) 33(0.79)                     2(1.00)  7(0.35) 15(0.58) 24(0.50)
            9       9(0.90)      6(0.86)         14(0.56) 29(0.69)                     1(0.50)  9(0.50) 19(0.68) 29(0.60)
           10       7(0.88)      9(0.90)         16(0.67) 32(0.76)          3(1.00) 3(1.00)     9(0.56) 16(0.62) 31(0.65)
           11       8(1.00)      9(1.00)         17(0.68) 34(0.81)                     2(1.00) 12(0.55) 13(0.54) 27(0.56)
           12       4(1.00)      7(1.00)         19(0.61) 30(0.71)          2(0.67) 2(1.00)     9(0.45) 11(0.48) 24(0.50)
           13       2(1.00)      7(1.00)         23(0.70) 32(0.76)          5(0.83) 6(1.00) 15(0.63)     7(0.58) 33(0.69)
           14       4(1.00)      5(1.00)         22(0.67) 31(0.74)                     4(1.00) 10(0.56) 12(0.46) 26(0.54)
          Agg.     70(0.95)    116(0.94)        244(0.62) 430(0.73)        20(0.83) 44(0.92) 145(0.52) 176(0.55) 385(0.57)
         U means up and D means down.
       partner should only have chosen up if he has observed a type X player. The
       O type who logically is playing down is wrong and suffers losses. His mistake
       is due to the irrational signaling from his partner. Similar situations may
       occur in the 3-player game. In these situations, we consider player O as
       rational, even though his play led to losses due to relying on wrong signals.

   2. Players play the dominant strategy when they believe their partners play
      the dominant strategy, i.e. if they believe that their partners are rational.
      However, if they believe that their partners are not rational, i.e. not playing
      dominant strategies, they might themselves deviate from the equilibrium
      strategy and play a “rationalizable” strategy. A rationalizable strategy is a
      strategy that is a best response to some strategy of the opponent, but not
      necessarily to the equilibrium strategy. So a player who does not believe
      that his partner is behaving rationally may consider the signaling from this
      partner as useless, since then following the signal will lead to losses, as in the
      situations mentioned above. Nevertheless, we consider their rationalizable
      actions as irrational, even though that the deviation from equilibrium can
      be rational if they believe that the other player(s) in the cohort are not

   3. Consider a cohort that has drawn XX in the 2-player game. A risk averse
      player in this game will play the strategy (up, up) if he is not able to make
      two steps of iteration7 , whereas a rational player will play the best response
      (up, down). However, one player may choose down in period one, which
      ends the game after the first stage. Then we will not observe the actions in
      the second stage. Moreover, we are not able to distinguish a totally rational
      player and a risk-averse player with limited iterative ability. So we don’t
      know if the player is able to do the second step of iteration in this situation.
      We will count a player, who played according to the equilibrium prediction
      until the game wrongfully ended, as an agreement.

    The relative frequencies with which our subjects play the unique separating
equilibrium are remarkably close to the relative frequencies with which Weber’s
subjects chose actions that are compatible with the separating equilibrium in his
learning setup.8 This seems to suggest that removing the equilibria in weakly
dominated strategies does not make a large difference for play. In the two-player
      This risk averse player is rational by playing this strategy, in the sense that choosing up
yields zero return comparing to an expected return of -67 with choosing down. However, he
is not entirely rational, since he cannot iterate correctly and therefore does not play down in
period two.
      Note that given that we assume equilibrium play we cannot conclude that all subjects
Weber categorises as playing the separating equilibrium are actually doing so. The reason is
that they might play a mixed strategy equilibrium which resulted in actions that look like the
separating equilibrium.

setting Weber observed .94 (ρ = 1, θ−1 = O) and .70 (ρ = 1, θ−1 = X) agreement,
while we observed 0.94 and 0.62. In the three-player setting Weber found relative
agreement fractions of 1.00 (ρ = 1, θ−1 = OO), .65 (ρ = 1, θ−1 = OX) and 0.57
(ρ = 1, θ−1 = XX). Our relative frequencies were .92, .52, and .55.9 However,
due to the different subject pools and the multiplicity of equilibria in Weber’s
setup a meaningful direct comparison is not possible.
    In what follows we analyse our data with respect to two questions:

   1. Does the number of players, for a given difficulty of the problem, have an
      influence on the frequency of equilibrium play?

   2. How does the level of difficulty impact on the frequency of collectively
      rational equilibrium play?

5.2     Influence of cohort sizes
Differences in group sizes may have an influence on whether subjects actually
choose the best response, regardless of the the level of iteration needed. This is
because the theoretical best responses require to a certain extent the rationality
of all players and the common knowledge of this rationality. This really means
that all players understand how to play the game and know the best responses,
and they also know and believe that the other players also understand how to play
the game. The common knowledge argument breaks down when there is even a
slight hint of doubt among any one of the players about the ability of other group
members. Under this circumstance, players may believe with a certain probability
that the signaling from partners does not provide any new information, as the
signal might come from a player who is not able to iterate deeply enough. This
leads players to play rationalizable strategies instead of the equilibrium strategies.
So the more players are in the cohort the more likely it becomes that one of them
is not able to do the iteration and that the common knowledge assumption breaks
    A two-sample Wilcoxon rank-sum (Mann-Whitney) test is used to examine
the cases, where one and two steps of iteration are required10 . The null hypothesis
is that the frequency of best responses for given steps of iteration are the same
for different cohort sizes n.
     Subjects in Weber’s study were either UCLA or Caltech graduate or undergraduate stu-
dents. We expected them ceteris paribus to do better than our students, as admission in these
universities should be more selective. So our prior that removing the multiple equilibria should
lead to a higher frequency of the separating equilibrium might still be true.
     Given ρ = true, one step of iteration is needed for subjects who have observed only type O
partners - i.e. θ−i = O for n = 2 and θ−i = OO for n = 3. Two steps of iteration are required
after having observed exactly one type X partner, i.e. θ−i = X for n = 2 and θ−i = OX for
n = 3.

           Table 4: Influence of group sizes on individual rationality
                            Situations by Levels of Iteration
                  1: No type X observed       2: One type X observed
          n     obs ranksum expected           obs ranksum expected
          2     123    10656       10578      391 137188.5 131376
          3      48     4050        4128      280 88267.5        94080
        total   171    14706       14706      671 225456        225456
       p-value            0.528                         0.006

    The results are tabulated in Table 4. They show that cohort sizes have an
influence on subjects ability to choose best responses. In cases when one step
of iteration is required (only type O’s are observed), subjects are able to deduce
their type without requiring any signals from others. There, consistent with
out hypothesis, the number of players makes no difference. On the other hand,
subjects become dependent on the signals from partners in order to deduce their
own type if they observe at least one other player being of type X. Then signaling
complicates the matter because players must also decide how to interpret the
signals. Players choose the predicted responses on the equilibrium path in later
periods if they believe others have obeyed strict dominance and have chosen
the predicted responses on the equilibrium path in earlier periods. Moreover,
in the three player situations with more than one player of type X players will
also have to believe that others know that they obey dominance themselves. A
second step of common knowledge of rationality is necessary. Only then can
the signals be interpreted as predicted in a Perfect Bayesian Equilibrium. This
chain of reasoning about whether one can believe that others play the predicted
responses and obeying dominance does not allow for any doubts. However, a
subject may be cautious and may not believe that others obey dominance, or
they do not believe that others believe that he himself obeys dominance, and so
on. Even believing others to be cautious may be a sufficient cause for deviating
from the equilibrium path. Worse still, only one player is required for causing this
deviation. Under these circumstances, rational players may play rationalizable
strategies instead of equilibrium strategies. So if the suspicion that others will
get it wrong is a strong driving force for equilibrium deviation, then we should
expect that deviation happens more often when more players are involved. This
is because then players should have a higher believed probability that at least
one of the other players does not behave according to dominance. Our statistical
test confirms this intuition. The number of correct deductions is greater in the
treatment with two players (p = 0.006) if reliance on the rationality of other
players is necessary but the difficulty stays the same.
    In summary, it seems reasonable to suggest that subjects choose the best
responses consistently regardless of different cohort sizes when signaling is not
required from partners for them to deduce their own type. However, once sig-

                   Figure 1: Nested factors in the experiment

naling is involved, increasing the number of players in the cohort increases the
likelihood of subjects deviating from the equilibrium path and therefore lead to
more deviations.

5.3    Panel-regression analysis
In this section we anlyse the influence of the steps of iteration necessary for equi-
librium play on the fraction of observed equilibrium play and revisit the question
of the influence of the group size. For this purpose we use multilevel panel-data
analysis. Standard regression modeling usually assumes that the errors have zero
means and are mutually independent among observations. However, we would
expect residuals of the data collected from the experiment for individuals and co-
horts to be correlated. This is inherent in our experimental settings as the same
individuals generated data over 14 playing periods, while they also interacted
with the same other players within their cohorts for 14 times. We use a multi-
level linear random intercept model as described in Rabe-Hesketh and Skrondal
(2005) to model the serial correlation structure inherent in our data.
    Note that each data point in the experiment is generated by individual j from
cohort k in the period i. The data fit into a nested hierarchical structure as shown
in figure 1: starting with occasions (or playing periods) at level 1, individuals at
level 2, and cohorts at level 3. In the random intercept model, we assume that
each level has its own noise that is independent from other levels. These noise
components, which exist in addition to the conventional error term, are identical
for all observations belonging to the same cluster on the level in question. The
model has three level-specific random intercepts: the within-subject noises for
each observation ijk , the between-subject noises ζjk , and the between-cohort
noises ζk .
    We are interested in the likelihood of subjects making decisions that match
our theoretical rational responses. Our proposed linear predictor uses the logistic
distribution as the link function. We estimate the following multilevel random

effect logit model with m covariates:

                                                                (2)    (3)
    logit {P r(y = 1|xijk )} = β0 + β1 x1ijk + · · · + βm xmijk + ζjk + ζk +   ijk

   We have obtained some individual characteristics of the participating subjects
through a questionnaire at the end of the experimental sessions. The question-
naire asked for gender, age group, and courses attended at university. We incor-
porate these information as covariates in our analysis. The regression includes
the following dependent and independent variables.

IR is the dichotomous dependent variable that indicates whether a subject played
     equilibrium responses within a game. It should be noted that an equilib-
     rium response does not have to be on the equilibrium path, as other players
     in the group may have deviated. Note that an equilibrium response is col-
     lection of sequentially rational actions with beliefs that are consistent with
     equilibrium play of the group members. So this variable measures the joint
     construct of individual rationality and the belief in the rationality of the
     other players. Only for the decision in the first stage of a game does the
     belief in the rationality of the group members not play a role.

 steps# is a set of dummy variables that characterise the particular situation
     the individual is in. A situation is defined by two things: the number of
     steps of iteration necessary; and the number of people in the group. In
     the regression, the variables are step1, step2, nstep0, nstep1, nstep2 and
     nstep3, where the number denotes the steps of iteration necessary, while a
     leading n indicates that it was a group with three subjects. The the omitted
     baseline dummy is the 0-step (i.e. ρ = f alse) situation in a group with
     two subjects. Combining the number of group members and the iteration
     level in one battery of dummy variables avoids the problems of interpreting
     interactions in non-linear models.

courses are a set of dummy variables for different degrees subjects are enrolled
     in. The categories are given by: Arts, Commerce, Economics, Engineer-
     ing&Science, Finance, and the omitted baseline Other.

gender is a dummy variable that represents the sex of the subject. The baseline
    is female.

maturity is a dummy variable indicating whether the subject is over 25. The
    baseline is under 25.

    The results of the logit regression are shown in Table 5. We report the odds
ratio parameter estimates, where a coefficient of 1 signals no impact, and also
the marginal effects. The marginal effects are calculated with respect to the two

           Table 5: Regression results

Dependent variable          IR (rational choice)
                            eβ        marg. eff.
Situation dummies              base is step0
  nstep0                  0.306         -0.058
                          (0.18)          (0.27)
  nstep1                  0.596          -0.018
                          (0.54)          (0.56)
  nstep2                0.032***        -0.445***
                          (0.00)          (0.00)
  nstep3                0.042***        -0.377***
                          (0.00)          (0.00)
  step1                   0.870          -0.004
                          (0.84)          (0.84)
  step2                 0.062***        -0.288***
                          (0.00)          (0.00)
Degree dummies                 base is other
  Economics               4.864          0.372
                          (0.15)          (0.14)
  Arts                    5.131           0.382
                          (0.21)          (0.18)
  Finance                 1.907           0.158
                          (0.62)          (0.61)
  Commerce               7.538*            0.45
                          (0.07)          (0.07)
  EngSci                  3.700           0.315
                          (0.24)          (0.22)
male                      1.255           0.045
                          (0.56)          (0.58)
maturity                  0.981          -0.004
                          (0.98)          (0.98)
Constant                            5.793
ρsubject                           0.356***
ρgroup                              0.001
Observations                         1260
*** p<0.01, ** p<0.05, * p<0.; p values in parentheses

sets of dummy variables. So the marginal effects for the set of dummy variables
that describe the situation are with respect to a change from the baseline, which
is the two-player situation with ρ = f alse, for an average subject.11 It is obvious
that one step of iteration does not pose a more difficult problem than the trivial
situation of ρ = f alse. The situations nstep1 and step1 do not have a statistically
different impact than the baseline or nstep0. This is confirmed by Wald tests on
the odds ratios and the marginal effects.
    A large drop in agreement with the game theoretical prediction happens when
two steps of iteration are necessary. For an average subject in the two-player game
the probability drops by 0.287 compared to the baseline case. In the three player
situation the drop is even greater (0.445). The second step of iteration seems to
be the main hurdle. Subjects who can do two steps of iteration are able to do the
third step, too. Between the second and the third step of iteration no additional
drop in the probability of agreement is found. The impact of the additional
step (from two to three) is not significant (p > 0.15, Wald test on the marginal
effect). So it seems to be the rule that subjects either can do no more than one
step of iteration or are able to go all the way and perform the maximum steps of
iteration, which is three.
    Returning to the discussion in the previous section, we can confirm that the
number of subjects in a group reduces the probability of a predicted equilibrium
choice if the reliance on the rationality of the group members is necessary. Com-
paring the marginal effect of the two and three player situations with two steps of
iteration shows that the subjects are less likely to behave as predicted by Perfect
Bayesian Equilibrium if they are in a three-player group (p < 0.03, one-sided
Wald test). In this case, the belief in the rationality of the other player(s) is
necessary, and being grouped with two players makes it more likely that at least
one of them does not follow the equilibrium logic. So a player who has to rely
on the rationality of two other players might be less inclined to play according
to the common rationality assumption than a player who only has to rely on one
other group member. This interpretation is further supported by the fact that
we do not find differences for the number of group members in the situation with
zero or one step of iteration, where the best response does not depend on the
behaviour of the group member(s) (p > 0.27 and p > 0.64, Wald tests on the
marginal effects).
    Our additional control variables – the degrees the students are studying for,
gender, and age – don’t have clear effects. The only weakly significant influence
within these variables is a better performance of commerce students. This is
not really surprising, as commerce is among the degrees with the highest high
school marks necessary for admittance. A closer look at the estimates shows
that the coefficients and marginal effects for different courses are relatively high,
     The average subject is a hypothetical composite subject, where all the dummies not de-
scribing the situation are set to their averages.

without being statistically significant. The reason is twofold. Firstly, the variance
of agreement within the subjects in one category is considerable, which hints at
some heterogeneity within these groups. Secondly, this heterogeneity is accounted
for by our individual random effect, which is highly significant. The agreement
is highly correlated within subject across the 14 separate Dirty Faces Games
(ρsubject = 0.356), which illustrates that some subjects are smarter than others.

5.4    Bounded rationality or lack of common knowledge?
The results hint at two potential reasons why in configurations with two and three
steps of iteration much less equilibrium play is observed than in configurations
where zero or one step is necessary. The significant drop of equilibrium play be-
tween one and two steps coincides with the point where reliance on the rationality
of others becomes necessary. Additionally, the rate of equilibrium play is signifi-
cantly lower in the two player game then in the three player game when two steps
of iteration are necessary. This confirms that the lack of common knowledge of
rationality is an important factor for reducing the rate of equilibrium play. Not
to rely on the group member(s) getting it right can be highly rational. What
about bounded rationality itself? Can we conclude from our data that people are
actually not able to perform the necessary depth of iteration? Or is it possible
that all the deviation stems from the lack of belief in the ability of the group
members? We believe that there is some evidence for bounded rationality as
well. A player who is able to perform two iterations and who anticipates that the
others can do at least one, should play according to the equilibrium prediction in
a situation where two steps are required. Given that almost all individuals were
able to do one step of iteration, the sharp drop of individual equilibrium play
between one and two iteration steps hints at an influence of limited iterative abil-
ity. Moreover, their are two different patterns of deviations from the equilibrium
path: a) not playing down even if the type should be known; b) playing down
without being able to know the type. Pattern a) is rationalizable, while b) is not.
So playing according to pattern a) can be either due to bounded rationality or
due to doubts about the ability of the other player, while playing according to
pattern b) can only be a consequence of the lack of iterative ability. In the two
treatments 6.1 percent (2-player) and 6.5 percent (3-player) of all play followed
this pattern, which can only be caused by limited iterative ability.

6     Conclusion
In this study we revisited the Dirty Faces Game in order to investigate the abil-
ity of iteration and the common knowledge thereof in humans. We modified the
setup of Weber (2001) slightly by adding a penalty for delaying the announce-
ment of types, once they are known. This modification ensured that the game

has a unique Perfect Bayesian Equilibrium, which enabled us to properly isolate
behaviour deducted from correct iteration and common knowledge of rationality
from off-equilibrium play.
     We found that there is a threshold between one and two steps of iteration,
where individual behaviour in many cases ceases to follow common rationality.
This failure may stem from either limited ability of performing the necessary
iterations or from the break down of common knowledge of rationality. A person
is either not able to perform the computational steps necessary or does not believe
that his fellow group members are able to perform the iterations. For equilibrium
play to occur, a player has to believe that the other group members are at least
able to perform n−1 steps if he himself requires n steps to get to the equilibrium.
We found evidence that doubts about the ability of the group members have some
influence. More players in a group led to less individual behaviour consistent
with equilibrium, in cases where profitability of equilibrium behaviour depended
on the other player’s individual rationality. The number of players in a group
had no influence in situations where players did not have to rely on their group
members being rational. On the other hand we also found some evidence for
limited iteration ability, as we observed a significant amount of non-rationalizable
     We conclude that the sharp drop of play following the common rationality
assumption between one and two steps of iteration is jointly caused by a) the
bounded rationality of individuals and b) the break down of the common knowl-
edge of rationality. This result suggests that some further research should be
devoted to learning more about the relative importance of these two factors for
deviations from equilibrium play.

Bosch-Domenech, Antoni, Montalvo, Jose G., Nagel, Rosemarie, and Satorra,
  Albert (2002). One, two, (three), infinity, ... : Newspaper and lab beauty-
  contest experiments. The American Economic Review 92 (5), 1687–1701.
Camerer, C. F. (2003). Behavioral Game Theory: Experiments in Strategic In-
  teraction. Priceton University Press.
Duffy, J. and R. Nagel (1997). On the robustness of behaviour in experimental
 ‘beauty contest’ games. Economic Journal 107 (445), 1684–1700.
Fey, M., R. D. McKelvey, and T. D. Palfrey (1996). An experimental study
  of constant-sum centipede games. International Journal of Game Theory 25,
Fischbacher, U. (2007). Z-tree - Zurich toolbox for readymade economic experi-
  ments. Experimental Economics forthcoming.

Ho, T.-H., C. Camerer, and K. Weigelt (1998). Iterated dominance and iterated
  best response in experimental ”p-beauty contests”. The American Economic
  Review 88 (4), 947–969.

Littlewood, J. E. (1953). A Mathematician’s Miscellany. London: Meuthen &
  Co. Ltd.

McKelvey, R. D. and T. D. Palfrey (1992). An experimental study of the centipede
 game. Econometrica 60, 803–836.

Nagel, R. (1995). Unraveling in guessing games: An experimental study. The
  American Economic Review 85 (5), 1313–1326.

Rabe-Hesketh, S. and A. Skrondal (2005). Multilevel and Longitudinal Modeling
  Using Stata. STATA Corporation.

Rubinstein, A. (1989). The electronic mail game: Strategic behavior under ”al-
 most common knowledge”. The American Economic Review 79 (3), 385–391.

Weber, R. A. (2001). Behavior and learning in the dirty faces game. Experimental
 Economics 4 (3), 229–242.

A     Instructions for n = 2

Welcome to our experiment. Please read these instructions carefully. Under-
standing the instructions is crucial for earning money.
    This is an experiment in decision-making. You will be paid for your par-
ticipation. The exact amount you will receive will be determined during the
experiment and will depend on your decisions. This amount will be paid to you
in cash after the conclusion of the experiment. If you have any questions during
the experiment, raise your hand and the experimenter will assist you. It is strictly
forbidden to talk, exclaim or to communicate with other participants during the
experiment. It is very important for us that you obey these rules. Otherwise the
data generated in this session are useless.
    In this experiment, you will play a series of 14 identical games in which you
can earn or lose money based on your choices. You start with an endowment of
AUD 9. Wins and losses during the 14 games will be added to or deducted from
this endowment. At the end of the experiment you will be paid the resulting
amount in cash.
    You are paired with one other participant (called partner) throughout all the
14 games. You will not know the identity of this other person, either during or
after the experiment, just the other person does not know your identity.


At the start of each of the 14 games, the computer will randomly draw a type
for you and a type for the person you are paired with. The possible types are
“X” and “O”. The computer always draws from an urn with two balls of type
“X” and one ball of type “O”. So the probability that you are of type “X” is 2/3
while the probability that you are of type “O” is 1/3. Note that the draw for
each person will be from a different urn. This means that the likelihood of you
being of a certain type does not depend on what type the other person is.


Each participant will only be told the type of the partner, but no his/her own
type. So you will know the type of the person you are paired with, but not your
own type. Your partner will know your type, but not his/her own. Additionally,
you will be told if at least one person (you or/and your partners) is of type “X”.
Below you have an example of how the information you will get may look like.
In this case you see that at least one person is of type “X” and that your partner
is of type “O”.

             Decisions (maximum of two rounds per game)

    Round 1
After you have seen the type of the other player in your group and the information
whether at least one player (you and/or your partner) is of type ”X” you are asked
to choose one of two actions: ”Up” or ”Down”. The combination of your type
and your decision will determine how much money you earn. Note that your
payoff does only depend on your type and not on the type of your partner. The
money you earn or lose is determined in the following table:

                                            Your Type
                                       “X”              “O”
                 Your   “up”             0                0
                choice “down”      win 100 cents   lose 400 cents

  1. If you choose “Up” your current earnings will not change.
  2. If you choose “Down” and your type is “X” one dollar is added to your
  3. If you choose “Down” and your type is “O” then four dollars will be de-
     ducted from your account.

Note again that the type that determines the payoffs is your type and not the
type of your partner. An example of a decision screen is shown below. The
payoffs in the table below are given in cents.

    After you made your decision the following happens: If either you or your
partner has chosen “Down” the current game ends. Your payoff will be calculated
and shown on the screen. Then a new game begins with a new draw of the types.
However, if both you and your partner have decided to play “Up” the game enters
a second round.
    Round 2 (only if both players chose “Up” in round 1)
Round two of the game practically works the same way as round one does. Note
that you and your partner keep the types that were drawn before decision round
    The only difference in round two is that the payoffs for choosing down are
multiplied by a factor of 0.8. The payoffs in round two are the following:
                                          Your Type
                                      “X”           “O”
                 Your   “up”           0              0
                choice “down”     win 80 cents lose 320 cents
So now you win 80 cents (instead of 1 Dollar in round 1) if you are of type “X”
and you choose “Down”. If you choose “Down” and it turns out that you are of
type “O” you lose 320 cents (instead of 4 Dollars in round 1). Choosing “Up”
once again does not cause any gains or losses regardless of your type.
   An example of the decision screen is given below:
   After round 2 the game ends no matter of the actions previously taken. Your
payoff will be calculated and shown on the screen. Then a new game starts with
a new draw of types (as explained above).

   In total you will play 14 of these games. At the end of the experiment (after
14 games) you will be given a little questionnaire where you have to fill in your
details. The questionnaire is only used to make sure that you get the money you
have earned.
   Thank you very much for your participation.


To top