VIEWS: 6 PAGES: 27 POSTED ON: 8/4/2011
The University of Adelaide School of Economics Research Paper No. 2007-01 The Dirty Faces Game Revisited Ralph-C. Bayer and Mickey Chan The Dirty Faces Game Revisited Ralph-C. Bayer∗and Mickey Chan February 15, 2007 Abstract Weber (2001) uses the Dirty Faces Game to examine the depth of it- erated rationality. Weber does not consider equilibria that contain weakly dominated actions. So he implicitly assumes that it is common knowledge that no one ever uses weakly dominated actions. We show that allowing for equilibria in weakly dominated strategies greatly extents the set of po- tentially rational actions. The original game therefore lacks discriminatory power, as many actions categorised as irrational by Weber can actually be part of an equilibrium strategy. We slightly modify the payoﬀ structure and establish strict dominance, which leads to a unique equilibrium. The resulting dominance-solvable game is implemented in an experiment. We ﬁnd that subjects are either able to iterate right to the equilibrium or fail to do so when two or more steps of iteration are necessary. Virtually all sub- jects were able to do one step of iteration. Further, we ﬁnd evidence that the lack of conﬁdence in other players’ iterative abilities induces deviations from equilibrium play. Keywords: Game Theory, Iterative Reasoning, Experimental Economics. JEL Classiﬁcation Numbers: C91, C92, C72. ∗ corresponding author: Department of Economics, Room 126, Napier Building, Ade- laide 5005, Australia. Phone: +61(8) 8303 4666, Fax: +61(8) 8223 1460. email: ralph.bayer@adelaide.edu.au 1 1 Introduction In the Dirty Faces Game – originally developed by Littlewood (1953) – com- mon knowledge of rationality is necessary for players to behave according to the Bayesian Nash equilibrium. In diﬀerent conﬁgurations of game, diﬀerent levels of iterated rationality and common knowledge are necessary. Therefore the game can be used to measure the level of iterated rationality in humans with the help of laboratory experiments. A particularly interesting feature of the Dirty Faces Game is the fact that a higher iterative is beneﬁcial to all players. In other games commonly used to measure the level of iteration,1 such as beauty-contest and cen- tipede games (McKelvey and Palfrey, 1992; Fey et al., 1996, e.g.), it is best for an individual player to iterate one level more deeply than the opponent(s). In contrast, the Dirty Faces Game is ideal for testing how many levels of iteration a player can do, as in a situation where n steps of iteration are necessary, a player, who believes that the other players can do at least n − 1 steps of iteration, has no incentive to deviate from equilibrium play. The basic idea of the game is the following. Players interact in a group. All players observe the type of all other players (a dirty face or clean face) but not their own type. Then given the common knowledge that at least one player has a ‘dirty face’ the objective is to ﬁnd out as quickly as possible if the own face is dirty or not. Clearly, one level of iteration is necessary if a player observes no dirty faces. Then given that it is common knowledge that there is at least one dirty face she should know that it is her face that is dirty. In a second step, a player who sees that one other player has a dirty face can infer the state of his face from the reaction of this player. If this player announces that he has a dirty face, then she knows that her own face cannot be dirty, as this player must have seen only clean faces in order to reach this conclusion. Seeing only one person with a dirty face, who does not announce ‘my face is dirty’, should lead to the conclusion that this other person must have seen at least one other person with a dirty face. Since the only dirty face she sees is that of this particular person, she can conclude that she must have a dirty face, too. Two steps of iterated rationality are necessary in these two cases, where a player observes one dirty face. The same logic applies for situations where more than one dirty face is observed. The depth of iteration necessary is always the number of dirty faces observed plus one. For a player being able to deduct her type properly, the following conditions have to be satisﬁed. First, she has to be able to perform the iterated reasoning described above. Secondly, all the other players also have to be able to perform the necessary steps of iteration. Thirdly, the ability of the players to perform the necessary steps of iteration has to be common knowledge. 1 See for example the beauty contest in Nagel (1995), Duﬀy and Nagel (1997), Ho et al. (1998), and Bosch-Domenech, Antoni et al. (2002) or the email game developed by Rubinstein (1989) and experimentally examined in chapter 5 of Camerer (2003). 2 Weber (2001) adapted the Dirty Faces Game for the laboratory with the aim to test the depth of commonly known rational iteration in humans. In what follows we show that Weber’s implementation requires an extreme assumption: players do not play weakly dominated strategies and this is common knowl- edge among all players. By implicitly concentrating on the equilibrium in un- dominated strategies, Weber excludes a continuum of Perfect Bayesian Nash Equilibria that include weakly dominated actions. In doing so, he categorises some equilibrium play as non-rational. More speciﬁcally, Weber imposes the re- striction that people have to believe that others immediately announce their type once they know it, even if in equilibrium the other players are indiﬀerent between doing so or waiting. We modify Weber’s setup such that all the equilibria in weakly dominated strategies disappear and only a unique equilibrium remains. Using this setup, we can use a laboratory experiment to investigate the level of commonly known iteration without having to rely on the implicit assumption that weak dominance is commonly known to be obeyed by all players. We ﬁnd that the frequency of equilibrium play is roughly compatible with Weber’s ﬁndings. However, a direct comparison is not the aim of this paper as we cannot control for subject-pool eﬀects. We concentrate on the question of when and why are people deviating from equilibrium play. For this purpose, we use an econometric panel model. Our main ﬁnding is that almost all people are able to do one step of iteration, while in many groups the iterative chain breaks down when two or more levels of iteration are necessary. However, if a subject is able to do two levels of iteration he or she can usually do three levels as well. Furthermore, we ﬁnd evidence that doubts that other players are able to correctly iterate explains some oﬀ-equilibrium play. In larger groups individuals play less often according to equilibrium even if the same level of iteration is necessary for doing so, since the perceived likelihood that at least one other player may not be able to perform the necessary iteration rises. The remainder of the paper is organised as follows. In the next section we explain our version of the dirty faces game. In Section 3 we show that our mod- iﬁcation removes the unwanted equilibria from Weber’s model and leads to a unique equilibrium, which can be reached by iterated deletion of strictly dom- inated strategies. Our experimental design is described in 4. The results are reported in Section 5. Section 6 concludes. 2 The Dirty Faces Game with discounting In this section we develop our speciﬁcation of the Dirty Faces Game. Our version has the same structure as Weber’s game. However, our game will allow for discounting. The n-player Dirty Faces Game proceeds as follows: 1. There are n players, i = 1 . . . n. Player −i denotes the partner(s) of player 3 i. We call the collection of all players ∀i ∈ {1, . . . , n} in a game a cohort. 2. Nature draws a type θi ∈ {O, X} for all players from a distribution that is identical for and independent between all players. The commonly known probabilities are 1 − p for type O and p for type X, respectively. Being of type X can be seen as having a dirty face. 3. An announcement takes place, which provides common knowledge among the players if there is at least one type X player in the cohort. The an- nouncement is denoted by the boolean indicator variable ρ ∈ {true, f alse}, and declares whether: (a) no one has drawn type X, ρ = f alse, or (b) there is at least one type X player, ρ = true. 4. The players observe the types of their partners θ−i , but cannot observe their own type θi . 5. The players make decisions and these are evaluated in the following se- quence: Stage counter : Starting with t = 1. Decision stage: Each player chooses an action, either up or down, where the action down correspondents to saying ‘I have a dirty face’, while ‘up’ correspondents to ‘I don’t know’. Evaluation stage: The game ends if either (a) any player has chosen down, or (b) n stages have passed, i.e. when t = n. Otherwise, the game continues with players returning to the Decision stage. The stage counter t is advanced by one and the players learn the actions taken by all players in the previous stage. 6. Payoﬀs are realised. The payoﬀs ui (ai , θi ) are dependent on the action and the type of the player. Furthermore payoﬀs are discounted with discount factor δ per period. When down has been chosen, a type X player receives α, and a type O player receives −β. Including discounting we have the following payoﬀs: ui (down, X) = δ t−1 α ui (down, O) = −δ t−1 β 4 On the other hand, the player receives zero payoﬀ whenever choosing up, regardless of his type and the period in question: ui (up, θi ) = 0 So players who correctly infer that they have dirty face will be rewarded, while players who wrongly claim having a dirty face are penalised. For our purposes, it is important to make sure that players will not ﬁnd it attractive to gamble if they do not have information in addition to the prior. The following condition makes sure that it is sequentially rational to choose up given that the prior beliefs are held: pα − (1 − p)β < 0 (1) This condition ensures that the expected payoﬀ of choosing down is negative if subjects have no further information than the prior probability for type X. It follows that the action down – claiming to have a dirty face – is strictly dominated by playing up – admitting not to know – when the prior beliefs are held. However, this is not suﬃcient to render the game dominance solvable, as will be shown below. We need a further assumption on δ. It will turn out that in Weber’s formulation without discounting, where δ is equal to one, dominance solvability breaks down, while for δ ∈ (0, 1) iterated deletion and correct updating leads to a unique Perfect Bayesian Equilibrium. The payoﬀs for diﬀerent actions given a particular type are summarised in Table 1. Table 1: Payoﬀs in the discounted Dirty Faces Game Own Type X O Own up 0 0 t−1 t−1 Actions down δ α −δ β In the decision stage, the basis on which players choose a particular action is their beliefs. The belief of player i, µi , is his believed probability of being of type X, i.e. having a dirty face, which has all the properties of a standard probability measure. The action and the belief of player i at stage t, given the observation of the partners’ types and the announcement, are denoted by (t) (t) ai (θ−i , ρ) and µi (θ−i , ρ), respectively. We omit the history of the game as it is perfectly determined by t and the fact that the game is still going. For example, (2) ai (θ−i , ρ) contains the fact that all players ∀i ∈ {1, . . . , n} must have played (1) ai (θ−i , ρ) = up. A pure strategy is a proﬁle of actions - one for every information set reached. Thus a pure strategy can be expressed by an action vector a that contains an 5 action for each possible information set. It contains an action for all situations (θ−i , ρ) and periods t. In an equilibrium we also need to specify a belief vector µ for each player. This vector, which is sometimes called an assessment - assigns a believed probability of being of type X for every information set. As the Dirty Faces Game is a Bayesian game, strategies and belief vectors for all players are necessary to describe an equilibrium. 3 Equilibrium In what follows we will informally charcterise the Perfect Bayesian Equilibria of the game for two players depending on the discount factor δ. The logic easily extends to any number of players. In a Perfect Bayesian Equilibrium, all actions have to be sequentially rational given own beliefs and the strategies of all other players. Additionally, the beliefs have to be consistent with equilibrium actions and prior beliefs. Where possible Bayes’ rule is used for updating. Suppose that both players have drawn type O. Then the announcement will be that no player is of type X — ρ = f alse. Without any further iteration necessary both players should see that their only sequentially rational actions are (1)∗ (2)∗ choosing up twice — ai (X, f alse) = up and ai (X, f alse) = up ∀i. This is part of any equilibrium. The more interesting case is the one where at least one player is of type X. Given that this is the case the announcement will be ρ = true. Proposition 1. For n = 2 and δ ∈ (0, 1] there exists an equilibrium, which ∀i contains the actions (1)∗ ai (O, true) = down (2)∗ ai (O, true) = down (1)∗ ai (X, true) = up (2)∗ ai (X, true) = down Proof. Correct updating requires that θ−i = O and ρ = true implies µi (O, true) = 1. So playing down yields an expecting payoﬀ of α in t = 1 and δα in t = 2. The maximum deviation payoﬀ from delaying playing down is δα. As δ ≤ 1, playing down is never a proﬁtable deviation in this case. (1) Correct belief formation requires that µi (X, true) = p, as no updating is (1)∗ possible. So ai (X, true) = up follows from the non-gambling condition in (1). (2) For the given equilibrium strategy µi (X, true) = 1 is implied because the other player’s action up in t = 1 is only consistent with θi = X. The equilibrium above, which we will refer to as “separating” in what follows, since the diﬀerent types play diﬀerent strategies, is the only equilibrium Weber 6 considered. This equilibrium makes sense in that a player who sees only an opponent of type O, but knows that there is at least one type X, concludes that it must be her, which leads to an immediate cashing in with the action down. A player who knows that all players are able to make this one step of iteration will be able to make a second step of inference: If I see a type X playing up in the ﬁrst period this player must have seen me being of type X. Otherwise she would have played down. So I must be of type X. We also see that that this equilibrium hinges crucially on a player choosing down as soon as he or she has inferred being of type X. Unfortunately, this is only weakly dominant if δ = 1. So in this case the dominance solvability, which is highly desirable for measuring common rationality, breaks down. Even worse, there are many other equilibria for Weber’s case of δ = 1. Therefore Weber characterises some behaviour as irrational, which can actually be equilibrium play. The following proposition shows this. Let us denote a mixed behavioural strategy that gives the (t) probability of player i choosing down in period t given θ−i and ρ as σi (θ−i , ρ). Proposition 2. For n = 2 and δ = 1 there exists a continuum of equilibria, where (1)∗ σi (O, true) ∈ [0, 1]∀i. Proof. Mixing in period t = 1 for given θ−i = O and ρ = true is sequentially rational if the expected payoﬀ for playing up and down are identical. Knowing that θi = X yields α for playing down. Playing up yields 0 if the other player plays down in period t = 1. However, if the other player plays up in period t = 1 then playing down in period t = 2 yields δα = α for δ = 1. So playing up in period t = 1 gives the same payoﬀ if the other player plays up with probability 1 in the ﬁrst period. Condition (1) ensures that playing down is strictly dominated for the other player in period one. In Weber’s formulation delaying playing down if the type is known is only weakly dominated. Since a player who delays ending the game by choosing up does not have to fear the other player jumping in and ending the game, delaying is part of an equilibrium strategy. This is the case because there are no costs of (1)∗ delaying. Consequently, depending on the equilibrium probability σi (O, true) the other player −i may choose up or down in period t = 2 after observing θ−i = X and ρ = true 2 Since we cannot infer mixing probabilities from a few individual choices, laboratory experiments with Weber’s payoﬀ settings have the problem that they only have discriminatory power if the assumption holds that weakly dominated equilibrium-strategies are never played and that this is common knowledge. We introduce a waiting cost in the form of a discount factor δ ∈ (0, 1). This forces a rational player to end the game immediately when she can infer that 2 The condition for a risk-neutral player −i to choose down in such a situation is (1)∗ pα σi (O, true) ≤ (1−p)β 7 her type is X. Therefore, the continuum of equilibria in mixed strategies and all its resulting asymmetric equilibria vanish. With such a setup we are able to properly discriminate between commonly rational behaviour and iteration failure in a laboratory experiment. Proposition 3. For n = 2 and δ ∈ (0, 1) the “separating equilibrium” from proposition 1 is unique. Proof. The situation θ−i = O and ρ = true now requires playing down with probability one for sequential rationality, as playing up is strictly dominated. The payoﬀ from playing down is α, which is greater than the maximum payoﬀ that can be achieved by playing up, which is δα. The beliefs for all other situations are pinned down as described in the proof for proposition 1, which only allow for the described actions to be sequentially rational. The logic easily extends to games with more than two players. The unique equilibrium strategy for a particular player with δ ∈ (0, 1) can be characterised as: 1. Always paly up if ρ = f alse 2. If ρ = true play up if t−1 is smaller than the number of players you observe being of type X, play down otherwise. In a setting without discounting, δ = 1, the class of equilibria is quite large. The fact that players are indiﬀerent in equilibrium between playing down immedi- ately once they know their type or delaying gives rise to many diﬀerent equilibria in pure and mixed strategies, such that it is not possible to properly discrimi- nate between equilibrium and erroneous play.3 Also note that in the discounting environment iterated deletion of strictly dominated actions from a strategy leads to the unique equilibrium, which makes it possible to test at which iteration step commonly known rationality breaks down. The steps of iteration necessary for an individual player to choose an equilibrium strategy is the number of dirty faces observed plus one. The player must also believe that the other players can do at least one step of iteration less. 4 Experimental implementation Our experiments only diﬀer from Weber’s second series of sessions, which he used to investigate learning eﬀects, by introducing discounting on delayed payoﬀs. Our 3 In footnote 18, p 239, Weber attributes some individual behaviour that deviates from the “separating equilibrium” to spite instead of irrationality. Closer inspection shows that the observed behaviour is actually consistent with equilibrium play. One player who understood the identical equilibrium payoﬀs for delaying and immediate play of down used this knowledge to mislead the other group member. 8 aim is to see how removing the multiple equilibria may aﬀect the experimental outcomes. This purpose leaves little room for design choices, since we want to qualitatively compare the results with those from Weber’s experiments.4 In particular, we stick with the use of the neutral language labeling types O and X, instead of using framed language like “clean” and “dirty faces”. There are - as in Weber’s paper - two treatments with cohort sizes of n = 2 and n = 3. So the necessary level of iterated reasoning for equilibrium play is relatively low. The prior probabilities for drawing type O and type X are 1/3 and 2/3, respectively. These priors reduce the occurrence of the trivial case of ρ = f alse, suﬃciently. The payoﬀ parameters are α = 100 and β = 400 points, which together with the prior probabilities should prevent gambling. The expected return is -67 points when subjects choose down if they hold the prior beliefs. Meanwhile, choosing up at the end of the game always yields zero. The prior probabilities and the relative size of the payoﬀs are identical to Weber’s learning setup. The discount factor is set to δ = 0.8. The reduction in the payoﬀ is noticeable as each stage passes, but not so signiﬁcant that gambling behaviour at latter stages might be induced. The incentive condition against gambling is maintained throughout the game, since the discounting is applied to all payoﬀs. 4.1 Procedure and summary statistics The sum of points earned during the experiment were converted to cash at the end of the game.5 The conversion ratio was one Australian Dollar (AUD) for every 100 points. An endowment of 900 points, or AUD 9, was provided to every subject at the beginning of the session. The main purpose of the endowment was to prevent early bankruptcy in the session. An injection of points (treated like a loan) was provided to subject that went bankrupt.6 Each treatment consisted of fourteen consecutive Dirty Faces Games. All games were independent from each other, i.e. the type draws were independent. The proﬁts are shown at the end of each period, and payments are made in cash at the end of the session. Subjects are anonymously matched with the same other subject(s) for all games in a session; we use a partner treatment. All these facts are common knowledge to all participants. Each game starts with the computer randomly and independently assigning types to subjects according to the priors. An announcement is made on the screen whether there is at least one type X player in the cohort. It is followed 4 A direct comparison is not possible due to potential subject pool eﬀects. 5 In the comparable treatments Weber paid two random periods. 6 The average/median proﬁts after fourteen periods were AUD 11.74/12.4 in the 2-player treatment, and AUD 7.98/9 in the 3-player treatment. The number of people who ended up with less than the endowment were 1(out of 42 subjects) in the 2-player game and 8(out of 48) in the 3-player game. 9 by revealing the type(s) of the partner(s) - in the 3-player case, the two partners are identiﬁed as Left and Right. Subjects enter the ﬁrst stage and are asked to simultaneously choose their actions - either up or down. After everyone in the cohort has chosen an action, the actions chosen by their partners are revealed. If the game advances to the next stage, subjects are asked to choose actions again. The game continues until someone has chosen down or n stages have passed. The period payoﬀ is displayed at the end of the period, but subjects are never told their own type at the end of the game. The sessions were conducted in a computer lab at the University of Adelaide. The subjects were recruited from a pool of students. The experimental ses- sions were programmed and run using the software z-tree by Fischbacher (2007). Communication among subjects was controlled for, as they were seated in self- contained booths. The seating order was randomly assigned. No communication was allowed during the session. Subjects received all the relevant information in written instructions and interactively on screen. The instructions for the 2-player game can be found in the appendix A. The experimental sessions were conducted between October, 2004 and May, 2005. There were two sessions for each treatment, 2-player and 3-player games. The sessions provided a total of 1260 observations from 90 subjects who played in 14 periods - 42 subjects formed 21 groups in the 2-player game, and the other 48 subjects formed 16 groups in the 3-player game. Subjects were primarily undergraduate students from the University of Adelaide. The distribution of students from various disciplines is shown in Table 2. Table 2: Distribution of courses the subjects were enrolled in course number percent Arts 4 4.44 Commerce 17 18.89 Economics 30 33.33 Engineering 24 26.67 Finance 4 4.44 Sciences 9 10.00 Others 2 2.22 Total 90 100.00 5 Results In what follows we will present our main results. We begin with some descriptive statistics on the frequencies of agreement by treatment and type realisation. In later sections we use statistical inference in order to analyse cohort size eﬀects 10 and the inﬂuence of the ‘diﬃculty’ of a situation on the likelihood of individual play following the equilibrium strategy. 5.1 Descriptive statistics Table 3 reports individual level data over the 14 playing periods. For a given event, each entry states the number of subjects choosing an action in agreement with the predicted best response. The percentage of these subjects out of all subjects who were in the same situation is stated in parentheses. For example, in the tenth period of the 2-player game there were 9 subjects, who observed a type O partner. Of those subjects 90% (or 0.90) chose the predicted best response down. There is clearly a high frequency of consistency with equilibrium play in the events where the required number of iterations is low. These are cases where one level of iteration is required when subjects have observed ρ = true and all partners are type O - θ−i = O for n = 2 or θ−i = OO for n = 3. Beyond that, the frequency of agreement with the predicted response gets much lower as the required level of iteration increases. Holding the required number of iterations constant, it seems that more players are able to perform two levels of iteration in the 2-player game (observed X, 0.62) than in the 3-player game (observed OX, 0.52). This might suggest that a higher number of players in a cohort complicates the thinking process of subjects and hinders the iterative process. In the 3-player game, there does not seem to be any signiﬁcant diﬀerence in the frequency of agreement with the predicted responses between the two highest levels of iteration. The frequencies are 0.52 for two levels of iteration (observed OX) and 0.55 for three levels of iteration (observed XX). It is reasonable to assume that if subjects are unable to perform two steps of iteration then they should be unable iterate correctly in situations that require more than two steps. The closeness of the two frequencies may also suggest that once subjects are able to perform two levels of iteration, they may have understood the structure and are therefore also able to solve tasks where higher-level iteration is necessary. They might slide down the slippery slope of iteration and go all the way. The following should be kept in mind when looking at these summary statis- tics: 1. We consider players in the following occasions as rational, even though their chosen actions have led to losses. To see this, suppose the cohort has drawn OX in the 2-player game. The player with type O has observed a player of type X as partner. The best responses are up for the player of type O and down for the player of type X in the ﬁrst stage. The game should have ended there. However, if player X has chosen the strictly dominated action up in the ﬁrst stage, the O type should choose down in the second, as his 11 Table 3: Individual rationality across periods Number of players 2 3 Obs. ρ = f alse O X ρ = f alse OO XO XX pred.a UU D UD Total UUU D UD UUD Total Period Agreements:n(freq.) 1 2(1.00) 11(0.92) 17(0.61) 30(0.71) 2(0.67) 8(0.44) 12(0.44) 22(0.46) 2 5(0.83) 7(0.88) 15(0.54) 27(0.64) 2(1.00) 13(0.59) 15(0.63) 30(0.63) 3 5(0.83) 8(0.89) 15(0.56) 28(0.67) 5(0.83) 5(0.83) 11(0.55) 10(0.63) 31(0.65) 4 8(1.00) 10(1.00) 16(0.67) 34(0.81) 3(0.75) 9(0.50) 12(0.46) 24(0.50) 5 6(1.00) 8(1.00) 14(0.50) 28(0.67) 5(0.83) 1(1.00) 8(0.57) 15(0.56) 29(0.60) 12 6 2(1.00) 10(0.83) 19(0.68) 31(0.74) 4(1.00) 10(0.42) 11(0.55) 25(0.52) 7 2(1.00) 9(1.00) 20(0.65) 31(0.74) 7(1.00) 15(0.58) 8(0.53) 30(0.63) 8 6(1.00) 10(1.00) 17(0.65) 33(0.79) 2(1.00) 7(0.35) 15(0.58) 24(0.50) 9 9(0.90) 6(0.86) 14(0.56) 29(0.69) 1(0.50) 9(0.50) 19(0.68) 29(0.60) 10 7(0.88) 9(0.90) 16(0.67) 32(0.76) 3(1.00) 3(1.00) 9(0.56) 16(0.62) 31(0.65) 11 8(1.00) 9(1.00) 17(0.68) 34(0.81) 2(1.00) 12(0.55) 13(0.54) 27(0.56) 12 4(1.00) 7(1.00) 19(0.61) 30(0.71) 2(0.67) 2(1.00) 9(0.45) 11(0.48) 24(0.50) 13 2(1.00) 7(1.00) 23(0.70) 32(0.76) 5(0.83) 6(1.00) 15(0.63) 7(0.58) 33(0.69) 14 4(1.00) 5(1.00) 22(0.67) 31(0.74) 4(1.00) 10(0.56) 12(0.46) 26(0.54) Agg. 70(0.95) 116(0.94) 244(0.62) 430(0.73) 20(0.83) 44(0.92) 145(0.52) 176(0.55) 385(0.57) a U means up and D means down. partner should only have chosen up if he has observed a type X player. The O type who logically is playing down is wrong and suﬀers losses. His mistake is due to the irrational signaling from his partner. Similar situations may occur in the 3-player game. In these situations, we consider player O as rational, even though his play led to losses due to relying on wrong signals. 2. Players play the dominant strategy when they believe their partners play the dominant strategy, i.e. if they believe that their partners are rational. However, if they believe that their partners are not rational, i.e. not playing dominant strategies, they might themselves deviate from the equilibrium strategy and play a “rationalizable” strategy. A rationalizable strategy is a strategy that is a best response to some strategy of the opponent, but not necessarily to the equilibrium strategy. So a player who does not believe that his partner is behaving rationally may consider the signaling from this partner as useless, since then following the signal will lead to losses, as in the situations mentioned above. Nevertheless, we consider their rationalizable actions as irrational, even though that the deviation from equilibrium can be rational if they believe that the other player(s) in the cohort are not rational. 3. Consider a cohort that has drawn XX in the 2-player game. A risk averse player in this game will play the strategy (up, up) if he is not able to make two steps of iteration7 , whereas a rational player will play the best response (up, down). However, one player may choose down in period one, which ends the game after the ﬁrst stage. Then we will not observe the actions in the second stage. Moreover, we are not able to distinguish a totally rational player and a risk-averse player with limited iterative ability. So we don’t know if the player is able to do the second step of iteration in this situation. We will count a player, who played according to the equilibrium prediction until the game wrongfully ended, as an agreement. The relative frequencies with which our subjects play the unique separating equilibrium are remarkably close to the relative frequencies with which Weber’s subjects chose actions that are compatible with the separating equilibrium in his learning setup.8 This seems to suggest that removing the equilibria in weakly dominated strategies does not make a large diﬀerence for play. In the two-player 7 This risk averse player is rational by playing this strategy, in the sense that choosing up yields zero return comparing to an expected return of -67 with choosing down. However, he is not entirely rational, since he cannot iterate correctly and therefore does not play down in period two. 8 Note that given that we assume equilibrium play we cannot conclude that all subjects Weber categorises as playing the separating equilibrium are actually doing so. The reason is that they might play a mixed strategy equilibrium which resulted in actions that look like the separating equilibrium. 13 setting Weber observed .94 (ρ = 1, θ−1 = O) and .70 (ρ = 1, θ−1 = X) agreement, while we observed 0.94 and 0.62. In the three-player setting Weber found relative agreement fractions of 1.00 (ρ = 1, θ−1 = OO), .65 (ρ = 1, θ−1 = OX) and 0.57 (ρ = 1, θ−1 = XX). Our relative frequencies were .92, .52, and .55.9 However, due to the diﬀerent subject pools and the multiplicity of equilibria in Weber’s setup a meaningful direct comparison is not possible. In what follows we analyse our data with respect to two questions: 1. Does the number of players, for a given diﬃculty of the problem, have an inﬂuence on the frequency of equilibrium play? 2. How does the level of diﬃculty impact on the frequency of collectively rational equilibrium play? 5.2 Inﬂuence of cohort sizes Diﬀerences in group sizes may have an inﬂuence on whether subjects actually choose the best response, regardless of the the level of iteration needed. This is because the theoretical best responses require to a certain extent the rationality of all players and the common knowledge of this rationality. This really means that all players understand how to play the game and know the best responses, and they also know and believe that the other players also understand how to play the game. The common knowledge argument breaks down when there is even a slight hint of doubt among any one of the players about the ability of other group members. Under this circumstance, players may believe with a certain probability that the signaling from partners does not provide any new information, as the signal might come from a player who is not able to iterate deeply enough. This leads players to play rationalizable strategies instead of the equilibrium strategies. So the more players are in the cohort the more likely it becomes that one of them is not able to do the iteration and that the common knowledge assumption breaks down. A two-sample Wilcoxon rank-sum (Mann-Whitney) test is used to examine the cases, where one and two steps of iteration are required10 . The null hypothesis is that the frequency of best responses for given steps of iteration are the same for diﬀerent cohort sizes n. 9 Subjects in Weber’s study were either UCLA or Caltech graduate or undergraduate stu- dents. We expected them ceteris paribus to do better than our students, as admission in these universities should be more selective. So our prior that removing the multiple equilibria should lead to a higher frequency of the separating equilibrium might still be true. 10 Given ρ = true, one step of iteration is needed for subjects who have observed only type O partners - i.e. θ−i = O for n = 2 and θ−i = OO for n = 3. Two steps of iteration are required after having observed exactly one type X partner, i.e. θ−i = X for n = 2 and θ−i = OX for n = 3. 14 Table 4: Inﬂuence of group sizes on individual rationality Situations by Levels of Iteration 1: No type X observed 2: One type X observed n obs ranksum expected obs ranksum expected 2 123 10656 10578 391 137188.5 131376 3 48 4050 4128 280 88267.5 94080 total 171 14706 14706 671 225456 225456 p-value 0.528 0.006 The results are tabulated in Table 4. They show that cohort sizes have an inﬂuence on subjects ability to choose best responses. In cases when one step of iteration is required (only type O’s are observed), subjects are able to deduce their type without requiring any signals from others. There, consistent with out hypothesis, the number of players makes no diﬀerence. On the other hand, subjects become dependent on the signals from partners in order to deduce their own type if they observe at least one other player being of type X. Then signaling complicates the matter because players must also decide how to interpret the signals. Players choose the predicted responses on the equilibrium path in later periods if they believe others have obeyed strict dominance and have chosen the predicted responses on the equilibrium path in earlier periods. Moreover, in the three player situations with more than one player of type X players will also have to believe that others know that they obey dominance themselves. A second step of common knowledge of rationality is necessary. Only then can the signals be interpreted as predicted in a Perfect Bayesian Equilibrium. This chain of reasoning about whether one can believe that others play the predicted responses and obeying dominance does not allow for any doubts. However, a subject may be cautious and may not believe that others obey dominance, or they do not believe that others believe that he himself obeys dominance, and so on. Even believing others to be cautious may be a suﬃcient cause for deviating from the equilibrium path. Worse still, only one player is required for causing this deviation. Under these circumstances, rational players may play rationalizable strategies instead of equilibrium strategies. So if the suspicion that others will get it wrong is a strong driving force for equilibrium deviation, then we should expect that deviation happens more often when more players are involved. This is because then players should have a higher believed probability that at least one of the other players does not behave according to dominance. Our statistical test conﬁrms this intuition. The number of correct deductions is greater in the treatment with two players (p = 0.006) if reliance on the rationality of other players is necessary but the diﬃculty stays the same. In summary, it seems reasonable to suggest that subjects choose the best responses consistently regardless of diﬀerent cohort sizes when signaling is not required from partners for them to deduce their own type. However, once sig- 15 Figure 1: Nested factors in the experiment naling is involved, increasing the number of players in the cohort increases the likelihood of subjects deviating from the equilibrium path and therefore lead to more deviations. 5.3 Panel-regression analysis In this section we anlyse the inﬂuence of the steps of iteration necessary for equi- librium play on the fraction of observed equilibrium play and revisit the question of the inﬂuence of the group size. For this purpose we use multilevel panel-data analysis. Standard regression modeling usually assumes that the errors have zero means and are mutually independent among observations. However, we would expect residuals of the data collected from the experiment for individuals and co- horts to be correlated. This is inherent in our experimental settings as the same individuals generated data over 14 playing periods, while they also interacted with the same other players within their cohorts for 14 times. We use a multi- level linear random intercept model as described in Rabe-Hesketh and Skrondal (2005) to model the serial correlation structure inherent in our data. Note that each data point in the experiment is generated by individual j from cohort k in the period i. The data ﬁt into a nested hierarchical structure as shown in ﬁgure 1: starting with occasions (or playing periods) at level 1, individuals at level 2, and cohorts at level 3. In the random intercept model, we assume that each level has its own noise that is independent from other levels. These noise components, which exist in addition to the conventional error term, are identical for all observations belonging to the same cluster on the level in question. The model has three level-speciﬁc random intercepts: the within-subject noises for (2) each observation ijk , the between-subject noises ζjk , and the between-cohort (3) noises ζk . We are interested in the likelihood of subjects making decisions that match our theoretical rational responses. Our proposed linear predictor uses the logistic distribution as the link function. We estimate the following multilevel random 16 eﬀect logit model with m covariates: (2) (3) logit {P r(y = 1|xijk )} = β0 + β1 x1ijk + · · · + βm xmijk + ζjk + ζk + ijk We have obtained some individual characteristics of the participating subjects through a questionnaire at the end of the experimental sessions. The question- naire asked for gender, age group, and courses attended at university. We incor- porate these information as covariates in our analysis. The regression includes the following dependent and independent variables. IR is the dichotomous dependent variable that indicates whether a subject played equilibrium responses within a game. It should be noted that an equilib- rium response does not have to be on the equilibrium path, as other players in the group may have deviated. Note that an equilibrium response is col- lection of sequentially rational actions with beliefs that are consistent with equilibrium play of the group members. So this variable measures the joint construct of individual rationality and the belief in the rationality of the other players. Only for the decision in the ﬁrst stage of a game does the belief in the rationality of the group members not play a role. steps# is a set of dummy variables that characterise the particular situation the individual is in. A situation is deﬁned by two things: the number of steps of iteration necessary; and the number of people in the group. In the regression, the variables are step1, step2, nstep0, nstep1, nstep2 and nstep3, where the number denotes the steps of iteration necessary, while a leading n indicates that it was a group with three subjects. The the omitted baseline dummy is the 0-step (i.e. ρ = f alse) situation in a group with two subjects. Combining the number of group members and the iteration level in one battery of dummy variables avoids the problems of interpreting interactions in non-linear models. courses are a set of dummy variables for diﬀerent degrees subjects are enrolled in. The categories are given by: Arts, Commerce, Economics, Engineer- ing&Science, Finance, and the omitted baseline Other. gender is a dummy variable that represents the sex of the subject. The baseline is female. maturity is a dummy variable indicating whether the subject is over 25. The baseline is under 25. The results of the logit regression are shown in Table 5. We report the odds ratio parameter estimates, where a coeﬃcient of 1 signals no impact, and also the marginal eﬀects. The marginal eﬀects are calculated with respect to the two 17 Table 5: Regression results Dependent variable IR (rational choice) eβ marg. eﬀ. Situation dummies base is step0 nstep0 0.306 -0.058 (0.18) (0.27) nstep1 0.596 -0.018 (0.54) (0.56) nstep2 0.032*** -0.445*** (0.00) (0.00) nstep3 0.042*** -0.377*** (0.00) (0.00) step1 0.870 -0.004 (0.84) (0.84) step2 0.062*** -0.288*** (0.00) (0.00) Degree dummies base is other Economics 4.864 0.372 (0.15) (0.14) Arts 5.131 0.382 (0.21) (0.18) Finance 1.907 0.158 (0.62) (0.61) Commerce 7.538* 0.45 (0.07) (0.07) EngSci 3.700 0.315 (0.24) (0.22) male 1.255 0.045 (0.56) (0.58) maturity 0.981 -0.004 (0.98) (0.98) Constant 5.793 ρsubject 0.356*** ρgroup 0.001 Observations 1260 *** p<0.01, ** p<0.05, * p<0.; p values in parentheses 18 sets of dummy variables. So the marginal eﬀects for the set of dummy variables that describe the situation are with respect to a change from the baseline, which is the two-player situation with ρ = f alse, for an average subject.11 It is obvious that one step of iteration does not pose a more diﬃcult problem than the trivial situation of ρ = f alse. The situations nstep1 and step1 do not have a statistically diﬀerent impact than the baseline or nstep0. This is conﬁrmed by Wald tests on the odds ratios and the marginal eﬀects. A large drop in agreement with the game theoretical prediction happens when two steps of iteration are necessary. For an average subject in the two-player game the probability drops by 0.287 compared to the baseline case. In the three player situation the drop is even greater (0.445). The second step of iteration seems to be the main hurdle. Subjects who can do two steps of iteration are able to do the third step, too. Between the second and the third step of iteration no additional drop in the probability of agreement is found. The impact of the additional step (from two to three) is not signiﬁcant (p > 0.15, Wald test on the marginal eﬀect). So it seems to be the rule that subjects either can do no more than one step of iteration or are able to go all the way and perform the maximum steps of iteration, which is three. Returning to the discussion in the previous section, we can conﬁrm that the number of subjects in a group reduces the probability of a predicted equilibrium choice if the reliance on the rationality of the group members is necessary. Com- paring the marginal eﬀect of the two and three player situations with two steps of iteration shows that the subjects are less likely to behave as predicted by Perfect Bayesian Equilibrium if they are in a three-player group (p < 0.03, one-sided Wald test). In this case, the belief in the rationality of the other player(s) is necessary, and being grouped with two players makes it more likely that at least one of them does not follow the equilibrium logic. So a player who has to rely on the rationality of two other players might be less inclined to play according to the common rationality assumption than a player who only has to rely on one other group member. This interpretation is further supported by the fact that we do not ﬁnd diﬀerences for the number of group members in the situation with zero or one step of iteration, where the best response does not depend on the behaviour of the group member(s) (p > 0.27 and p > 0.64, Wald tests on the marginal eﬀects). Our additional control variables – the degrees the students are studying for, gender, and age – don’t have clear eﬀects. The only weakly signiﬁcant inﬂuence within these variables is a better performance of commerce students. This is not really surprising, as commerce is among the degrees with the highest high school marks necessary for admittance. A closer look at the estimates shows that the coeﬃcients and marginal eﬀects for diﬀerent courses are relatively high, 11 The average subject is a hypothetical composite subject, where all the dummies not de- scribing the situation are set to their averages. 19 without being statistically signiﬁcant. The reason is twofold. Firstly, the variance of agreement within the subjects in one category is considerable, which hints at some heterogeneity within these groups. Secondly, this heterogeneity is accounted for by our individual random eﬀect, which is highly signiﬁcant. The agreement is highly correlated within subject across the 14 separate Dirty Faces Games (ρsubject = 0.356), which illustrates that some subjects are smarter than others. 5.4 Bounded rationality or lack of common knowledge? The results hint at two potential reasons why in conﬁgurations with two and three steps of iteration much less equilibrium play is observed than in conﬁgurations where zero or one step is necessary. The signiﬁcant drop of equilibrium play be- tween one and two steps coincides with the point where reliance on the rationality of others becomes necessary. Additionally, the rate of equilibrium play is signiﬁ- cantly lower in the two player game then in the three player game when two steps of iteration are necessary. This conﬁrms that the lack of common knowledge of rationality is an important factor for reducing the rate of equilibrium play. Not to rely on the group member(s) getting it right can be highly rational. What about bounded rationality itself? Can we conclude from our data that people are actually not able to perform the necessary depth of iteration? Or is it possible that all the deviation stems from the lack of belief in the ability of the group members? We believe that there is some evidence for bounded rationality as well. A player who is able to perform two iterations and who anticipates that the others can do at least one, should play according to the equilibrium prediction in a situation where two steps are required. Given that almost all individuals were able to do one step of iteration, the sharp drop of individual equilibrium play between one and two iteration steps hints at an inﬂuence of limited iterative abil- ity. Moreover, their are two diﬀerent patterns of deviations from the equilibrium path: a) not playing down even if the type should be known; b) playing down without being able to know the type. Pattern a) is rationalizable, while b) is not. So playing according to pattern a) can be either due to bounded rationality or due to doubts about the ability of the other player, while playing according to pattern b) can only be a consequence of the lack of iterative ability. In the two treatments 6.1 percent (2-player) and 6.5 percent (3-player) of all play followed this pattern, which can only be caused by limited iterative ability. 6 Conclusion In this study we revisited the Dirty Faces Game in order to investigate the abil- ity of iteration and the common knowledge thereof in humans. We modiﬁed the setup of Weber (2001) slightly by adding a penalty for delaying the announce- ment of types, once they are known. This modiﬁcation ensured that the game 20 has a unique Perfect Bayesian Equilibrium, which enabled us to properly isolate behaviour deducted from correct iteration and common knowledge of rationality from oﬀ-equilibrium play. We found that there is a threshold between one and two steps of iteration, where individual behaviour in many cases ceases to follow common rationality. This failure may stem from either limited ability of performing the necessary iterations or from the break down of common knowledge of rationality. A person is either not able to perform the computational steps necessary or does not believe that his fellow group members are able to perform the iterations. For equilibrium play to occur, a player has to believe that the other group members are at least able to perform n−1 steps if he himself requires n steps to get to the equilibrium. We found evidence that doubts about the ability of the group members have some inﬂuence. More players in a group led to less individual behaviour consistent with equilibrium, in cases where proﬁtability of equilibrium behaviour depended on the other player’s individual rationality. The number of players in a group had no inﬂuence in situations where players did not have to rely on their group members being rational. On the other hand we also found some evidence for limited iteration ability, as we observed a signiﬁcant amount of non-rationalizable choices. We conclude that the sharp drop of play following the common rationality assumption between one and two steps of iteration is jointly caused by a) the bounded rationality of individuals and b) the break down of the common knowl- edge of rationality. This result suggests that some further research should be devoted to learning more about the relative importance of these two factors for deviations from equilibrium play. References Bosch-Domenech, Antoni, Montalvo, Jose G., Nagel, Rosemarie, and Satorra, Albert (2002). One, two, (three), inﬁnity, ... : Newspaper and lab beauty- contest experiments. The American Economic Review 92 (5), 1687–1701. Camerer, C. F. (2003). Behavioral Game Theory: Experiments in Strategic In- teraction. Priceton University Press. Duﬀy, J. and R. Nagel (1997). On the robustness of behaviour in experimental ‘beauty contest’ games. Economic Journal 107 (445), 1684–1700. Fey, M., R. D. McKelvey, and T. D. Palfrey (1996). An experimental study of constant-sum centipede games. International Journal of Game Theory 25, 269–287. Fischbacher, U. (2007). Z-tree - Zurich toolbox for readymade economic experi- ments. Experimental Economics forthcoming. 21 Ho, T.-H., C. Camerer, and K. Weigelt (1998). Iterated dominance and iterated best response in experimental ”p-beauty contests”. The American Economic Review 88 (4), 947–969. Littlewood, J. E. (1953). A Mathematician’s Miscellany. London: Meuthen & Co. Ltd. McKelvey, R. D. and T. D. Palfrey (1992). An experimental study of the centipede game. Econometrica 60, 803–836. Nagel, R. (1995). Unraveling in guessing games: An experimental study. The American Economic Review 85 (5), 1313–1326. Rabe-Hesketh, S. and A. Skrondal (2005). Multilevel and Longitudinal Modeling Using Stata. STATA Corporation. Rubinstein, A. (1989). The electronic mail game: Strategic behavior under ”al- most common knowledge”. The American Economic Review 79 (3), 385–391. Weber, R. A. (2001). Behavior and learning in the dirty faces game. Experimental Economics 4 (3), 229–242. 22 A Instructions for n = 2 INSTRUCTIONS Welcome to our experiment. Please read these instructions carefully. Under- standing the instructions is crucial for earning money. This is an experiment in decision-making. You will be paid for your par- ticipation. The exact amount you will receive will be determined during the experiment and will depend on your decisions. This amount will be paid to you in cash after the conclusion of the experiment. If you have any questions during the experiment, raise your hand and the experimenter will assist you. It is strictly forbidden to talk, exclaim or to communicate with other participants during the experiment. It is very important for us that you obey these rules. Otherwise the data generated in this session are useless. In this experiment, you will play a series of 14 identical games in which you can earn or lose money based on your choices. You start with an endowment of AUD 9. Wins and losses during the 14 games will be added to or deducted from this endowment. At the end of the experiment you will be paid the resulting amount in cash. You are paired with one other participant (called partner) throughout all the 14 games. You will not know the identity of this other person, either during or after the experiment, just the other person does not know your identity. Types At the start of each of the 14 games, the computer will randomly draw a type for you and a type for the person you are paired with. The possible types are “X” and “O”. The computer always draws from an urn with two balls of type “X” and one ball of type “O”. So the probability that you are of type “X” is 2/3 while the probability that you are of type “O” is 1/3. Note that the draw for each person will be from a diﬀerent urn. This means that the likelihood of you being of a certain type does not depend on what type the other person is. Information Each participant will only be told the type of the partner, but no his/her own type. So you will know the type of the person you are paired with, but not your own type. Your partner will know your type, but not his/her own. Additionally, you will be told if at least one person (you or/and your partners) is of type “X”. Below you have an example of how the information you will get may look like. In this case you see that at least one person is of type “X” and that your partner is of type “O”. 23 Decisions (maximum of two rounds per game) Round 1 After you have seen the type of the other player in your group and the information whether at least one player (you and/or your partner) is of type ”X” you are asked to choose one of two actions: ”Up” or ”Down”. The combination of your type and your decision will determine how much money you earn. Note that your payoﬀ does only depend on your type and not on the type of your partner. The money you earn or lose is determined in the following table: Your Type “X” “O” Your “up” 0 0 choice “down” win 100 cents lose 400 cents 1. If you choose “Up” your current earnings will not change. 2. If you choose “Down” and your type is “X” one dollar is added to your earnings. 3. If you choose “Down” and your type is “O” then four dollars will be de- ducted from your account. Note again that the type that determines the payoﬀs is your type and not the type of your partner. An example of a decision screen is shown below. The payoﬀs in the table below are given in cents. 24 After you made your decision the following happens: If either you or your partner has chosen “Down” the current game ends. Your payoﬀ will be calculated and shown on the screen. Then a new game begins with a new draw of the types. However, if both you and your partner have decided to play “Up” the game enters a second round. Round 2 (only if both players chose “Up” in round 1) Round two of the game practically works the same way as round one does. Note that you and your partner keep the types that were drawn before decision round 1. The only diﬀerence in round two is that the payoﬀs for choosing down are multiplied by a factor of 0.8. The payoﬀs in round two are the following: Your Type “X” “O” Your “up” 0 0 choice “down” win 80 cents lose 320 cents So now you win 80 cents (instead of 1 Dollar in round 1) if you are of type “X” and you choose “Down”. If you choose “Down” and it turns out that you are of type “O” you lose 320 cents (instead of 4 Dollars in round 1). Choosing “Up” once again does not cause any gains or losses regardless of your type. An example of the decision screen is given below: After round 2 the game ends no matter of the actions previously taken. Your payoﬀ will be calculated and shown on the screen. Then a new game starts with a new draw of types (as explained above). 25 In total you will play 14 of these games. At the end of the experiment (after 14 games) you will be given a little questionnaire where you have to ﬁll in your details. The questionnaire is only used to make sure that you get the money you have earned. Thank you very much for your participation. 26