Microsoft PowerPoint - Game Theory

Document Sample
Microsoft PowerPoint - Game Theory Powered By Docstoc
					                                                               What is Game Theory?
                                                    • Created by Von Neumann and Morgenstern in 1944 (book:
                                                      The Theory of Games and Economic Behaviour).
                                                    • Has found application in economics, politics, biology,
                                                      computer science, psychology, sociology, ......
               Game Theory                          • Whenever people or other agents interact with each other they
                                                      are playing a game.
                                                    • Game Theory is the study of rational interactions in games i.e.
                                                      it is the study of the logic of interaction in games.
                                                    • It is the study of how agents can do as well as possible in
                                                    • This sounds like classical AI!!
                                                    • Will first consider examples of games, and then look at
                                                      relevance to “Nouvelle AI”.

      Some Examples of Games                       Smugglers (adapted from Morton Davis).
• Traditional games such as chess, draughts,        • A drug smuggling ring operates by using two
  noughts-and-crosses.                                “carriers” to smuggle drugs via either the
• These are not usually considered interesting        (sole) airport or the (sole) seaport of a country.
  from a game theory point of view - this is        • The police have 2 officers to try to stop them.
  because these games are games of perfect
  information, are finite, and have no chance       • If one officer guards the exit used by both
  (random) moves.                                     smugglers 70 kilos will get through, and for
• In these circumstances there are theoretical        each smuggler at an exit with no officers
  results showing that the results of such games      present 50 kilos will get through.
  are essentially predetermined, assuming both      • If there are at least as many officers as
  players play perfectly.                             smugglers no drugs will get through.

                                                                Enforcing speed limits (adapted from
                                 Police at Airport
                                                                           Morton Davis)
                                 0        1        2           • A town council is trying to decide how strictly
                         0        0         70         100       to enforce speed limits.
Smugglers at Airport     1       50         0             50   • By quantifying various costs and benefits to
                         2      100         70            0      the community and to drivers - the time saved
                                                                 by speeding, the danger to the driver, the
                                                                 danger to the general public, the penalties to
 What strategies should the police and smugglers adopt           the speeding driver, the cost of enforcement, a
 to achieve their goals?                                         game theorist arrives at the payoff matrix
 (Bear in mind this game will be played repeatedly!)

                                                               Game, Theory, Biology, and Nouvelle AI
                              Council Enforces Law
                                 Yes              No           Bluegill Sunfish (adapted from Ken Binmore).
                                                               • Bluegill sunfish males come in 2 varieties:
                       Yes    (-190, -25)    (10, -5)             – regular which take 7 years to reach maturity, and which
   Driver Speeds                                                    build a nest which attracts egg laying females. When eggs
                       No       (0, -20)         (0, 0)
                                                                    are laid, he fertilizes them, and then defends resulting
                                                                    family, while female gets on with her life elsewhere.
                                                                  – rogue which reach maturity in 2 years, and is not capable
• Is it “cheaper” for the council to enforce, or ignore the         of rearing a family. Instead he lurks in hiding until a female
                                                                    has laid her eggs in response to a regular male, and then
                                                                    rushes out to fertilize the eggs before the regular male has
• Is there any other alternative?                                   a chance to do so. If successful, the regular male defends
                                                                    the resulting family, which are not related to him.

• If we regard the behaviour of bluegill sunfish males                • Selecting the fittest to reproduce most often
  as a game in which each male is competing with the                    will therefore lead to a situation in which the
  others to get the maximum number of offspring, then
  this can be treated as an N-player game (where N is                   relative proportions of rogues and regulars is
  number of males in population).                                       such that there is no particular advantage in
• If there are enough regular males, then the payoff is                 being either.
  high for being a rogue, since there are plenty of                   • Applying game theory to biological modelling
  opportunities to apply the rogue strategy.
• If there are too many rogues in the population, then it               can therefore help us understand why
  is better to be a regular male since there is at least                evolution leads to populations with particular
  some chance that the male will get to propagate its                   proportions of alternative strategies.
  genes, versus almost no chance for any individual
  rogue.                                                              • It can also help us understand similar
• Therefore the fitness of an individual is determined                  phenomena in evolutionary computational
  by the relative proportions of the 2 types of male in                 processes.
  the population

                         Variants                                                  Further Reading
There are many variants of games and game theory. Games can
  be studied:                                                        • Davis, Morton (1970) Game Theory: A
• With, or without, communication allowed between the players          Nontechnical Introduction
   – promises                                                           Many of the examples in this lecture are adapted
   – threats
   – lies
                                                                       from this book.
• With simultaneous play, or sequential (in which one player
  goes first, and then the other(s) get to choose their response).
• With deterministic, or stochastic, payoffs.                        • Binmore, Ken (1992) Fun and games: A Text on
• With 2 or more players.                                              Game Theory
• With, or without, “side-deals” allowed.
• Repeated, or “one-shot”.
• etc.                                                               • Smith, Maynard (1982) Evolution and the
                                                                       Theory of Games

          Zero-Sum 2 Player Games
                                                                            A Political Example
   • These are 2 player games where the interests of the
     players are completely and utterly opposed to each
     other, and one player's gain is the other player's loss    • Two political parties (the Acranial, and the
     (and vice-versa).                                            Brainless) are in the process of deciding their
   • Can therefore be represented by a payoff matrix              manifestos before an election.
     which just gives the payoffs to one of the players, and
     the payoffs for the other player can be computed from
                                                                • Abortion is a major issue in this election, and
     these.                                                       each party must decide whether they will adopt
   • Sometimes, the payoffs for the two players will add          a “pro” legal abortion position, an anti-
     to a non-zero constant value. In this case the games         abortion position, or if they will simply avoid
     are still considered zero-sum as they can easily be          the issue.
     converted to an equivalent game in which the payoffs
     do add to zero.

                                                                                   Party B
   • Party researchers carry out opinion polls and
                                                                                 Pro    Anti Avoid
     discover that the following matrix represents the
                                                                         Pro     45% 50%        40%      In this case this is
     percentage of the vote that the Acranials will receive
                                                                                                         very simple.
     if each party adopts the positions shown:                 Party A   Anti    60% 55%        50%
                   Party B                                               Avoid 45% 55%          40%

                  Pro    Anti Avoid                             • Party B should avoid the issue, since it always does
          Pro    45% 50%         40%                              best with this option no matter what A does
                                                                  (remember: smaller numbers are better for B!)
Party A   Anti   60% 55%         50%                            • Party A should adopt an “anti” position since it
                                                                  always does at least as well with this as with any
          Avoid 45% 55%          40%                              other strategy no matter what B does.
                                                                • Therefore the predicted best outcome for each party is
   What should the parties do?                                    with these strategies and there will be a 50-50 split in
                                                                  the vote

   What if the payoff matrix had been different?                                                     Points to Note
                           Party B
                                                                                 • In the last example the joint position A chooses “Anti” and B
                         Pro       Anti Avoid            It is less clear now
                                                                                   chooses “Pro” is known as a (Nash) equilibrium of the game,
                                                                                   and these two choices are known as equilibrium strategies.
              Pro       35% 10%               60%        what each party
                                                         should do.              • A (Nash) equilibrium of a game represents a choice of
Party A       Anti      45% 55%               50%                                  strategy by each of the players such that no player is
             Avoid 40% 10%                    65%                                  unilaterally tempted to change strategy since they are
                                                                                   guaranteed to do worse if they do.
They can arrive at a solution by reasoning as follows:
                                                                                 • Some games have more than one equilibrium. In zero-sum
• B should NOT avoid the issue, since it will always do at least as well by        games all such equilibria are equivalent in the sense that the
  choosing “Pro”.                                                                  payoffs for all players are the same in each of the equilibria
• Once A realises that “Avoid” is not an option for B, it becomes clear that A
  should adopt an “Anti” stance, since it does better with this than any other   • Some games have no equilibria. Is it possible to have a
  option no matter what B does.                                                    sensible strategy in such games?
• Once B realises that A will adopt an “Anti” position, B should adopt a “Pro”
  position since this maximizes B's vote (and minimizes A's)

                                                                                          An Example (Smugglers)
             Pure and Mixed Strategies
                                                                                                                         Police at Airport
  • In zero-sum games with Nash equilibria one can do                                                                    0        1        2
    no better than to always play (one of) the equilibrium
    strategy. Since one always plays the same strategy,                                                        0         0         70        100
    this is known as a pure strategy.                                            Smugglers at Airport          1         50        0         50
                                                                                                               2        100        70        0
  • In games with no Nash equilibria, one can only
    guarantee the value of the game by playing a so-
    called mixed strategy, which involves randomly                               • If this game is played repeatedly it turns out
    choosing (with appropriate probabilities) which pure                           that there are equilibrium strategies, but they
    strategy to play.                                                              are mixed.

• Police: Send both police to the airport 1/2 of the time,                     The Minimax Theorem
  and both to the seaport the other 1/2 of the time.                • Von Neumann (yes, him again!) proved a theorem about finite
• Smugglers:                                                          2 player zero-sum games, which roughly says (again taken
                                                                      from Morton Davis):
   – 4/14 of the time send 1 smuggler to airport and 1 to seaport     “Every finite 2 player zero-sum game has a value V which is
   – 5/14 of the time send both to the airport                        the amount player 1 will win on average if both players play
   – 5/14 of the time send both to the seaport                        sensibly”
                                                                    • The following should be noted about this theorem:
• These strategies guarantee that on average the                    • There is a (mixed) strategy for player 1 that will guarantee
  smugglers will get 50 kilos through, and the police                 this. Nothing player 2 can do will prevent player 1 from
  will stop 50 kilos from getting through. If the                     getting V average.
  smugglers (always) used a different strategy then                 • There is a (mixed) strategy for player 2 that will guarantee that
  there is a strategy for the police which will mean the              player 1 gets no more than V.
  smugglers get less through, and if the police use a               • Since the game is zero-sum, what player 1 wins, player 2
                                                                      loses. Since player 2 wants to minimise her losses, player 2 is
  different strategy then the smugglers can choose a                  motivated to limit player 1's average win to V.
  strategy which will get more through.                             • Note this last point does not hold for non-zero-sum games.
                                                                    • The mixed equilibrium strategies for the Smuggling game are
                                                                      an example of the theorem, where V = 50.

   Non-Zero-Sum 2 Player Games
• Zero-sum games are games of “pure opposition” i.e.
                                                                               The Prisoner's Dilemma
  anything one player gains is balanced by equivalent
  losses by the other players, so the players simply                Two people are arrested and charged with a serious
  compete with each other.                                           crime. They are kept in separate cells, with no
• In contrast, non-zero-sum games involve elements of                opportunity to confer. If both people remain silent,
  both competition and cooperation, and it is possible               and tell the police nothing, they will both be set free,
  for both players to gain, or both to lose, or for there to         as there is insufficient evidence to convict them. If
  be a net gain or loss by the players.                              one “turns state's evidence” (and says the other
• The payoff matrix for such games must specify the                  prisoner did it), that prisoner will be rewarded with
  payoffs for both players, since those for player 2                 £20000, and the other prisoner will get a 6 year
  cannot be computed from those for player 1.
                                                                     sentence. If they both “turn state's evidence” (and
• Note that in such games each player is trying to                   inform on each other), they both get a 1 year
  maximize their own payoff. This is NOT necessarily
  the same as minimizing the payoff to one's opponent.               sentence.

                                            P2                                                  Iterated Prisoner's Dilemma
                                C                         D
                                                                                         Suppose instead that the players are given the
  P1        C                (0, 0)             (-60000, 20000)                            opportunity to repeatedly play the game against each
            D        (20000, -60000) (-10000, -10000)                                      other. Now the possibility exists to learn from past
                                                                                           play, and to give the other player the opportunity to
        C = cooperate (= remain silent)
                                                                                           cooperate, and if they don't, to “punish” them by
        D = defect (= inform)
                                                                                           future play.
The equilibrium of this game is where both players defect, and the inexorable
logic of the fact that both players are trying to maximize their payoff drives them to   Possible strategies include:
this point.
However, if they could agree to both cooperate, and stick to the agreement, they         • random play
would both do much better. However, even if they had agreed there is nothing to          • always defect
stop a player reneging on their promise (talk is cheap!), and for the player reneging,
it is to their advantage to do so.                                                       • always cooperate
For a one-shot game the Nash equilibrium is really the inevitable outcome of this        • something “intelligent” depending on opponent's play
game if both players play rationally.

              Evolution of Cooperation                                                      Something slightly paradoxical!
                                                                                         • Note that there is something paradoxical about playing
  • Axelrod used this to study the evolution of                                            cooperatively if the players know how many iterations there
    cooperative behaviour. Set up a computer                                               are going to be (e.g. 10).
                                                                                         • Even if (or maybe because) one has been playing
    tournament where programs (strategies) all                                             cooperatively with one's opponent, on the last iteration there is
    play each other, and get to reproduce (have                                            a temptation to defect to try and “sneak in” a final extra gain,
                                                                                           since there is no possibility of being “punished” for it.
    copies of themselves made) in proportion to                                          • Therefore the last iteration will therefore be just like a “!one-
    how well they do.                                                                      shot” game.
                                                                                         • Therefore the game before the last is in some sense really the
  • Do cooperative strategies win out?                                                     last.
                                                                                         • Therefore one should play this like a one-shot game as well
  • More on this in exercise classes!!                                                   • And so on, for the previous game, and ultimately, all 10

                                                                                         • What might be wrong with this argument?

   Evolutionarily Stable Strategies
• Consider the bluegill sunfish example mentioned earlier.
• Suppose that the proportions of rogue and regular males are
  p1 and p2 (where p1 + p2 = 1).
• From a game theoretic point of view it makes no difference if
  we have a population divided in these proportions, or if each
  player (a male sunfish) plays the mixed strategy rogue with
  probability p1 and regular with probability p2.
• This mixed strategy is called an evolutionarily stable strategy
  if a male playing a different “mutant” strategy could not take
  over the population.
• Much evolutionary biological modelling is devoted to finding
  evolutionarily stable strategies, or to showing that the
  evolutionarily stable strategies of the model correspond to
  what happens in reality.
• Think about:
       Are strategies like tit-for-tat evolutionarily stable?


Shared By:
Description: Microsoft PowerPoint - Game Theory