Try the all-new QuickBooks Online for FREE.  No credit card required.


Document Sample
Shot Powered By Docstoc
					               Using Bayes’ Theorem for Path Prediction
                                        R.J.S.Baker, P.I.Cowling, T.W.G.Randall, and P.Jiang
                                                                           behaviour pattern [3]. Le Doux [10] explains that humans
Abstract—Understanding the intentions of another living                     possess a fast mental process that warns of a potential danger,
creature is an inherent ability, one which humans use with
                                                                            and readies the body for reaction. In addition to this, a slower
great regularity in daily life. In video games, however, static
interactions are generally used instead of ones that dynamically            reasoning process that determines an appropriate reaction is
adapt to their opponents. The work presented uses a grid-                   initiated simultaneously. This reasoning process is a system
based interception game to form probabilistic beliefs relating              that models the nature of all given stimuli which initially
actions to goals by observing path transitions found through                triggered the fast process. For researchers and game
the A* algorithm. We then apply Bayesian inference to
determine the opponent’s destination given only real-time                   developers alike, a similar understanding of interactions, and
enemy path information.                                                     subsequent reactions, is essential to achieving intelligent
Keywords: Bayesian Analysis, Opponent Model, Poker
                                                                              A. Opponent Models and Games
                         I. INTRODUCTION                                       At the very inception of Game Theory, Von Neumann
                                                                            developed the idea of a minimax search upon the game tree of
   Laird and van Lent [8] stated seven years ago that video
                                                                            a zero-sum game. This considers all possible performable
games could be the ‘killer application’ for human-level AI
                                                                            actions, and applies a heuristic score to game situations that
research, and much has been done by the academic research
                                                                            may arise. The opponent modeling in this is such that we
community to investigate AI within video game environments,
                                                                            assume that the opponent would choose the same action the
and to create believable, human behaviours. However, many
                                                                            player would if in the opponent’s position. This idea has been
commercial video games continue to rely on a relatively small
                                                                            applied in Laird’s research into creating a QUAKE II
set of core techniques, such as finite state machines and
                                                                            deathmatch AI [9]. Laird’s approach uses the AI’s own state
environmental triggers. An experienced player that has already
                                                                            analysis to model a prediction of the opponent’s next position.
observed these static behaviours is able to exploit the
                                                                            The predictive accuracy and behaviour of this approach
weaknesses involved with a static decision making process.
                                                                            supports the underlying principle behind opponent modeling:
Games developers are now more willing to commit to some of
                                                                            Understanding (or acknowledging the existence of) an
the advances of the academic research community as the
                                                                            opponent’s strategy is integral to successful and believable
requirement for believably intelligent behaviour is becoming
                                                                            behaviour. The use of opponent modeling in video games is
much more apparent in the pursuit of a truly immersive
                                                                            best applied to Enemy characters, which should provide a
experience. We believe that the modeling and prediction of
                                                                            challenging interaction with the player, but avoid situations
human intentions and interactions is an integral part of
                                                                            where a weaker player would be overwhelmed [5].
achieving this goal.
                                                                                 An attempt to remedy the difficulty issue presented by
                                                                            enemy characters was implemented in Dynamic Scripting, a
                     II. OPPONENT MODELING
                                                                            technique that adapts over several iterations in order to
   Opponent modeling is arguably one of the most important,                 produce an effective player by selecting and scoring a set of
yet most complex of human abilities. An ‘average’ human is                  scripts [12]. Spronck later developed top-culling, a technique
able to distinguish, through observation, the nature and                    which prevents overly successful techniques from being
intention of numerous actions, as well as assess the capabilities           selected, dynamically adjusting the enemy’s strategy in order
of those who perform the action.                                            to accommodate for the shortcomings of a human player [13].
   Humans have an inherent ability to create a mental model of              The limitations of Spronck’s approach exist only in the
interactive behaviours, where the model is an approximated                  requirement of iterated encounters; the player must die or lose
simulation of how another creature will behave [6]. These                   a battle before adapting its difficulty. Games such as first
models are generally created through social observation and                 person shooters (FPS) and real time strategies (RTS),
interaction which yields much information from which to                     however, should not necessarily force the failure of a human
learn. Through the generalization and interpretation of the                 player before adaptation, but rather react to human behaviours
modeled set of actions over many varied circumstances,                      in real-time. Obviously, this adaptation should not be so
humans are able to display a socially acceptable interactive                forgiving as to fail to punish a player’s foolish actions.
                                                                                 In most commercial video games, the aim of an opponent
   This work was funded by the Engineering and Physical Sciences Research   modeling system should be to identify user strategies through
Council. We also gratefully acknowledge the contribution of Introversion    observation and interaction, and then identify expected
Software Ltd. who donated the DEFCON source code for use in our research.
behaviours.                                                         ships in order to navigate to a treasure chest with the same
                                                                    colour corresponding to that of the ship(s), and the scoring is
  B. Agent Path Prediction
                                                                    determined by the number of ships that have reached their
   Predicting the path of an agent is a necessary technique in      goal. In order to win, a player must maintain an advantage
many facets of interaction. Much research has attempted to          over their opponent, and therefore should attempt to stop one
model the path-making behaviours of humans in real-world            or more of their opponents’ ships from reaching their goal.
situations [7]. The application of such a predictive technique      This game is somewhat analogous to a ‘Capture-the-Flag’
across virtual environments is many-fold. In this vein, military    mode of play from many multiplayer FPS games, where
prediction of enemy units has been deeply researched [2],           attacking or defensive strategies must be chosen in response to
particularly into the potential destinations of naval units [14].   the opponent’s actions. The interception of an opponent
In the context of video games, path modeling has primarily          requires early inference of their destination, given that the
been concerned with online environments, where Dead                 opponent has multiple goals to choose from. In this example,
Reckoning is used to model prospective agent position in an         the lower ship has been given only one of the treasure chests as
attempt to reduce network traffic caused by the transmission of     its target destination, and the upper ship needs to intercept the
positional data [11]. Our aim in this paper is to demonstrate a     lower ship before it reaches its goal. A useful analogy can be
probabilistic technique for determining an agent’s short-term       drawn through the consideration of sports simulations, an
goal in real-time evaluation for use in determining an              example such as player ‘marking’ tactics within a soccer game.
appropriate and intelligent interceptive reaction.                  A ‘marked’ player must attempt to evade the opposition player
                                                                    assigned to follow and restrict his movement, especially in
                      III. BAYES’ RULE                              such scenarios as those of a free-kick or corner kick. Halmoids
   Bayes’ rule relates conditional and marginal probability         as a game represents the task of both the marked player, and
distributions of random variables, and shows that however           the marker themselves. In this work we concentrate on the role
different the probability of event A conditional upon event B is    of the marker. As with many game scenarios, the importance
to that of B conditional upon A, there is still a relationship      of predicting a player’s tactical intention through action
between the two.                                                    analysis is integral to successful play; to fail to understand the
                                                                    opponents target can potentially mean the loss of a goal in
                                Pr(B | A) Pr( A)
                 Pr( A | B)                                 (1)    soccer, or the entire game in the case of Halmoids.
   Equation (1) gives Bayes’ rule where, in a game scenario,
our opponent has an unknown strategy, A. We need to
determine the strategy A, given the observed set of opponent
actions, B. In our previous work based on a simplified form of
Poker, we considered that A can represent one of four
strategies, and each of these strategies has Pr(B | A) of
performing action B (where B is represented by one of three
actions; bet, check and fold) [2]. This information is used to
create a probability distribution that determines if our
opponent is performing strategy A through the analysis of the
past set of player actions. The action-strategy probabilities are
determined by observing numerous games played by each
strategy-using opponent and taking a frequency of each action
performed. For a player who does not have this experience,
subjective approximations must be created for the frequency of
actions performed by each style. We can suppose that all
players have strategies by which to achieve certain goals
regardless of game.

   This section introduces GridWorld, a teaching/research tool                       Fig 1. A simple ship-blocking terrain.
for AI techniques, and harness for a competitive AI
environment created by the University of Bradford and Black           A. A* Analysis and the Issue of Subjectivity
Marble Ltd. with sponsorship from Microsoft [4]. An example
                                                                      An agent in a video game should be able to generalise its
of Halmoids, a GridWorld game, can be seen in Fig. 1.
                                                                    ability to infer destination over many scenarios or, in the case
     The aim of Halmoids is to control one or more pirate
                                                                    of Halmoids, terrains. We attempt to create a generalised
opponent modeling approach through partial terrain analysis.         the central treasure chest, Target 2. As is shown through
The application of Bayes' requires an initial probability            observation of the terrain defined by fig. 1, to move to the
distribution to represent the link between opponent action and       central chest requires the movement along one of the paths for
strategy/goal. This probability distribution can either be           either adjacent chest, hence the initial convergence of target
                                                                     belief being upon that of Target 3. The change denoted by the
constructed from data of past interactions, or a designer's
                                                                     increase of the belief representing that of Target 2 can be
subjective belief of how an opponent should behave to achieve        explained through Fig. 4, which shows a map of the reaction of
its goal. In a situation where the nature of a terrain/opponent is   the Bayes-controlled ship (b) to the opponent action set. As
unavailable, we cannot take a subjective viewpoint. Thus, we         can be observed, b successfully intercepts the opponent before
need to be able to generalise for unforeseen terrains, and           it reaches the target. Observing the path taken by b, we can see
actively procure environmental data before analysis can be           that due to the frequency of moves to the right-hand treasure
performed. We use the A-Star (A*) algorithm, with a                  chest, the path before move number 9 is at a point where the
Euclidean distance heuristic to create paths from our opponent       belief that the opponent will move to the right-hand treasure
                                                                     chest is strong enough to cause an effect that displays the
ship to each of its potential destinations. We cannot explicitly
                                                                     realisation upon the next move that its initial belief is
compare our opponent’s position each turn in comparison to           incorrect. This performance yields a behaviour that appears
the paths we create as our (human) opponent could use a              human in its folly, as well as showing enough intelligence in
different distance heuristic or non-optimal form of pathfinding,     order to correct its error.
or if a single path is required by more than one goal, which can
potentially apply in the case of Fig. 1. In light of this, our
approach is to analyse each position along the path, in relation
to the previous position. The relation is recorded in one of
eight directions, and a set S of directional probabilities per
path is recorded, represented by

Where U represents the Up direction, D represents the Down
direction; L represents the Left direction, and R represents the
Right direction. Obviously, this can be transformed to any
directional or angular scale of choice, but these eight
directions are chosen for simplicity. This directional choice
can also apply to the spatial representation of a character in
both 2 and 3-dimensional environments.                                 Fig 2. Bayesian analysis of an opponent ship heading towards Target 3.
                              TABLE I
                         OPPONENT TARGETS

       Direction        Target 1        Target 2      Target 3
          UP            0.63777         0.68333       0.63777
        DOWN            0.00008         0.00008       0.00008
         LEFT           0.17666         0.07666       0.00008
        RIGHT           0.00008         0.07666       0.17666
       UP-LEFT          0.17666         0.07666       0.00008
      UP-RIGHT          0.00008         0.07666       0.17666
     DOWN-LEFT          0.00008         0.00008       0.00008
    DOWN-RIGHT          0.00008         0.00008       0.00008

We now have a set of per-directional probability distributions
for each possible destination. Using the terrain from Fig. 1, we
generate a distribution as shown in Table I.
   Given our initial probability distribution, we can now              Fig 3. Bayesian analysis of an opponent ship heading towards Target 2.
perform an iterative calculation using Equation (1), which will
take each action observed by the player and gain a posterior           A. Coping with Random Behaviour
probability distribution of our opponent’s most probable
target, which our opponent chooses at random. As our                 In order to test the robustness of the A*-Bayes hybrid, varying
observation yields further information as to our opponent’s          levels of random behaviour are added to the lower ship’s
target, we can then intercept using the appropriate path to          pathfinding, so that with probability p the move made is at
block our opponent’s progression. Fig. 2 shows the                   random rather than following its original path. Fig. 5 shows the
convergence of probabilities given the movement data of an           accuracy of our analysis against an opponent where 0.01 ≤ p ≤
opponent moving to the right-hand treasure chest, Target 3.          1.0, applied in increments of 0.01. As we can observe, the
Fig. 3, however, shows the performance of a player moving to         predictive accuracy falls with an increase in the amount of
random behaviour displayed. This can be accounted for by the                  actions and a short-term goal. We have shown that Bayesian
presence of moves which are linked with a very small                          analysis upon this distribution in relation to opponent actions
probability in relation to a target, as displayed in Table I. This            observed in real time can determine the goal of a previously
is further compounded by the amount of random movement                        unseen opponent, and hence determine the point of
that occasionally forces a ‘bluff’ move towards a different goal              interception. The potential for applying Bayesian analysis and,
at the last minute; this sidestep can occasionally be enough to               by relation, opponent modeling to video game agents is
fool the analysis into believing in a change in destination. We               considerable; the data that can be gleaned from human
can also see that as the amount of random behaviour further                   interaction is expansive in scope, yet few have attempted to
increases, the success of the analysis increases somewhat; this               analyse it, and apply any inference to a game environment.
is due to the lack of focus or direction for the randomised
player, causing an erratic approach, stopping or slowing the                                               FURTHER WORK
player from reaching its goal.
                                                                              We are currently working with Introversion Software and their
                                                                              commercial video game, DEFCON, a self-described
                                                                              ‘Genocide-‘em-up’ which simulates a global thermonuclear
                                                                              war. Each places units along their continent with the aim of
                                                                              inflicting as much damage to the opponent as possible. Our
                                                                              future work lies in using the analysis of the imperfect
                                                                              information of opponent actions and partial unit placement
                                                                              data in order to determine and counteract the enemy’s goals.

                                                                              [1]    Baker, R.J.S., and Cowling, P.I. 2007. Bayesian Opponent Modeling in
                                                                                     a Simple Poker Environment, IEEE Symposium on Computational
                                                                                     Intelligence and Games (CIG 2007), 125-131, Honolulu, USA.
                                                                              [2]    Brown, D.E. and Gordon, G. 2005. Terrain Based Prediction to Reduce
                                                                                     the Search Area in Response to Insurgent Attacks. The 10th
                                                                                     International Command and Control Research and Technology
                                                                                     Symposium, McLean, VA 13-16.
                                                                              [3]    Byrnes, J. P. 2001. Cognitive development and learning in instructional
                                                                                     contexts (2nd ed.).Boston, MA: Alyn & Bacon.
                                                                              [4]    Cowling, P.I. 2006. Writing AI as Sport. AI Game Programming
                                                                                     Wisdom 3 (ed. Steve Rabin). 89-96. Charles River Media, Hingham,
                                                                              [5]    van den Herik, H.J.; Donkers, H.H.L.M., and Spronck, P.H.M. 2005.
                                                                                     Opponent Modelling and Commercial Games. IEEE 2005 Symposium
                                                                                     on Computational Intelligence and Games (CIG 2005)(eds. Graham
                                                                                     Kendall and Simon Lucas). 15-25.
                                                                              [6]    Johnson-Laird, P.N. 1983. Mental models: towards a cognitive science
                                                                                     of language, inference and consciousness. Cambridge, UK: Cambridge
         Fig 4. Performance of the Bayes-Controlled (upper) ship                     University Press.
                                                                              [7]    Krumm, J. 2006. Real Time Destination Prediction Based on Efficient
                                                                                     Routes, Society of Automotive Engineers (SAE) 2006 World Congress,
                                                                                     Paper 2006-01-0811.
                                                                              [8]    Laird, J.E. and van Lent, M. 2000. Human-level AI's Killer Application:
                                                                                     Interactive Computer Games. AAAI Fall Symposium Technical Report,
                                                                                     North Falmouth, Massachusetts, 80-97.
                                                                              [9]    Laird, J. E. 2000. It Knows What You're Going to Do: Adding
                                                                                     Anticipation to a Quakebot. AAAI 2000 Spring Symposium Series:
                                                                                     Artificial Intelligence and Interactive Entertainment,: AAAI Technical
                                                                                     Report SS00 -02.
                                                                              [10]   Le Doux, J. 1997. The Emotional Brain: The mysterious underpinnings
                                                                                     of emotional life. New York: Simon & Schuster.
                                                                              [11]   Li, S.; Chen, C., and Li, L. 2008. A new method for path prediction in
                                                                                     network games. Computers in Entertainment. Vol. 5, No. 4, 1-12.
                                                                              [12]   Spronck, P.H.M.; Sprinkhuizen-Kuyper, I., and Postma, E. 2003. Online
                                                                                     Adaptation of Game Opponent AI in Simulation and in Practice.
                                                                                     Proceedings of the 4th International Conference on Intelligent Games
                                                                                     and Simulation (GAME-ON 2003) (eds. Quasim Mehdi and Norman
                                                                                     Gough). 93-100.
 Fig 5. Interception success of a Bayes-controlled ship against an opponent   [13]   Spronck, P.H.M.; Sprinkhuizen-Kuyper, I., and Postma, E. 2004.
             applying varying levels of randomness to its actions                    Difficulty Scaling of Game AI. GAME-ON 2004: 5th International
                                                                                     Conference on Intelligent Games and Simulation (eds. Abdennour El
                                                                                     Rhalibi and Danny Van Welden). 33-37.
                           V. CONCLUSION                                      [14]   Zhao, X.; Xu, R., and Kwan, C. 2004. Ship-motion prediction:
                                                                                     algorithms and simulation results, IEEE International Conference on
This paper describes how a probabilistic A* path analysis can                        Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP
be used as a generalizable means to develop probability                              '04). Vol.5, No.8, 17-21.
distributions representing the link between assumed opponent

Shared By:
Description: wat moe