VIEWS: 3 PAGES: 7 POSTED ON: 3/10/2012
Why Study Games? • Many human activities can be modeled as games Games – Negotiations – Bidding – TCP/IP CPS 170 – Military confrontations Ron Parr – Pursuit/Evasion • Games are used to train the mind – Human game-playing, animal play-fighting Why Are Games Good for AI? History of Games in AI • Games typically have concise rules • Computer games have been around almost as long as computers (perhaps longer) • Well-defined starting and end points – Chess: Turing (and others) in the 1950s • Sensing and effecting are simplified – Checkers: Samuel, 1950s learning program – Not true for sports games • Usually start with naïve optimism – See robocup • Follow with naïve pessimism • Games are fun! • Simon: Computer chess champ by 1967 • Downside: Getting taken seriously (not) • Many, e.g., Kasparov, predicted that a computer – See robo search and rescue would never be champion Game Setup Games Today • Most commonly, we study games that are: – 2 player • Computers perform at champion level – Alternating – Backgammon, Checkers, Chess, Othello – Zero-sum – Perfect information • Computers perform well • Examples: Checkers, chess, backgammon – Bridge • Assumptions can be relaxed at some expense • Computers still do badly • Economics studies case where number of – Go, Hex agents is very large – Individual actions don’t change the dynamics Zero Sum Games Characterizing Games • Assign values to different outcomes • Two-player games are very much like • Win = 1, Loss = -1 search – Initial state • With zero sum games every gain comes at the – Successor function other player’s expense – Terminal test • Sum of both player’s scores must be 0 – Objective function (heuristic function) • Are any games truly zero sum? • Unlike search – Terminal states are often a large set – Full search to terminal states usually impossible Game Trees Game Trees x o x o x Player 1 Max nodes o A1 A3 x o x x o x x o x A2 o x x o x o x Player 2 Min nodes o x o x o A11 A21 A22 A31 A32 A12 x o x x o x x o x x o x x o x x o x o x x o x x o o x o x o o x o x o o o o x o x o o x o x o o Player 1 Terminal Nodes Minimax Values Minimax Max nodes • Max player tries to maximize his return 3 • Min player tries to minimize his return • This is optimal for both (zero sum) Min nodes 2 3 2 minimax(nmax ) = max s∈succesors( n ) minimax( s) minimax(nmin ) = min s∈succesors( n ) minimax( s ) 3 12 2 4 15 2 Minimax Properties Minimax in the Real World • Minimax can be run depth first – Time O(bm) • Search trees are too big – Space O(bm) • Alternating turns double depth of the search – 2 ply = 1 full turn • Assumes that opponent plays optimally • Branching factors are too high – Chess: 35 • Based on a worst-case analysis – Go: 361 • Search from start never terminates in non- • What if this is incorrect? trivial games Evaluation Functions Desiderata for Evaluation Functions • Like heuristic functions • Would like to put the same ordering on nodes (even • Try to estimate value of a node without if values aren’t totally right) expanding all the way to termination • Is this a reasonable thing to ask for? • Using evaluation functions • What if you have a perfect evaluation function? – Do a depth-limited search • How are evaluation functions made in practice? – Treat evaluation function as if it were terminal – Buckets • What’s wrong with this? – Linear combinations • Chess pieces (material) • How do you pick the depth? • Board control (positional, strategic) • How do you manage your time? • Iterative deepening, quiescence Search Control Issues Pruning • Horizon effects • The most important search control method is – Sometimes something interesting is just figuring out which nodes you don’t need to beyond the horizon expand – How do you know? • Use the fact that we are doing a worst-case • When to generate more nodes? analysis to our advantage • If you selectively extend your frontier, how – Max player cuts off search when he knows min do you decide where? player can force a provably bad outcome – Min player cuts of search when he knows max can • If you have a fixed amount of total game force a provably good (for max) outcome time, how do you allocate this? Alpha-beta pruning How to prune • We still do (bounded) DFS Max nodes 3 • Expand at least one path to the “bottom” • If current node is max node, and min can force a lower value, then prune siblings Min nodes • If curent node is min node, and max can 2 3 2 force a higher value, then prune siblings 3 12 2 4 15 2 Implementing alpha-beta Max node pruning max_value(state, alpha, beta) if cutoff(state) then return eval(state) for each s in successors(state) do 2 alpha = max(alpha, min_value(s, alpha, beta)) if alpha >= beta the return beta end return alpha Max nodes min_value(state, alpha, beta) 2 4 if cutoff(state) then return eval(state) for each s in successors(state) do beta = min(alpha, max_value(s, alpha, beta)) if beta <= alpha the return alpha 4 end return beta Amazing facts about alpha-beta What About Probabilities? • Empirically, alpha-beta has the effect of Max nodes reducing the branching factor by half for many problems • This effectively doubles the horizon that Chance can be searched nodes • Alpha-beta makes the difference P=0.9 P=0.5 P=0.5 P=0.6 P=0.4 P=0.1 between novice and expert computer players Min nodes Expectiminimax Expectiminimax is nasty • High branching factor • n random outcomes per chance node • Randomness makes evaluation fns difficult • O(bmnm) time – Hard to predict many steps into future – Values tend to smear together – Preserving order is not sufficient • Pruning is problematic eminimax(nmax ) = max s∈succesors( n ) eminimax( s ) – Need to prune based upon bound on an eminimax(nmin ) = min s∈succesors ( n ) eminimax( s ) expectation eminimax(nchance ) = ∑s∈succesors( n ) eminimax( s) p ( s ) – Need a priori bounds on the evaluation function Multiplayer Games Conclusions • Things sort-of generalize • Game tree search is a special kind of search • We can maintain a vector of possible • Rely heavily on heuristic evaluation functions values for each player at each node • Alpha-beta is a big win • Assume that each player acts greedily • Most successful players use alpha-beta • What’s wrong with this? • Final thought: Tradeoff between search effort and evaluation function effort • When is it better to invest in your evaluation function?
"Games Why Study Games Why Are Games Good for AI History of "