Document Sample

Why Study Games? • Many human activities can be modeled as games Games – Negotiations – Bidding – TCP/IP CPS 170 – Military confrontations Ron Parr – Pursuit/Evasion • Games are used to train the mind – Human game-playing, animal play-fighting Why Are Games Good for AI? History of Games in AI • Games typically have concise rules • Computer games have been around almost as long as computers (perhaps longer) • Well-defined starting and end points – Chess: Turing (and others) in the 1950s • Sensing and effecting are simplified – Checkers: Samuel, 1950s learning program – Not true for sports games • Usually start with naïve optimism – See robocup • Follow with naïve pessimism • Games are fun! • Simon: Computer chess champ by 1967 • Downside: Getting taken seriously (not) • Many, e.g., Kasparov, predicted that a computer – See robo search and rescue would never be champion Game Setup Games Today • Most commonly, we study games that are: – 2 player • Computers perform at champion level – Alternating – Backgammon, Checkers, Chess, Othello – Zero-sum – Perfect information • Computers perform well • Examples: Checkers, chess, backgammon – Bridge • Assumptions can be relaxed at some expense • Computers still do badly • Economics studies case where number of – Go, Hex agents is very large – Individual actions don’t change the dynamics Zero Sum Games Characterizing Games • Assign values to different outcomes • Two-player games are very much like • Win = 1, Loss = -1 search – Initial state • With zero sum games every gain comes at the – Successor function other player’s expense – Terminal test • Sum of both player’s scores must be 0 – Objective function (heuristic function) • Are any games truly zero sum? • Unlike search – Terminal states are often a large set – Full search to terminal states usually impossible Game Trees Game Trees x o x o x Player 1 Max nodes o A1 A3 x o x x o x x o x A2 o x x o x o x Player 2 Min nodes o x o x o A11 A21 A22 A31 A32 A12 x o x x o x x o x x o x x o x x o x o x x o x x o o x o x o o x o x o o o o x o x o o x o x o o Player 1 Terminal Nodes Minimax Values Minimax Max nodes • Max player tries to maximize his return 3 • Min player tries to minimize his return • This is optimal for both (zero sum) Min nodes 2 3 2 minimax(nmax ) = max s∈succesors( n ) minimax( s) minimax(nmin ) = min s∈succesors( n ) minimax( s ) 3 12 2 4 15 2 Minimax Properties Minimax in the Real World • Minimax can be run depth first – Time O(bm) • Search trees are too big – Space O(bm) • Alternating turns double depth of the search – 2 ply = 1 full turn • Assumes that opponent plays optimally • Branching factors are too high – Chess: 35 • Based on a worst-case analysis – Go: 361 • Search from start never terminates in non- • What if this is incorrect? trivial games Evaluation Functions Desiderata for Evaluation Functions • Like heuristic functions • Would like to put the same ordering on nodes (even • Try to estimate value of a node without if values aren’t totally right) expanding all the way to termination • Is this a reasonable thing to ask for? • Using evaluation functions • What if you have a perfect evaluation function? – Do a depth-limited search • How are evaluation functions made in practice? – Treat evaluation function as if it were terminal – Buckets • What’s wrong with this? – Linear combinations • Chess pieces (material) • How do you pick the depth? • Board control (positional, strategic) • How do you manage your time? • Iterative deepening, quiescence Search Control Issues Pruning • Horizon effects • The most important search control method is – Sometimes something interesting is just figuring out which nodes you don’t need to beyond the horizon expand – How do you know? • Use the fact that we are doing a worst-case • When to generate more nodes? analysis to our advantage • If you selectively extend your frontier, how – Max player cuts off search when he knows min do you decide where? player can force a provably bad outcome – Min player cuts of search when he knows max can • If you have a fixed amount of total game force a provably good (for max) outcome time, how do you allocate this? Alpha-beta pruning How to prune • We still do (bounded) DFS Max nodes 3 • Expand at least one path to the “bottom” • If current node is max node, and min can force a lower value, then prune siblings Min nodes • If curent node is min node, and max can 2 3 2 force a higher value, then prune siblings 3 12 2 4 15 2 Implementing alpha-beta Max node pruning max_value(state, alpha, beta) if cutoff(state) then return eval(state) for each s in successors(state) do 2 alpha = max(alpha, min_value(s, alpha, beta)) if alpha >= beta the return beta end return alpha Max nodes min_value(state, alpha, beta) 2 4 if cutoff(state) then return eval(state) for each s in successors(state) do beta = min(alpha, max_value(s, alpha, beta)) if beta <= alpha the return alpha 4 end return beta Amazing facts about alpha-beta What About Probabilities? • Empirically, alpha-beta has the effect of Max nodes reducing the branching factor by half for many problems • This effectively doubles the horizon that Chance can be searched nodes • Alpha-beta makes the difference P=0.9 P=0.5 P=0.5 P=0.6 P=0.4 P=0.1 between novice and expert computer players Min nodes Expectiminimax Expectiminimax is nasty • High branching factor • n random outcomes per chance node • Randomness makes evaluation fns difficult • O(bmnm) time – Hard to predict many steps into future – Values tend to smear together – Preserving order is not sufficient • Pruning is problematic eminimax(nmax ) = max s∈succesors( n ) eminimax( s ) – Need to prune based upon bound on an eminimax(nmin ) = min s∈succesors ( n ) eminimax( s ) expectation eminimax(nchance ) = ∑s∈succesors( n ) eminimax( s) p ( s ) – Need a priori bounds on the evaluation function Multiplayer Games Conclusions • Things sort-of generalize • Game tree search is a special kind of search • We can maintain a vector of possible • Rely heavily on heuristic evaluation functions values for each player at each node • Alpha-beta is a big win • Assume that each player acts greedily • Most successful players use alpha-beta • What’s wrong with this? • Final thought: Tradeoff between search effort and evaluation function effort • When is it better to invest in your evaluation function?

DOCUMENT INFO

Shared By:

Categories:

Tags:

Stats:

views: | 3 |

posted: | 3/10/2012 |

language: | |

pages: | 7 |

OTHER DOCS BY xuyuzhu

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.