Embed
Email

The Implementation of Machine Learning in the Game of Checkers

Document Sample

Shared by: huanghengdong
Categories
Tags
Stats
views:
0
posted:
12/18/2011
language:
pages:
20
The Implementation of Machine

Learning in the Game of Checkers





Billy Melicher

Computer Systems lab 08

10/29/08

Abstract

• Machine learning uses past information

to predict future states

• Can be used in any situation where the

past will predict the future

• Will adapt to situations

Introduction

• Checkers is used to

explore machine

learning

• Checkers has many

tactical aspects that

make it good for

studying

Background

• Minimax

• Heuristics

• Learning

Minimax

• Method of adversarial search

• Every pattern(board) can be given a fitness

value(heuristic)

• Each player chooses the outcome that is best

for them from the choices they have

Minimax









Gotten from wiki

Minimax

• Has exponential growth rate

• Can only evaluate a certain number of actions

into the future – ply

Heuristic

• Heuristics predict out come of a board

• Fitness value of board, higher value, better

outcome

• Not perfect

• Requires expertise in the situation to create

Heuristics

• H(s) = c0F0(s) + c1F1(s) + … + cnFn(s)

• H(s) = heuristic

• Has many different terms

• In checkers terms could be:

• Number of checkers

• Number of kings

• Number of checkers on an edge

• How far checkers are on board

Learning by Rote

• Stores every game played

• Connects the moves made for each board

• Relates the moves made from a particular

board to the outcome of the board

• More likely to make moves that result in a

win, less likely to make moves resulting in a

loss

• Good in end game, not as good in mid game

How I store data

I convert each checker board into a 32 digit base 5 number where

each digit corresponds to a playable square and each number

corresponds to what occupies that square.

Learning by Generalization

• Uses a heuristic function to guide moves

• Changes the heuristic function after games

based on the outcome

• Good in mid game but not as good in early

and end games

• Requires identifying the features that affect

game

Development

• Use of minimax algorithm with alpha beta

pruning

• Use of both learning by Rote and

Generalization

• Temporal difference learning

Temporal Difference Learning

• In temporal difference learning, you adjust the

heuristic based on the difference between the

heuristic at one time and at another

• Equilibrium moves toward ideal function

• U(s) <-- U(s) + α( R(s) + γU(s') - U(s))

Temporal Difference Learning

• No proof that prediction closer to the end of

the game will be better but common sense says

it is

• Changes heuristic so that it better predicts the

value of all boards

• Adjusts the weights of the heuristic

Alpha Value

• The alpha value decreases the change of the

heuristic based on how much data you have

• Decreasing returns

• Necessary for ensuring rare occurrences do not

change heuristic too much

Results

• Value of weight reaches equilibrium

• Changes to reflect the learning of the program

• Occasionally requires programmer intervention

when it reaches a false equilibrium

Results

16









14









12









10









8 Value of Weight







6









4









2









0



0 5 10 15 20 25









During the course of a game the value of

this particular weight centers around 10.

Results

• Learning by rote requires a large data set

• Requires large amounts of memory

• Necessary for determining alpha value in

temporal difference learning

Results

120









100









80









Number of Boards in Data Base

60









40









20









0



0.5 1 1.5 2 2.5 3 3.5









• Learning by rote does increase with the

number of games but has decreasing returns

and large amounts of memory



Related docs
Other docs by huanghengdong
2012_Vendor_Form_Wedding_Expo
Views: 0  |  Downloads: 0
SCOPE 1 GP letter v2.0 12Mar2007
Views: 0  |  Downloads: 0
Boston_immigration_records
Views: 2  |  Downloads: 0
PSC MATRIX of achievement 080709
Views: 0  |  Downloads: 0
Summary - CIRCA
Views: 0  |  Downloads: 0
ieee_wiley_ebooks_library_customer_title_list
Views: 0  |  Downloads: 0
2009-2010_ACC0044_fishers_772_07-dec-2009
Views: 1  |  Downloads: 0
FSP20111216-EN
Views: 0  |  Downloads: 0
Workshops
Views: 0  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!