Learning as search How does the pac-man baddy get it’s way to you effectively? How do you decide on your next move in chess? How do you solve a rubrik’s cube? How do you quickly find the quickest/shortest route between two places? How do you navigate a maze? How do you solve the logic puzzles we gave you? How do you solve a sudoko puzzle? How do you write code to animate a character’s actions in a game or film? The answer for a human is probably to use a mixture of rules-of-thumb, logic, all sorts. We’re amazing, and do all sorts of things without really thinking about how we do them. However more and more we want to be able to automate this kind of problem-solving, so that people don’t have to do things that are Dull (repetitive, the same kind of thing day after day, or even second by second), Dirty or Dangerous. But before we can get a machine to do it, we have to be able to tell it what to do, and we don’t want to have to write custom-made programs from scratch every time. So that’s where the AI bit comes in. This raises a whole bunch of questions about how humans solve problems, are whether some of these ways are the results of our physics (we only have one body, can only be in one place at a time, can’t hold lots of things in memory at once). This line of thinking suggests that there might be a range of alternative strategies we could apply, but the important thing if we want to be able to apply AI quickly and easily to a range of problems is that have a general framework. Luckily, such a framework exists, and is called “state space search”. Crudely put, the idea is that what you know about a given problem at any point in time can be thought of as the “state “ of your system. We can place some conditions on what we want to achieve, and when we have met those conditions we are in our “goal” state. We can usually place some restrictions on things that we can do to move from one state to another. How we navigate between states from the starting position, to the goal state is by applying what is called a search algorithm. Problem Starting Example states Valid moves Goal state position chess Initial board State of board Dictated by Checkmate after move pieces sudoko Few squares Current state of Apply rules to fix No empty squares prefilled square value of blank No rules (constraints) square violated creating Blank Sequence of Add instruction Code passes series of assembly program machine on end; user-supplied tests code to do instructions Add instruction in task X middle Route Position Positions, Add path from Reach target; finding possibly with current position to Could be extra history next junction constraints e.g. < 3 hours, “fastest”, shortest Lets go back to our maze problem. Imagine that you are put at the entrance to the maze below and have to find your way to the middle. How are you going to do this? You could of course do it by making a series of random guesses at each junction. If you had some way of marking which paths you had been down, you could avoid repeating paths you had just been down, but you could easily get stuck in dead ends. No animation here- too much randomness… So to avoid this you decided to be a bit more systematic. Lets assume that: o At every junction you put an arrow on the path you came from, and then take the leftmost path leading out. o If you get to the middle you stop, o Otherwise if you get to a dead end you just go back to the last junction, and take the next untravelled entrance to the left. o If you end up back at a junction and find you have taken all of the paths leading out of it, then you go “backwards” up the path you arrived on (remember that you marked it with an arrow). In most cases this strategy will find the middle. It might take you a long time, and a lot of exploring and backtracking but you will get there. The only problem you have apart from time, is that you might get stuck if your maze has loops in it – for example a wall which is not attached at either end, and you might just keep following it around. Animation of maze-following. Red lines indicate current state of seeker. Green lines show where back-tracking has happened. Depending on context (and available storage) we may remember these paths or not. So maybe you decide on a different approach. Perhaps someone has told you there is a quick way in? o You start out by taking the left hand path until you get to the next junction, where you leave a mark with the value “1”. o You then go back to the entrance and take the next path, again marking the first junction with a “1”. o When you’ve done all the first set of paths, you now take the first path again, and this time repeat the process leaving “2”s at the next level of junctions. o Of course in an ideal world (x-men 3 anyone?) you create clones of yourself at each junction to avoid all that running to and fro. The main point is that you explore all the first level junctions, then all the second, and so on. Problems with this approach? – well the point about cloning really- – especially if you have lots of ways out of a junction you could rapidly need lots of clones. Could be a big problem if the middle is a long way in (lots of junctions). On the plus side, you are guaranteed to find the quickest way to the middle, even if it takes you ages. Animation of second maze-following strategy assuming “cloning”. Items in red lines indicate current state of seeker and must be held in memory. Items in green lines show information discoverd and show where back-tracking has happened. Depending on context (and available storage) these may be kept or not As an aside, what if you are doing the left-first approach and someone tells you that the middle is only four junctions in? Easy – you just exploit that knowledge to tell you when to give up on a path even if you haven’t got to the middle/a dead end. All sounds a bit tricky, so what if you have a beacon that tells you how many metres away the middle is? You could use that to guide your choice. But then the maze designer may have put in some blind alleys that get you near to , but not quite at, the middle. Or maybe the maze is built on a hill, and you are told that the middle is somewhere near the top, so you have an inexact measure of how close you are? Both of these would help provided you learned from our “blind” experiences and used some way of back-tracking. Why have I chosen a maze? Well, it’s easy to visualise, we all know them, and lots of games are built on them – even if they are dressed up in other ways. Best of all, mazes have this nice property that you can unfold them into a tree, and trees are really important in computing. Maze discovered by second strategy. This time the walls have been removed and the nodes are renumbered in order of discovery/exploration. For completeness the final two “dead- ends” are added in blue The image above shows what happens if you take the set of things discovered by the second strategy and then simply remove the walls. For completeness I have added the last two dead ends – shown in blue. As you can see, it is a simple matter of “unkinking” the odd line, and putting in some extra bends to get nice simple tree shape - a bit like a “family tree”. Because I wanted to keep the example fairly simple, I only had two choices at each junction, so this is what is called a “binary” tree. As you saw, some of the points where the algorithm halts are junctions, and some are dead ends. When we unfold the maze to a tree, these correspond to what we would call “interior nodes” and “terminal nodes”. The latter are also often called “leaves” for obvious reasons, and the starting point is usually known as the “root” of the tree. Trees are what’s known a data structure, and its easy to store a tree as a collection of nodes – each one is like a little container which has inside it the id of the parent node above it in the tree, any the “children” ones below it, and maybe some other information. Computer scientists like trees because they are really simple to implement, manipulate, and extend. The more mathematically minded people like them because they are a form of what is known as “acyclic graphs”, which are handy for doing reasoning about (acyclic means without cycles or loops). For example we might implement this like: Node ( id = 3, parent id = 1, child id = 6,7, data = “”) Node(id = 19, parent id = 16, child id = , data = “X marks the spot”) In Steve’s knowledge representation lectures you saw one way in which facts and relationships could be stored, so hopefully the analogies are obvious: If classes of things are nodes, then “isa” is a line from one node to another ... Most of you will either be familiar with, or about to learn, an object-oriented programming language. There are lots of links between the reasoning behind OO languages and the ideas of tress and graphs. Especially so once people start talking about classes and inheritance Hopefully by now you are starting to see that many of the examples I gave above easily fit into this format. More generally we recognised above that mazes could have “free standing” walls in them which could create loops. In the practicals there are several examples of this – for example in the missionary and cannibals problem you can get back to an earlier state. So where does this get us? Well, we have seen how many problems can be represented in a fairly simple way as a search in what is called state space. We have seen that we need not “know” everything about our state space in advance – as long as we can say what possible ways there are out of a given state, then we can apply one of the strategies above, storing the information we need in the form of a tree/graphs via a collection of nodes. All we need to do now is formalise the fairly folksy definition above in a way that they can be written down in a form that a programmer can reproduce. In the lectures we cover the formal definitions of this kind of search in a lot more detail. The first method I described in called depth-first search, and the second is breadth-first search. These are known as “blind search” methods since they have no other data to guide them to the middle of the maze. They are implemented via two classic computing methods: the stack and the queue. A stack is just like a stack of cards. You can add cards to the top, or take the top one off, but you can never access the interior cards without first removing the ones above them. Lots of computer programming uses this metaphor to manage memory and variables, as you’ll find out when you start debugging programmes, although thankfully nowadays compilers mostly hide the details from the user. Depth first search is implemented using a stack. When you reach a junction (interior node) you: push that node (lets call it “A”) on the top of the stack, and go down the first child. If that child is a dead-end, then you simply: “pop” node “A” off the top of the stack, remove the id of the first child node, retrieve the second child id as our next destination and push “A” back on the stack. If the child node “B” is itself a junction, then you: add node “B” to the stack and move on to its first child. If you eventually back-track to “A” and it has no children left, then you: discard it, “pop” the next node off the top of the stack (which must have been “A”s parent), Go to the next unexplored node - i.e. one of A’s siblings. Or discard that node and get the next off the stack if there are no siblings left. Hopefully you can see that in this way the size of the queue is restricted to the current depth of the search. A queue is like a bus queue – sometimes called first in first out. Breadth-first search is implemented using this type of queue: In the example above: Start with a queue with just node 1 in. Remove node one from the front of the queue, Put its child nodes (2 and 3) on the back. Then take the next node (2) off the front and add its children (4 and 5) to the back). Repeat until goal found. Every time this reduces the size of the queue by one, then increases it by zero or more: Obviously terminal nodes do not have any children so they shorten the queue Interior nodes increase the queue by adding two or more children on the back. Especially if nodes may have several children, this means that the queue may well carry lots more nodes in memory than the stack – in other words breadth-first generally requires more memory. If not, try looking at the two animations again, where the nodes in red have to be held in memory and compare the two approaches. If there are more than two children per node the situation gets even worse. The lectures will cover the specifics of depth/breadth first search, and informed methods such as best-first , hill-climbing and A*. Hopefully you can see that those methods known as “heuristic” or “informed” search which use of information about the quality of a state (e.g. how far it is from the goal) to guide the search process are really just variants on these two methods where each node also carries some sort of “value” or “cost”. So what’s all this other stuff about optimisation and modelling? Time for some more examples In the lectures we draw a distinction between optimisation and modelling. Where do these examples fit in? Well in a maze we have a goal state “X marks the spot”. We also have a model of the world- our maze. We may not have a full model in our chosen algorithm to start of with, but we build it up in the form of a tree as we examine sequence of moves and see what nodes (dead ends or junctions) they take us to. What we are looking for is the right sequence of moves to get us to our goal. This corresponds to a list of the nodes as we go from the root of our tree through to the goal. In fact for depth first search this is even held neatly for us, since it is the contents of the stack! Breadth-first search holds rather more detail, and we will need to retain the discarded nodes if we want to be able to print the exact route to the goal, but the principal is the same. In order to solve the problem we build up a partial representation of the bit of the world we are interested in, but what we provide at the end is a sequence of moves - ( inputs) that take us from the start to the goal state. Pac-man, chess (with allowance for the fact that is too hard to do more than a few moves ahead), and various flavours of route/path finding are all essentially the same. So what about modelling problems like classification, how do they fit in? Well, let’s imagine we are trying to build up a rule base that can later be used in a help-desk to diagnose printer problems. The sorts of information that an expert might user are: is the power reaching both computer and printer?, is there any paper?, is the printer driver installed and working?, is there a printer jam? ... Each of these is something that could be measured or found out by asking the user. There are also a set of actions that could be performed –e,.g. turn on, reboot, reconnect leads, check paper jam, not all of which will help every situation. Lets imagine that we are given lots of examples, with answers to the questions above, which actions solved which problems. There many ways of tackling this problem, here is one simple one: Let a state correspond to a list of rules and the number of our examples solved/not solved by applying those rules. When we move to a child state we add a rule to the set inherited from the parent, and re-test the unsolved examples. We can create rules from any combination of situations and outcomes. For example “IF paper_out=TRUE THEN Reboot” is a valid, but fairly daft rule, We continue until: o we have found a path that (sequences of rules) that solves all our examples. o Or the number of unsolved examples has not decreased for some (e.g. 5) moves We then take the rule-set from the path leading to the “winning” node to be the “model” for our help desk Now this is a rather simplistic idea I’ve chosen just to illustrate how we could view model- building as a search problem. Real-world methods are generally more sophisticated but nevertheless many “rule-induction” and “decision tree” methods essentially do something similar to this. By the way, note that since we have a “number of unsolved cases” that we want to reduce to zero, we automatically have a way to apply informed search. Partial tree resultnig from one possible approach to create model of printer diagnosis problem. Each node in tree corresponds to a possible model to be used in the help desk. Exact Search Methods and the Need for “Heursitics” Sadly most interesting problems in real life suffer from the problem that so-called exact methods like breadth-first, A* etc . just can’t run fast enough (see box on NP below). This is basically because the number f possible states just grows too quickly. For example even a simple maze like the one above, with only two-way junctions had 2n paths leading out of level n i.e. 3 nodes at or below level 2, 7 at or below level 3, 15 at or below level 4 ... which rapidly means that even with the fastest machines can’t compute all the possibilities within years by the time n gets above the mid twenties (I’ll leave it as an exercise for you to work this out ..). For other types of problems – like finding optimal permutations of events/cities etc, the number of possible solutions grows even faster. It’s a matter of conjecture, but almost universally accepted within computer science that there is no algorithm that can definitely sovle these problems that runs in a time which is polynomial in n (e.g. n2, n3, or even just ny for some fixed value y). The net result of all this is that we are forced to look for ways of coming up with solutions that usually come up with high quality solutions, even if they cannot be guaranteed to be the best. One approach is to apply rules of thumb to build up solutions – like “keep adding the nearest unvisited drop” if we are planning a delivery schedule. However these may not yield results without adding lots of tricks such as sophisticated back-tracking and loop avoidance for our maze example. The other approach is to find a way of specifying our problem so that we can work with whole “candidate” solutions, and apply rules of thumb to tell us how to generate the next candidate solution based on what we have seen so far, and some kind of quality measure. These methods tend to be called “metaheuristics” – we’ll be covering one family of them in some detail. To give you an idea though, below are some different approaches for formulating the problems we have discussed above. Different ways of asking the same question One way that can be useful to think of applying these methods is just as we did for our maze: in other words to construct a solution, extending it as necessary until they reach the goal state. But is worth bearing in mind how many nodes we examined just to do our simple maze. As I suggested, if the maze has loops in it then depth first can get trapped in endless loops, but on the other hand depth-first search needs lots of memory. However, as with so much else in life, there are different ways that we could pose this problem. For example how about this as an approach to solving the maze: Assume that our maze can be solved in 10 or less states. Let our “state” be the sequence of turns we take Let the starting state be all “left” i.e. [L,L,L,L,L,L,L,L,L,L] Let the children of a state be all those that can be reached by changing one turn – i.e from the first state we have : [R,L,L,L,L,L,L,L,L,L], [L,R,L,L,L,L,L,L,L,L] , [L,L,R,L,L,L,L,L,L,L] ,[L,L,L,R,L,L,L,L,L,L] ,[L,L,L,L,R,L,L,L,L,L] ,[L,L,L,L,L,R,L,L,L,L] ,[L,L,L,L,L,L,R,L,L,L] ,[L,L,L,L,L,L,L,R,L,L] ,[L,L,L,L,L,L,L,L,R,L] ,[L,L,L,L,L,L,L,L,L,R] , When we “examine” a state, we start at the beginning of the maze and stop when we reach a dead-end If the dead end is our goal state then the search is over. This time our state space obviously has loops in it, but we could still apply depth or breadth first search, or for larger mazes we might use a genetic algorithm to evolve the solution. Clearly in this case we are not holding any specific knowledge about the maze Here’s another example: sorting out a delivery schedule for a lorry. We are given a list of drops they must make, we are told not to revisit, and we know that for safety reasons the driver must not be driving for more than 8 hours. In this case the obvious “constructive” approach is as follows: Start state = depot, time = zero Possible transitions from any state = move to unvisited drop, set time =time +time_to_new_drop Goal state = no unvisited drops and time + time_from_drop_to_depot < 8 hours Clearly we could apply depth or breadth first to this problem (if we had time). Where we had “dead-ends” to stop lines of travel in the maze, here we can use our time constraint to stop exploring lengthy routes. We could also make use of some knowledge to give us “informed search”. For example, we could apply a rule of thumb that initially we always took the route to the nearest unvisited drop. However, as you can probably imagine, even for pretty small problems the size of the queue need to hold all the nodes at a given level for breadth first becomes huge. In this case the alternative way of asking the question might be start with a random sequence of cities, so that a state consisted of: a valid (i.e. complete) sequence of drops, the time taken to complete the run for that state ids of all the states reached by swapping the positions of two drops in the sequence. This has some benefits – for example it gives us a natural source of information about sequences (their total lengths) that we can use to guide our search. Again genetic algorithms and other metaheuristics are a natural way to tackle large instances.
Pages to are hidden for
"Learning as search.docx - 220.127.116.11"Please download to view full document