# Ch3 Heuristic Search

Document Sample

```					Heuristic Search

Oktober 2011
Suzaimah Ramli
suzaimah@upnm.edu.my

1
Contents
1.   An Introduction Heuristic Search
2.   Examples and Terminology
3.   Heuristic Search Techniques
4.   Example of Evaluation Heuristic
5.   Intelligent Agents

2
1-An Introduction Heuristic Search
 In solving problems, sometimes we have to search
through many possible ways of doing something.
 Example :
 We may know all the possible actions our robot can do,
but we have to consider various sequences to find a
sequence of actions to achieve a goal.
 We may know all the possible moves in a chess game, but
we must consider many possibilities to find a good move.
 Many problems can be formalised in a general way
as search problems.

3
Search problems described in terms of:
 An initial state. (e.g., initial chessboard, current
positions of objects in world, current location)
 A target state.(e.g., winning chess position, target
location)
 Some possible actions, that get you from one
state to another. (e.g. chess move, robot action, simple
change in location).
Search techniques systematically consider
all possible action sequences to find a path
from the initial to target state.

4
Definitions
(1)

Heuristics (Greek heuriskein = find, discover)
"the study of the methods and rules of discovery and
invention".

EXAMPLE
In chess: consider one (apparently best) move, maybe a few --
but not all possible legal moves.
In the travelling salesman problem: select one nearest city,
give up complete search (the greedy technique). This gives us,
in polynomial time, an approximate solution of the inherently
exponential problem; it can be proven that the approximation
error is bounded.

5
Definitions
(2)

For heuristic search to work, we must be able to rank
the children of a node. A heuristic function takes a
state and returns a numeric value -- a composite
assessment of this state. We then choose a child with
the best score (this could be a maximum or minimum).

EXAMPLE
The 8-puzzle: how many misplaced tiles? how many slots
away from the correct place? and so on.
Water jugs: ???
Chess: no simple counting of pieces is adequate.

6
2 Examples and Terminology
 Chess: search through set of possible moves
 Looking for one which will best improve position

 Route planning: search through set of paths
 Looking for one which will minimize distance

 Theorem proving: Search through sets of reasoning
steps
 Looking for a reasoning progression which proves
theorem

 Machine learning: Search through a set of concepts
 Looking for a concept which achieves target
categorisation                                     7
Search Terminology
• States
– “Places” where the search can visit
• Search space
– The set of possible states
• Search path
– The states which the search agent actually visits
• Solution
– A state with a particular property
• Which solves the problem (achieves the task) at hand
– May be more than one solution to a problem
• Strategy
– How to choose the next step in the path at any given
stage
8
Specifying a Search Problem

Three important considerations

1. Initial state
– So the agent can keep track of the state it is visiting
2. Operators/ Some possible actions
– Function taking one state to another
– Specify how the agent can move around search space
– So, strategy boils down to choosing states & operators
3. Goal test/target state
– How the agent knows if the search has succeeded

9
Example 1 - Chess

1. Initial state
– As in picture
2. Operators
– Moving pieces
3. Goal test
– Checkmate
• king cannot move
without being taken

10
Example 2 – Route Planning

1. Initial state                Liverpool     Leeds

– City the journey starts
Nottingham
in                           Manchester

2. Operators                        Birmingham

– Driving from city to city
3. Goal test
– If current location is                    London

• Destination city
11
Example 3
 Scenario : searching for route on a map.
School              Factory

Library      Hospital          Newsagent
Church

Park       University

 Question :How do we systematically and
exhaustively search possible routes, in order to
find, say, route from library to university?
12
 The set of all possible states reachable from the initial
state defines the search space.
 We can represent the search space as a tree.

library

school                 hospital

park          newsagent
factory

university    church
 We refer to nodes connected to and “under” a node in
the tree as “successor nodes”.                     13
3- Heuristic Search Techniques
• How do we search this tree to find a possible route
from library to University?
• May use simple systematic search techniques,
which try every possibility in systematic way.
• Breadth First Search : Try shortest paths first.
Guaranteed to find the shortest path to a solution.
• Depth first search: Follow a path as far as it goes,
and when reach dead end, backup and try last
encountered alternative.
• Hill Climbing : (This is a greedy algorithm) go as
high up as possible as fast as possible, without
looking around too much.
14
Heuristics and Search
 In general
 a heuristic is a “rule-of-thumb” based on domain-dependent

 In search
 one uses a heuristic function of a state where
h(node) = estimated cost of cheapest path from the state for
that node to a goal state G
 where
 h(G) = 0
 h(other nodes)  0
 (note: we will assume all individual node-to-node costs are > 0)

 How does this help in search
 we can use knowledge (in the form of h(node)) to reduce search time
 generally, will explore more promising nodes before less promising15
ones
Combining Heuristic And Path Costs

 We can also use both costs together:
 let n be some node in the search tree, then

f(n) = g(n) + h(n) is the estimated cost from S to G via n
where
 g(n)= d(S to n) - true path cost from S to n (exact)
 h(n)= h(n to S) - heuristic estimate of path cost from N to
G

16
Example: Route Finding

 First states to try:
Liverpool
 - Birmingham, Peterborough
Leeds
 f(n) = distance from London +                                135

crow flies distance from state
 i.e., solid + dotted line distances           150
Nottingham
155
 f(Peterborough) = 120 + 155 = 275
 f(Birmingham) = 130 + 150 = 280
Birmingham
 Hence expand Peterborough
Peterborough
 Returns later to Birmingham
 It becomes best state : Must go through
Leeds from Notts                                130         120

London
17
Try shortest paths first. Guaranteed to find the shortest
path to a solution.
Explore nodes in tree order: library, school,hospital, factory,
park, newsagent, university, church.(conventionally explore
left to right at each level)

library

school                 hospital

park          newsagent
factory

university
church    18

1

2           3       4

5          6                   8   9
7

What is in OPEN when node 6 expanded ?
How many nodes should have been expanded to find the goal ?
19
start

put s in OPEN

yes
OPEN empty ?                 Fail

Remove the first node of OPEN
and put it in CLOSE (call it n)

Expand n.
Put successors at the end of OPEN
pointers back to n

yes
any succesor = goal              Success
?
20
select a heuristic function (e.g., distance to the goal);
put the initial node(s) on the open list;
repeat
select N, the best node on the open list;
succeed if N is a goal node; otherwise put N on the
closed list and add N's children to the open list;
until
we succeed or the open list becomes empty (we fail);

A closed node reached on a different path is made open.
NOTE: "the best" only means "currently appearing the best"...

21
and found=FALSE.
2. While queue not empty and not
found do:
–    Remove the first node N from queue.
–    If N is a goal state, then found = TRUE.
–    Find all the successor nodes of N, and
put them on the end of the queue.
22
3.2 Depth First Search
Nodes explored in order: library, school,
factory, hospital, park, newsagent,
university.

library

school                hospital

park         newsagent
factory

university        23
Depth First Search

1

2           3       4

5          6                   8   9
7

What is in OPEN when node 6 expanded ?
How many nodes should have been expanded to find the goal ?
24
Algorithm For Depth First
start

put s in OPEN

yes
OPEN empty ?                Fail

Remove the first node of OPEN               Depth(n) =
and put it in CLOSE (call it n)             Depth Bound

Expand n.
Put successors at the beginning of OPEN
pointers back to n

yes
any succesor = goal           Success
?
25
Algorithm For Depth First
found=FALSE.
–   Remove the first node N from stack.
–   If N is a goal state, then found = TRUE.
–   Find all the successor nodes of N, and put
them on the top of the stack.

26
3.3 Hill Climbing Or
Best First Search Algorithm
• Best first search algorithm almost same as
depth/breadth. But we use a priority queue,
where nodes with best scores are taken off the
queue first.
• This is a greedy algorithm: go as high up as
possible as fast as possible, without looking
around too much. A Type of Best First Search -
“Greedy”: always take the biggest bite.

27
Best First Search
Order nodes searched: Library, hospital, park,
newsagent, university.

Library (6)

School (5)                 Hospital (3)

Park (1)       Newsagent (2)
Factory (4)

University (0)

28
Algorithm For Best First Search

1. Remove the BEST node N from
queue.
2. If N is a goal state, then found =
TRUE.
3. Find all the successor nodes of N,
assign them a score, and put
them on the queue.

29
Algorithm For Hill Climbing

select a heuristic function;
set C, the current node, to the highest-valued
initial node;
loop
select N, the highest-valued child of C;
return C if its value is better than the value
of N; otherwise set C to N;

30
Problems With Hill Climbing

 Blind alley effect: early estimates very
 One solution: delay the usage of greedy search
 Not guaranteed to find optimal solution
 Remember we are estimating the path cost to
solution

31
3.4 Other Heuristic Search Methods

A*: Score based on predicted total
path “cost”, so sum of
– actual cost/distance from initial to
current node,
– predicted cost/distance to target node.

32
A* Search
 Want to combine uniform path cost and greedy
searches
 To get complete, optimal, fast search strategies
 Suppose we have a given (found) state n
 Path cost is g(n) and heuristic function is h(n)
 Use f(n) = g(n) + h(n) to measure state n
 Choose n which scores the highest
 Basically, just summing path cost and heuristic
 Can prove that A* is complete and optimal
 But only if h(n) is admissable,
 i.e. It underestimates the true path cost to solution from n
 See Russell and Norvig for proof
35
Algorithm for A* Search

Initialize: Let Q = {S}
While Q is not empty
pull Q1, the first element in Q
if Q1 is a goal report(success) and quit
else
child_nodes = expand(Q1)
<eliminate child_nodes which represent loops>
put remaining child_nodes in Q
sort Q according to ucost = pathcost(S to node) +
h*(node)
end
Continue

36
 The estimate of the distance is called a heuristic
 typically it comes from domain knowledge
 e.g., the straight-line distance between 2 points

 If the heuristic never overestimates, then the search procedure
using this heuristic is “admissible”, i.e.,
   h*(N) is less than or equal to realcost(N to G)

 A* is a search with admissible heuristic is optimal
 i.e., if one uses an admissible heuristic to order the search one is
guaranteed to find the optimal solution

 The closer the heuristic is to the real (unknown) path cost, the more
effective it will be, ie if h1(n) and h2(n) are two admissible heuristics
and h1(n)h2(n) for any node n then A* search with h2(n) will in
general expand fewer nodes than A* search with h1(n)
37
4.1 Example of Evaluation Heuristic Function
8-Puzzle
The 8-Puzzle Problem
 the number of tiles in the wrong position
 the sum of distances of the tiles from
their goal positions, where distance is
counted as the sum of vertical and
horizontal tile displacements (“Manhattan
distance”)

38
Function h(N) that estimates the cost of the
cheapest path from node N to goal node.

5   8     1 2 3 h(N) = number of misplaced tiles
4 2 1     4 5 6      =6
7 3 6     7 8
N        goal

40
Function h(N) that estimate the cost of the
cheapest path from node N to goal node.

5   8     1 2 3 h(N) = sum of the distances of
4 2 1     4 5 6        every tile to its goal position
7 3 6     7 8        =2+3+0+1+3+0+3+1
N        goal      = 13

41
f(N) = h(N) = number of misplaced tiles

3       3        4
5        3

4
2
4                            2       1
3        3
0
4

5        4
42
f(N) = g(N) + h(N)
with h(N) = number of misplaced tiles

3+3
1+5      2+3

3+4
5+2
0+4                        3+2       4+1
1+3      2+3
5+0
3+4

1+5      2+4
43
f(N) = h(N) =  distances of tiles to goal

6         5

2
5                             2        1
4         3
0
4

6         5
44
5         8               1    2     3

4    2    1               4    5     6

7   3     6               7    8
N                         goal
• h1(N) = number of misplaced tiles = 6 is admissible
• h2(N) = sum of distances of each tile to goal = 13
• h3(N) = (sum of distances of each tile to goal)
+ 3 x (sum of score functions for each tile) = 49
45
Graph and Tree Notations
Search Tree Notation
•   Branching degree, b
– b is the number of children of a node
•   Depth of a node, d                                       d = Depth
– number of branches from root to a             S
node                                                      0
b=2
•   Arc weight, Cost of a path
•   n  n’ , n parent node of n’, a child
node
•   Node generation                                               1
•   Node expansion
•   Search policy:
– determines the order of nodes                              2
expansion                                        G
•   Two kinds of nodes in the tree
– Open Nodes (Fringe)
• nodes which are not expanded
(i.e., the leaves of the tree)
– Closed Nodes
• nodes which have already been
expanded (internal nodes)
4.2 Example of Evaluation Heuristic Function

A                   B                   C

S                                                                 G

D                     E                       F

S = start, G = goal, other nodes = intermediate states, links = legal transitions
Example of a Search Tree

S
A                              D

B                     D                           E

C              E                                      B         F

G

Note: this is the search tree at some particular point in the search.
Search Method 1:

S
A                   D

B                   A
D                           E

C       E   E   S       B   S       B       F
Search Method 2:
Depth First Search (DFS)
S
A                       D

B           D

C       E

Here, to avoid repeated
states assume we don’t
D       F           expand any child node which
from the root S to the parent.
(Again, one could use other
G       strategies)
Search Method 3:
Best-First Search
S
10.4                                           8.9
A                                    D
10.4
A
6.9
E

F   3.0
B    6.7

o uses estimated path cost from node to goal (heuristic, hcost)
o ignores actual path cost
0
o after each expansion; sort nodes according to estimated path-cost
from node to goal
G
o it always expands the most promising node on the fringe (according
to the heuristic)
oNote: this is not the optimal path for this problem
Path Costs
1               4
A              B                C
2

2               5                   G
S

3
5                                    4
2
D               E                   F

Optimal (minimum cost) path is S-A-D-E-F-G
Heuristic Functions

A          B          C
10.4       6.7        4.0

11.0                               G
S

8.9                       3.0
6.9
D          E              F
Search Method 3:
A* Algorithm in action
S         5 + 8.9 = 13.9
2 +10.4 = 12.4
A                           D
3 + 6.7 = 9.7
B                             D   4 + 8.9 = 12.9

7 + 4 = 11                    8 + 6.9 = 14.9
C            E                                   6 + 6.9 = 12.9
E

B               F    10 + 3.0 = 13
11 + 6.7 = 17.7

G
13 + 0 = 13
5- Intelligent Agents
 An agent is anything that can
 perceive its environment through sensors, and
 act upon that environment through actuators (or effectors)

actuators

 Goal: Design rational agents that do a “good job” of acting in their
environments
 success determined based on some objective performance measure

57
5.1 Example: Vacuum Cleaner Agent

Percepts: location and contents, e.g., [A, Dirty]
Actions: Left, Right, Suck, NoOp

58
5.2 Rational Agents
• Rational Agents
– An agent should strive to "do the right thing", based on what it can perceive and
the actions it can perform. The right action is the one that will cause the agent to be
most successful.
– Performance measure: An objective criterion for success of an agent's
behavior.
– E.g., performance measure of a vacuum-cleaner agent could be amount of dirt
cleaned up, amount of time taken, amount of electricity consumed, amount of noise
generated, etc.

– Definition of “Rational Agent”:
• For each possible percept sequence, a rational
agent should select an action that is expected to
maximize its performance measure, given the
evidence provided by the percept sequence and
whatever built-in knowledge the agent has.
59
 Rationality  omniscience
An omniscient agent knows the actual outcome of its actions.
 Rationality  perfection
Rationality maximizes expected performance, while perfection
maximizes actual performance.

The proposed definition requires:
– Information gathering/exploration
• To maximize future rewards
– Learn from percepts
• Extending prior knowledge
– Agent autonomy
• Compensate for incorrect prior knowledge

60
• Rationality depends on
–   the performance measure that defines degree of success
–   the percept sequence - everything the agent has perceived so far
–   what the agent know about its environment
–   the actions that the agent can perform

• Agent Function (percepts ==> actions)
– Maps from percept histories to actions   f: P*  A
– The agent program runs on the physical architecture to produce the function f
– agent = architecture + program

Action := Function(Percept Sequence)
If (Percept Sequence) then do Action

• Example: A Simple Agent Function for Vacuum World
If (current square is dirty) then suck

61
5.3 PEAS Analysis
 To design a rational agent, we must specify the task
environment

 PEAS Analysis:
 Specify Performance Measure, Environment, Actuators, Sensors

 Example: Consider the task of designing an automated
taxi driver
 Performance measure: Safe, fast, legal, comfortable trip, maximize
profits
 Environment: Roads, other traffic, pedestrians, customers
 Actuators: Steering wheel, accelerator, brake, signal, horn
 Sensors: Cameras, sonar, speedometer, GPS, odometer, engine
sensors, keyboard

62
• Agent: Medical diagnosis system
– Performance measure: Healthy patient, minimize costs, lawsuits
– Environment: Patient, hospital, staff
– Actuators: Screen display (questions, tests, diagnoses, treatments,
referrals)
– Sensors: Keyboard (entry of symptoms, findings, patient's answers)

• Agent: Part-picking robot
–   Performance measure: Percentage of parts in correct bins
–   Environment: Conveyor belt with parts, bins
–   Actuators: Jointed arm and hand
–   Sensors: Camera, joint angle sensors

63
5.4 Environment Types
• Fully observable (vs. partially observable):
– An agent's sensors give it access to the complete state of the
environment at each point in time.

• Deterministic (vs. stochastic):
– The next state of the environment is completely determined by the
current state and the action executed by the agent. (If the environment
is deterministic except for the actions of other agents, then the
environment is strategic).

• Episodic (vs. sequential):
– The agent's experience is divided into atomic "episodes" (each episode
consists of the agent perceiving and then performing a single action),
and the choice of action in each episode depends only on the episode
itself.

64
• Static (vs. dynamic):
– The environment is unchanged while an agent is deliberating (the
environment is semi-dynamic if the environment itself does not change
with the passage of time but the agent's performance score does).

• Discrete (vs. continuous):
– A limited number of distinct, clearly defined percepts and actions.

• Single agent (vs. multi-agent):
– An agent operating by itself in an environment.

65
The environment type largely determines the agent design.

The real world is (of course) partially observable, stochastic,
sequential, dynamic, continuous, multi-agent

66
5.4 Structure of an Intelligent Agent
• All agents have the same basic structure:
– accept percepts from environment
– generate actions

• A Skeleton Agent:
function Skeleton-Agent(percept) returns action
static: memory, the agent's memory of the world

memory  Update-Memory(memory, percept)
action  Choose-Best-Action(memory)
memory  Update-Memory(memory, action)
return action

• Observations:
– agent may or may not build percept sequence in memory (depends on
domain)
– performance measure is not part of the agent; it is applied externally to
judge the success of the agent
67
5.5 Agent Types
1. Simple Reflex Agents
– are based on condition-action rules and implemented with an appropriate
production system. They are stateless devices which do not have memory
of past world states.

2. Reflex Agents With Memory (Model-based)
– have internal state which is used to keep track of past states of the world.

3. Agents With Goals
– are agents which in addition to state information have a kind of goal
information which describes desirable situations. Agents of this kind take
future events into consideration.

4. Utility-based agents
– base their decision on classic axiomatic utility-theory in order to act
rationally.
Note: All of these can be turned into “learning” agents
68
1 Simple Reflex Agent
•   We can summarize part of
the table by formulating
commonly occurring
patterns as condition-action
rules:
•   Example:
if car-in-front-brakes
then initiate braking
•   Agent works by finding a
rule whose condition
matches the current
function Simple-Reflex-Agent(percept) returns action
situation                        static: rules, a set of condition-action rules
– rule-based systems
state  Interpret-Input(percept)
•   But, this only works if the      rule  Rule-Match(state, rules)
current percept is sufficient    action  Rule-Action[rule]
for making the correct           return action
decision
69
Simple Reflex - Vacuum Agent

70
2 Agents that Keep Track of the World
•   Updating internal state
requires two kinds of
encoded knowledge
world changes (independent
of the agents’ actions)
agents’ actions affect the
world
•   But, knowledge of the
internal state is not always
enough                           function Reflex-Agent-With-State(percept) returns
– how to choose among
action
alternative decision paths     static: rules, a set of condition-action rules
(e.g., where should the car            state, a description of the current world
go at an intersection)?        state  Update-State(state, percept)
– Requires knowledge of the      rule  Rule-Match(state, rules)
goal to be achieved            action  Rule-Action[rule]
state  Update-State(state, action)
return action
71
3 Agents with Explicit Goals

 reflex agents only act based on pre-computed knowledge (rules)
 goal-based (planning) act by reasoning about which actions achieve
the goal
 less efficient, but more adaptive and flexible

72
General Architecture for Goal-Based Agents
Input percept
state  Update-State(state, percept)
goal  Formulate-Goal(state, perf-measure)
search-space  Formulate-Problem (state, goal)
plan  Search(search-space , goal)
while (plan not empty) do
action  Recommendation(plan, state)
plan  Remainder(plan, state)
output action
end

• Simple agents do not have access to their own performance measure
– In this case the designer will "hard wire" a goal for the agent, i.e. the designer will
choose the goal and build it into the agent
• Similarly, unintelligent agents cannot formulate their own problem
– this formulation must be built-in also

• The while loop above is the "execution phase" of this agent's behavior
– Note that this architecture assumes that the execution phase does not
require monitoring of the environment
73
• Knowing current state is not always enough.
– State allows an agent to keep track of unseen parts of the world, but the agent
must update state based on knowledge of changes in the world and of effects of
own actions.
– Goal = description of desired situation

• Examples:
– Decision to change lanes depends on a goal to go somewhere (and other factors);
– Decision to put an item in shopping basket depends on a shopping list, map of

• Notes:
– Reflexive agent concerned with one action at a time.
– Classical Planning: finding a sequence of actions that achieves a goal.
– Contrast with condition-action rules: involves consideration of future "what will
happen if I do ..." (fundamental difference).

74
4 A Complete Utility-Based Agent

 Utility Function
 a mapping of states onto real numbers
 allows rational decisions in two kinds of situations
 evaluation of the tradeoffs among conflicting goals
 evaluation of competing goals
75
• Preferred world state has higher utility for agent = quality
of being useful
• Examples
–   quicker, safer, more reliable ways to get where going;
–   price comparison shopping
–   bidding on items in an auction
–   evaluating bids in an auction

• Utility function: state ==> U(state) = measure of
happiness
• Search (goal-based) vs. games (utilities).

76
Shopping Agent
• Navigating: Move around store; avoid obstacles
– Reflex agent: store map precompiled.
– Goal-based agent: create an internal map, reason explicitly about it, use signs
and adapt to changes (e.g., specials at the ends of aisles).

• Gathering: Find and put into cart groceries it
wants, need to induce objects from percepts.
– Reflex agent: wander and grab items that look good.
– Goal-based agent: shopping list.

• Menu-planning: Generate shopping list, modify
list if store is out of some item.
– Goal-based agent: required; what happens when a needed item is not there?
Achieve the goal some other way. e.g., no milk cartons: get canned milk or
powdered milk.

• Choosing among alternative brands
– utility-based agent: trade off quality for price.
77
Learning Agents

 Four main components:
 Performance element: the agent function
 Learning element: responsible for making improvements by observing
performance
 Critic: gives feedback to learning element by measuring agent’s performance
 Problem generator: suggest other78  possible courses of actions (exploration)
5.6 Summary
• An agent perceives and acts in an environment. It has an
architecture and is implemented by a program.
• An ideal agent always chooses the action which maximizes its
expected performance, given the percept sequence received so
far.
• An autonomous agent uses its own experience rather than built-
in knowledge of the environment by the designer.
• An agent program maps from a percept to an action and updates
its internal state.
• Reflex agents respond immediately to percepts.
• Goal-based agents act in order to achieve their goal(s).
• Utility-based agents maximize their own utility function.

79

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 4 posted: 10/25/2011 language: Malay pages: 77