Ch3 Heuristic Search

Document Sample
Ch3 Heuristic Search Powered By Docstoc
					Heuristic Search

  Oktober 2011
 Suzaimah Ramli

1.   An Introduction Heuristic Search
2.   Examples and Terminology
3.   Heuristic Search Techniques
4.   Example of Evaluation Heuristic
5.   Intelligent Agents

 1-An Introduction Heuristic Search
 In solving problems, sometimes we have to search
  through many possible ways of doing something.
 Example :
   We may know all the possible actions our robot can do,
    but we have to consider various sequences to find a
    sequence of actions to achieve a goal.
   We may know all the possible moves in a chess game, but
    we must consider many possibilities to find a good move.
 Many problems can be formalised in a general way
  as search problems.

Search problems described in terms of:
   An initial state. (e.g., initial chessboard, current
    positions of objects in world, current location)
   A target state.(e.g., winning chess position, target
   Some possible actions, that get you from one
    state to another. (e.g. chess move, robot action, simple
    change in location).
Search techniques systematically consider
 all possible action sequences to find a path
 from the initial to target state.


Heuristics (Greek heuriskein = find, discover)
"the study of the methods and rules of discovery and

In chess: consider one (apparently best) move, maybe a few --
but not all possible legal moves.
In the travelling salesman problem: select one nearest city,
give up complete search (the greedy technique). This gives us,
in polynomial time, an approximate solution of the inherently
exponential problem; it can be proven that the approximation
error is bounded.


For heuristic search to work, we must be able to rank
the children of a node. A heuristic function takes a
state and returns a numeric value -- a composite
assessment of this state. We then choose a child with
the best score (this could be a maximum or minimum).

The 8-puzzle: how many misplaced tiles? how many slots
away from the correct place? and so on.
Water jugs: ???
Chess: no simple counting of pieces is adequate.

   2 Examples and Terminology
 Chess: search through set of possible moves
    Looking for one which will best improve position

 Route planning: search through set of paths
    Looking for one which will minimize distance

 Theorem proving: Search through sets of reasoning
    Looking for a reasoning progression which proves

 Machine learning: Search through a set of concepts
    Looking for a concept which achieves target
     categorisation                                     7
           Search Terminology
• States
  – “Places” where the search can visit
• Search space
  – The set of possible states
• Search path
  – The states which the search agent actually visits
• Solution
  – A state with a particular property
     • Which solves the problem (achieves the task) at hand
  – May be more than one solution to a problem
• Strategy
  – How to choose the next step in the path at any given
         Specifying a Search Problem

Three important considerations

1. Initial state
   – So the agent can keep track of the state it is visiting
2. Operators/ Some possible actions
   – Function taking one state to another
   – Specify how the agent can move around search space
   – So, strategy boils down to choosing states & operators
3. Goal test/target state
   – How the agent knows if the search has succeeded

             Example 1 - Chess

1. Initial state
  – As in picture
2. Operators
  – Moving pieces
3. Goal test
  – Checkmate
     • king cannot move
       without being taken

   Example 2 – Route Planning

1. Initial state                Liverpool     Leeds

  – City the journey starts
    in                           Manchester

2. Operators                        Birmingham

  – Driving from city to city
3. Goal test
  – If current location is                    London

    • Destination city
                   Example 3
 Scenario : searching for route on a map.
               School              Factory

   Library      Hospital          Newsagent

                           Park       University

 Question :How do we systematically and
  exhaustively search possible routes, in order to
  find, say, route from library to university?
 The set of all possible states reachable from the initial
  state defines the search space.
 We can represent the search space as a tree.


               school                 hospital

                                   park          newsagent

                                          university    church
 We refer to nodes connected to and “under” a node in
  the tree as “successor nodes”.                     13
   3- Heuristic Search Techniques
• How do we search this tree to find a possible route
  from library to University?
• May use simple systematic search techniques,
  which try every possibility in systematic way.
• Breadth First Search : Try shortest paths first.
  Guaranteed to find the shortest path to a solution.
• Depth first search: Follow a path as far as it goes,
  and when reach dead end, backup and try last
  encountered alternative.
• Hill Climbing : (This is a greedy algorithm) go as
  high up as possible as fast as possible, without
  looking around too much.
               Heuristics and Search
 In general
    a heuristic is a “rule-of-thumb” based on domain-dependent
     knowledge to help you solve a problem

 In search
    one uses a heuristic function of a state where
     h(node) = estimated cost of cheapest path from the state for
       that node to a goal state G
    where
    h(G) = 0
    h(other nodes)  0
    (note: we will assume all individual node-to-node costs are > 0)

 How does this help in search
    we can use knowledge (in the form of h(node)) to reduce search time
    generally, will explore more promising nodes before less promising15
 Combining Heuristic And Path Costs

 We can also use both costs together:
   let n be some node in the search tree, then

     f(n) = g(n) + h(n) is the estimated cost from S to G via n
    g(n)= d(S to n) - true path cost from S to n (exact)
    h(n)= h(n to S) - heuristic estimate of path cost from N to

         Example: Route Finding

 First states to try:
    - Birmingham, Peterborough
 f(n) = distance from London +                                135

  crow flies distance from state
    i.e., solid + dotted line distances           150
    f(Peterborough) = 120 + 155 = 275
    f(Birmingham) = 130 + 150 = 280
 Hence expand Peterborough
    Returns later to Birmingham
       It becomes best state : Must go through
         Leeds from Notts                                130         120

          3.1 Breadth First Search
Try shortest paths first. Guaranteed to find the shortest
path to a solution.
Explore nodes in tree order: library, school,hospital, factory,
park, newsagent, university, church.(conventionally explore
left to right at each level)


           school                 hospital

                               park          newsagent

                                                     church    18
         Breadth First Search


                2           3       4

         5          6                   8   9

What is in OPEN when node 6 expanded ?
How many nodes should have been expanded to find the goal ?
Algorithm For Breadth First

          put s in OPEN

          OPEN empty ?                 Fail

   Remove the first node of OPEN
   and put it in CLOSE (call it n)

   Expand n.
   Put successors at the end of OPEN
   pointers back to n

       any succesor = goal              Success
     Algorithm For Breadth First
select a heuristic function (e.g., distance to the goal);
put the initial node(s) on the open list;
  select N, the best node on the open list;
  succeed if N is a goal node; otherwise put N on the
  closed list and add N's children to the open list;
  we succeed or the open list becomes empty (we fail);

A closed node reached on a different path is made open.
NOTE: "the best" only means "currently appearing the best"...

      Algorithm For Breadth First
1. Start with queue = [initial-state]
   and found=FALSE.
2. While queue not empty and not
   found do:
  –    Remove the first node N from queue.
  –    If N is a goal state, then found = TRUE.
  –    Find all the successor nodes of N, and
       put them on the end of the queue.
     3.2 Depth First Search
Nodes explored in order: library, school,
factory, hospital, park, newsagent,


     school                hospital

                         park         newsagent

                                university        23
             Depth First Search


                2           3       4

         5          6                   8   9

What is in OPEN when node 6 expanded ?
How many nodes should have been expanded to find the goal ?
     Algorithm For Depth First

              put s in OPEN

              OPEN empty ?                Fail

Remove the first node of OPEN               Depth(n) =
and put it in CLOSE (call it n)             Depth Bound

Expand n.
Put successors at the beginning of OPEN
pointers back to n

            any succesor = goal           Success
      Algorithm For Depth First
1. Start with stack = [initial-state] and
2. While stack not empty and not found do:
  –   Remove the first node N from stack.
  –   If N is a goal state, then found = TRUE.
  –   Find all the successor nodes of N, and put
      them on the top of the stack.

         3.3 Hill Climbing Or
      Best First Search Algorithm
• Best first search algorithm almost same as
  depth/breadth. But we use a priority queue,
  where nodes with best scores are taken off the
  queue first.
• This is a greedy algorithm: go as high up as
  possible as fast as possible, without looking
  around too much. A Type of Best First Search -
  “Greedy”: always take the biggest bite.

         Best First Search
Order nodes searched: Library, hospital, park,
newsagent, university.

                   Library (6)

      School (5)                 Hospital (3)

                             Park (1)       Newsagent (2)
      Factory (4)

                                    University (0)

Algorithm For Best First Search
While queue not empty and not found do:

     1. Remove the BEST node N from
     2. If N is a goal state, then found =
     3. Find all the successor nodes of N,
        assign them a score, and put
        them on the queue.

Algorithm For Hill Climbing

 select a heuristic function;
 set C, the current node, to the highest-valued
 initial node;
       select N, the highest-valued child of C;
       return C if its value is better than the value
       of N; otherwise set C to N;

     Problems With Hill Climbing

 Blind alley effect: early estimates very
   One solution: delay the usage of greedy search
 Not guaranteed to find optimal solution
   Remember we are estimating the path cost to

3.4 Other Heuristic Search Methods

A*: Score based on predicted total
 path “cost”, so sum of
  – actual cost/distance from initial to
    current node,
  – predicted cost/distance to target node.

                       A* Search
 Want to combine uniform path cost and greedy
   To get complete, optimal, fast search strategies
 Suppose we have a given (found) state n
   Path cost is g(n) and heuristic function is h(n)
   Use f(n) = g(n) + h(n) to measure state n
   Choose n which scores the highest
 Basically, just summing path cost and heuristic
 Can prove that A* is complete and optimal
   But only if h(n) is admissable,
      i.e. It underestimates the true path cost to solution from n
   See Russell and Norvig for proof
  Algorithm for A* Search

Initialize: Let Q = {S}
While Q is not empty
   pull Q1, the first element in Q
   if Q1 is a goal report(success) and quit
          child_nodes = expand(Q1)
          <eliminate child_nodes which represent loops>
          put remaining child_nodes in Q
          sort Q according to ucost = pathcost(S to node) +

Comments On Heuristic Estimation
 The estimate of the distance is called a heuristic
     typically it comes from domain knowledge
     e.g., the straight-line distance between 2 points

 If the heuristic never overestimates, then the search procedure
  using this heuristic is “admissible”, i.e.,
       h*(N) is less than or equal to realcost(N to G)

 A* is a search with admissible heuristic is optimal
     i.e., if one uses an admissible heuristic to order the search one is
      guaranteed to find the optimal solution

 The closer the heuristic is to the real (unknown) path cost, the more
  effective it will be, ie if h1(n) and h2(n) are two admissible heuristics
  and h1(n)h2(n) for any node n then A* search with h2(n) will in
  general expand fewer nodes than A* search with h1(n)
4.1 Example of Evaluation Heuristic Function
 The 8-Puzzle Problem
    the number of tiles in the wrong position
      (is this admissible?)
    the sum of distances of the tiles from
      their goal positions, where distance is
      counted as the sum of vertical and
      horizontal tile displacements (“Manhattan
     (is this admissible?)

Function h(N) that estimates the cost of the
cheapest path from node N to goal node.

 5   8     1 2 3 h(N) = number of misplaced tiles
 4 2 1     4 5 6      =6
 7 3 6     7 8
   N        goal

Function h(N) that estimate the cost of the
cheapest path from node N to goal node.

 5   8     1 2 3 h(N) = sum of the distances of
 4 2 1     4 5 6        every tile to its goal position
 7 3 6     7 8        =2+3+0+1+3+0+3+1
   N        goal      = 13

    f(N) = h(N) = number of misplaced tiles

                             3       3        4
           5        3

4                            2       1
           3        3

           5        4
f(N) = g(N) + h(N)
with h(N) = number of misplaced tiles

            1+5      2+3

   0+4                        3+2       4+1
            1+3      2+3

            1+5      2+4
    f(N) = h(N) =  distances of tiles to goal

           6         5

5                             2        1
           4         3

           6         5
         5         8               1    2     3

         4    2    1               4    5     6

         7   3     6               7    8
             N                         goal
• h1(N) = number of misplaced tiles = 6 is admissible
• h2(N) = sum of distances of each tile to goal = 13
        is admissible
• h3(N) = (sum of distances of each tile to goal)
           + 3 x (sum of score functions for each tile) = 49
        is not admissible
Graph and Tree Notations
                  Search Tree Notation
•   Branching degree, b
     – b is the number of children of a node
•   Depth of a node, d                                       d = Depth
     – number of branches from root to a             S
        node                                                      0
•   Arc weight, Cost of a path
•   n  n’ , n parent node of n’, a child
•   Node generation                                               1
•   Node expansion
•   Search policy:
     – determines the order of nodes                              2
        expansion                                        G
•   Two kinds of nodes in the tree
     – Open Nodes (Fringe)
         • nodes which are not expanded
            (i.e., the leaves of the tree)
     – Closed Nodes
         • nodes which have already been
            expanded (internal nodes)
4.2 Example of Evaluation Heuristic Function
             Map Navigation

                  A                   B                   C

    S                                                                 G

                 D                     E                       F

  S = start, G = goal, other nodes = intermediate states, links = legal transitions
 Example of a Search Tree

                         A                              D

           B                     D                           E

 C              E                                      B         F


Note: this is the search tree at some particular point in the search.
        Search Method 1:
    Breadth First Search (BFS)

            A                   D

    B                   A
            D                           E

C       E   E   S       B   S       B       F
   Search Method 2:
Depth First Search (DFS)
            A                       D

    B           D

C       E

                        Here, to avoid repeated
                        states assume we don’t
    D       F           expand any child node which
                        appears already in the path
                        from the root S to the parent.
                        (Again, one could use other
                G       strategies)
                        Search Method 3:
                        Best-First Search
            10.4                                           8.9
                 A                                    D

                                                                       F   3.0
                                              B    6.7

o uses estimated path cost from node to goal (heuristic, hcost)
o ignores actual path cost
o after each expansion; sort nodes according to estimated path-cost
from node to goal
o it always expands the most promising node on the fringe (according
to the heuristic)
oNote: this is not the optimal path for this problem
                     Path Costs
                     1               4
             A              B                C

                 2               5                   G

    5                                    4
             D               E                   F

        Optimal (minimum cost) path is S-A-D-E-F-G
      Heuristic Functions

       A          B          C
           10.4       6.7        4.0

    11.0                               G

           8.9                       3.0
       D          E              F
                   Search Method 3:
                 A* Algorithm in action
                                                    S         5 + 8.9 = 13.9
                 2 +10.4 = 12.4
                                     A                           D
      3 + 6.7 = 9.7
                      B                             D   4 + 8.9 = 12.9

7 + 4 = 11                    8 + 6.9 = 14.9
             C            E                                   6 + 6.9 = 12.9

        Dead End
                                                B               F    10 + 3.0 = 13
                                      11 + 6.7 = 17.7

                                                                    13 + 0 = 13
                 5- Intelligent Agents
 An agent is anything that can
     perceive its environment through sensors, and
     act upon that environment through actuators (or effectors)


 Goal: Design rational agents that do a “good job” of acting in their
     success determined based on some objective performance measure

  5.1 Example: Vacuum Cleaner Agent

Percepts: location and contents, e.g., [A, Dirty]
Actions: Left, Right, Suck, NoOp

                    5.2 Rational Agents
• Rational Agents
   – An agent should strive to "do the right thing", based on what it can perceive and
     the actions it can perform. The right action is the one that will cause the agent to be
     most successful.
   – Performance measure: An objective criterion for success of an agent's
   – E.g., performance measure of a vacuum-cleaner agent could be amount of dirt
     cleaned up, amount of time taken, amount of electricity consumed, amount of noise
     generated, etc.

   – Definition of “Rational Agent”:
        • For each possible percept sequence, a rational
          agent should select an action that is expected to
          maximize its performance measure, given the
          evidence provided by the percept sequence and
          whatever built-in knowledge the agent has.
 Rationality  omniscience
    An omniscient agent knows the actual outcome of its actions.
 Rationality  perfection
     Rationality maximizes expected performance, while perfection
     maximizes actual performance.

The proposed definition requires:
  – Information gathering/exploration
      • To maximize future rewards
  – Learn from percepts
      • Extending prior knowledge
  – Agent autonomy
      • Compensate for incorrect prior knowledge

• Rationality depends on
   –   the performance measure that defines degree of success
   –   the percept sequence - everything the agent has perceived so far
   –   what the agent know about its environment
   –   the actions that the agent can perform

• Agent Function (percepts ==> actions)
   – Maps from percept histories to actions   f: P*  A
   – The agent program runs on the physical architecture to produce the function f
   – agent = architecture + program

              Action := Function(Percept Sequence)
                        If (Percept Sequence) then do Action

• Example: A Simple Agent Function for Vacuum World
              If (current square is dirty) then suck
              Else move to adjacent square

                   5.3 PEAS Analysis
 To design a rational agent, we must specify the task

 PEAS Analysis:
    Specify Performance Measure, Environment, Actuators, Sensors

 Example: Consider the task of designing an automated
  taxi driver
    Performance measure: Safe, fast, legal, comfortable trip, maximize
    Environment: Roads, other traffic, pedestrians, customers
    Actuators: Steering wheel, accelerator, brake, signal, horn
    Sensors: Cameras, sonar, speedometer, GPS, odometer, engine
     sensors, keyboard

• Agent: Medical diagnosis system
  – Performance measure: Healthy patient, minimize costs, lawsuits
  – Environment: Patient, hospital, staff
  – Actuators: Screen display (questions, tests, diagnoses, treatments,
  – Sensors: Keyboard (entry of symptoms, findings, patient's answers)

• Agent: Part-picking robot
  –   Performance measure: Percentage of parts in correct bins
  –   Environment: Conveyor belt with parts, bins
  –   Actuators: Jointed arm and hand
  –   Sensors: Camera, joint angle sensors

           5.4 Environment Types
• Fully observable (vs. partially observable):
  – An agent's sensors give it access to the complete state of the
    environment at each point in time.

• Deterministic (vs. stochastic):
  – The next state of the environment is completely determined by the
    current state and the action executed by the agent. (If the environment
    is deterministic except for the actions of other agents, then the
    environment is strategic).

• Episodic (vs. sequential):
  – The agent's experience is divided into atomic "episodes" (each episode
    consists of the agent perceiving and then performing a single action),
    and the choice of action in each episode depends only on the episode

• Static (vs. dynamic):
  – The environment is unchanged while an agent is deliberating (the
    environment is semi-dynamic if the environment itself does not change
    with the passage of time but the agent's performance score does).

• Discrete (vs. continuous):
  – A limited number of distinct, clearly defined percepts and actions.

• Single agent (vs. multi-agent):
  – An agent operating by itself in an environment.

The environment type largely determines the agent design.

The real world is (of course) partially observable, stochastic,
sequential, dynamic, continuous, multi-agent

  5.4 Structure of an Intelligent Agent
• All agents have the same basic structure:
  – accept percepts from environment
  – generate actions

• A Skeleton Agent:
                         function Skeleton-Agent(percept) returns action
                          static: memory, the agent's memory of the world

                           memory  Update-Memory(memory, percept)
                           action  Choose-Best-Action(memory)
                           memory  Update-Memory(memory, action)
                           return action

• Observations:
  – agent may or may not build percept sequence in memory (depends on
  – performance measure is not part of the agent; it is applied externally to
    judge the success of the agent
                    5.5 Agent Types
1. Simple Reflex Agents
  – are based on condition-action rules and implemented with an appropriate
    production system. They are stateless devices which do not have memory
    of past world states.

2. Reflex Agents With Memory (Model-based)
  – have internal state which is used to keep track of past states of the world.

3. Agents With Goals
  – are agents which in addition to state information have a kind of goal
    information which describes desirable situations. Agents of this kind take
    future events into consideration.

4. Utility-based agents
  – base their decision on classic axiomatic utility-theory in order to act
     Note: All of these can be turned into “learning” agents
                 1 Simple Reflex Agent
•   We can summarize part of
    the table by formulating
    commonly occurring
    patterns as condition-action
•   Example:
           if car-in-front-brakes
           then initiate braking
•   Agent works by finding a
    rule whose condition
    matches the current
                                    function Simple-Reflex-Agent(percept) returns action
    situation                        static: rules, a set of condition-action rules
     – rule-based systems
                                     state  Interpret-Input(percept)
•   But, this only works if the      rule  Rule-Match(state, rules)
    current percept is sufficient    action  Rule-Action[rule]
    for making the correct           return action
Simple Reflex - Vacuum Agent

      2 Agents that Keep Track of the World
•   Updating internal state
    requires two kinds of
    encoded knowledge
     – knowledge about how the
       world changes (independent
       of the agents’ actions)
     – knowledge about how the
       agents’ actions affect the
•   But, knowledge of the
    internal state is not always
    enough                           function Reflex-Agent-With-State(percept) returns
     – how to choose among
       alternative decision paths     static: rules, a set of condition-action rules
       (e.g., where should the car            state, a description of the current world
       go at an intersection)?        state  Update-State(state, percept)
     – Requires knowledge of the      rule  Rule-Match(state, rules)
       goal to be achieved            action  Rule-Action[rule]
                                      state  Update-State(state, action)
                                      return action
   3 Agents with Explicit Goals

 Reasoning about actions
    reflex agents only act based on pre-computed knowledge (rules)
    goal-based (planning) act by reasoning about which actions achieve
     the goal
    less efficient, but more adaptive and flexible

      General Architecture for Goal-Based Agents
                    Input percept
                    state  Update-State(state, percept)
                    goal  Formulate-Goal(state, perf-measure)
                    search-space  Formulate-Problem (state, goal)
                    plan  Search(search-space , goal)
                    while (plan not empty) do
                        action  Recommendation(plan, state)
                        plan  Remainder(plan, state)
                        output action

• Simple agents do not have access to their own performance measure
    – In this case the designer will "hard wire" a goal for the agent, i.e. the designer will
      choose the goal and build it into the agent
• Similarly, unintelligent agents cannot formulate their own problem
    – this formulation must be built-in also

• The while loop above is the "execution phase" of this agent's behavior
    – Note that this architecture assumes that the execution phase does not
      require monitoring of the environment
• Knowing current state is not always enough.
  – State allows an agent to keep track of unseen parts of the world, but the agent
    must update state based on knowledge of changes in the world and of effects of
    own actions.
  – Goal = description of desired situation

• Examples:
  – Decision to change lanes depends on a goal to go somewhere (and other factors);
  – Decision to put an item in shopping basket depends on a shopping list, map of
    store, knowledge of menu

• Notes:
  – Reflexive agent concerned with one action at a time.
  – Classical Planning: finding a sequence of actions that achieves a goal.
  – Contrast with condition-action rules: involves consideration of future "what will
    happen if I do ..." (fundamental difference).

4 A Complete Utility-Based Agent

  Utility Function
     a mapping of states onto real numbers
     allows rational decisions in two kinds of situations
         evaluation of the tradeoffs among conflicting goals
         evaluation of competing goals
• Preferred world state has higher utility for agent = quality
  of being useful
• Examples
   –   quicker, safer, more reliable ways to get where going;
   –   price comparison shopping
   –   bidding on items in an auction
   –   evaluating bids in an auction

• Utility function: state ==> U(state) = measure of
• Search (goal-based) vs. games (utilities).

                      Shopping Agent
• Navigating: Move around store; avoid obstacles
  – Reflex agent: store map precompiled.
  – Goal-based agent: create an internal map, reason explicitly about it, use signs
    and adapt to changes (e.g., specials at the ends of aisles).

• Gathering: Find and put into cart groceries it
  wants, need to induce objects from percepts.
  – Reflex agent: wander and grab items that look good.
  – Goal-based agent: shopping list.

• Menu-planning: Generate shopping list, modify
  list if store is out of some item.
  – Goal-based agent: required; what happens when a needed item is not there?
    Achieve the goal some other way. e.g., no milk cartons: get canned milk or
    powdered milk.

• Choosing among alternative brands
  – utility-based agent: trade off quality for price.
                    Learning Agents

 Four main components:
    Performance element: the agent function
    Learning element: responsible for making improvements by observing
    Critic: gives feedback to learning element by measuring agent’s performance
    Problem generator: suggest other78  possible courses of actions (exploration)
                   5.6 Summary
• An agent perceives and acts in an environment. It has an
  architecture and is implemented by a program.
• An ideal agent always chooses the action which maximizes its
  expected performance, given the percept sequence received so
• An autonomous agent uses its own experience rather than built-
  in knowledge of the environment by the designer.
• An agent program maps from a percept to an action and updates
  its internal state.
• Reflex agents respond immediately to percepts.
• Goal-based agents act in order to achieve their goal(s).
• Utility-based agents maximize their own utility function.


Shared By:
rahmat ahmad rahmat ahmad http://