Production Utility

Shared by: alicejenny
Categories
Tags
-
Stats
views:
0
posted:
9/29/2012
language:
English
pages:
15
Document Sample
scope of work template
							           Production Utility
• When multiple productions match, how to
  choose best one to fire
  – Conflict set: the productions that match
  – Conflict resolution: the scheme for choosing the
    rule to fire
     • For ACT-R - production with highest Utility
                      Utility

• Each production has a utility u associated
  with it
  – Explicitly set or
  – Learned from experience
• Utility = u + noise
  – Noise controlled by :egs (expected gain noise)
     • logarithmic distribution, mean 0 and variance:

                         p2
                  s2 =        s2
                          3
      Probability of firing
Given j productions with utility Uj:


   The probability of choosing production i:




    S - expected gain noise

    Summation is over all productions in conflict set
        Building-sticks problem
•   Unlimited supply of building sticks
•   Three lengths
•   Build a target stick of a particular length
•   2 basic strategies
    – Start small
    – Start big and saw off
• Tendency to hill climb
Building Sticks
               Bst-nolearn
• A = 15 B = 200 C = 41 Goal = 103
  – Diff B and Goal = 97, diff C and Goal = 35
  – Only solution B-2C-A
• A = 10 B = 200 C = 29 Goal = 132
  – Diff B and Goal = 68, diff C and Goal = 103
  – Only solution B-2C-A
                   Model
• Run model
• Look at productions and trace
• Encode-under
  – Encodes difference between goal and stick-c
• Encode-over
  – Encodes difference between goal and stick-b
• (spp decide-over decide-under force-over-
  force-under)
          Production parameters
> (spp force-over force-under decide-over decide-under)
Parameters for production FORCE-OVER:
:utility 8.720
:u 10.000
:at 0.050
Parameters for production FORCE-UNDER:
:utility 8.328
:u 10.000
:at 0.050
Parameters for production DECIDE-OVER:
:utility 16.871
:u 13.000
:at 0.050
Parameters for production DECIDE-UNDER:
:utility 6.597
:u 13.000
:at 0.050
           Setting Utility
(spp decide-over :u 13)
(spp decide-under :u 13)
(spp force-over :u 10)
(spp force-under :u 10))
              Calculate probabilities


Difference of 35 -> 1 decide and 2 force, :ut = -100 means 0 chance
of failure, :egs = 3




Probability(force) = .248 Thus .248 chance that will try to solve in
wrong direction
             Utility Learning
 • Difference Learning Equation

Ui (n) = Ui (n -1)+ a[Ri (n)- Ui (n -1)]

          learning rate (0.2)
• Ri(n) = reward rule receives for its nth
  application
                       Reward
• Asynchronous
• Ri(n) = reward value - time from rule selection to
  the reward
• Applies to all productions between current reward
  and previous reward
• (trigger-reward val) val = number or nil
• Spp
      (spp read-done :reward 20)
      (spp pick-another-strategy :reward 0)
                 Learning in BST
             a    b    c    Goal   %OVERSHOOT
            15   250   55    125      20
            10   155   22    101      67
            14   200   37    112      20
            22   200   32    114      47
            10   243   37    159      87
            22   175   40     73      20
            15   250   49    137      80
            10   179   32    105      93
            20   213   42    104      83
            14   237   51    116      13
            12   149   30     72      29
            14   237   51    121      27
            22   200   32    114      80
            14   200   37    112      73
            15   250   55    125      53


• Majority look like undershoot, however majority
  can only be solved be overshoot
• 1st and last
   – Solved by either
   – 20% overshoot for 1, 53% overshoot for last
                           BST-Learn
• (sgp :ul t)
• Initial:
    –   force-over :u 10.0
    –   force-under :u 10.0
    –    decide-over :u 13.0
    –   decide-under :u 13.0

> (collect-data 100)
CORRELATION: 0.775
MEAN DEVIATION: 18.506

Trial 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
   29 51 61 67 89 42 79 79 57 44 28 35 62 75 51

DECIDE-OVER : 13.1984
DECIDE-UNDER: 11.3041
FORCE-OVER : 12.0823
FORCE-UNDER : 6.5455
 Set/show production parameter-SPP
• Name
• :at – action time of production
   – defaults to :dat
• :u
   – Current U value
   – Defaults to :iu
• :utility
   – Last computed utility calculated during conflict
      resolution
   – Can’t be set
• :reward
   – Reward triggered by production firing

						
Related docs
Other docs by alicejenny
to view Lesson from Teachers
Views: 201  |  Downloads: 0
GUIDELINES FOR POST EXPOSURE PROPHYLAXIS PEP
Views: 133  |  Downloads: 0
FIRST BANK ADDITION City of Bloomington
Views: 0  |  Downloads: 0
Is There Bubble in US Housing Markets MIT
Views: 24  |  Downloads: 0
CCEVS Policy Letter NIAP CCEVS
Views: 0  |  Downloads: 0
Ratification of Protocol No
Views: 190  |  Downloads: 0
Michigan Proposed Insurance Survey ASTSWMO
Views: 0  |  Downloads: 0
The Impact of the new NHS Dental Contract
Views: 0  |  Downloads: 0
OVERVIEW OF THE Bad Request
Views: 189  |  Downloads: 0