# Production Utility

Shared by:
Categories
Tags
-
Stats
views:
0
posted:
9/29/2012
language:
English
pages:
15
Document Sample

```							           Production Utility
• When multiple productions match, how to
choose best one to fire
– Conflict set: the productions that match
– Conflict resolution: the scheme for choosing the
rule to fire
• For ACT-R - production with highest Utility
Utility

• Each production has a utility u associated
with it
– Explicitly set or
– Learned from experience
• Utility = u + noise
– Noise controlled by :egs (expected gain noise)
• logarithmic distribution, mean 0 and variance:

p2
s2 =        s2
3
Probability of firing
Given j productions with utility Uj:

The probability of choosing production i:

S - expected gain noise

Summation is over all productions in conflict set
Building-sticks problem
•   Unlimited supply of building sticks
•   Three lengths
•   Build a target stick of a particular length
•   2 basic strategies
– Start small
– Start big and saw off
• Tendency to hill climb
Building Sticks
Bst-nolearn
• A = 15 B = 200 C = 41 Goal = 103
– Diff B and Goal = 97, diff C and Goal = 35
– Only solution B-2C-A
• A = 10 B = 200 C = 29 Goal = 132
– Diff B and Goal = 68, diff C and Goal = 103
– Only solution B-2C-A
Model
• Run model
• Look at productions and trace
• Encode-under
– Encodes difference between goal and stick-c
• Encode-over
– Encodes difference between goal and stick-b
• (spp decide-over decide-under force-over-
force-under)
Production parameters
> (spp force-over force-under decide-over decide-under)
Parameters for production FORCE-OVER:
:utility 8.720
:u 10.000
:at 0.050
Parameters for production FORCE-UNDER:
:utility 8.328
:u 10.000
:at 0.050
Parameters for production DECIDE-OVER:
:utility 16.871
:u 13.000
:at 0.050
Parameters for production DECIDE-UNDER:
:utility 6.597
:u 13.000
:at 0.050
Setting Utility
(spp decide-over :u 13)
(spp decide-under :u 13)
(spp force-over :u 10)
(spp force-under :u 10))
Calculate probabilities

Difference of 35 -> 1 decide and 2 force, :ut = -100 means 0 chance
of failure, :egs = 3

Probability(force) = .248 Thus .248 chance that will try to solve in
wrong direction
Utility Learning
• Difference Learning Equation

Ui (n) = Ui (n -1)+ a[Ri (n)- Ui (n -1)]

          learning rate (0.2)
• Ri(n) = reward rule receives for its nth
application
Reward
• Asynchronous
• Ri(n) = reward value - time from rule selection to
the reward
• Applies to all productions between current reward
and previous reward
• (trigger-reward val) val = number or nil
• Spp
(spp pick-another-strategy :reward 0)
Learning in BST
a    b    c    Goal   %OVERSHOOT
15   250   55    125      20
10   155   22    101      67
14   200   37    112      20
22   200   32    114      47
10   243   37    159      87
22   175   40     73      20
15   250   49    137      80
10   179   32    105      93
20   213   42    104      83
14   237   51    116      13
12   149   30     72      29
14   237   51    121      27
22   200   32    114      80
14   200   37    112      73
15   250   55    125      53

• Majority look like undershoot, however majority
can only be solved be overshoot
• 1st and last
– Solved by either
– 20% overshoot for 1, 53% overshoot for last
BST-Learn
• (sgp :ul t)
• Initial:
–   force-over :u 10.0
–   force-under :u 10.0
–    decide-over :u 13.0
–   decide-under :u 13.0

> (collect-data 100)
CORRELATION: 0.775
MEAN DEVIATION: 18.506

Trial 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
29 51 61 67 89 42 79 79 57 44 28 35 62 75 51

DECIDE-OVER : 13.1984
DECIDE-UNDER: 11.3041
FORCE-OVER : 12.0823
FORCE-UNDER : 6.5455
Set/show production parameter-SPP
• Name
• :at – action time of production
– defaults to :dat
• :u
– Current U value
– Defaults to :iu
• :utility
– Last computed utility calculated during conflict
resolution
– Can’t be set
• :reward
– Reward triggered by production firing

```
Related docs
Other docs by alicejenny
to view Lesson from Teachers
GUIDELINES FOR POST EXPOSURE PROPHYLAXIS PEP
FIRST BANK ADDITION City of Bloomington