Instrumental Conditioning II pellet

					Instrumental Conditioning II
        Delay of Reinforcement

           Choice              Delay   Goal
                                       or No
            Incorrect                  Reward

                        Grice (1948)
Grice (1948) Results
 Overcoming the effects of delay
• Secondary reinforcers

• “Marking” procedure
Lieberman, McIntosh & Thomas (1979)
                 Reinforcement    Punishment

Positive          Chocolate Bar   Electric Shock

Negative         Excused from     No TV
contingency      Chores           privileges

Effect on Rate
Professor Drew
Anticipatory Contrast - Crespi (1942)
Rats run down maze to find food pellets in goal arm.
       What is a reinforcer?
Thorndike: A stimulus that produces a “satisfying state of

Operational Definition (behaviorists): That which
increases the probability of the response that preceded it.
         Drive Reduction Theory

              Compare with Set Point

Amt of                                          Seek water/
H2O in                                 drives   don’t seek
body                                            water
 Drive Reduction Considered: Are
reinforcers necessary for survival?

– Eating to excess

– Drugs of Abuse

– “Pleasure centers” of the brain
    Behavioral Regulation View: The
           Premack Principle
• Behaviors are reinforcing, not stimuli
• To predict what will be reinforcing, observe
  the baseline frequency of different
• Highly probable behaviors will reinforce
  less probable behaviors
      Premack Revised: The Response
          Deprivation Hypothesis
            Timberlake & Allison (1974)
• Low frequency behaviors can reinforce high
  frequency behaviors (and vice versa)

• All behaviors have a preferred frequency = the
  behavioral bliss point

• Deprivation below that frequency is aversive, and
  organisms will work to remedy this
      Response deprivation hypothesis

 The ice cream
 scale (in pints)

.25       .5      .75       1.0       1.25      1.5     1.75        2.0 2.25   2.5

      Will work                             Will work to avoid ice cream
      to obtain

                        Bliss point
                        (1.0 pints/night)
Contiguity versus Contingency in
      operant conditioning
               Degraded Contingency Effect
                = bar press   = food

 Perfect                                Strong
 contingency                            Responding

Degraded                                 Weak
contingency                              Responding
         G.V. Thomas (1983)
          Contiguity pitted against contingency

“Free” reinforcers given every 20s
Lever press advances delivery of pellet, but
cancels pellet for next 20-s interval

             20s             40s               60s
So if you press at second 2, you get a pellet immediately,
but you get no pellet during seconds 3-20 and 21-40.
                   G.V. Thomas (1983)
                   Contiguity pitted against contingency

                              Lose this
Lever press here

                   20s             40s               60s
      So if you press at second 2, you get a pellet immediately,
      but you get no pellet during seconds 3-20 and 21-40.
       “Superstitious Behavior”
• Suggested that temporal contiguity more
  important than contingency
• 15-s FT, no response requirement
• “adventitious reinforcement”
“In 6 out of 8 cases the resulting responses were so
clearly defined that two observers could agree
perfectly in counting instances. One bird was
conditioned to turn counter-clockwise about the cage,
making 2 or 3 turns between reinforcements. Another
repeatedly thrust its head into one of the upper corners
of the cage….”
  toward feeder

   near feeder

along wall

  ¼ turn
“Misbehavior” and the limits of
    operant conditioning
 Limits of Operant Conditioning
• Some behaviors can’t be conditioned
  – Yawning
  – Scratching

• Belongingness
  – Presentation of a female won’t reinforce biting

• “Misbehavior”
Marian Breland Bailey – How to train a
The famous dancing chicken
What is learned in operant
          What is learned?
Edwin Guthrie: mere contiguity of a
stimulus and a behavior stamps in
that S-R; reinforcement is not

      S                R
      What is learned?

Thorndike:        “stamps in” this

  S              R
    What is learned?

S             R

             O     ?
    2-Process Theory

S               R      O

       2-Process Theory


 S                     R


      Evidence for 2-process theory
               Pavlovian-Instrumental Transfer

Phase 1                Phase 2               Test
LeverFood            LightFood            Light: #Presses?
                                            No Light: #Presses?

 #                                           The presence
 Presses                                     of the CS
           Light         No CS               responding
           What is learned?

    S                      R                    O

Does the Pavlovian S-O association activate a
vague emotional state or a specific mental
representation of the outcome?
 Specific Outcome Representations

 Phase 1           Phase 2                 Test
 (operant)         (classical)
R LeverPellet    TonePellet     Tone:Left? Right?
L LeverSucrose   LightSucrose   Light:Left? Right?

      Presses                                     Right

                   Light           Noise