First Complete Draft: May 15, 2008
David M. Buss
A. What is Learning?
a. Habituation and Sensitization
B. Classical Conditioning
1. Pavlov’s Discovery of Classical Conditioning
2. Major Principles of Classical Conditioning
d. spontaneous recovery
e. stimulus generalization and the case of little Albert
e. stimulus discrimination
3. A Cognitive Reformulation of Classical Conditioning
4. An Evolutionary Perspective on Conditioning
Ongoing Debate: If Human Behavior is Controlled by Contingencies of
Reinforcement, Are People Responsible for Their Own Actions?
C. Operant Conditioning
1. Thorndike and the Law of Effect
2. Reinforcement and Punishment
3. Major Principles of Operant Conditioning
a. Primary and secondary reinforcers
b. Operant generalization and operant discrimination
d. Schedules of reinforcement
4. Operant Conditioning Inside the Head—Cognition and the Brain
a. Cognition and learning
b. Brain mechanisms and learning
5. An Evolutionary Perspective on Operant Conditioning
D. Observational Learning
1. An Evolutionary Perspective on Social Learning
Adapting to Your World: Using Principles of Learning to Improve Your Life
F. Chapter Summary
G. Further Exploration
Psychologists became excited about the influence of food rewards on puzzle-
solving ability in chimpanzees. So they set up an experiment. Every time the chimp
solved a puzzle successfully, the experimental rewarded the chimp by giving it a banana
slice. As you probably know already, chimps, like most people, really like the taste of
One of the chimps turned out to be exceptionally good at solving puzzles. As the
experiment continued, this smart chimp solved the puzzles at a rapid rate. And each time,
the experimenter rewarded the chimp with another banana slice. Before long, the chimp
began to stockpile his earned banana slices. And then something dramatic happened.
The chimp solved another puzzle, and waited for his reward. But the researcher had run
out of banana slices. The chimp looked at the researcher. The researcher looked at the
chimp. Then the chimp picked up one of his banana slices and handed it back to the
The chimp continued to solve more puzzles, each time handing the experimenter
back another banana slice. Finally, it was the experimenter who had the entire pile of
banana slices! This true story brings us to the central question of this chapter: How do
humans and other species learn?
WHAT IS LEARNING?
1. Define learning
2. Describe the concept of habituation
3. Contrast habituation with sensitization
An infant has no vocabulary, can’t walk, and is years away from acquiring the
skill to drive a car. Adults, in contrast, know an average of roughly 50,000 words, can
walk without thinking about it, and most have mastered driving. Learning is the
collection of processes by which experiences cause relatively enduring changes in
individual’s psychology or behavior. The relatively enduring component differentiates
changes that are transient, such as those caused by fatigue or those caused by changes in
the temperature of the air. The inclusion of both psychology and behavior is also critical
to the definition of learning. You can learn a new vocabulary word such as ―dearth‖
(which means a shortage or lack) and then use it in conversation (e.g., ―There is a dearth
of good coffee shops near my house‖). Or you can learn the same vocabulary word and
keep in stored in your memory even if you never end up using it. The learning exists in
your head, even if it is not expressed in behavior.
The three examples above—learning to walk, acquiring a 50,000 word vocabulary,
and driving a car—illustrate that learning does not occur through a single process but
rather a collection of processes. Most infants eventually learn to walk, but they do
without formal instruction or a lot of help. Once children begin to learn words, they do
so at an astonishingly rapid rate of 10 words a day so that they possess a vocabulary of
10,000 words by the age of six. As we will see in Chapter 9, there seems to be a critical
period in which language learning comes naturally and easily (notice how much more
difficult it is to learn a second language as an adult). Learning words, though, requires
environmental exposure to those words—a different sort of experience than that required
to learn to walk. Now consider learning to drive. Unlike walking or word learning,
humans do not have an evolved propensity to learn how to drive a car. That type of
learning requires watching how others do it and imitating their actions, often combined
with formal instruction. The key point is that not all forms of learning are the same. They
occur through different processes, some of which involve things for which humans seem
almost wired to learn, and others for which humans have no natural learning inclinations.
Habituation and Sensitization
One of the most primitive forms of learning is habituation, the progressive
reduction in intensity or frequency of a response to a stimulus as a consequence of
repeated exposures to the stimulus. A classic example of habituation occurs in people
who live near train tracks. At first, the sounds from the trains are jarring and annoying.
Over time, however, people gradually habituate to the noise and even get to the point
where they don’t hear it at all. I once visited a friend who lived near train tracks, and
when a train roared by I asked him: ―How can you concentrate with all that noise‖ He
replied: ―What noise?‖ He had totally habituated to the sounds. Other examples of
habituation include gradually getting used to hot weather over time after moving to a city
closer to the equator; and gradually experiencing a lowered response to alcohol after
repeated episodes of drinking beer.
The adaptive value of habituation is clear. Organisms have a limited amount of
attention, and so ―tune out‖ stimuli that do not require an adaptive response. Loud noises
often signal an impending threat. Habituation to loud noises that do not signal threats
allows an individual to allocate attention to more pressing adaptive problems. In the case
of habituating to living in a hot climate, the body acclimatizes adaptively. As a general
rule, habituation occurs more quickly if the repeated events occur in rapid succession or
close in time (Miller & Grace, 2000).
Sensitization, another primitive type of learning, is the progressive amplification
of a response following repeated exposures to a stimulus. Have you ever worn a shirt
that had an inside label that chaffed your skin? Typically, when you first put on the shirt,
the chaffing is minor. But after the label repeatedly rubs on the same spot on your skin, it
feels increasingly irritating. You have become sensitized to the repeated rubbing of the
label. Sensitization is the opposite of habituation. Whereas habituation is the progressive
reduction of a response to repeated exposures to a stimulus, sensitization is the
progressive amplification of responses to repeated exposures. Sensitization has been
documented in an astonishing array of organisms, including humans, cats, and goldfish
(Kimble, 1961). Even single-celled amoeba show sensitization, suggesting that it is an
evolutionary ancient learning mechanism (Mast & Pusch, 1924).
Sensitization often produces adaptive responses, signaling to the organism an
adaptive problem that should be dealt with. In the case of increasing pain by the repeated
chaffing of skin from the shirt label, sensitization increases the likelihood that you will
remove the label before it breaks the skin membrane. Sensitization, however, has also
been implicated in maladaptive responses such as panic anxiety and post-traumatic stress
disorder (Rosen & Schulkin, 1998). Soldiers who have been exposed to traumatic
experiences such as explosions and loud gun fire sometimes develop a maladaptive
hyper-sensitivity to any loud noises, even when they are safely home and away from
battle (Yehuda, 2002).
The role of sensitization in creating maladaptive responses such as those of post-
traumatic stress disorder are likely due to a mismatch between ancestral and modern
environments (Cantor, 2005). Ancestral environments did not contain the sorts of loud
explosions cause by modern bombs and guns. So soldiers exposed to evolutionarily
novel traumatic experiences become overly sensitized to loud noises. Sensitization, a
learning mechanism that functions adaptively most of the time, can create disorders when
there are severe mismatches between ancestral and modern environments.
Both habituation and sensitization are the simplest and likely most evolutionarily
primitive forms of learning in that the are both types of non-associative learning--
learning that occurs without being paired or associated with a reward or punishment. The
types of learning to which we now turn are forms of associative learning—learning that
occurs as a consequence of a response being paired or associated with a reward or
Learning is defined as the processes by which experiences cause relatively
enduring changes in an organism’s psychology or behavior. Two of the most
evolutionarily primitive forms of learning are habituation and sensitization. Habituation
is the progressive reduction of responding to a stimulus after repeated exposures to it.
Sensitization is the progressive intensification of responses after repeated exposures to a
Can you think of an example from your own life in which you habituated to
something after repeated exposure to it?
Now can you think of an example in which you became sensitized to something
after repeated exposures to it?
In what ways to these examples from your life differ from each other?
1. Define classical conditioning.
2. Distinguish between unconditioned and conditioned stimuli.
3. Distinguish between unconditioned and conditioned responses.
4. Correctly identify the six major principles of classical conditioning:
Acquisition, extinction, reacquisition, spontaneous recovery, stimulus
generalization, and stimulus discrimination
5. Describe the cognitive reformulation of classical conditioning
6. Define the evolutionary concept of preparedness in learning.
7. Define the evolutionary concept of adaptive specializations in learning.
When I was a college student, I used to walk past a pizza parlor called ―Blondies‖
on my way home from classes. It smelled so good that often I could not resist stopping in
for a slice. Even on those days when I did resist, the fresh steaming pizza aroma
immediately caused my mouth to water. By the end of the term, my mouth would water
as soon as I saw the drug store right before the pizza place, even before I could smell
anything. I had learned to salivate at the sight of the drug store through classical
conditioning, a process of learning by which a neutral stimulus (drug store) evokes a
response (salivation) after being paired with something (eating pizza) that previously
evoked the response. The Russian physiologist Ivan Pavlov (1849-1936) discovered
classical conditioning through his experiments on the salivary response. But his subjects
were not people; they were dogs. And he taught them to salivate not to pizza smells or
the sight of a drug store, but rather to the tone of a bell.
Pavlov’s Discovery of Classical Conditioning
In Pavlov’s experiments, he inserted a plastic tube inside of the dog’s salivary
glad in order to collect saliva (see Figure 6.1). Then he placed the dog in a harness. In
the first phase of the experiment, he gave the dog appetizing food. To no one’s surprise,
the dog salivated. In the terminology of classical conditioning, the food is referred to as
an unconditioned stimulus (US), defined as something that evokes a natural response.
Other examples of unconditioned stimuli include puffs of air that naturally evoke an eye-
blink reflex; the doctor’s tap on your knee that naturally evokes the knee-jerk reflex; or
exposure to freezing cold that naturally evokes the shivering response. The dog’s
salivation is called an unconditioned response (UR), defined as the reaction naturally
elicited by the unconditioned stimulus. The salivation is ―unconditioned‖ in the sense
that it requires no prior learning, or conditioning, to occur.
Insert Figure 6.1 About Here
Pavlov’s discovered that he could condition dogs to salivate in response to stimuli
that initially have nothing to do with eating, such as the ring of a bell, the sound of a
buzzer, or even a flash of light. His procedure consisted of repeatedly pairing the
unconditioned stimulus (food) with a conditioned stimulus (CS), defined as what is
initially neutral and does not evoke a response. Bells, buzzers, and flashes of light are all
examples of conditioned stimuli because they do not elicit the salivary response from the
dog. But after repeatedly pairing the sound of a bell (CS) with the food (US), eventually
the dog with begin to salivate to the sound of the bell. Salivation in response to a bell or
tuning fork after these repeated pairings is called a conditioned response (CR), a
reaction produced by the conditioned stimulus that mimics the unconditioned response.
This process, involving four key elements of classical conditioning, are shown in Figure
Insert Figure 6.2 About Here
These four elements of classical conditioning (US, UR, CS, and CR) explain why
learning in a variety of settings. They explain why my taste buds got activated by the
sight of the drug store. I have a natural salivary response (UR) to pizza (US). After
repeated pairings of the sight of the drug store (CS) with salivation in response to pizza
aromas (US), my mouth began to water at the mere sight of the drug store (CR).
Heroin addicts notoriously get classically conditioned to the sight and feel of the
needles they use to inject the illegal drugs. By repeatedly pairing an initially neutral
stimulus (the needle) with the natural pleasure produced by the heroin (US), they
eventually experience pleasure simply by the sight and feel of the needle.
Examples of classical conditioning abound in television commercials.
Advertisers are notorious for using physically attractive people to sell products. As we
will see in Chapter 14 (Social), people find looking at attractive individuals (US) to be
intrinsically rewarding, causing us to feel pleasure when we look at them (UR). By
repeatedly pairing attractive individuals (US) with initially neutral stimuli such as
particular brands of soft drinks, laundry soap, or cars (CSs), people eventually find it
rewarding to look and use these products even when they are not paired with attractive
individuals. Consumers get classically conditioned to find certain products rewarding.
Major Principles of Classical Conditioning
When Pavlov’s (1923, 1927) experiments on classical conditioning were
published, psychologists greeted them with great enthusiasm. The process of classical
conditioning could be used to explain changes in behavior without resorting to
mentalistic concepts such as wants, needs, desires, or even hunger. It promised a
rigorous scientific psychology in which overt behavior, and changes in behavior, became
the primary focus. And importantly, the behaviorists, as they came to be called, believed
that the laws of learning they discovered were highly general; they were assumed to
apply across a wide variety of species and a wide variety of circumstances (Domjan, in
press). The assumption of generality, as we will see, later came to be challenged. In the
meantime, however, behaviorists made important advances in discovering some basic
principles of classical conditioning.
The period of classical conditioning in which the CS and US are repeatedly paired
in order to create the CR is called the acquisition phase. The dog does not begin to
salivate buckets of drool after the first pairing. Rather, the salivary response to the
neutral stimulus (e.g., bell or tone) is relatively mild after a few pairings, rapidly
increases, and then reaches a plateau (see left panel of Figure 6.3). I discovered this with
my dog Dexter. When I first started to feed him dinner, I would say the word ―dinner.‖
Initially, he did not respond to the word. After several pairings of the word ―dinner‖
followed by his food, the simple word dinner would cause Dexter to lick his lips. He had
acquired a conditioned response of lip licking to the conditioned stimulus of ―dinner.‖
Insert Figure 6.3 About Here
What happens if you stop pairing a CS such as a tone with the an US such as
food? You probably guessed correctly that the CR, in this case salivation to the tone, will
eventually decrease. Extinction is process by which a conditioned response gradually
diminishes in magnitude when the US is no long paired with the CS (see middle panel of
Figure 6.3). Just for fun, I tried this with Dexter. I said the word ―dinner,‖ but did not
follow it with food. Gradually, his lip-licking response grew less and less frequent until
eventually Dexter stopped doing it entirely.
How rapidly does the conditioned response return when the CS and US are paired
again? It turns out that the answer is ―very rapidly,‖ as seen in the third panel of Figure
6.4. This rapid recovery of the conditioned response is called reacquisition. I tried this
out on Dexter. After a couple of weeks of extinction, I once again followed the word
―dinner‖ with Dexter’s food. It took him only a single instance for his conditioned
response of lip licking to return. He had reacquired the conditioned response.
Sometimes it doesn’t even take another conditioning trial for the conditioned
response to return. Pavlov eliminated the salivary conditioning to the tone through
extinction. Then he gave the poor dogs a break from his experiments for a brief period of
time. Then he put the dogs back in their harnesses, and sounded the tone. Interestingly,
the dogs began to salivate again, even though food was not followed by the tone. This is
called spontaneous recovery, the return of an extinguished conditioned response after a
rest period. This fascinating phenomenon tells us that extinction does not totally
eliminate conditioning. There is residual learning called ―savings.‖ Somewhere in the
dog’s brain, there remain faint traces of the conditioning.
Let’s revisit the dog who was conditioned to salivate to a particular tone, such as
one of 5000 Hz. Do you think that the dog will salivate only to that tone, or will it
salivate to higher or lower pitched tones such as 5100 Hz or 4900 Hz? It turns out that
dogs do respond to tones different from those on which they were conditioned. This
phenomenon is called stimulus generalization, the spread of the conditioned response to
stimuli that are similar to the conditioned stimulus.
Stimulus generalization does not just occur with dogs and salivation. It occurs
with humans too. In a famous example, the behaviorist John Watson showed a white rat
to a small boy named Little Albert. Little Albert liked the white rat and wanted to play
with it. Then Watson did something that most people would now consider cruel. As
Little Albert reached for the white rat, Watson came up behind him and unexpectedly
made a loud clanging noise. The noise startled Little Albert, and he began to cry. After a
few trials of this, Little Albert grew afraid of the white rat. He started to cry when shown
the white rat, even if the loud noise did not follow. The fascinating finding, though, was
that Little Albert’s conditioned fear generalized. He also showed fear of stimuli that bore
some resemblance to the white rat, including a white rabbit, a fur coat, some white cotton
balls, and even the white beard of Santa Claus. Watson he showed that classical
conditioning occurred not just in dogs or other non-human organisms.
Stimulus generalization makes good adaptive sense. In everyday life, stimuli are
rarely exactly identical to each other. Suppose you initially liked to play with a white
furry rat, but it bit you hard on your finger, drawing blood. It would be wise to
generalize, and avoid not just that one white rat, but other rats as well. Or if you burned
your finger on a stove, you would be wise to be careful not merely around that one stove,
but around all stoves. Stimulus generalization allows an organism to learn about classes
of things that share similar elements.
As a general rule, the closer the new stimulus is to the conditioned stimulus, the
greater the response to it. And the more distant it is, the less the response to it.
Conditioned salivation to a 5000 Hz tone will generalize to a 4900 or 5100 tone. As the
tone becomes more and more dissimilar from the original tone, however, the
generalization falls off. This is called a generalization gradient (see Figure 6.4)
Insert Figure 6.4 About Here
The flip side of stimulus generalization is stimulus discrimination, when an
organism responds only to the conditioned stimulus and does not respond to other stimuli.
Look again at Figure 6.4, the stimulus generalization gradient. While it reveals that
organisms generalize to similar stimuli, it also reveals that they fail to generalize to
stimuli distant from the original stimulus. With even greater distance, they stop
responding at all. This reveals that organisms are discriminating among stimuli.
You can probably recall instances in which you have experienced stimulus
discrimination. One that occurs fairly often is when you are walking across campus and
calls out your name. Let’s say your name is Misha. If someone called out ―Hey, Misha,‖
you would probably turn around. But you would be less likely to turn around, perhaps
hesitating, if you heard ―Hey, Lisha‖ or ―Hey, Michele.‖ Although you have been
conditioned to respond when your name is called, you discriminate among stimulus
names as they sound less and less like yours.
A Cognitive Reformulation of Classical Conditioning
Early behaviorists such as Pavlov and Watson believed that the scientific analysis
of behavior required strict avoidance of mentalistic terms that referred to things that
might go on inside the head. Behaviorists banished words and concepts such as beliefs,
desires, expectancies, and goals. Recall the conditioned fear of Little Albert? Watson
wanted to demonstrate that complex emotional responses such as fear could be produced
solely by the application of classical conditioning. Mentalistic concepts, such as those
favored by Freud, were seen as entirely unnecessary.
Furthermore, Watson and other behaviorists advocated an extreme version of
environmentalism. Everything important to know occurred external to the organism.
Watson famously summarized his position in this quote:
―Give me a dozen healthy infants, well-formed, and my own specified world to
bring them up in and I’ll guarantee to take any one at random and train him to become
any type of specialist I might select—doctor, lawyer, artist, merchant-chief and, yes, even
beggar-man and thief, regardless of his talents, penchants, abilities, vocations, and race of
his ancestors‖ (Watson, 1930, p. 104).
The key point from Watson’s perspective was not to deny that mental processes
occurred inside the heads of organisms. He acknowledged that they did. Rather, he
believed that they were too fuzzy conceptually and hence could not be studied
scientifically. Behavior, he believed, could be understood without any reference to
processes inside the head. This set an important framework that affected the entire field
of psychology for half a century, resting on these assumptions:
Most behavior is learned.
Conditioning is the process by which behavior is learned.
Mental concepts are scientifically fuzzy and do not add to our understanding of
Behavior at any one point in time can be explained by past conditioning
Conditioning, therefore, is the most important explanatory concept for psychology.
The exclusion of cognitive concepts in understanding conditioning was challenged by
Robert Rescorla and Allan Wagner in what has become known as the Recorla-Wagner
model (Rescorla & Wagner, 1972). They argued that dogs did not learn to salivate as a
result of a direct learned connection between the tone and the food. Rather, the dogs
learn an expectation, a mental concept, that the tone reliably predicts the arrival of food.
You might be thinking: How does adding the concept of ―expectation,‖ which we can’t
observe directly, contribute to a deeper understanding of conditioning? To understand,
keep in mind that forces that cannot be seen directly affect things all the time. No one
has ever seen a black hole in space, for example, but the existence of black holes can be
inferred from their effects on changes in the orbits of objects in their vicinity. Similarly,
you cannot literally see gravity, but you can infer gravity from the effects it has on
objects when they are dropped.
Consider these facts. Whereas Pavlov’s dogs got conditioned to the sound of the
tone, they did not get conditioned to many other things that the food was paired with—
Pavlov’s harness, Pavlov’s experimental room, or even Pavlov himself. Moreover, when
conditioned to the tone, the dog also does more than salivate. It starts wagging its tail,
begging for food, and looking in the direction in which food is most likely to emerge.
According to the Rescorla-Wagner model, the conditioning causes an expectation that
food will predictably follow, and it is that expectation that causes the array of responses
from salivating to tail wagging (see Figure 6.5). It’s not merely the association of two
events such as a tone and food, but rather the predictive value of the tone in the arrival of
Insert Figure 6.5 About Here
Here are some findings explained by the Rescorla-Wagner model that cannot be
explained without invoking the mental concept of expectation:
Unfamiliar events such as bells or tones will be easier to condition than familiar
ones, such as the presence of Pavlov, because familiar events already have other
expectations associated with them and do not offer good predictive value.
The conditioned stimulus usually must precede the unconditioned stimulus, not
merely be associated with it. If the food and bell occur at the same time, or the
bell occurs after the food is presented, the bell offers no predictive value about the
arrival of food (Terry, 2006).
Conditioning is nearly impossible, or is blocked, if the animal already has a good
predictor of the arrival of food (Kamin, 1969). If a tone already predicts the
arrival of food, and the experimenter then pairs both a light and the tone with the
arrival of food, the dog will not learn to respond to the light alone. It is as if the
light adds no new predictive information about the arrival of food, so the dog
In short, the Rescorla-Wagner model suggests that organisms use information to form
expectations and internal representations that allow them to best predict what will happen
in the world (in the case of dogs, the arrival of food). Mere stimulus-response association
is not enough to explain conditioning phenomena.
An Evolutionary Perspective on Classical Conditioning
A fundamental assumption of classical conditioning theory is that that the
principles of learning are general in two important ways. First, they are assumed to be
general across species. This assumption is what allowed Pavlov to use dogs rather than
humans. Behaviorists assumed that the choice of which species on which to conduct the
experiments did not matter. This assumption allowed researchers to choose species to
study based on primarily on convenience. Second, the laws of learning are assumed to be
general across stimuli and responses. The dog could be conditioned to salivate in
response to tones, bells, lights, smells, or anything else. In short, learning theorists
assumed that conditioned stimuli were entirely arbitrary.
An evolutionary perspective leads to the expectation of specificity rather than
generality. To understand why, consider two adaptive problems that most organisms
face—avoiding predators and avoiding eating poisonous substances. An animal that
required many repeated pairings of getting bitten by a predator would not survive long.
Nor would an animal that required a long time to learn to avoid eating poisonous
substances. Some adaptive problems are too critical and immediate to be left to extended
bouts of classical conditioning. Survival requires that some learning must occur rapidly.
Garcia’s challenge to the generality assumption. A research team led by John
Garcia was the first to seriously challenge the generality assumption (Garcia & Koelling,
1966). In one set of experiments, Garcia had rats eat something with a novel taste. Many
hours later, he exposed the rats to a dose of radiation which made them sick.
Subsequently, the rats showed a strong aversion to the food containing the novel taste;
they wouldn’t go near the stuff. It only took a single trial of conditioning for the rats to
learn to avoid the novel food. When Garcia tried to condition rats using lights or sounds,
however, the conditioning failed. In another experiment, taste and visual cues were both
paired with subsequent illness (Miller & Domjan, 1981). Nonetheless, the rats only
developed an aversion to the taste cue, not to the visual cue.
Adaptive learning specializations. Rats seem to come into the world with
adaptive specializations to learn some things easily, such as avoiding foods linked with
subsequent nausea. But rats find it extraordinarily difficult to learn other things, such as
avoiding buzzers and lights that are linked with subsequent nausea. Since the
experiments were conducted with adult rats, perhaps they had already learned the food-
illness link. To rule out this hypothesis, another group of psychologists conducted a
parallel conditioning experiment on infant rats one day after their birth (Gemberling &
Domjan, 1982). Despite their young age, the same effect was found—single trial
learning caused the infant rats to avoid tastes paired with subsequent nausea.
Although these findings may seem obvious in retrospect, at the time they were
bitterly disputed. In fact, editors of the major psychology journals consistently rejected
Garcia’s papers because they went against the dominant assumptions of behaviorism,
which were widely believed to be "laws." Only after Garcia replicated his findings many
times and similar findings began to emerge from other laboratories did the journals
grudgingly accept his papers for publication.
These and other findings contradicted the basic assumptions of classical
conditioning in several ways. First, they challenged the second generality assumption,
since all stimuli are not equally capable of being conditioned. Second, the fact that the
food aversion learning occurred after such a long delay violated the assumption that the
pairings had to occur in close temporal proximity. Learning in the real world depends on
pre-existing associations between events, such as between eating and nausea. Organisms
capable of predicting events of critical adaptive significance enables them to cope with
those events more successfully (Domjan, 2005).
A specialized food-aversion learning mechanism makes excellent evolutionary
sense for both rats and humans. Rats and people are both omnivores, surviving by eating
a widely varied diet. It’s critical for them to avoid eating foods that make them sick.
Illness is usually occurs after a long delay between the taste and the illness. You’ve
probably experienced this yourself. In my case, I used to have a great fondness for
Japanese sushi. Several hours after feasting on sushi one day, however, I later became ill
and vomited. This single event created a strong aversion to sushi, and I haven’t eaten it
since. As omnivores, modern humans owe their existence to ancestors who evolved
adaptive specializations to learn to avoid foods that could be hazardous to their health.
Preparedness. The proposition that organisms might come into this world
"prepared" by evolution to learn some things and not others was picked up by Martin
Seligman (Seligman, 1970). Preparedness is an organism’s ability to learn some kinds
of associations rapidly. Seligman and his colleagues showed that it was quite easy to
"condition" people to develop certain types of fears--fears of snakes, heights, and spiders,
for example--but extremely difficult to condition people to develop other sorts of fears
such as fears of flowers, electrical outlets, or even guns (Cook et al., 1986; Seligman &
Hager, 1972). The key point is that some cues historically have been natural precursors of
hazards to survival. The sight of a spider or the sound of a rattlesnake are natural
precursors of a biting attack. Organisms that learned rapidly to avoid these dangers had a
great survival advantage.
Sexual conditioning. You may recall from Chapter 2 (Evolutionary Foundations)
that survival is not enough in the game of evolution. Organisms can survive for decades,
but if they fail to mate successfully, their genes will not be passed on. Consequently, it is
reasonable to expect that organisms have evolved adaptive specializations not just around
events hazardous to their survival. They should have specialized learning specializations
for events that signal sexual success.
To test this idea, Michael Domjan and his colleagues studied sexual conditioning
in Japanese quail (Domjan et al., 2004). Japanese quail are ground-dwelling birds that
live in grassy areas. Male quail typically detect a female by seeing a part of their body,
typically her head and neck sticking up through the grass. If a male approaches her after
seeing these cues, he has a chance to copulate with her. To study sexual conditioning of
the quail, Domjan constructed an artificial head that resembled a female quail (see Photo
When males are merely exposed to the artificial female head, they show modest
approach behavior—a modest unconditioned response (approach) to a naturalistic-
looking conditioned stimulus (female head)(see Figure 6.6 of experimental apparatus).
Then Domjan allowed some male quail to copulate with an actual female quail after
seeing the head of the artificial head. The conditioning proved to be intense. On
subsequent trials, the males vigorously approached the artificial female head and often
attempted to grab it and copulate (Cusado & Domjan, 1998). Even if there was a long
time interval between seeing the female head and obtaining an actual copulation,
conditioning proved extremely strong. In contrast, when parallel conditioning trials were
performed with an arbitrary CS, an object of the same size and shape, but lacking the
female head cues, conditioning proved to be far weaker (see Figure 6.7).
Insert Figures 6.6 and 6.7 of Domjan’s Experimental Apparatus and Findings
Recall the concept of extinction discussed earlier— the process by which a
conditioned response gradually diminishes in magnitude when the US is no long paired
with the CS. It turns out that when the Japanese quail are conditioned with the
naturalistic stimulus of the female head, and then experience many trials in which they do
not copulate with a real female quail, they fail to show extinction (Krause et al., 2003).
Male quail, in short, have an adaptive specialization that is designed to condition rapidly
and strongly to naturalistic cues that are typically linked in the wild with sexual access to
females. And conditioning to naturalistic cues shows strong resistance to extinction.
Interestingly, all of these effects were obtained with male quail, and not found in female
quail, suggesting that adaptive specializations differ for males and females.
Science has strange twists and turns on its way to discovery. In this case, Michael
Domjan was a recipient of the ―Golden Fleece Award.‖ This was an award created by
Senator William Proxmire in 1975, and given to a researcher who had obtained federal
grant funding for research he deemed to be the greatest waste of taxpayer’s money. Upon
giving Domjan this award, Senator Proxmire said: ―Let the Japanese study their own
quail!‖ Despite receiving this dubious award, the history of science has vindicated
Domjan’s work, which has proved pivotal in reformulating our understanding of one of
the most important and basic processes of learning.
An adaptationist perspective on classical conditioning. All of these discoveries
by Garcia, Seligman, Domjan, and others proved extremely important in revising our
understanding how organisms learn through classical conditioning. Psychologists have
implicitly assumed that learning mechanisms such as classical conditioning evolved and
are maintained because they provided organisms with an adaptive advantage. The full
implications of this view, however, have only recently become appreciated:
Organisms have adaptive specializations that cause them to condition to some
naturalistic stimuli rapidly and strongly—conditioning that is highly resistant to
extinction. As Domjan concludes, ―learning with ecologically relevant stimuli
and responses will be more robust than learning with arbitrary cues‖ (Domjan, in
press, p. 31).
Adaptive specializations exist for stimuli that have important consequence for
survival (e.g., conditioning to snakes) and reproduction (e.g., sexual conditioning).
Different species have different adaptive specializations; Japanese quail can be
easily conditioned to an artificial head of a female quail, whereas rat, dog, and
even human males presumably can not.
Males and females within the same species may have somewhat different adaptive
The principles of learning are not as general as early learning theorists believed;
they depend on the particular species (dog, rat, quail, human) and on the particular
stimuli used (artificial versus naturalistic).
Classical conditioning, discovered by Ivan Pavlov, is a basic learning process by
which an initially neutral stimulus (such as a tone) evokes a response (such as salivation)
after being paired with a stimulus that already evoked the response (such as food). They
key terms of classical conditioning include: unconditioned stimulus (US), which
naturally evokes the response; unconditioned response (UR), the response naturally
elicited by the US; conditioned stimulus (CS), the initially neutral stimulus that does not
evoke a response until it is paired with the unconditioned stimulus; and conditioned
response (CR), a reaction produced by the CS that mimics the UR.
There are six major principles of classical conditioning—acquisition, extinction,
reacquisition, spontaneous recovery, stimulus generalization, and stimulus discrimination.
Two major perspectives have led to a reformulation of classical conditioning.
The first is the cognitive reformulation. Whereas early behaviorists avoided using
mentalistic concepts, cognitive psychologists such as Rescorla and Wagner showed that
they are necessary. Rescorla and Wagner showed that dogs learn expectation (a mental
concept) about what reliably predicts food. They showed that classical conditioning
occurs not merely because of an association of two events such as tones and food, but
rather because of the predictive value of the tone in the arrival of food. Cognitive
psychologists showed that mental concepts such as expectations are necessary to
understand the process of classical conditioning.
The evolutionary perspective led to the second reformulation of classical
conditioning. It showed that the laws of learning are not as general across species or
across stimuli and responses as early behaviorists assumed. John Garcia’s work showed
that you can condition rats quickly to learn taste aversions, but not light aversions. This
paved the way for the evolutionary concept of preparedness—organisms come into the
world pre-programmed to learn some things easily and rapidly, and other things only with
many trials and with great difficulty. Stated differently, organisms have adaptive
learning specializations, particularly in domains that strongly affect survival (e.g., eating)
and reproduction (e.g., sex). These adaptive specializations differ across domains (e.g.,
eating, sex), across species (e.g., rats differ from Japanese quail), and the sexes (males
differ from females). The overall conclusion is that the principles of learning are not as
general as early learning theorists believed.
Can you think of a time in your life when you became nauseous after
eating something, and learned to avoid it thereafter?
Can you recall a time when you, a friend, or a family member trained a
dog using the principles of classical conditioning?
Why do you think adaptive learning specializations are especially
concentrated in the domains that strongly affect survival and
1. Define operant conditioning.
2. Define the law of effect.
3. Distinguish between reinforcement and punishment.
4. Distinguish between primary and secondary reinforcers.
5. Define operant generalization and operant discrimination.
6. Identify the four major schedules of reinforcement.
7. Describe how research on cognition and the brain have deepened our
understanding of operant conditioning.
8. How does an evolutionary perspective deepen understanding of operant
Learning through classical conditioning typically occurs beyond the voluntary
control of the organism. A dog conditioned to salivate to a bell learns that a bell signals
the onset of food. A male quail conditioned to initiate sexual contact following a visual
cue of a female quail learns the cues that predict sex. Classical conditioning involves
learning about the predictive cues in the environment.
Another type of learning is entirely different, and hinges on the consequences that
follow an organism’s actions. Operant conditioning is learning that occurs when the
consequences of an organism’s action influence the probability that the organism will
repeat that action again in the future.
Let’s consider a few examples of operant conditioning. If a boy cleans up his
room and this action is followed by praise from his parents, it increases the likelihood
that the boy will clean his room in the future. When a woman who is romantically
interested in a man smiles at him when he looks at her, the consequence of smiling
increases the likelihood that the man will look at her again. When a gatherer in a hunter-
gatherer culture goes to a certain area of the woods in search of berries, and this is
followed by successfully finding berries, the consequence increases the likelihood that
the gatherer will go to that same spot in the future. These are all examples of operant
conditioning (also called instrumental conditioning), when the consequences of a
behavior determine the likelihood of the behavior being repeated in the future.
Operant conditioning has been called selection by consequences, since the effects
of behaviors determine or ―select‖ which behaviors will be repeated in the future
(Skinner, 1981). Selection by consequences is a fascinating type of causal influence.
Usually when we think of causal influences, we think of mechanical causality, as when
one billiard ball strikes another and causes the struck ball to roll. Selection by
consequences explains the existence of a pattern of behavior by the history of
consequences of that behavior.
You may notice a fascinating parallel to natural selection, as described in Chapter
2 (Evolutionary Foundations). Natural selection explains the existence of an adaptation
by understanding that genes that produce behavior with the consequence of increasing an
organism’s reproductive success will increase in frequency over time. Natural selection
is selection by consequences over evolutionary time. Operant conditioning is selection
by consequences that occurs within an organism’s lifetime. Both natural selection and
operant conditioning are forms of selection by consequences.
Thorndike and the Law of Effect
Around the time that Pavlov was conducting his studies of classical conditioning
in Russia, another important development in learning occurred in the United States. A
graduate student of William James named Edward Thorndike (1874-1949) began a series
of studies in the basement of the psychology building at Harvard University. Thorndike
collected cats from the neighborhood, and placed them in a puzzle box that he had
devised. The puzzle box contained strings, a tilting pole, and a lever (see Photo 6.4).
The cats could escape the puzzle box only through one response, such as tilting the pole.
If the cat successfully escaped, it was given a small amount of food, and then put back in
the box. Through repeated trials, Thorndike was able to keep track of how the cats
learned to escape and obtain the morsel of food over time.
Insert Photo 6.4 and Figure 6.8 About Here
The cats typically tried many things, such as pulling on the string and pressing the
lever. Usually these responses failed; sometimes they succeeded—hence, the name
given to this phenomenon, trial and error learning. Gradually, the cats learned which
response led to successful escape and a morsel of food. The learning was strictly defined
as the amount of time required to escape from the box. Initially, it took the cats a long
time. Over time, however, they escaped faster and faster (see Figure 6.8). Thorndike
coined a term for this type of learning—the law of effect: Responses to stimuli that are
followed by a satisfying state of affairs to the organism are more likely to occur in the
future; responses to stimuli followed by unpleasant states of affairs are less likely to be
repeated in the future.
You may observe this type of learning in your own life. If you lose your keys, for
example, you may try looking in a variety of places. But if you find your keys in one
location, such as on a particular chair or in a particular drawer (satisfying state of affairs),
you are more likely to look for your keys in those spots in the future. Your key searching
behavior has been governed by the law of effect.
Reinforcement and Punishment
In the 1930s, another graduate student at Harvard, B.F. Skinner (1904-1990)
began studies of learning that dramatically changed the landscape of the field of
psychology (insert photo of B.F. Skinner). Rather than cats, Skinner studied rats, and
eventually shifted to pigeons for reasons of convenience (recall the behaviorist
assumption that the laws of learning were general across species, so it did not really
matter which species one studied). Skinner was a behaviorist in the strictest sense. He
avoided all mentalistic concepts. And he defined his terms strictly in terms of behavior
and the effects of environmental reinforcers on behavior (Skinner, 1938).
Insert Photo 6.5 of B.F. Skinner about here
Like Thorndike, Skinner noticed that organisms do not simply respond to events
in the world, like Pavlov’s dogs. They often perform actions that have effects on the
world around them. These actions are operant responses, behaviors that produce an
effect and are capable of being modified by the consequences they produce. Operant
responses are sometimes called instrumental responses, because they act like instruments
or tools that cause effects on the world.
The key invention by B.F. Skinner to study operant responses is a device now
known as a ―Skinner box.‖ This is a stripped down cage with practically nothing in it but
a lever which, when pressed, delivers a pellet or drop of water (see Photo 6.6 and 6.7 of
Skinner Box). The Skinner box provided a considerable improvement in efficiency over
Thorndike’s puzzle box. Rather than having to keep putting the animal back in the box,
the animal remains in the box to continue the experiment uninterrupted. Another
advantage of the Skinner box was that the lever provided a precise dependent variable.
The experimenter could simply count how often the animal lever pressed the lever.
Skinner kept the pellets of food and drops of water small intentionally in order to keep
the animals hungry and thirsty. This enabled him to study the animals over many trials.
Insert Photo of Skinner Box 6.6 and Cartoon Photo 6.7 of Skinner Box
Skinner sought to explore the laws of operant conditioning, or selection by
consequences. He started with two major types of consequences. The first type was
reinforcement, defined as a consequence following an operant response that increases
the likelihood of the operant response occurring in the future. Notice that reinforcement
is defined in strictly behavioral terms. Food is a reinforcer for most organisms, so
Skinner typically used food. The second type of consequence was punishment, defined
as a consequence following an operant response that decreases the likelihood that it will
occur in the future. If a rat presses a lever, and the consequence is a painful electric
shock to its feet, this punishment decreases the rat’s lever-pressing in the future. Like
reinforcement, punishment is defined in strictly behavioral terms—the effects something
has on the likelihood of subsequent behavior.
It may occur to you that what is reinforcing or punishing might differ from person
to person or from species to species. Hamburgers, for example, are reinforcing to meat
eaters, but punishing to vegetarians. The smell of cow dung is reinforcing to dung
beetles, who use it to create their nests, but it is punishing to most humans, who would
rather escape its acrid fumes. Skinner avoided these complexities by defining
reinforcement and punishment solely in terms of their effects on subsequent behavior. As
we will see later, however, these complexities turn out play an important role in
understanding the process of learning.
Skinner added another important distinction—whether a reinforcement or
punishment is added or whether it is removed. Presenting a reinforcement such as a
pellet of food is called positive reinforcement because something is being added.
Removing a reinforcement, such as taking away a pellet, is called negative reinforcement
because something is being subtracted. Similarly, presenting a punishment is called
positive punishment because something like a painful electric shock is being added.
Removal of the electric shock would constitute negative punishment, because the painful
event is being taken away.
It is easy to get confused with this terminology because it goes against everyday
intuition. Negative reinforcement and punishment sound like they should mean the same
thing, but they don’t. One way to keep these terms straight is by keeping in mind these
two key definitions: (1) Reinforcement always increases the likelihood of behavior,
whereas punishment always decreases the likelihood of behavior; (2) when stimuli are
presented they are called ―positive,‖ and when they are removed they are called
―negative‖ (see Table 6.1). Positive and negative do not mean ―good‖ or ―bad,‖ but
rather adding or subtracting.
Insert Table 6.1 About Here
Can you think of an example of each of these four categories in your everyday life?
I’ve used all four categories with my dog Dexter. When he’s acting well-behaved,
I give him a special dog treat, a positive reinforcer that increases his good behavior.
When he does something I don’t like, such as jump up on people who come to the door, I
say ―bad dog,‖ a positive punishment that decreases his bad behavior. When Dexter
shreds his toy bone, making a mess all over my living room, I take it away from him, a
negative punishment designed to decrease his mess-making behavior. When I take his
bone away, he will sit up and become well-behaved—a negative reinforcer because I
have removed something that increases the likelihood of his good behavior.
Major Principles of Operant Conditioning
Skinner and his disciples conducted many experiments in order to identify the
underlying principles of operant conditioning. This section describes these major
Primary and Secondary Reinforcers
Primary reinforcers are events that innately increase the likelihood of a response.
Examples of primary reinforcers are food for the hungry animal, water for a thirsty
animal, or a sexually receptive mate for a sexually aroused animal. These are called
primary reinforcers because they are basic evolved adaptations that solved a fundamental
problem of survival or reproduction. Similarly, primary punishers are events that
innately decrease the likelihood of a response. Examples of primary punishers include
the varieties of physical pain we experience when exposed to extreme heat (e.g., putting
your hand on a hot stove) or when our body envelope is pierced (e.g., puncture wound
from stepping on a sharp stone while walking barefoot). As we will see later, one of the
key debates in learning theory is how many primary reinforcers and punishers exist.
Secondary reinforcers are stimuli that are originally neutral, but acquire the power
to reinforce behavior by being paired with a primary reinforcer. Money is a prime
example. I recall a young child of three years old from my neighborhood who found a
silver quarter on the floor. He started out being indifferent to the quarter. But then
someone showed him that when he put the quarter into a gum ball machine, a sweet
tasting large gum ball came out, which he then started to savor. From then on, he started
to value money, which had become a secondary reinforcer as a consequence of being
paired with the primary reinforcer (calories). As the boy grew up, money became such a
powerful reinforcer that he began to acquire it and hoard it with great enthusiasm.
Some learning theorists assume that most of our behavior is controlled by
secondary reinforcers (Skinner, 1981). That is, they assume that the list of primary
reinforcers is small in number, and that much learning consists of acquiring secondary
reinforcers. Other learning theorists believe that the list of primary reinforces is longer:
―The unspoken behaviorist premise of a short list has, until recently, sheltered
behaviorists from the profound motivational complexity of animal behavior‖ (Herrnstein,
1977, p. 598). As we will learn in subsequent chapters (Chapter 12: Motivation and
Emotion; Chapter 14: Social), the field of psychology has discovered a much longer list
of what behaviorists call primary reinforcers, such as belonging to a social group, gazing
at attractive individuals, nurturing children, and even showing altruism toward kin.
Operant Generalization and Operant Discrimination
Recall from the discussion of classical conditioning the two related phenomena of
generalization and discrimination. A dog classically conditioned to salivate to a tone of
5000 Hz will also salivate to slightly different tones, such as 4900 Hz or 5100 Hz—a
generalization phenomenon. On the flip side of the coin, as the tone becomes more and
more dissimilar from the original tone on which the dog was conditioned, it will
increasingly fail to salivate—discrimination.
Similar processes occur in operant conditioning. Operant generalization refers
to the responses that occur to a new stimulus that is similar to the old stimulus. This
happened to me when I was a child. Once I put money into a candy machine to get a
pack of gum. The gum came out, as it usually does. For some reason, though, I pushed
the button again, even though I had put no new coins in the machine. Astonishingly,
another pack of gum came out—a highly reinforcing event to a kid. For the next several
months, every time I got candy from any candy machine, I would push the button two or
three times instead of just once. My operant behavior (pushing the button) had
generalized to other candy machines.
Conversely, operant discrimination refers differential responding to two
different stimuli. In my case, when getting cokes or other beverages from dispensing
machines, I did not push the button twice. I discriminated between candy machines and
coke machines. It is easy to train rats operant discrimination. All you have to do is
reinforce the rat with a pellet when pressing a lever in response to one tone, and not
reinforce the rat’s behavior in response to a different tone. The tone to which the rat
responds as called a discriminative stimulus, defined as a stimulus that is associated
with reinforcement and controls a behavior (such as bar pressing) such that the organism
does not respond to closely related stimuli. In my case, the candy machine became a
discriminative stimulus, controlling my double button-pushing behavior, but in a manner
such that I did not respond to a similar stimulus (the coke machine)[insert photo 6.8 of
coke and candy machines].
Insert Photos 6.8 About Here: Candy and Coke Machines
Research has discovered that an effective way to create a discriminative stimulus
is to increase the rate of differential reinforcement (Andrzejewski et al., 2006). If you
reinforce a rat for pressing a bar in response to one tone, and then wait 10 minutes to
present a different tone for which the rat does not get reinforcement, the rat take a long
time to learn a discriminative stimulus. If you increase the reinforcement rate by
shortening the time interval between trials to one minute, the rats learn the discriminative
stimulus a lot faster. In my case, once I got the second pack of gum from double pressing
the candy machine button, I immediately tried the same thing on the coke machine, and it
didn’t work. If I had waited several days to try to get two cans of coke for the price of
one, it would have taken much longer for the candy machine to become a discriminative
The key point is that much operant learning in real life requires both processes of
operant generalization and operant discrimination. These learning mechanisms help to
guide us toward the class of stimuli for which we will obtain reinforcement for our
actions (generalization), as well as the specific stimuli from which we will be most
heavily reinforced for our actions (discrimination).
What happens to an operantly conditioned response that is no longer reinforced?
Over time, the response undergoes extinction, a decline in response rate in the absence of
reinforcement. To take my double-button pushing behavior as an example, I gradually
stopped performing it since I never again received the reinforcement of getting two for
the price of one. Extinction of operantly conditioned responses happens all the time.
Rats who no longer receive pellets for pressing a bar after a tone gradually stop pressing
the bar. After making the mistake of letting my dog Dexter into the house through the
bedroom door once, I stopped doing it. Over time, he stopped barking and clawing at the
bedroom door. If you’ve ever experienced a romantic relationship that starts to sour, and
your partner no longer smiles back when you smile, your smiling behavior will gradually
get extinguished to this particular person. Although you can consider extinction as the
unlearning of learned operant behavior, there is evidence that some behaviors are never
completely unlearned. Often, the passage of time following extinction will lead to
spontaneous recovery. As an adult, I still occasionally double-push the candy machine
buttons. Moreover, after a response has experienced extinction, a single episode of
reinforcement is often enough to bring the behavior back full blown, under the control of
the original discriminative stimulus.
Schedules of Reinforcement
In everyday life, organisms are not reinforced every time they perform a
particular behavior. The cheetah that preys on gazelle for its dinner sometimes succeeds,
but sometimes goes hungry. A man or woman using an on-line dating service may
respond to many ads before getting reinforced by a positive response. Studying hard for
an exam may secure the grade of ―A‖ sometimes, but not always. In everyday life,
reinforcements are not continuous. Most are intermittent reinforcements, when
reinforcement follows only some of the responses emitted. Schedules of reinforcement
are well-defined rules for delivering reinforcers to an organism (Staddon & Cerutti, 2003).
They come in two major classes—interval schedules of reinforcement and ratio schedules
Interval Schedules of Reinforcement. When the reinforcement schedule is
determined by the time between each reinforcement, it is called an interval schedule of
reinforcement. There are two types of interval schedules of reinforcement. The first is a
fixed interval schedule (FI), when a reinforcement is received only after a fixed period
of time has elapsed (assuming that the appropriate response has been produced). If a rat
receives a pellet for bar pressing only after two minutes have elapsed, the rat is
experiencing a fixed interval schedule of reinforcement. The fixed interval, of course,
could be every two, four, twenty, or a hundred minutes. They key to FI schedules is that
a pre-determined (fixed) length of time must elapse before a behavior is reinforced. I
knew a student who put herself on fixed interval schedules of reinforcement to help her
study for exams. She would give herself a reward such as a cookie or a cup of tea only
after she had studied for 90 minutes.
The second type of interval schedule is called a variable interval schedule,
which occurs when the time of reinforcement varies from one episode to the next, but
averages out to a particular value. A woman might reward herself with a cookie after
studying for varying lengths of time, from one hour to two hours, but if they average out
to every 90 minutes, she has put herself on a variable interval schedule of reinforcement.
Variable interval schedules tend to be more powerful than fixed interval schedules
because an organism never knows exactly when it will receive the reinforcement, so it
must keep responding. In contrast, with a fixed interval schedule, performance tends to
drop off immediately after receiving a reinforcement (see Figure 6.9).
Inert Figure 6.9 About Here
Ratio Schedules of Reinforcement. Whereas interval schedules of reinforcement
are based on the amount of time elapsed, ratio schedules of reinforcement are based on
number of responses emitted. A fixed ratio schedule of reinforcement is constant, with
a pellet (or other reinforcer) delivered after a specific number of responses have been
emitted. If a rat gets a pellet only after it presses the bar 30 times, then it is on a fixed
ratio schedule of reinforcement. Note that time is irrelevant. The critical ingredient is
the number of responses emitted. It is called a ―ratio‖ schedule of reinforcement because
the key variable is the ratio of the number of responses relative to the number of
reinforcers. In the above example, the ratio is 30:1, or 30 bar presses for every
A variable ratio schedule of reinforcement exists when the a reinforcement is
delivered after a particular average number of responses has been emitted. The rat might
receive a pellet on average after 30 bar presses, but the actual reinforcement might come
after 15 or after 45. The key to understanding variable ratio schedules is that reinforcers
are tied to the average number of responses emitted, which can vary around that average.
Time interval, which is the key to interval schedules, is irrelevant when considering ratio
As you may notice from Figure 6.9, ratio schedules of reinforcement tend to be
far more powerful than interval schedules of reinforcement. Indeed, the most powerful
schedule yet discovered is the variable ratio schedule of reinforcement. This is the
schedule exploited by many gambling casinos. Casinos often set a slot machine to pay
out, on average, every 80 or 120 pulls of the ―armed bandit.‖ [insert photo 6.9 of one
armed bandit in casino] One player may get lucky, though, and hit the jackpot on the first
pull. Another play may be unlucky, and spend hours pulling the bandit without hitting
the jackpot. The variable ratio schedule is so powerful because the person knows that it
will pay off at some point, but it never knows when. The variable ratio schedule of
reinforcement is likely to be the main cause of gambling addiction.
Insert Photo 6.9 of One-Armed Bandit About Here
In everyday life, people do not perform full-blown behavior patterns
spontaneously. Instead, they start out with halting steps, and through a process of
successive approximations, eventually come to perform the behavior sequence. Consider
our lone rat in a Skinner box. An experimenter might starve before the rat presses the bar
spontaneously. If the experimenter gives the rat a pellet simply for moving in the
direction of the bar, though, this begins the process of shaping. Then the next pellet does
not come until the rat approaches the bar; and then touches the bar; and then finally
presses the bar. More formally, shaping is learning produced by reinforcing successive
approximations to the final desired behavior.
Shaping was once used by a psychology class to manipulate the behavior of a
professor. The students conspired to smile and nod enthusiastically whenever the
professor approached one corner of the stage, but to frown whenever the professor moved
out of that corner of the stage. Gradually, through successive approximations, the
students shaped the professor so that he was pinned to one corner of the stage. It was
good for a laugh, but demonstrated the power of shaping by reinforcing successive
approximations of a final desired behavioral outcome (insert photo 6.? about here).
Insert Photo 6.10 of Professor pinned to one corner of stage
Shaping through successive approximations is the learning process by which
animal trainers get dolphins to dance at Sea World and bears to ride bicycles at circuses.
Dancing and bicycle riding are complex behaviors that cannot be learned all at once.
Through the gradual process of shaping, though, trainers can get animals to accomplish
feats of astonishing complexity [insert photo 6.10 of bear riding a bicycle].
Photo 6.11: Bear Riding a Bicycle
The above discussion of shaping and schedules of reinforcement all carry the
common theme that there is a more or less predictable pattern to reinforcement. In
interval schedules, reinforcement predictability is pegged to time. In variable schedules,
reinforcement predictability is linked to number of responses. In shaping, reinforcement
is linked to successive approximations toward a final behavioral performance.
Sometimes, however, reinforcement occurs by chance alone. And when it occurs by
chance alone after a behavior has been emitted, it can result in superstition.
A few concrete examples will illustrate the origins of superstitious behavior
through chance reinforcement. Tennis players are notorious for being superstitious. The
former tennis great Bjorn Borg refused to shave during major tennis tournaments,
because one time early in his career, he did not shave and happened to win the
tournament—a superstition reinforced by chance. Tennis great Roger Federer has a
predictable shake of his head before serving, presumably to get the hair out of his eyes;
yet his hair is never in his eyes. This superstitious behavior presumably arose because
one time he served a particularly important ace right after shaking his head. When I was
younger, and asked to pick a number between one and 10, I picked the number 7 and it
turned out to be correct. Ever since, 7 is my ―lucky number.‖ Most of us carry around a
handful of superstitions, which likely arose because we were accidently reinforced when
performing a particular action.
Operant Conditioning Inside the Head—Cognition and the Brain
Early learning theorists such as B.F. Skinner avoided postulating mentalistic
concepts, or what are now called cognitions—information processing procedures that
occur inside the heads of organisms. They believed that learning through operant
condition occurred mechanistically through the process of pairing a behavioral response
with a reinforcing consequence. While recognizing the importance of Skinner’s
pioneering experiments on operant conditioning, psychologists since Skinner have
increasingly focused on processes that occur ―inside the heads‖ of organisms. Two major
directions capture this shift. One is the emphasis on cognition (Kirsch et al., 2004). The
second is the exploration of the brain mechanisms that underlie learning.
Cognition and Learning
The first to advocate a cognitive approach to operant conditioning was the
University of California at Berkeley psychologist Edward Chace Tolman (1886-1959),
after whom the current psychology building at Berkeley is named (Tolman Hall).
Tolman believed that it was necessary to invoke cognitive concepts in order to
understand operant conditioning. He proposed that animals formed ―beliefs‖ that specific
actions would lead to specific end states—the calculation of a means-ends relationship.
According to this view, a stimulus does not directly produce a response. Rather,
there is a cognitive procedure that occurs in between a stimulus and a response. One of
the most powerful demonstrations of this cognitive view involved the postulation that rats
(and people), when learning how to run through mazes, develop cognitive maps—mental
representations of the physical environment (Tolman & Honzik, 1930; Tolman et al.,
In one classic experiment, Tolman trained rats using an apparatus shown in Figure
6.10a. In order to get to the goal (a piece of food), the rat had to go straight down the
maze; then turn left; then right; and then right again. The rats in the maze, of course,
cannot see the overall design like we can, viewing it from the top. Rats learn to navigate
this maze quickly, typically taking only four successive nights of training.
Insert Figure 6.10a and 6.0b About Here
After rats learned this maze, Tolman put them in a different maze, shown in
Figure 6.0b. In this maze, the rat’s movement straight down the first passage was
blocked. They could not get to the food in the same way in which they had learned so
well. What would the rats do? A traditional behaviorist such as Skinner would have to
predict that the rats would show stimulus generalization, and pick the path closest to the
one that formerly led to success.
The rats had a different idea. Rather than choosing the closest path, they typically
went down the path that led directly to the goal, despite having no prior experience with
this new maze. This led Tolman to conclude that rats had formed a cognitive map. And
when their physical environment (the maze) changed, their used their cognitive map to
reach the goal successfully and efficiently. The rats acted as though their cognitive maps
allowed them to take a ―short cut‖ to the goal, rather than mechanically following the
path closest to the one they had previously learned.
In short, at least some forms of operant conditioning cannot be understood
without invoking higher order cognitive processes such as cognitive maps.
Brain Mechanisms and Learning
Any form of learning, a change in behavior, must have an underlying brain basis.
A change in behavior cannot occur without a change in the brain. For many years,
psychologists struggled with trying to identify which brain centers were centrally
involved in learning.
The most obvious starting place is to examine brain structures involved in
pleasure. Primary reinforcements, after all, are consequences of behavior that increase
the probability of their occurrence. So many reinforcements are likely to be inherently
pleasurable. James Olds was an early pioneer in identifying what became known as the
pleasure centers of the brain (Olds, 1956; Olds & Fobes, 1981). He focused on the limbic
system. When he placed electrodes in the limbic system of rat’s brains, and allowed
them the chance to press a bar to receive a mild electrical current that stimulated the
limbic system, he made an astonishing discovery. Rats found stimulating the limbic area
so utterly pleasurably that they would spend hours pressing the bar to stimulate it. Indeed,
rats become so focused on stimulating this pleasure center that they would forego food
and water to do it.
Based on this research, psychologists began to implant electrodes in the brains of
people suffering from extreme pain from cancer or from incurable epileptic seizures.
Many patients reported, analogous to the rats, a feeling of intense pleasure when
delivering mild electrical stimulation to the limbic areas of their brains. Some reported
sexually pleasurable feelings, describing the sensation as similar to that achieved just
before reaching a sexual orgasm. In some cases, this relieved the intense physical
suffering of these patients. Over time, though, these procedures have been abandoned,
mainly over concerns about the ethics involved in drilling holes in people’s brains, as
well as questions about their therapeutic effectiveness (Valenstein, 1986).
Since the discoveries by James Olds, psychologists have discovered several other
brain areas responsible for intensely rewarding sensations. (Wise, 2005). The medial
forebrain bundle of neurons, which provides one set of connections between the
hypothalamus and the nucleus accumbens, provides pleasurable sensations when
stimulated. These areas are stimulated when rats or people engage in intrinsically
pleasurably behaviors, such as eating when hungry, drinking when thirsty, or having sex
when sexually aroused (Damsma, et al., 1992). Studies using fMRI technology find that
the nucleus accumbens in men ―lights up‖ when they watch pictures of physically
attractive women (Aharon et al., 2001)(see Figure 6.11).
Insert Figure 6.11 About Here
Psychologists have traced a chemical basis of these intrinsically reinforcing
activities to the secretion of the neurotransmitter dopamine. High levels of dopamine
create positive feelings; low levels of dopamine are sometimes linked to depression or
low mood. Neurons in the nucleus accumbens, by secreting dopamine, deliver an
intrinsically pleasurable and hence reinforcing sensation.
In sum, psychologists are beginning to uncover the underlying brain structures
involved in learning—brain structures that lead organisms to engage in behavior that
contribute to their survival (e.g., eating, drinking) and successful reproduction (e.g.,
sexual activity)—a topic to which we now turn.
An Evolutionary Perspective on Operant Conditioning
The radical behaviorism of B.F. Skinner assumed that a researcher could teach
any organism any response, given enough time and a proper schedule of reinforcement.
Two students of Skinner, Keller and Marion Breland, discovered problems with this basic
assumption (Breland & Breland, 1961). Whereas Skinner had somewhat grandiosely
entitled his seminal book on behaviorism The Behavior of Organisms, the Brelands
playfully (but aptly) titled their key article The Misbehavior of Organisms.
This title came from their experiments, using Skinnerian principles of operant
conditioning, in which they attempted to train pigs and raccoons to put wooden coins into
a slot in a piggy bank. Much to their surprise, the pigs and raccoons did not cooperate.
After much reinforcement, the pigs would take the coin and approach the piggy bank, but
the drop in on the ground and engage in ―rooting‖ behavior. Wild pigs are naturally
omnivores, consuming seeds, roots, tubers, leaves, and other foods found on and under
the ground. The trained pigs, contrary to their schedules of reinforcement, quickly
reverted to their evolved food-getting proclivities. The raccoons were similarly defiant.
Rather than place the wooden coins into the piggy bank, the raccoons insisted on rubbing
them between their paws—something raccoons naturally do to food objects before
consuming them. The evolved adaptations of pigs and raccoons, in short, overrode the
Breland’s attempts to train them to perform ―unnatural‖ tasks, despite rigorous
application of the principles of operant conditioning.
These and other findings lead to three key conclusions. First, there are
―biological constrains‖ on learning. Evolved adaptations of organisms sometimes
override principles of reinforcement. Not all behaviors can be learned through principles
of operant conditioning. Second, evolved adaptations cause some things to be
intrinsically reinforcing—rooting in pigs, washing in raccoons, looking at pictures of
attractive women in men.
A third conclusion is that some actions are intrinsically non-reinforcing. Consider
sex. Sex is generally considered to be a powerful reinforcer, and there are good
evolutionary reasons why it should be. Those in our evolutionary past who did not find
sex to be reinforcing did not engage in it. Hence, they failed to reproduce. We are all
descendants of ancestors who found sex to be reinforcing. But not all sex is reinforcing.
People do not generally find sex with genetic relatives reinforcing, and for good reason.
Sex with genetic relatives leads to ―inbreeding depression,‖ in which the offspring have
an above-average number of genetic defect. So humans have evolved incest-avoidance
adaptations that not only find the thought of sex with relatives non-reinforcing; many find
it outright disgusting (Lieberman et al., 2003).
In sum, an evolutionary perspective sheds much light on learning through operant
conditioning. It provides constraints on what can and cannot be learned. It causes some
things to be intrinsically reinforcing—those actions that historically contributed to
survival and reproduction. And it causes some things to be intrinsically non-reinforcing
or even punishing—those actions that interfered with survival and successful
Ongoing Debate: If Human Behavior is Controlled by Contingencies of
Reinforcement, Are People Responsible for Their Own Actions?
When Skinner published his book Beyond Freedom and Dignity in 1971,
advocating that people relinquish the mistaken belief that they are responsible for their
own behavior, he created a debate that remains ongoing today. On April 1, 2005,
ethologist David Barash argued that people’s reluctance to accept that their own behavior
is causally determined lies behind many of society’s problems. He wrote that ―[A]
scientific conception of behavior abolish[es] the unsupportable conceit that people are
responsible for their own actions‖ (p. B10).
Renowned behaviorist John Staddon, of Duke University, disagrees. He argues
that dictionary definitions of responsibility includes phrases such as ―liable to be called to
account,‖ ―answerable,‖ and ―able to pay.‖ Staddon goes on to argue that ―In short,
responsibility simply means accepting the consequences . . . for one’s own actions.
These consequences are punishment, for bad acts, and reward, for good. Most humans
are so constructed that they will behave in predictable, generally deterministic ways if
they are rewarded or punished. Moreover, other deterministic humans, seeing the
aversive consequences of bad acts, will in turn be deterred from engaging in such acts.
None of this works perfectly—we can’t yet predict human behavior with precision. But
far from calling determinism into question, the concept of responsibility demands
determinism! If human behavior were undetermined and capricious . . . there would
indeed be no point to the idea of personal responsibility.‖
The crux of the debate is this. On one end, some people argue that if human
behavior is determined by contingencies of reinforcement (or any other causes), then
society cannot hold people responsible for their own actions. The other side of the debate
holds that the very fact that humans are, by definition, responsible for their own actions—
they have some sense of which of their actions will lead to reward and which will lead to
punishment—is perfectly consistent with human behavior being determined. One might
add that the fact that society, by enacting laws and policies that punish some behaviors
and encourages others, sets up contingencies of reinforcement that hold people
responsible for their own actions. It’s up to you to decide whether these views add to, or
detract from, your own sense of freedom and dignity.
End of Ongoing Debate
Operant conditioning is learning that occurs when the consequences of an
organism’s actions influence the likelihood of future performance of that action. That is
why operant conditioning is sometimes called ―selection by consequences.‖ Perhaps the
most primitive form of operant conditioning is trial and error learning. An earlier version
of operant conditioning is called the law of effect, in which responses followed by
satisfying states of affairs tend to be repeated in the future.
B.F. Skinner, a pioneer in the discovery of operant conditioning, distinguished
between reinforcements and punishments. Reinforcements increase the likelihood of
responses in the future. Punishments decrease the likelihood of future responses.
Skinner and his disciples discovered several major principles of operant
conditioning. One is the distinction between primary reinforcement (events that innately
increase the likelihood of a response) and secondary reinforcement (events that are
initially neutral, but acquire power to reinforce by being paired with primary reinforcers.
Another pair of principles of operant conditioning is operant generalization and operant
discrimination. In operant generalization, responses occur to new stimuli that are similar
to the original stimulus. In operant discrimination, organisms learn to differentially
respond to two or more different stimuli. Another principle of operant conditioning is
extinction, which is the gradual decline in response rate after a reinforcer is withdrawn.
Another set of principle of operant conditioning are schedules of reinforcement,
the precise rules by which reinforcers follow behaviors. Interval schedules of
reinforcement are defined as reinforcement given after a particular amount of time has
elapsed, such as every 10 minutes or half hour. Interval schedules come in two types—
fixed and variable, depending on whether the reinforcement is delivered after an exact
amount of time has elapsed (fixed) or after an average amount of time has elapsed
(variable). Ratio schedules of reinforcement are based not on time elapsed, but rather on
the number of responses emitted. Ratio schedules also come in two types, fixed (after a
particular number of responses have occurred, such as 20) and variable (after an average
number of responses have occurred). Variable ratio schedules are generally the most
powerful schedules of reinforcement, and the likely basis of the acquisition of gambling
addictions. A final principle of operant conditioning is shaping, which is learning that
takes place by reinforcing successive approximations of a correct response. Operant
conditioning is the learning process most likely to explain superstitions, which can occur
when people are accidentally reinforced after performing a particular behavior.
As with classical conditioning, cognitive advances have deepened the
understanding of operant conditioning. Tolman, for example, demonstrated that rats
develop cognitive maps, or mental representations of their physical environments, that
aid with learning navigation toward food reinforcers. Psychologists have also discovered
some of the brain regions involved in learning, such as the pleasure centers in the brain
that produce the primary reinforcers—the limbic system discovered as a pleasure center
by James Olds; and more recently the medial forebrain bundle of neurons that connect the
hypothalamus and nucleus accumbens. The neurotransmitter dopamine has also been
found to create pleasurable sensations, and its secretion is therefore linked with
An evolutionary perspective sheds additional light on operant learning.
Researchers using an evolutionary perspective have discovered biological constraints on
learning; discovered that intrinsically reinforcing actions are those that historically
contributed to survival and reproduction; and intrinsically punishing actions are those that
historically interfered with survival and reproduction. Although early learning theorists
such as Skinner assumed that the number of primary reinforcers was small in number,
evolutionarily-guided researchers such as Skinner’s student Richard Herrnstein propose
that the number of primary reinforcers is larger in number.
Have you ever used reinforcers or punishers to shape the behavior of a friend or
If you were to choose a schedule of reinforcement to train a dog, would you be
more effective if you used continuous reinforcement or intermittent
Why do you think food and sex are such powerful primary reinforcers?
1. Define observational learning.
2. Distinguish observational learning from operant learning.
3. Describe how an evolutionary perspective deepened our understanding of
A young boy of four years old sat quietly watching his mother play chess with
another woman. He said nothing. They said nothing to him. After an hour, the young
boy reached over to the board, picked up one of the pawns, and moved it forward on the
board in the only direct way pawns can move according to the rules of chess. Both
women looked at the boy in amazement. They had not taught him to play chess. The boy
had not been reinforced. He had not been shaped by successive approximations. No
pellets or M & Ms came his way. Yet he had learned one of the rules of chess
nonetheless. He had learned through a third major type of learning—observational
learning, defined as learning that occurs through watching, retaining, and sometimes
imitating the behavior of others. Observational learning is sometimes referred to as
social learning theory, because the mode of learning requires a social object, or model,
who performs the behavior that is subsequently observed and learned. As we will see,
the model can be an adult, a peer, a television character, or even a video game character.
Recall the opening story from this chapter—the chimp who actually gave the
experimenter back some of his banana slices after the experimenter had run out of them.
The chimp had acquired the banana-giving behavior through observational learning. He
had witnessed the experimenter dispense banana slices, so he did the same. One of the
striking features about observational learning is that no extrinsic reinforcements—no
pellets, M & M’s, banana slices, or sex—are needed to acquire a new behavior pattern.
All that is needed is witnessing someone else perform a behavior.
One of the early pioneers of observational learning was Stanford psychologist
Alfred Bandura (Bandura, 1965; Bandura, Ross, & Ross, 1961). Bandura and his
colleagues brought 4-year-old children into a play area that contained an array of toys.
Among those toys was a ―Bobo doll,‖ a plastic inflatable doll with sand in the bottom,
which caused the doll to bounce back up when struck. After a while, an adult would start
to hit the Bobo doll with a fist or mallet; kick it; yell at it; and scream ―kick it.‖ The
young children simply watched the adult perform these aggressive actions. A control
group did not witness the adult aggress against the Bobo doll.
Subsequently, the children were allowed to play with an array of toys that
included a smaller version of the Bobo doll. Those who had witnessed the adult beat up
the Bobo doll were twice as likely as those in the control condition to act aggressively
themselves, mimicking the adults by punching, kicking, and beating the Bobo doll with a
mallet (see Photo 6.12). Through observational learning, the children had acquired
Insert Photo 6.12 of Bobo Doll Studies About Here
In subsequent experiments, Bandura and colleagues discovered that the treatment
of adults who had beaten up the Bobo dolls had an impact on whether the children
imitated their behavior. When the adult was punished, children aggressed less. When
the adult was praised, the children aggressed more (Bandura, Ross, & Ross, 1963).
Showing that children imitate adult’s aggression toward a toy doll in laboratory
settings is one thing. Documenting similar effects of witnessing aggression in real life,
on television, or in video games is another. The findings do, in fact, support the effects
of observational learning in these other venues. Children exposed to violent videos not
only become more aggressive immediately after exposure; they also tend to become more
aggressive when they reach adulthood (Anderson et al., 2003). Indeed, the effect of
exposure to violent television between ages 6 and 11 has a larger impact on later violent
behavior than the other contributing causes, such as low IQ, abusive parents, violent
peers, or even coming from a broken home (Carnagey et al., 2007).
The most recent concern is the proliferation of violent video games, which are
played and watched by millions of children and adults throughout the world. In violent
video games, participants witness frequent acts of aggression, since roughly 85% of video
games contain some violence (Carnagey et al., 2007). Participants also get to directly
participate in the violence, in essence learning that takes places immediately in the course
of playing video games. Research shows that playing violent video games increases
subsequent aggressive behavior, violent thoughts, angry feelings, and decreases prosocial
behavior (Anderson et al., 2004). One mechanism through which this occurs is
desensitization, a process by which players respond physiologically less and less after
repeated exposures to violence.
One study had participants play a violent video games for 20 minutes (Carnagey
et al., 2007). Subsequently, they watched a 10-minute video of real-life violence while
researchers measured their heart rate (HR) and galvanic skin response (GSR). Those who
had played a violent video game became physiologically desensitized to real-life violence,
showing lower HR and GSR. These findings suggest that the observational learning
effects of witnessing violence may occur partly through becoming physiologically
desensitized to acts of aggression.
Observational learning of aggression can occur through a variety of media—first-
hand observation of adults engaging in aggression, TV or movie models engaging in
aggression, or video-game players engaging in aggression. One other origin of
observational learning comes from an unexpected source—the religious texts. In a
modern world of terrorists, many violent people claim that God approves of their actions.
In two experiments to test this proposition, participants read violent passages that they
were told either came from the Bible and sanctioned by God or from an ancient scroll and
not sanctioned by God (Bushman et al., 2007). Subsequently, participants were given the
chance to aggress against someone else by blasting them with a loud noise through
headphones. When the violence in the passage was from the Bible or sanctioned by God,
actual aggression increased, especially among those who already believed in the Bible
and in God (see Photos 6.13). In short, observational learning of violence need not come
from actual visual models of aggression. It can occur through violent images created by
the written word—findings that may help to explain the prevalence of violent terrorism
Insert Photos 6.13 About Here – Potential Terrorist
An Evolutionary Perspective on Observational Learning
An evolutionary perspective on social or observational learning suggests that
people should not imitate from any old model. After all, we are exposed to dozens or
hundreds of models each day, from siblings to classmates to parents to TV stars. People
could not possibly learn equally from all of them; there’s not enough time in a day. An
evolutionary perspective suggests that humans should be highly selective in the models to
which they attend and the particulars of the model’s behavior that are selected for
The first clue to the evolutionary basis of observational learning came from
studies of lab-reared monkeys. These monkeys had no prior exposure to snakes, flowers,
or toy rabbits. The experimenters constructed video tapes of other monkeys reacting with
fear when witnessing a snake (Cook & Mineka, 1990). When they showed the novice
monkeys videos of other monkeys reacting with fear to snakes, they subsequently learned
to fear the snakes themselves—a prime case of observational learning. But here’s the
interesting twist. When the experimenters spliced a flower, a mushroom, or a toy rabbit
into the video, and showed participant monkeys videos of other monkeys reacting with
fear to these objects, the participant monkeys showed absolutely no fear to flowers or
mushrooms. So observational learning is critical to the development of fears of monkeys;
but so is the particular stimuli used.
Insert Photos 6.14 Here: Fearful Monkey
Monkeys only acquire fear through observational learning when the object of fear
happens to coincide with an ancestral danger. The key concept is preparedness,
indicating that the monkeys seem to come into the world pre-wired to learn to fear some
things and not others. Psychologists have documented similar findings with humans,
concluding that humans have an evolved module of ―fear and fear learning‖ (Ohman &
Mineka, 2001). This example illustrates a key point—that ―learning‖ and ―evolved‖ are
not competing explanations. Humans (and other species) have evolved learning
adaptations, in this case a specialized adaptation to observationally learn to fear things
that historically have been dangerous to survival.
Insert Photos 6.15 About Here: High and Low Status Models
Another key evolutionary factor affecting observational learning is the social
status of the model. Status is critical, from an evolutionary perspective, because people
higher in status are more heavily endowed with resources needed for successful survival
and reproduction. Consequently, it would be surprising if people used low-status skid-
row bums as models as much as higher status individuals who command our attention
and respect. And indeed, models higher in social status are imitated more than models
lower in social status (McCullaugh, 1986). Advertisers and marketers are well-aware of
the power of high status models in observational learning. When Michael Jordan wears a
particular brand of basketball shoe or Jennifer Lopez wears a particular brand of dress,
sales fly through the roof. Marketers make money by exploiting the human bias to learn
observationally though high status models.
Adapting to Your World: Using Principles of Learning to Improve Your Life
Thoughtful reflection of the principles of learning—classical conditioning,
operant conditioning, and observational learning—reveals ways in which you can
improve your own life. Consider observational learning. An undergraduate student who
I’ll call JJ had a clearly defined goal in his life. He wanted to get an education, become a
professor, and eventually return to his native country (an Asian country) and become the
founder of the field of evolutionary psychology. One of his strategies was to learn by
picking key models for each stage of his career.
First, the paid close attention to the behavior of more advanced undergraduates
(―models,‖ in the language of operant conditioning) who had attained success in getting
into the sorts of graduate schools that he aspired to. He mimicked their behavior. When
he entered graduate school, he paid special attention to more advanced graduate students
who had achieved success in getting their Ph.D.’s and securing good academic positions.
Finally, he watched carefully the strategies of the most successful professor in his native
Through observational learning at each stage in his career, selecting the right
models on which to base his own behavior, he achieved great success. He succeeded in
entering a prestigious graduate school. He succeeded in securing his Ph.D. And he
succeeded in obtaining a professorship in his native country. Although it is too early to
tell whether he will succeed in his ultimate goal of founding the field of evolutionary
psychology in his native country, he is well on his way. His research is getting published
in prestigious journals, and academics in his country are beginning to take note of this
new trend. The lesson in learning is simple—pick your models carefully!
Observational learning is a third type of learning, which occurs through watching,
retaining, and imitating the behavior of others. It is sometimes called social learning,
because the mode of learning requires a social object, or model, who performs the
behavior subsequently learned. Models can be other people, television or movie actors,
video game characters, or individuals depicted in written texts such as novels or the bible.
One hypothesis for why aggression seems to be observationally learned is that repeatedly
witnessing aggression causes a decrease in physiological responding, suggesting that
people become desensitized to violence.
An evolutionary perspective has deepened our understanding of observational
learning in both monkeys and humans. Monkeys who witness other monkeys react with
fear when seeing a snake observationally learn to fear snakes. Watching other monkeys
react with fear to flowers or rabbits, however, does not cause observational learning.
Monkeys seems prepared to learn some fears very easily, namely those historically
hazardous to survival. Among humans, social status has always been strongly linked
with access to key resources needed for survival and reproduction. Thus, it is not
surprising that observational learning occurs much more strongly when in response to
high status than low status models. Humans seem prepared to observationally learn
preferentially from those who have the resources they want.
After school shootings, such as those that occurred in Columbine, Colorado,
which learning process do you think was responsible for subsequent “copy-cat”
shootings at other schools?
Why do you think the models of children shift from parents to peers with
What advantages does observational learning have over classical conditioning or
operant conditioning in terms of speed of learning?
CONNECTIONS: TOWARD A UNIFIED PSYCHOLOGY
Learning is intrinsically linked to all other branches of psychology. Many forms
of learning have a deep evolutionary history (Chapter 3), such as classical and operant
conditioning. Evolution has designed humans to learn some things rapidly, particularly
things that are critical to life or death (e.g., food consumption) or reproduction (e.g., sex).
Learning has a clear biological basis (Chapter 4), such as discovering the basic brain
centers responsible for pleasure and hence reinforcement. We learn through our senses
and perceptual mechanisms (Chapter 5), from habituating to repeated sounds such as
train noise through our vision system that selectively hones in on high status models for
Many things we learn are stored in memory (Chapter 7), such as paths to
successful hunting grounds or the tricks that allow us to unlock the hearts and minds of
other people. Although we are not always conscious of what we have learned, some
forms of learning are indeed available to conscious awareness and can be communicated
to other people (Chapter 8). A reasonable argument can be made that humans have a
greater ability to learn than any other organisms, which may be a key feature of human
intelligence (Chapter 10). Learning clearly develops over the lifespan (Chapter 11),
leading to one of the few abilities that peaks late in life—wisdom. Motivation and
emotion (Chapter 12) play powerful roles in learning. We are motivated to learn to get
ahead in life; that’s probably one of the reasons you are reading this book. And
emotional experiences heighten the speed of learning—the jolt of adrenaline we get from
being scared by a stranger causes us to learn to be wary of dark places.
Much human learning is inherently social (Chapter 13). Observational or social
learning is the most obvious example. But we also shape the behaviors of others in our
social environment, and in turn are shaped by them. Personality (Chapter 14) plays a
critical role in learning. Those high on the personality trait of ―openness to experience,‖
for example, learn from a wider range of experiences than those low on openness to
Psychological disorders (Chapter 15) can disrupt learning, as in the case of
sociopaths who are notoriously resistant to learning from punishing outcomes. Many
forms of psychological therapy (Chapter 16) are based on principles of learning, such as
using the principle of habituation as a core strategy in the treatment of snake phobias
through a procedure known as systematic desensitization. Finally, learning is essential to
stress, coping, and health (Chapter 17). We become stressed when our previously learned
coping strategies no longer work in a novel environment. We observationally learn good
coping strategies from successful models. And we achieve good health by watching
people whose life habits lead to longevity. In all of these ways, psychology is a unified
science of the mind.
I. What is Learning?
A. Learning Objectives and Summary
Define Learning (pp. xx-xx)
Learning is the collection of processes by which experiences cause relatively
enduring changes in individual’s psychology or behavior.
Describe the concept of habituation (pp. xx-xx)
Habituation is the progressive reduction in intensity or frequency of a response
to a stimulus as a consequence of repeated exposures to the stimulus.
Contrast habituation with sensitization (pp. xx-xx)
Rather than reduction in response as occurs with habituation, sensitization is the
progressive amplification of a response following repeated exposures to a
B. Key Terms and People
Key Terms: Learning, habituation, sensitization
C. Study Questions
1. When Laura gets increasingly irritated in a movie theater when someone continues to
talk on their cell phone during the movie, she has experienced the phenomenon of
2. When a pattern of personal experiences causes a more or less permanent change in
behavior, we say that the person has ___________________.
3. When Joshua experiences a temporary change in behavior due to getting a flu virus, we
call this change learning. (true or false)
II. Classical Conditioning
A. Learning Objectives and Summary
Define classical conditioning (pp. xx-xx)
classical conditioning is a process of learning by which a neutral stimulus evokes
a response after being paired with something that previously evoked the response.
Distinguish between unconditioned and conditioned stimuli (pp. xx-xx)
An unconditioned stimulus (US) is something that innately evokes a response
with no prior learning; in contrast, a conditioned stimulus (CS) defined is a
stimulus that is initially neutral and does not evoke a response until it is paired
(usually repeatedly) with an unconditioned stimulus.
Distinguish between unconditioned and conditioned responses (pp. xx-xx)
An unconditioned response (UR) the reaction naturally elicited by the
unconditioned stimulus, such as a dog salivating to the stimulus of food; a
conditioned response (CR) is a reaction produced by the conditioned stimulus
that mimics the unconditioned response.
Correctly identify the six major principles of classical conditioning (pp. xx-xx)
Acquisition: The period of classical conditioning in which the CS and US are
repeatedly paired in order to create the CR is called the acquisition phase.
Extinction is process by which a conditioned response gradually diminishes in
magnitude when the US is no long paired with the CS.
Reacquisition is the rapid recovery of the conditioned response after the
reintroduction of a pairing of an unconditioned and conditioned stimulus.
Spontaneous recovery is the return of an extinguished conditioned response after
a rest period.
Stimulus generalization is the spread of the conditioned response to stimuli that
are similar to the conditioned stimulus.
Stimulus discrimination occurs when an organism responds only to the
conditioned stimulus and does not respond to other stimuli.
Describe the cognitive reformulation of classical conditioning (pp. xx-xx)
The cognitive reformulation of classical condition involved bringing back
mentalistic concepts, such as expectancies about the predictive value of events
and cognitive maps, in order to fully explain the process of classical conditioning.
Define the evolutionary concept of preparedness in learning (pp. xx-xx)
Preparedness is an organism’s ability to learn some kinds of associations rapidly.
Define the evolutionary concept of adaptive specializations in learning (pp. xx-xx)
Adaptive specializations are evolved predispositions to learn some things easily,
particularly those of special relevant to survival and reproduction during the
history of a species.
B. Key Terms & People
Key Terms: unconditioned stimulus (US), unconditioned response (UR), conditioned
stimulus (CS), conditioned response (CR), acquisition phase, extinction, reacquisition,
spontaneous recovery, stimulus generalization, stimulus discrimination, preparedness.
Key People: Ivan Pavlov, James Watson, Robert Rescorla, Allan Wagner, John Garcia,
C. Study Questions
1. If a dog has been conditioned to respond to the color red, but does not respond to
related colors such as purple, we would say that the dog has undergone ______________.
2. If Shannon gets sick to her stomach after eating a raw oyster, and learns based on that
single experience to avoid raw oysters thereafter, we would say that her rapid
conditioning is an example of:
b. spontaneous recovery
3. Laboratory demonstrations that rats appear to develop expectancies about which events
are good predictors of rewards suggests that learning theorists need cognitive concepts as
part of learning theory. [true or false]
D. Key Illustrations
Name the learning phenomenon associated with each panel [ed. delete correct answers on
top of diagram].
III. Operant Conditioning
A. Learning Objectives & Summary
Define operant conditioning (pp. xx-xx)
Learning that occurs when the consequences of an organism’s action influence the
probability that the organism will repeat that action again in the future.
Define the law of effect (pp. xx-xx)
Responses to stimuli that are followed by a satisfying state of affairs to the
organism are more likely to occur in the future; responses to stimuli followed by
unpleasant states of affairs are less likely to be repeated in the future.
Distinguish between reinforcement and punishment (pp. xx-xx)
Reinforcement is a consequence following an operant response that increases the
likelihood of the operant response occurring in the future; in contrast, punishment
is a consequence following an operant response that decreases the likelihood of its
Distinguish between primary and secondary reinforcers (pp. xx-xx)
Primary reinforcers are events that innately increase the likelihood of a response;
in contrast, secondary reinforcers are stimuli that are originally neutral, but
acquire the power to reinforce behavior by being paired with a primary reinforcer.
Define operant generalization and operant discrimination (pp. xx-xx)
Operant generalization refers the responses that occur to a new stimulus that is
similar to the old stimulus; in contrast, operant discrimination refers to differential
responding to two different stimuli.
Identify the four major schedules of reinforcement (pp. xx-xx)
Fixed interval schedule
Variable interval schedule
Fixed ratio schedule
Variable ratio schedule
Describe how research on cognition and the brain have deepened our
understanding of operant conditioning (pp. xx-xx)
At least some forms of operant conditioning cannot be understood without
invoking higher order cognitive processes such as cognitive maps and
How does an evolutionary perspective deepen understanding of operant
conditioning? (pp. xx-xx)
It provides constraints on what can and cannot be learned.
It leads to the understanding that some things to be intrinsically reinforcing—
those actions that historically contributed to survival and reproduction.
It leads to the understanding that some things to be intrinsically non-reinforcing or
even punishing—those actions that interfered with survival and successful
B. Key Terms and People
Key Terms: Operant conditioning, law of effect, operant responses, reinforcement,
punishment, operant generalization, operant discrimination, discriminative stimulus,
extinction, intermittent reinforcement, schedule of reinforcements, interval schedule of
reinforcement, fixed interval schedule of reinforcement, variable schedule of
reinforcement, fixed ratio schedule of reinforcement, variable ratio schedule of
reinforcement, shaping, cognitive maps.
Key People: Edward Thorndike, B.F. Skinner, Edward Chace Tolman, James Olds,
Keller and Marion Breland
C. Study Questions
1. Food and sex would be described as __________________ reinforcers. [primary]
2. The most powerful schedule of reinforcement is:
a. fixed interval
b. variable interval
c. fixed ratio
d. variable ratio
3. Negative reinforcement is the same thing as punishment. [true or false]
D. Key Illustrations
Positive and Negative Reinforcement and Punishment
Increases Likelihood Decreases Likelihood
of Behavior of Behavior
Stimulus Added __________________ ___________________
Stimulus Removed __________________ ___________________
Fill in the blanks above.
IV. Observational Learning
A. Learning Objectives & Summary
Define observational learning (pp. xx-xx)
Learning that occurs through watching, retaining, and sometimes imitating the
behavior of others.
Distinguish observational learning from operant learning (pp. xx-xx)
In observational learning, unlike operant learning, requires seeing the actions of
others and can occur without being reinforced externally.
Describe how an evolutionary perspective deepened our understanding of
observational learning (pp. xx-xx)
It has added the concept of preparedness, the idea that organisms are innately
designed to learn some behaviors from others (e.g., fears of snakes), but not
others (e.g., fear of flowers).
It has provided a deeper understanding that humans are evolved to observationally
learn from some types of models (e.g., those high in status) more than from other
potential models (e.g., those low in status).
B. Key Terms and People
Key Terms: Observational learning; model; preparedness.
Key People: Albert Bandura, Susan Mineka, Arne Ohman
C. Study Questions
1. A person who performs a behavior that is observed and subsequently imitated by
others is called a _________________. [model]
2. Observational learning of aggression can occur through:
a. witnessing live models who act aggressively
b. witnessing television or movie actors who act aggressively
c. reading written descriptions of individuals who act aggressively
d. all of the above
3. Observational learning occurs through models equally, regardless of their social status.
[true or false]
Association for Behavior Analysis
Journal of Applied Behavior Analysis
European Journal of Behavior Analysis
Bandura, A. (1977). Social Learning Theory. New York: General Learning Press.
Domjan, M. (2006). Principles of learning and behavior. Belmont, CA: Wadsworth.
Skinner, B.F. (1938). The behavior of organisms. New York: Appleton-Century-Crofts.
Aharon, I., Etcoff, N., Ariely, D., Chabris, C.F., O’Conner, E., & Breiter, H.C. (2001).
Beautiful faces have variable reward value: fMRI and behavioral evidence.
Neuron, 32, 537-551.
Anderson, C.A., Berkowitz, L., Donnerstein, E., Huesmann, L.R., Johnson, J. et al.
(2003). The influence of media violence on youth. Psychological Science in the
Public Interest, 4, 81-110.
Anderson, C.A., Carnagey, N.L., Flanagan, M., Benjamin, A.J., Eubanks, J., & Vanentine,
J.C. (2004). Violent video game: specific effects of violent content on aggressive
thoughts and behavior. Advances in Experimental Social Psychology, 36, 199-
Andrzejewski, M.A., Ryals, C.D., Higgins, S., Sulkowski, J., Doney, J., Kelley, A.E., &
Bersh, P.J. (2006). Is extinction the hallmark of operant discrimination?:
Reinforcement and SΔ effects. Behavioural Processes, Volume 74, Issue 1, 10
January 2007, 49-63.
Bandura, A. (1965). Influence of models’ reinforcement contingencies on the acquisition
of imitative responses. Journal of Social and Personality Psychology, 1, 589-595.
Bandura, A., Ross, D., & Ross, S. (1961). Transmission of aggression through imitation
of adult models. Journal of Abnormal and Social Psychology, 63, 575-582.
Bandura, A., Ross, D., & Ross, S. (1963). Vicarious reinforcement and imitative learning.
Journal of Abnormal and Social Psychology, 67, 601-607.
Breland, K., & Breland, M. (1961). The misbehavior of organisms. American
Psychologist, 16, 681-684.
Bushman, B.J., Ridge, R.D., Das, E., Key, C.W., & Busath, G.L. (2007). When God
sanctions killing: Effect of scriptural violence on aggression. Psychological
Science, 18, 204-207.
Cantor, C. (2005). Evolution and posttraumatic stress: Disorders of vigilance and
defense. East Sussex, UK: Routledge.
Carnagey, N.L., Anderson, C.A., & Bushman, B.J. (2007). The effect of video game
violence on physiological desensitization to real-life violence. Journal of
Experimental Social Psychology, 43, 489-496.
Cook, E.W., Hodes, R.L., & Lang, P.J. (1986). Preparedness and phobia: Effects of
stimulus content on human visceral conditioning. Journal of Abnormal
Psychology, 95, 195-207.
Cook, M., & Mineka, S. (1990). Selective associations in the observational conditioning
of fear in monkeys. Journal of Experimental Psychology: Animal Behavior
Processes, 16, 372-389.
Cusato, B., & Domjan, M. (1998). Special efficiency of sexual conditioned stimuli that
include species typical cues: Tests with a CS preexposure design. Learning and
Motivation, 29, 152-167.
Damsma, G., Pfaus, J.G., Winkstern, D. Phillips, A.G., & Fibiger, H.C. (1992). Sexual
behavior increases dopamine transmission in the nucleus accumbens and striatus
of male rats: Comparison with novelty and locomotion. Behavioral
Neurosciences, 106, 181-191.
Domjan, M. (2005). Pavlovian conditioning: A function perspective. Annual Review of
Psychology, 56, 179-206.
Domjan, M. (in press). Adaptive specializations and generality of the laws of classical
and instrumental conditioning.
Domjan, M., Cusato, B., & Krause, M. (2004). Learning with arbitrary vs. ecologically
conditioned stimuli: Evidence from sexual conditioning. Pyshconomic Bulletin
Review, 11, 232-246.
Garcia, J., & Koelling, R.A. (1966). Relation of cue to consequence in aversion learning.
Psychonomic Science, 4, 123-124.
Gemberling, G.A., & Domjan, M. (1982). Selective association in one-day-old rats:
Taste-toxicosis and textureshock aversion learning. Journal of Comparative and
Physiological Psychology, 96, 105-113.
Kamin, L.J. (1969). Predictability, surprise, attention, and conditioning. In B.A.
Campbell & R.M. Church (Eds.), Punishment and aversive behavior. New York:
Kimble, G.A. (1961). Hilgard and Marquis’ conditioning and learning (2nd ed.). New
Kirsch, I., Lynn, S.J., Vigorito, M., & Miller, R.R. (2004). The role of cognition in
classical and operant conditioning. Journal of Clinical Psychology, 60, 369-392.
Krause, M.A, Cusato, B., & Domjan, M. (2003). Extinction of conditioned sexual
responses in male Japanese quail (Coturnix japonica): Role of species typical
cues. Journal of Comparative Psychology, 117, 76-86.
Lieberman, D., Tooby, J., & Cosmides, L. (2003). Does morality have a biological
basis? An empirical test of the factors governing moral sentiments relating to
incest. Proceedings of the Royal Society of London, B, 270, 819-826.
Mast, S.O., & Pusch, L.C. (1924). Modification of response in amoeba. Biological
Bulletin, 46, 55-59.
McCullaugh, P. (1986). Model status as a determinant of observational learning and
performance. Journal of Experimental Social Psychology, 8, xx-xx.
Miller, V., & Domjan, M. (1981). Specificity of cue to consequence in aversion learning
in the rat: Control for US-induced differential orientations. Animal Learning and
Behavior, 9, 339-345.
Miller, R.R., & Grace, R.C. (2003). Conditioning and learning. In A. Healy & R.
Proctor (Eds.), Handbook of psychology. Vol. 4: Experimental psychology (pp.
357-397). New York: Wiley.
Ohman, A., & Mineka, S. (2001). Fears, phobias, and preparedness: Toward an evolved
modules of fear and fear learning. Psychological Review, 108, 483-522.
Olds, J. (1956, October). Pleasure centers in the brain. Scientific American, 195, 105-
Olds, J., & Fobes, J.I. (1981). The central basis of motivation: Intracranial self-
stimulation studies. Annual Review of Psychology, 32, 523-574.
Pavlov, I.P. (1923). New researches on conditioned reflexes. Science, 58, 359-361.
Pavlov, I.P. (1927). Conditioned reflexes. Oxford: Oxford University Press.
Rescorla, R.A., & Wagner, A.R. (1972). A theory of Pavlovian conditioning: Variations
in effectiveness of reinforcement and nonreinforcement. In A. Black & W.F.
Prokasky, Jr. (Eds.). Classical conditioning II. New York: Appleton-Century-
Rosen, J.B. & Schulkin, J.: From normal fear to pathological anxiety. Psychological
Review, 1998, 105(2), 325-350.
Seligman, M.E.P. (1970). On the generality of the laws of learning. Psychological
Review, 77, 406-418.
Seligman, M.E.P., & Hager, J.L. (Eds.). Biological boundaries of learning. New York:
Skinner, B.F. (1938). The behavior of organisms. New York: Appleton-Century-Crofts.
Skinner, B.F. (1981). Selection by consequences. Science, 213, 501-504.
Staddon, J.E.R., & Cerutti, D.T. (2003). Operant conditioning. Annual Review of
Psychology, 54, 115-144.
Terry, W.S. (2006). Learning and memory, 3rd ed. Boston: Allyn & Bacon.
Thorndike, E.L. (1898). Animal intelligence: An experimental study of the associative
process in animals. New York: Macmillan.
Tolman, E.C., & Honzik, C.H. (1930). ―Insight‖ in rats. University of California
Publications in Psychology, 4, 215-232.
Tolman, E.C., Richie, B.F., & Kalish, D. (1946). Studies in spatial learning: :
Orientation and short cut. Journal of Experimental Psychology, 36, 13-24.
Valenstein, E.S. (1986). Great and desperate cures: The rise and decline of
psychosurgery and other radical treatments of mental illness. New York: Basic
Watson, J. (1930). Behaviorism (Rev. ed.). Chicago: University of Chicago Press.
Wise, R.A. (2005). Forebrain substrates of reward and motivation. Journal of
Comparative Neurology, 493, 115-121.
Yehuda, R. (2002). The status of cortisol findings in post-traumatic stress disorder.
Psychiatry Clin North Am., 25(2):341-68.