The University of Edinburgh
School of Philosophy,
Psychology & Language Sciences
Undergraduate Psychology Honours
Final Year Dissertation
Processing pronouns: Effects of content-based
properties of potential antecedents and
Supervisor: Patrick Sturt
ABSTRACT .............................................................................................................. 1
Constraints on antecedent resolution ............................................................. 4
The initial-filter model ................................................................................... 7
The interactive-parallel constraint model ...................................................... 9
The memory load hypothesis ......................................................................... 12
The present study ........................................................................................... 14
METHODS ............................................................................................................... 17
Participants ..................................................................................................... 17
Design ............................................................................................................ 17
Materials ........................................................................................................ 20
Procedure ....................................................................................................... 22
RESULTS ................................................................................................................. 24
Pre-critical region........................................................................................... 26
Pronoun region ............................................................................................... 27
Critical region ................................................................................................ 27
Post-critical region ......................................................................................... 29
Probe recognition ........................................................................................... 30
DISCUSSION ........................................................................................................... 32
REFERENCES ......................................................................................................... 44
APPENDICES .......................................................................................................... 51
Appendix A: Experimental materials............................................................. 51
Appendix B: Statistics for word reading times and probe responses ............. 55
This report examines the process of antecedent identification for pronouns, which is an
essential part of language comprehension in general. The interactive-parallel constraint
model proposes that antecedent resolution is the result of many structural and non-
structural constraints competing simultaneously, suppressing and enhancing the
activation levels of potential antecedents within the discourse representation
(Gernsbacher, 1989; MacDonald & MacWhinney, 1990). This model is supported by
previous evidence that the gender of grammatically-inaccessible antecedents, according
to Principle B of binding theory (Chomsky, 1981), affects antecedent resolution
(Badecker & Straub, 2002). Using a word-by-word self-paced reading methodology, this
experiment investigated whether this „multiple-candidate effect‟ can instead be attributed
to the basic difficulty encoding a sentence containing two less distinguishable referents of
the same gender as compared to different genders, termed the „memory load hypothesis‟.
The results did not replicate the multiple-candidate effect, and found no support for the
memory load hypothesis in early processing. Results in later processing revealed a
conflicting effect of slower reading times for words located towards the end of a sentence
that contained two referents of different genders as opposed to being gender congruent.
This effect could be attributable to a process in which readers attempt to bind the pronoun
to the most-recently processed grammatically-inaccessible antecedent within the
sentence, or it might be evidence of gender-based priming in referent encoding. Finally,
evidence for the memory load hypothesis was found on response times in the probe
recognition task; responses were longer if the probe followed a sentence containing two
proper names of congruent gender, irrespective of whether the sentence required
antecedent resolution. A type of similarity-based interference was apparently affecting
probe responses; it seems the same processes that are accessed in initial sentence
processing are re-accessed in later probe sentence-checking.
Understanding how anaphoric dependencies are processed is an essential part of
comprehending how language is processed in general. Anaphoric dependencies constitute
the relationship between a referentially dependent expression, such as a pronoun, and its
antecedent noun phrase. Identifying an antecedent for a pronoun is necessary so that a
reader can understand what a pronoun refers to and thereby build a coherent
representation of the discourse being processed. Research has shown considerable interest
in investigating the process by which such dependencies are resolved. A number of
constraints have been identified as contributing towards antecedent resolution, primarily
grammatical constraints such as binding theory (e.g. Chomsky, 1981; Reinhart, 1976);
morphological constraints of the pronoun, such as person, gender, and number (e.g.
Arnold, Eisnband, Brown-Shmidt & Trueswell, 2000; Badecker & Straub, 1994;
Badecker & Straub, 2002; Cacciari et al, 1997; Ehrlich, 1980; Sturt, 2003; Van Gompel
& Liversedge, 2003); and the accessibility of referents within the discourse representation
(e.g. McKoon, Ward, Ratcliff, & Sproat, 1993; Van Gompel & Majid, 2004). This
experiment investigates the role of both grammatical and morphological gender
constraints on antecedent resolution imposed by the pronoun by manipulating the gender
of referents within the discourse representation built during comprehension, examining
whether gender information facilitates or hinders the process of antecedent resolution of
pronouns, or whether the manipulation of gender information is simply related to general
effects of similarity based-interference in language comprehension.
Constraints on antecedent resolution
In sentence comprehension, it is generally assumed that a discourse model is
formed in temporary working memory (Baddeley, 2007) and incrementally updated as
the sentence is read (Carreiras, Garnham, Oakhill & Cain, 1996). The discourse model is
a coherent representation of the set of entities evoked by the discourse and the
relationships that exist between those entities, commonly referred to as „discourse
referents‟. Language comprehension necessitates such a representation so as to allow for
the retrieval of earlier words in a sentence in order to relate them to later words in a
sentence (Just & Carpenter, 1992). Interpreting referentially dependent expressions
requires accessing and searching the discourse representation for referents already
encoded in the representation of the text so far to which they can refer back to (Lucas,
Tanenhaus & Carlson, 1990). A number of factors have been identified which constrain
the resolution of such referential dependencies, some of which are now discussed.
Grammatical constraints on the allowability of coreferential relationships are
imposed by binding theory (Chomsky, 1981). Binding theory proposes that structural
relations within a sentence limit the possible set of antecedents for a particular
referentially dependent expression, which are classified into three categories: anaphors
(reflexives and reciprocals, e.g. himself, each other), pronominals (non-reflexive
pronouns, e.g. he, her), and referential-expressions (henceforth R-expressions, defined as
uniquely identifiable entities such as full NPs, e.g. John, the duchess). The use of the
term „anaphor‟ here is distinct from the more general term used earlier to describe a
referentially-dependent expression. The concept of „binding‟ refers to coreference within
a discourse, and is based on the principle of c-command (Reinhart, 1976) and governing
categories. The use of the term „governing category‟ here is used to refer to the minimal
clause containing both the referentially-dependent expression and its grammatical
Chomsky (1981) proposed three principles constraining the grammaticality of
coreferential relationships within a sentence based on structural and locality restrictions
imposed on the antecedent, and there evidence suggests that native English speakers‟
judgement of the acceptability of different kinds of coreferential relationships are in
agreement with these principles (Gordon & Hendrick, 1977). Principle A of binding
theory states that an anaphor must be bound locally within its governing category, thus
explaining sentence (1a). Principle B states that a pronominal must be bound, but must be
free within its governing category, which explains sentence (1b). Principle C states that
an R-expression must be free everywhere, which explains the ungrammaticality of
sentence (1c), except in the case that the two R-expressions refer to separate referents.
(1a) Billi thought that Johnj saw himself*i/j.
(1b) Billi thought that Johnj saw himi/*j.
(1c) Johni saw John*i.
Principle A specifies the appropriate antecedent of an anaphor, whereas Principles
B and C only identify those antecedents that a pronominal or R-expression cannot refer
to, thus other discourse processing procedures may thereby be required to select a
suitable antecedent from the grammatically-appropriate set. This report is mainly
concerned with the processing of pronouns constrained by Principle B, although some
comparisons will be made with reflexives governed by Principle A. Throughout this
report, those referents located in binding theory compatible positions will be referred to
as „grammatically-accessible‟, and those referents not in binding theory positions will be
referred to as „grammatically-inaccessible‟.
Antecedent resolution has also been shown to be affected by the morphological
content-based constraints specified by a particular referentially-dependent expression.
Agreements of gender, number, and person between a pronoun and its potential
antecedents have been shown to facilitate antecedent selection (e.g. Arnold et al, 2000;
Badecker and Straub, 1994; Badecker and Straub, 2002; Cacciari et al, 1997; Sturt, 2003;
Van Gompel & Liversedge, 2003). These morphological features identify those entities
within a discourse that could be potential antecedents for pronouns, thereby assisting
resolution (Rigalleau & Caplan, 2000). Arnold et al (2000) has shown that gender
information about the pronoun is rapidly accessed in resolution, finding an influence of
gender within 200ms following the offset of the pronoun in cross-modal eye-tracking
Salience within the discourse representation is similarly thought to constrain
selection for the initial antecedent set. Pronouns are resolved faster when the antecedent
is presented in a focused position within the discourse, suggesting that ease of
accessibility within the discourse representation affects the difficulty of antecedent
resolution (McKoon et al, 1993). Badecker and Straub (2002) argue that only highly
focused referents are initially considered as antecedents of referentially-dependent
pronouns. Van Gompel and Majid (2004) found longer reading times for pronouns with
infrequent antecedents as opposed to frequent antecedents, a finding they propose to be
due to the infrequent antecedents being more salient within the discourse representation.
Research has recently focused on examining the time-course of antecedent
resolution, investigating at what point in processing binding theory constrains
coreference, and moreover the impact of other non-structural constraints. A number of
different models have been proposed to attempt to explain the interplay between these
constraints in the process of antecedent resolution, some of which are discussed below.
The initial-filter model
The „initial-filter model‟ proposes that grammatical constraints immediately
restrict the potential antecedent set to include only grammatically-accessible referents,
and only if this be insufficient for resolution are other discourse processing procedures
used later in processing (Nicol & Swinney, 1989). Cross-modal priming studies using
sentences such as those in (2a) and (2b) support this proposal (Nicol, 1988), finding that
only the grammatically-accessible referent the doctor is primed immediately after the
reflexive in (2a), whereas in (2b) the doctor is grammatically-inaccessible and
accordingly not primed following the pronoun. The two equally possible grammatically-
accessible referents the boxer and the skier are both primed after the pronoun in (2b),
consistent with the idea that Principle B functions to indicate only those referents
inaccessible to the pronoun.
(2a) The boxer told the skier that the doctori for the team would blame himselfi
# for the recent injury.
(2b) The boxer(i) told the skier(i) that the doctor for the team would blame him(i)
# for the recent injury.
Participants‟ responses were faster only for probe words corresponding to
grammatically-accessible referents of the different referentially dependent expressions,
which seems indicative of a process in which there is immediate triggering of the
potential set of antecedents for referentially dependent expressions, composed only of
those that are grammatically-accessible according to the structural constraints imposed by
binding theory (Nicol & Swinney, 1989). Indeed, grammatically-inaccessible referents
appear not to have been considered as potential antecedents at any point in processing.
Further support for the „initial-filter model‟ comes from a study by Clifton,
Kennison and Albrecht (1997), who manipulated grammatical number to investigate how
potential antecedents for pronouns are selected, using self-paced reading of sentences
such as (3a) to (3d).
(3a) The supervisors paid him yesterday to finish typing the manuscript.
(3b) The supervisor paid him yesterday to finish typing the manuscript.
(3c) The supervisors paid his assistant yesterday to finish typing the manuscript.
(3d) The supervisor paid his assistant yesterday to finish typing the manuscript.
In (3a) and (3b), the grammatical subject the supervisor(s) is not in a
grammatically-accessible antecedent location for the accusative pronoun him according to
Principle B, whereas in (3c) and (3d) the subject is in a grammatically-accessible position
for the possessive pronoun his, although only in (3d) does the subject match the pronoun
in number. Reading times were faster in the number-congruent sentence (3d), and longer
in the number-incongruent sentence (3c), in which there was a mismatch and further
processing was required. Number congruency had no effect on reading times in either
(3a) or (3b), which was taken as evidence that binding theory is imposed as an initial-
filter on the potential antecedent set, with number information only being utilised later in
processing if binding theory was unsuccessful in antecedent resolution.
The interactive-parallel constraint model
An alternative model of antecedent resolution is offered by what Badecker and
Straub (2002) refer to as the „interactive-parallel constraint model‟. This model proposes
that antecedent resolution is based on multiple constraints competing in parallel,
including binding theory, morphological agreement (i.e. gender, person and number),
discourse focus, and order of mention, all of which result in the suppression and
enhancement of activation levels of referents within the discourse representation
(Gernsbacher, 1989). The final activation level of a referent is regarded to be the sum of
these competing facilitatory and inhibitory activation levels. Accordingly, probe response
time studies reveal evidence of both facilitation effects for antecedents and inhibition
effects for non-antecedents (Gernsbacher, 1989). MacDonald and MacWhinney (1990)
report similar findings and conclude that as a discourse is processed, the level of
activation of particular elements within the discourse representation constantly changes
depending on the contributions of such parallel-acting constraints. The fundamental idea
of this model is that antecedent resolution is affected by many different kinds of lexical
information and grammatical constraints, in a similar manner to how lexical and syntactic
ambiguities are processed (e.g. MacDonald, Pearlmutter & Seidenberg, 1994; Trueswell,
Previous research has reported evidence of different constraints acting on
antecedent resolution, whereby content-based properties of grammatically-inaccessible
referents have affected the processing of pronouns (e.g. Badecker & Straub, 1994;
Badecker & Straub, 2002; Sturt, 2003). Badecker and Straub (2002) designed an
experiment to test the predictions of both the initial-filter model and the interactive-
parallel constraint model, using sentences such as (4a) to (4d) (in their Experiment 1),
which all include more than one highly salient referent.
(4a) John thought that Bill owed him another chance to solve the problem.
(4b) John thought that Beth owed him another chance to solve the problem.
(4c) Jane thought that Bill owed him another chance to solve the problem.
(4d) Jane thought that Beth owed him another chance to solve the problem.
Using self-paced word-by-word reading tasks, sentences in which grammatically-
accessible referents mismatched the morphological gender of the pronoun, as in the „no-
antecedent‟ conditions in (4c) and (4d), displayed longer reading times in the post-
pronoun region (corresponding to the two words immediately following the pronoun)
than in sentences in which there was a corresponding gender match, as in (4a) and (4b),
which was regarded as evidence that gender information is automatically used in
antecedent resolution. The proper names used in this study were not morphologically
marked for gender but were assumed to have conventional gender (Carreiras et al, 1996).
Other studies have revealed that definitional gender (i.e. king) and even stereotypical
gender (i.e. minister) influence antecedent resolution of reflexives (Kreiner, Sturt &
Garrod, 2008). Likewise, Cacciari et al (1997) highlight the importance of morphological
gender information in anaphor resolution in Italian, in which the gender of the antecedent
is explicitly marked and thereby aids referential resolution.
The key finding in Badecker and Straub‟s (2002) Experiment 1 was that of longer
reading times in the post-pronoun region in multiple-match conditions in which both the
grammatically-accessible referent and the grammatically-inaccessible referent matched
the morphological gender of the pronoun, termed the „multiple-candidate effect‟, as in
(4a), than in single-match conditions in which only the grammatically-accessible referent
matched the pronoun‟s gender. Similar reading time differences were found for reflexives
governed by Principle A, and also for manipulations of morphological number. The
initial-filter model proposes that grammatically-inaccessible referents be immediately
discarded from the candidate antecedent set; however, Badecker and Straub‟s (2002)
results indicate that content-based properties of grammatically-inaccessible antecedents
affect antecedent resolution at least initially. The constraints of binding theory were
clearly not being imposed as an initial-filter on the candidate antecedent set. Instead, all
referents were receiving activation according to the congruence of their content-based
properties, in accordance with the predictions of the interactive-parallel constraint model.
Sturt (2003) examined reflexives governed by Principle A using eye-tracking
methodology and similarly reported evidence of content-based properties affecting
antecedent resolution. Participants experienced processing difficulty when
grammatically-accessible antecedents mismatched the morphological gender of the
reflexive. Crucially, the gender of the grammatically-inaccessible discourse focused
antecedent also affected antecedent resolution, although this effect appeared substantially
later in processing than the effects of the grammatically-accessible antecedent, arising
only in second-pass and regression-path times, which was taken as evidence that this
effect was not part of the initial process of binding. Sturt (2003) proposed that binding
theory acts as an initial-filter on antecedent resolution, but may be overridden at a later
stage in favour of other non-structural constraints, thereby acting as a „defeasible filter‟.
The memory load hypothesis
Badecker and Straub (2002) posited that the multiple-candidate effect might not
be due to content-based properties of the grammatically-inaccessible referent affecting
antecedent resolution, but instead may be due to an earlier effect imposed by
comprehending material prior to the pronoun. The effect might have arisen from the basic
difficulty of encoding two referents of the same gender as compared to encoding two
more easily distinguishable referents of different genders. Within the discourse
representation, different gender referents are supposedly more distinctly represented
compared to referents of the same gender (Garnham & Oakhill, 1985), a finding based on
the idea that it is easier to build and maintain a discourse representation in working
memory when the discourse referents within it are more distinguishable (Just &
Carpenter, 1992). Memory demands are indeed directly related to sentence
comprehension difficulty (Van Dyke & McElree, 2006). Encoding referents of the same
gender might result in a type of similarity-based interference within the discourse
Danks (1986) and Magliano, Graesser, Eymard, Haberlandt and Gholson (1993)
claim that self-paced word-by-word reading tasks induce an artificial buffering strategy in
which participants demonstrate a tendency to traverse text at a rate that outpaces their
comprehension of complex material, which is instead subsequently processed towards the
end of the clause. In self-paced reading tasks, reading is constrained by the requirement
of response key pressing; the response keys cannot be pressed as fast as the eyes can
move and comprehend material, consequently readers invoke an artificial strategy of
simply rapidly pressing the response key and delay processing the material until an
appropriate moment arises (Danks, 1986). Badecker and Straub (2002) proposed that the
basic memory problems arising from the added complexity of encoding two referents of
the same gender might be delayed by the constraints of the self-paced reading task,
thereby resulting in the reading time differences observed in their original experiment.
Accordingly, similar differences ought to occur if the pronoun is replaced by another
proper noun and the number of gender congruent proper nouns within each sentence is
manipulated, as in (6a) and (6b). For ease of mention throughout this report this
prediction will henceforth be referred to as the „memory load hypothesis‟.
(6a) John thought that Beth owed Jim another opportunity to solve the problem.
(6b) John thought that Bill owed Jim another opportunity to solve the problem.
If the multiple-candidate effect can instead be explained by the memory load
hypothesis, then this finding will no longer offer support for the parallel-interactive
constraint model. In fact, the findings in Badecker and Straub‟s (2002) Experiment 2
revealed a reading time difference only in the sentences containing the pronoun, as in (4a)
and (4b), thereby replicating the multiple-candidate effect. Manipulating the gender, and
thereby the distinguishability, of the proper nouns in (6a) and (6b) failed to invoke the
processing-load effects predicted by the memory load hypothesis, which thereby led to
the conclusion that the reading time differences found do reflect antecedent resolution
processes. Content-based properties of both grammatically-accessible and grammatically-
inaccessible referents can affect antecedent resolution, thus supporting the interactive-
parallel constraint model. Binding theory is apparently not acting as an initial-filter;
congruent properties of grammatically-inaccessible referents are allowing for such
referents to be included in the initial candidate antecedent set.
The present study
There is a methodological concern with Badecker and Straub‟s (2002) Experiment
2 that must be considered before disregarding the memory load hypothesis entirely. If
participants are particularly sensitive to sentence complexity, it is feasible that the
materials used in their experiment were simply unsuitable for revealing differences in
encoding difficulty. If adding referents to the discourse representation increases memory
load, then adding a third referent in place of the pronoun can only add to that memory
load and increase any effects of similarity-based interference. The sentences exemplified
in (6a) and (6b) distinguish between either three referents of congruent gender, or of
different genders. However, since gender is dichotomous, the sentences containing
referents of different genders are biased towards one gender (i.e. two males and one
female). There may be an inherent difference between these conditions, and the pronoun
conditions which only contain two referents, and thereby the gender of the referents can
be equally distributed. It is possible that this added referent has breached some memory
load threshold in the materials in (6a) and (6b). Indeed, an examination of the data in
Badecker and Straub‟s (2002) Experiment 2 reveals that reading times in the critical
region (the two words following the third proper name) in the proper name sentences in
(6a) and (6b) are numerically comparatively similar to reading times in the corresponding
region in the multiple match pronoun sentences in (4a), all of which are higher than the
single match sentences in the pronoun condition in (4b).
An alternative idea that might not breach the suggested threshold would be to
replace the pronoun with a non-referential pronoun that does not introduce a new referent
into the discourse, such as the indefinite pronoun everyone, which does not require an
antecedent and furthermore is gender-neutral. This would then allow a gender-
manipulation of the prior two proper names in the sentence comparable to that in the
referential pronoun conditions in (4a) and (4b); the indefinite pronoun cannot contribute
to the distinguishability of the referents encoded within the discourse representation since
it bears no gender. If the interactive-parallel constraint model is accurate, and the
difference in reading times found in Badecker and Straub‟s (2002) experiments are
attributable to the multiple-candidate effect arising from antecedent resolution processes,
then reading time differences should only occur between the definite pronoun sentences.
However, if the memory load hypothesis is correct, and if the additional complexity of
encoding two referents of the same gender in the discourse representation is contributing
to the difference in reading-times found in Badecker and Straub‟s (2002) experiments,
then reading times should be longer in sentences containing two referents of the same
gender in both the definite pronoun conditions and the indefinite pronoun conditions.
40 undergraduate students at the University of Edinburgh participated in the
experiment, with no reported history of language or reading disorders. All were native
speakers of English and had normal or corrected-to-normal vision.
The experiment consisted of a word-by-word self-paced reading task, followed by
a post-sentential probe-recognition task and intermittent yes-no comprehension questions.
These secondary tasks were included in order to test reading comprehension, and to
encourage participants‟ full attention to the content of the sentences. The experiment
follows the same structure as Badecker and Straub‟s (2002) Experiment 2.
The main experimental task had four conditions, organised into a 2 x 2 design
with the within-subjects factors pronoun type (definite vs. indefinite pronoun) and match
type (single vs. multiple match). Example sentences from one complete set are shown in
(7a) to (7d). The full set of experimental items adheres to this basic structure, which is
based on the structure used in Badecker and Straub‟s (2002) Experiment 2.
(7a) definite pronoun condition, single match
Barbarai assumed that Michael saw heri disappear quietly from the party.
(7b) definite pronoun condition, multiple match
Barbarai assumed that Melissa saw heri disappear quietly from the party.
(7c) indefinite pronoun condition, single match
Barbara assumed that Michael saw everyone disappear quietly from the party.
(7d) indefinite pronoun condition, multiple match
Barbara assumed that Melissa saw everyone disappear quietly from the party.
Sentences (7a) and (7b) constitute the definite pronoun conditions, containing the
referentially-dependent definite pronoun her/him bound to an antecedent located within
the sentence. Principle B of binding theory states that a pronoun must not be bound
locally within its governing category (the constraint corresponding approximately here to
the embedded clause it is located within). Accordingly, the antecedent of the definite
pronoun her is the grammatically-accessible (non-local) matrix subject Barbara, as
opposed to the grammatically-inaccessible (local) embedded subject Michael or Melissa.
Match conditions were varied by manipulating the gender congruency of the
grammatically-inaccessible proper name with the morphological gender of the definite
pronoun. In the single match condition, only the gender of grammatically-accessible
proper name matched the morphological gender of the definite pronoun. In the multiple
match condition, both the gender of the grammatically-accessible proper name and the
grammatically-inaccessible proper name matched the morphological gender of the
definite pronoun. Both the interactive-parallel constraint model and the memory load
hypothesis predicted slower reading times in the multiple match condition, thus the
multiple-candidate effect found by Badecker and Straub (2002) was expected to arise.
The interactive-parallel constraint model assumes that both binding theory and gender
congruency affect antecedent resolution in parallel, and thereby the gender of the
grammatically-inaccessible proper name was predicted to affect antecedent resolution,
resulting in slower reading times. The memory-load hypothesis assumes that two
discourse referents of the same gender are less distinguishable than referents of different
gender, and thereby predicted difficulty in encoding and maintaining the discourse
representation, which would be demonstrated by slower reading times.
Sentences (7c) and (7d) comprised the indefinite pronoun conditions, containing
the non-referential indefinite pronoun everyone/everybody. Indefinite pronouns do not
refer to any specific entity and thereby require no antecedent to be bound to;
consequently, antecedent resolution does not occur in the indefinite pronoun conditions.
Match conditions were varied by manipulating the gender of the two proper names in the
sentence, thereby varying the distinguishability of the referents evoked by the discourse.
The indefinite pronouns selected were gender-unspecified, and thereby indefinite
pronoun match conditions were equivalent to the definite pronoun match conditions in
terms of the number of gender congruent referents. The multiple match condition
contained two proper names of the same gender, and the single match condition
contained two proper names of different genders. The interactive-parallel constraint
model predicted no difference in reading time between the single and multiple match
conditions, since no antecedent resolution is required. The memory load hypothesis
predicted that the multiple match condition would induce processing difficulty, resulting
in slower reading times, since the discourse referents within the discourse representation
are less distinguishable when they match in gender, and thus encoding the referents and
maintaining the discourse representation ought to be more difficult.
The main experiment consisted of 24 experimental sentences (see appendix A),
four variants of each corresponding to the four conditions described in (7a) to (7d),
combined with 77 filler sentences of varying syntactic structures and lengths. Within the
experimental sentences, the morphological gender of the pronoun was balanced across
the definite pronoun condition items; half of the sets contained the feminine version of
the third person singular objective pronoun her, and half contained the masculine version
him. The sentences were constructed so that the definite pronoun her could not be
interpreted as a possessive pronoun referring to the embedded subject proper name. The
realisation of the indefinite pronoun was balanced across the indefinite pronoun condition
experimental items; half of the sets featured everyone, and half featured everybody. The
morphological gender of the definite pronoun controlled the order of the gender of the
preceding proper names within the sentence in single match conditions; grammatically-
accessible proper names were gender congruent and grammatically-inaccessible proper
names were gender-incongruent. In multiple match conditions both the proper names
matched the definite pronoun‟s morphological gender. The order and selection of the
proper names in the corresponding indefinite pronoun conditions was identical.
Similar to Badecker and Straub‟s (2002) materials, proper names were selected as
opposed to full noun phrases, since proper names denote referents that are more
prominently represented within the discourse representation (Sanford, Moar, & Garrod,
1988). Selection of male and female gendered proper names was largely based on
intuition, excluding uncommon or gender-ambiguous names. Proper names in English are
not morphologically marked for gender; however, whether they are male or female is
generally considered to be an inherent property of the proper name itself (Carreiras et al,
1996). English proper names are thereby assumed to be conventionally gendered. For
each experimental set, all proper names were matched by letter length, and the gender-
manipulated male and female proper names in the embedded subject position were
matched by letter onset in the single and multiple match conditions.
The secondary tasks consisted of a post-sentential probe recognition task and
intermittent comprehension questions, both of which required a “yes” or “no” response
and correct responses were balanced equally between the two. A single probe word was
created for each experimental set (see appendix A) and, where appropriate, a single
comprehension question was created for each of the experimental sets. Pronouns and
proper names were never selected as probe words or as the focus of the comprehension
questions. “Yes” probes were selected equally from content words located in sentence-
initial, sentence-medial, and sentence-final positions to avoid any potential cuing. “No”
probes were either semantic associates of content words in the sentence, morphological
neighbours of content words in the sentence, or were both semantically and
morphologically unrelated to words in the sentence. Comprehension questions were
developed for one quarter of the experimental and filler sentences, for which correct
responses never required antecedent resolution to occur within the sentence, so the proper
names or pronouns were never highlighted as of interest in the experiment.
The 24 experimental sentences were divided into four lists using a Latin-square
counterbalanced design; each list contained only one condition of each item, and all
conditions were equally represented. Each list was combined with the 77-filler item set
and pseudo-randomized so that no two experimental sentences occurred consecutively.
Participants were divided equally between the four lists.
Participants were tested individually in a small and quiet room, each testing
session lasting approximately 20 minutes. The experiment was run on an Ergo Preceptor
3 laptop model N-30N3 using DMDX software (Forster & Forster, 2003). Participants
were presented with a set of written instructions informing them that they would be
partaking in a study investigating how people read different types of sentences.
Participants read each sentence one word at a time in a self-paced manner and responded
to probe words and comprehension questions. Each sentence was initially presented as a
series of white-lines on a black screen. Clicking the mouse-button initiated each trial, and
participants‟ repeated clicks of the mouse button resulted in successive word-by-word
presentation of the sentence. Words appeared in a white size 12 courier font on a black
background. Participants used their dominant hand for mouse clicks, of which response
times were recorded as corresponding to reading times of individual words.
Immediately following the presentation of each sentence probe words appeared in
isolation in the centre of the screen, and participants pressed the left- or right-hand mouse
button for “yes” or “no” responses respectively (corresponding to the locations of a Y and
N on screen), to indicate whether they thought the probe was present in the just-read
sentence. On one quarter of the trials comprehension questions followed the probe words,
and a similar “yes” or “no” response was required. On trials without a comprehension
question, the experiment immediately continued to the initial presentation screen of the
following sentence. Feedback was not given in response to any of the tasks. Response
times in the probe recognition task and comprehension task were recorded.
Two breaks were included in the experiment which participants could utilise if
required. Importantly, participants were instructed to read the sentences at a quick,
comfortable pace, but with care so as to be able to respond accurately on the question
tasks following each sentence.
Several different types of data were collected in this experiment. Reading time
data and probe recognition and comprehension question data were compiled. Probe words
and comprehension questions were included in the experiment to encourage participants‟
understanding and attention to the content of the experimental materials. Using Badecker
and Straub‟s (2002) criteria, only data from participants scoring above 80% accuracy on
comprehension and probe questions were included in the analyses. In this case, all
participants attained greater than 80% accuracy, and consequently all collected data was
included. Comprehension questions were only present in one quarter of the experimental
trials, and thus were not included in further analyses. Probe word accuracy and response
times were included in the main analyses.
Condition means and standard errors of reading times for all words are reported.
Repeated measures analyses of variance with the factors pronoun type and match type
were computed for single words and for planned-text groupings, specifically the two
words following the pronoun on which past research has typically shown a spill-over
effect of antecedent resolution (e.g. Badecker & Straub, 1994; Badecker & Straub, 2002).
The ANOVAs were based on means computed for each participant (F1) and each item
(F2) in each condition. All statistical tests are reported at the p < 0.05 level, unless
otherwise indicated. Planned comparisons were carried out on any significant
interactions, comparing single and multiple match conditions within the definite pronoun
conditions and the indefinite pronoun conditions.
For the purpose of discussion and clarification only, the sentence was divided into
the following regions:
Word 1. Barbara (initial region)
Word 2. assumed (initial region)
Word 3. that (pre-critical region)
Word 4. Melissa/Michael (pre-critical region)
Word 5. saw (pre-critical region)
Word 6. her/everyone (pronoun region)
Word 7. disappear (critical region)
Word 8. quietly (critical region)
Word 9. from (post-critical region)
Word 10. the (post-critical region)
Word 11. party (post-critical region)
The initial region contained the matrix subject (the grammatically-accessible
antecedent in the definite pronoun conditions) and the matrix verb, and was not included
in the analyses. It was assumed that reading times would not differ in this region since the
lexical material was identical in all conditions. Analyses were carried out on all regions
from the pre-critical region onwards, which comprised the words from the onset of the
complement clause up to the word preceding the pronoun. The lexical material did not
differ in this region, with the exception of the gender manipulation of the embedded
subject proper name. This region was thus included in the analyses in case any effects of
encoding multiple referents predicted by the memory load hypothesis arose immediately.
The pronoun region consisted of either the definite pronoun her/him or the indefinite
pronoun everyone/everybody. It was predicted that reading time differences would not
occur in this region, unless the processing difficulty of the multiple-candidate effect
appeared immediately, although previous findings suggest the effects of processing
referentially-dependent pronouns be somewhat delayed (e.g. Badecker & Straub, 2002;
Ehrlich & Rayner, 1983). The critical region contained the two words immediately
following the pronoun. It was expected that reading time differences would arise in this
region, indicative of either the multiple-candidate effect or the memory load hypothesis.
The post-critical region consisted of all remaining words in the sentence. This region was
included in case there was any effect spill-over from the prior critical region. Results
from within these established regions will now be discussed.
As expected there were no reading-time differences between the sentences in this
region, since this region did not differ between any conditions. For word 3 there was no
effect of pronoun type or match type (all F‟s < 1), and no interaction between the two
(F1(1,39) = 1.027, p > 0.05; F2(1,23) = 1.200, p > 0.05). For word 4 (the embedded
subject) there was no effect of pronoun type (F1(1,39) = 3.992, p > 0.05; F2(1,23) =
2.105, p > 0.05), no effect of match type (both F‟s < 1), and no interaction between the
two (both F‟s < 1). For word 5 there was no effect of pronoun type or match type, and no
interaction between the two (all F‟s < 1).
From this region onwards one might expect some effects to arise since the
sentences differ in their content depending on condition. Word 6 was either the definite
pronoun her/him or the indefinite pronoun everyone/everybody. For word 6 there was a
significant main effect of pronoun type both by-participants (F1(1,39) = 6.806, p < 0.05)
and by-items (F2(1,23) = 7.562, p < 0.05). Response times were faster in the definite
pronoun conditions (single match = 417.50 ms; multiple match = 420.63 ms) than in the
indefinite pronoun conditions (single match = 460.63 ms; multiple match = 463.67 ms).
Match type was not significant, and there was no interaction between pronoun type and
match type (all F‟s < 1).
Reading-time differences were expected to arise in this region. Planned-text
groupings of the critical region, combining word 7 and word 8, revealed no effect of
pronoun type or match type, and no interaction between the two (all F‟s < 1). Analyses
were also carried out on the two words separately. For word 7 there was no effect of
pronoun type (F1(1,39) = 1.050, p > 0.05; F2 < 1). The effect of match type was not
significant (both F‟s < 1), and there was no interaction between pronoun type and match
type (F1(1,39) = 1.233, p > 0.05; F2 < 1). For word 8 the effects of pronoun type and
match type were not significant, and there was no interaction between the two (all F‟s <
1). Standard errors in this region were very large, indicative of much variance in the
reading times (see Figures 1 and 2, and appendix B).
Figure 1. Word-by-word mean reading times in milliseconds for the single match and
multiple match sentences within the definite pronoun condition. Error bars indicate
standard errors of participant means.
Definite Pronoun Conditions
Single Match Condition Multiple Match Condition
Reading Time (ms)
Figure 2. Word-by-word mean reading times in milliseconds for the single match and
multiple match sentences within the indefinite pronoun condition. Error bars indicate
standard errors of participant means.
Indefinite Pronoun Conditions
Single Match Conditions Multiple Match Conditions
Reading Time (ms)
For word 9 there was no effect of pronoun type or match type (all F‟s < 1), and no
interaction between the two (F1 < 1; F2(1,23) = 1.222, p > 0.05). For word 10 there was
no effect of pronoun type (both F‟s < 1), and the effect of match type was not significant
(F1(1,39) = 1.538, p > 0.05; F2(1,23) = 1.073, p > 0.05). The interaction between the two
was not significant by-participants (F1(1,39) = 1.945, p > 0.05), but was marginally
significant in the by-items analysis (F2(1,23) = 3.562, p = 0.072).
Two planned comparisons were carried out on the reading times for word 10 to
examine the predictions of the multiple-candidate effect and the memory load hypothesis
in the two pronoun conditions. The first comparison compared the reading times for the
single and multiple match conditions within the definite pronoun condition, exemplified
in (7a) and (7b). Reading times in the definite pronoun conditions were 43.17 ms longer
in the single match conditions than in the multiple match conditions, as shown in figure 1,
a difference that reached marginal significance both by-items and by-participants (single
match = 424.88 ms; multiple match = 381.71 ms; t1 (39) = 2.157, p < 0.05; t2(23) = 2.035,
p = 0.054). This is the opposite to the predictions of both the multiple-candidate effect
and the memory load hypothesis. The second comparison compared the reading times for
the single and multiple match conditions within the indefinite pronoun sentences,
exemplified in (7c) and (7d). There was no difference between the reading times in this
condition (single match = 387.29 ms; multiple match = 396.75 ms; t1 (39) = -0.365, p >
0.05; t2(23) = -0.437, p > 0.05).
Only 23 items were included in the analysis for word 11, since one of the
sentences contained only 10 words. For word 11 there was no effect of pronoun type
(both F‟s < 1). The effect of match type was significant by-participants (F1(1,39) = 7.383,
p < 0.01), and was marginally significant by-items (F2(1,22) = 2.955, p = 1.00). Reading
times were consistently faster in the multiple match conditions (definite pronoun =
451.30 ms; indefinite pronoun = 454.04 ms) than in the single match conditions (definite
pronoun = 507.09 ms; indefinite pronoun = 506.22 ms). This result is contrary to the
predictions of the memory load hypothesis. There was no interaction between pronoun
type and match type (both F‟s < 1).
Figure 3. Mean response times in milliseconds for the probe recognition task for each
condition. Error bars indicate standard errors of participant means.
Probe Word Response Times
Single Match Multiple Match
Response Time (ms)
Definite Pronoun Indefinite Pronoun
All participants scored greater than 80% accuracy on the probe words. For probe
word accuracy there was no effect of pronoun type or match type, and no interaction
between the two (all F‟s < 1). For the probe word response times there was no effect of
pronoun type (both F‟s < 1). The effect of match type was significant both by-participants
(F1(1,39) = 4.382, p < 0.05) and by-items (F2(1,23) = 5.518, p < 0.05). Response times
were consistently faster in the single match conditions (definite pronoun =1306.63 ms;
indefinite pronoun = 1359.42 ms) than in the multiple match conditions (definite pronoun
= 1443.75 ms; indefinite pronoun = 1417.71 ms), as shown in Figure 3, which is the
direction predicted by the memory load hypothesis. There was no interaction between
pronoun type and match type (both F‟s < 1).
The results of this experiment are somewhat complex, and therefore each finding
will be examined in turn. Results are discussed in terms of effects found in early and late
processing of the sentence (in terms of reading times on early and late words in the
sentence). Effects found late within the sentence are assumed to be related to processing
earlier material, given that the content following the pronoun did not differ between
conditions. Since a word-by-word analysis was conducted, the results of particular words
will be examined according to their sentence-location. Examining effects in early
processing first, this experiment did not explicitly replicate the multiple-candidate effect
found by Badecker and Straub (1994, 2002). There were no immediate differences
between the processing of the definite pronoun in sentences in which both the
grammatically-accessible and grammatically-inaccessible referents match the
morphological gender of the definite pronoun (multiple match), and sentences in which
only the grammatically-accessible antecedent matches the morphological gender of the
definite pronoun (single match), as demonstrated by the lack of effects found either in the
pronoun region or the critical region. Likewise, no effects were found on these regions
within the indefinite pronoun conditions, thereby offering no support for the memory load
hypothesis. There were no differences in processing the gender congruent or gender
incongruent second proper name in the pre-critical region either, indicating that effects of
the memory load hypothesis did not arise at least in early processing. Reading times on
the pronoun region within the definite pronoun conditions were faster than those within
the indefinite pronoun conditions, an effect attributable to the fundamental difference in
word length between the definite pronouns her/him and the indefinite pronouns
everyone/everybody (Just, Carpenter, & Wooley, 1982).
Two somewhat conflicting effects in processing arose late in the sentence. Firstly,
reading times of words located within the post-critical region in a sentence containing
two proper names of different gender (single match) were longer than in a sentence
containing two proper names of the same gender (multiple match). This effect was firstly
restricted to the definite pronoun condition on word 10, but then extended to include the
indefinite pronoun condition also on word 11, which indicates that the effect is not
necessarily related to antecedent resolution. Secondly, the opposite effect was found on
response times in the probe recognition task, whereby probe responses were longer when
the sentence to be checked for correspondence contained two proper names of the same
gender (multiple match) than two proper names of different gender (single match). This
effect was not related to response accuracy, and thereby seems indicative of a delay in
processing. The first effect seems to oppose the memory load hypothesis, whereas the
second effect is in accordance with its predictions.
The results of this experiment appear to suggest that early processing of the
referentially-dependent definite pronoun is not affected by content-based properties of the
grammatically-inaccessible antecedent. In either the pronoun region or the subsequent
critical region there was no difference in reading times between sentences in which the
grammatically-inaccessible referent matched or mismatched the morphological gender of
the definite pronoun. Participants did not seem to have more difficulty processing a
sentence in which the two proper names matched the gender of the definite pronoun;
hence the multiple-candidate effect was not replicated in this experiment. However, it
does not follow that content-based properties of the grammatically-inaccessible referent
did not affect processing at all, since there are later effects indicating that the gender of
the grammatically-inaccessible referent was in fact affecting resolution, hence it would be
inappropriate to reject the interactive-parallel constraint theory based on this finding
alone. Nevertheless, the grammatically-inaccessible referent did not appear to influence
antecedent resolution in early processing at least. Furthermore, the lack of effects found
in early processing in both the definite and indefinite pronoun conditions provides little
support for the memory-load hypothesis.
Before forming any assumptions regarding whether binding theory is constraining
initial antecedent resolution, it is worth highlighting that, although not significant, there is
a numerical difference in the reading times from the definite pronoun condition in the
critical region; reading times were faster in single match than multiple match conditions.
This difference is in the direction that would be expected by the multiple-candidate
effect; however, there is a large amount of variability in the means which may account
for the lack of significant effects. Larger participant sample sizes might reveal the effect
more strongly. Comparing the data from this experiment with Badecker and Straub‟s
(2002) Experiment 2 data reveals that word-by-word reading times in this experiment
were consistently higher than those in Badecker and Straub‟s (2002). Higher reading
times are generally indicative of greater variability; indeed numerically, the standard
errors are much larger in this experiment than Badecker and Straub‟s (2002), which
might explain the failure to replicate the multiple-candidate effect.
On the other hand, it has been proposed that readers do not resolve anaphoric
dependencies on-line unless task demands require such a resolution (Green, McKoon &
Ratcliff, 1992). It is possible that readers were simply not resolving the definite pronoun
since this was a more complex process than was required for the immediate task. The
probe words and comprehension questions were constructed so as to never require
antecedent resolution. However, later effects found within the sentence that can
potentially be explained in terms of antecedent resolution processes seem to cast doubt
over this explanation.
Turning to effects in late processing, there were significant effects on several
words within the post-critical region, which consisted of all the words from the end of the
critical region to the end of the sentence. It must be pointed out that the length of this
region was not controlled for since no effects were expected to arise within it other than
potential spill-over effects from the critical region preceding it, thus it is difficult to make
any generalisations regarding the effects found on words within this region. There are
two potential explanations offered for the effects found within this region.
The initial, albeit rather weak, effect within this region was restricted to the
definite pronoun conditions. Longer reading times were found on word 10 for sentences
in which only the grammatically-accessible referent matched the morphological gender of
the definite pronoun (single match) than for sentences in which both the grammatically-
accessible and grammatically-inaccessible referents matched the definite pronoun‟s
morphological gender (multiple match), an effect opposite to the predictions of the
memory load hypothesis. Similar effects have been found in other research. Using a
naming latency methodology, Rigalleau and Caplan (2000) found longer naming
latencies for pronouns in which the prior clause contained two proper names of different
genders, only one of which matched the morphological gender of the pronoun (as in the
single match condition), than when the clause contained two proper names of the same
gender, both of which matched the pronoun‟s morphological gender (as in the multiple
match condition). Sturt (2003), examining reflexives using eye-tracking methodology,
found a comparable difference in reading times on the pre-final region in the sentence,
which corresponds approximately to the post-critical in this experiment. This was
assumed to be related to sentence wrap-up effects, given that the effect occurred only in
regressions as opposed to in first-pass reading. It was proposed that discourse preferences
only affected antecedent resolution in later processing, whereby participants attempted to
bind the reflexive to the discourse-focused but grammatically-inaccessible referent, and
processing difficulty entailed only in the case of a gender-mismatch. On the contrary, in
the case of a gender-match, readers may have actually successfully bound the reflexive to
the grammatically-inaccessible referent, a notion supported by the greater number of
ungrammatical interpretations of such gender-match sentences in a follow-up experiment
The first explanation thus offered for the results found in the post-critical region is
that participants may have bound the definite pronoun to a grammatically-inaccessible
antecedent in sentences in which the grammatically-inaccessible referent matched the
morphological gender of the definite pronoun (multiple match). However, in the case of a
gender mismatch in the single match condition, in which the gender of the
grammatically-inaccessible referent did not match the morphological gender of the
pronoun, further processing was required in order to bind the pronoun to the
grammatically-accessible gender-congruent referent. The grammatically-inaccessible
referent may have been more focused in the discourse representation, thereby the
preferential antecedent, although the sentences were constructed with the aim of giving
both referents within the sentence equal salience, since both were grammatical subjects,
which are generally regarded as more prominent (Järvikivi, Van Gompel, Hyönä, &
Bertram, 2005). There is a common assumption that there is an advantage of first-
mention, whereby the first mentioned referent is easier to access from the discourse
representation than later-mentioned referents (e.g. Gernsbacher & Hargreaves, 1988;
Gernsbacher, Hargreaves, & Beeman, 1989), which seems to contradict this explanation
of salience. On the other hand, it is equally possible that the embedded subject referent
was being regarded as more salient due to the simple advantage of recency (Gernsbacher
et al, 1989). Research has found that responses to a probe word located in the second of
two clauses are faster than if the probe word is located in the first clause (Chang, 1980;
Von Eckardt & Potter, 1985). The grammatically-inaccessible referent, located in the
embedded clause, was most recently processed and may thereby be receiving the most
activation resulting in higher salience and easier access within the discourse
representation, therefore readers may initially attempt to bind the definite pronoun to it.
If this explanation is appropriate, antecedent resolution appears to have been
delayed somewhat towards the end of the sentence, arising later in processing than was
found by Badecker and Straub (2002). The delay in antecedent resolution may of course
be due to the artificial buffering strategy, as suggested earlier (Danks, 1986; Magliano et
al, 1993), whereby readers traverse text in a self-paced reading experiment at a rate faster
than they comprehend, and so delay antecedent resolution until later in the sentence. It
must be noted that the explanation just offered is particularly tentative, since the effect
arising on this word was only marginally significant. Further research, perhaps
investigating which of the referents is receiving most activation, would be required in
order to form any further conclusions. Conversely, it might be that antecedent resolution
itself was not delayed, but that it occurred in two stages. It is possible that binding theory
was acting immediately in antecedent resolution, but that the constraint was violated later
in processing, as suggested by the defeasible-filter view (Sturt, 2003). The self-paced
methodology used does not sufficiently separate effects of early and late measures, thus
eye-tracking methodology might be more suitable for investigating this suggestion
On the following word in the sentences (word 11), the effect just explained
expanded to include the indefinite pronoun conditions. Sentences including two proper
names of the same gender had faster reading times for word 11 than sentences including
two proper names of different genders, irrespective of whether the sentence contained a
referentially dependent pronoun. As pointed out earlier, sentence length in this region
was not controlled for, since no effects were expected other than possible spill-over
effects from the critical region. However, with the exception of a small number of
sentences, this was either the final word in the sentence or the penultimate word. Longer
reading times are generally found on end-of-sentence words, indicative of integrative
processes (Just et al, 1982). Participants appear to have experienced more difficulty
integrating material from sentences containing referents of different gender than of the
same gender, an effect contrary to the predictions of the memory load hypothesis.
However, there is an alternative, equally speculative, explanation for the effects found
within the post-critical region, that of gender-based priming. This explanation is based on
findings that the response latency to a target word can be facilitated following a gender-
congruent word as compared to a gender-incongruent word (Banaji & Hardin, 1996). It
seems reasonable to suggest that when comprehending sentences containing two proper
names of the same gender, participants‟ encoding of the second gender-congruent proper
name was subject to priming, since its content-based properties matched those of the first
proper name, thereby potentially aiding encoding. Given that no baseline condition was
present in this experiment, no conclusions can be inferred as to whether the priming was
facilitative or inhibitory. Nevertheless, building and maintaining a discourse
representation containing referents that are all the same gender may well be comparably
easier than if the discourse contains referents that are both male and female, since the
gender feature primes further encoding of gender-congruent referents.
This effect of gender-based priming may either be separate to the effect found on
the previous word, or it may be that the same effect is occurring on both words, but that it
simply arose earlier within the definite pronoun conditions. Indeed, the pronoun is also
gendered and thereby matches the gender of at least one of the proper names, conceivably
contributing to the priming effect and potentially facilitating encoding. It is slightly
unclear, of course, as to why this potential priming effect arose only in later processing
rather than exhibiting priming on the actual proper names or on the gender-marked
definite pronoun. However, if an artificial buffering strategy is being used by participants
as suggested earlier, then it is plausible that processing might not take place until towards
the end of the sentence (Danks, 1986; Magliano et al, 1993).
The final effect found in this experiment occurred in the probe recognition task
that immediately following the presentation of each sentence. Probe recognition
responses were slower when participants had to decide whether the probe was located in
a sentence containing two referents of the same gender (multiple match) than a sentence
containing two referents of different genders (single match). This effect occurred in both
the definite and indefinite pronoun conditions, so it cannot be related to antecedent
resolution processes. A type of similarity-based interference seems to have taken place.
Probe recognition accuracy was not affected; participants performed equally well in all
conditions, however the decision of whether the probe was present in the just-read
sentence appears to have been delayed in the multiple match conditions. This finding
supports the memory-load hypothesis, although the effect arose much later than expected,
and indeed not even within the actual sentence being processed. Nevertheless,
participants did appear to be experiencing processing difficulty when the referents within
the sentence were of the same gender, and thereby less distinguishable.
One explanation is that processing whether the probe word was present in the just-
read sentence involves the same factors as initially processing the discourse and building
and maintaining the corresponding discourse representation (Garnham & Oakhill, 1985).
The self-paced reading methodology may not have been sensitive enough to reveal such
effects when participants were initially reading the sentence, potentially due to the use of
an artificial buffering strategy (Danks, 1986; Magliano et al, 1993); however when
participants had to make a firm decision in order to accurately respond to the probe
recognition task, and they had time to do so, the effect arose. The probe words never
corresponded to either the pronoun or the proper names; however, “yes” probes were
selected equally from initial, medial, and final locations within the sentence in order to
avoid cuing, and so it is plausible that participants may have been processing the entire
sentence to establish whether the probe word was present. Half of the probes were “no”
probes and thus not located within the sentence, which further supports the idea that
sentence checking must have occurred since the effect clearly arose on these probes also.
The difficulty of such sentence checking may thus depend on “the ease with which
information can be read out of a (referentially determinate) representation that has been
constructed as the sentence was read” (Garnham & Oakhill, 1985, p. 395). It is easier to
maintain such a discourse representation when the events within it are easier to
distinguish (Just & Carpenter, 1992). Two referents of the same gender are less
distinguishable than two referents of different gender, since they cannot be identified in
terms of gender information alone, and thus re-accessing the discourse representation
may be harder and cause more processing difficulty, resulting in the longer response
That this effect arose on the probe recognition task even though the task was not
directly examining the discourse referents suggests that it is perhaps a particularly
dominant effect. Warren and Gibson (2005) report a related effect examining different
types of noun phrases, finding that comprehension accuracy decreased when post-
sentential comprehension questions were about sentences containing two noun phrases
that matched in type (e.g. name and name) as opposed to differing in type (e.g.
description and name). The similarity of the noun phrases appears to have heightened
interference in processing, in a manner comparable to the way in which the similar
content-based properties of proper names in this experiment decreased the
distinguishability of the corresponding referents within the discourse representation,
thereby increasing interference, resulting in processing difficulty.
Although mere conjecture, if the explanation of gender-based priming is
appropriate to explain the effects found within the post-critical region (as opposed to
antecedent resolution processes), and since support was found for the memory-load
hypothesis on probe response times, one might potentially conclude that the multiple-
candidate effect found in Badecker and Straub‟s (2002) Experiment 2 was perhaps in fact
an effect of the memory-load hypothesis, as Badecker and Straub (2002) initially
proposed. Probe response times were not given by Badecker and Straub (2002), so there
is no way of knowing whether a similar effect arose on the probe words. The change in
materials might have been sufficient to bring about the effect, supporting the suggestion
of the existence of some distinguishability threshold breached in Badecker and Straub‟s
(2002) materials. Stipulating that the multiple-candidate effect was in fact an effect of the
memory load hypothesis eradicates the support it was thought to offer for the interactive-
parallel constraint model. If the word-by-word self-paced reading methodology is indeed
inducing an artificial buffering effect that is delaying sentence processing, perhaps the
effects found within this study would arise earlier if a more sensitive methodology which
does not induce the strategy, such as eye-tracking, was used to investigate the same
effects. As suggested earlier, this technique would also allow much greater accuracy in
determining which effects arise in early or late processing, as could be determined by
differences in first- or second-pass reading. Alternatively, if the effects found within the
post-critical region were in fact related to antecedent resolution processes, then this offers
support for the proposal that content-based properties of grammatically-inaccessible
antecedents affect resolution, as suggested by the interactive-parallel constraint model.
Furthermore, the slower reading times in this experiment resulted in much greater
variability, which might have concealed any credible effects that could have offered
support for the multiple-candidate effect and thereby the interactive-parallel constraint
In summary, this experiment failed to replicate the multiple-candidate effect. In
early processing, the content-based properties of grammatically-inaccessible antecedents
did not affect antecedent resolution. Similarly, there was no supportive evidence of the
memory-load hypothesis in early processing either. Effects arose in later processing
towards the end of the sentence which potentially indicate that readers were attempting to
bind the referentially-dependent pronoun to the most recently processed but
grammatically-inaccessible referent, unless in the case of a gender-mismatch, at which
point further processing was required. An alternative explanation offered was that
gender-based priming was occurring, whereby sentences containing referents of
congruent gender were primed and thereby easier to encode than those with referents of
incongruent gender. It is beyond the scope of this report to establish which of these
explanations is appropriate. Finally, support for the memory-load hypothesis was found
on the probe recognition task, in which responses were longer if the probe followed a
sentence containing two referents of the same gender. This is explicable if one assumes
that the same processes that occur when initially processing the sentence similarly occur
in later sentence-checking.
Arnold, J.E., Eisenband, J.G., Brown-Shmidt, S., & Trueswell, J.C. (2000). The rapid use
of gender information: evidence of the time course of pronoun resolution from
eyetracking. Cognition, 76, B13-B26.
Baddeley, A. (2007). Working Memory, Thought, and Action. Oxford: Oxford University
Badecker, W., & Straub, K. (1994). Evidence that binding principles participate in a
constraint satisfaction process. Poster session presented to the Seventh Annual
CUNY Sentence Processing Conference, New York.
Badecker, W., & Straub, K. (2002). The processing role of structural constraints on the
interpretation of pronouns and anaphors. Journal of Experimental Psychology, 28,
Banaji, M.R., & Hardin, C.D. (1996). Automatic stereotyping. Psychological Science, 7,
Cacciari, C., Carreiras, M., & Cionini, C.B. (1997). When words have two genders:
Anaphor resolution for Italian functionally ambiguous words. Journal of Memory
and Language, 37, 517-532.
Carreiras, M., Garnham, A., Oakhill, J., & Cain, K. (1996). The use of stereotypical
gender information in constructing a mental model: Evidence from English and
Spanish. The Quarterly Journal of Experimental Psychology, 49A, 639-663.
Chang, F.R. (1980). Active memory processes in visual sentence comprehension: Clause
effects and pronominal reference. Memory and Cognition, 8, 58-64.
Chomsky, N. (1981). Lectures on Government and Binding. Dordrecht, Netherlands:
Clifton, C., Kennison, S.M., & Albrecht, J.E. (1997). Reading the words her, his, him:
Implications for parsing principles based on frequency and on structure. Journal
of Memory and Language, 36, 276-292.
Danks, J.H. (1986). Identifying component processes in test comprehension: Comment
on Haberlandt and Graesser. Journal of Experimental Psychology: General, 115,
Ehrlich, K. (1980). Comprehension of pronouns. Quarterly Journal of Experimental
Psychology, 32, 247-255.
Ehrlich, K., & Rayner, K. (1983). Pronoun assignment and semantic integration during
reading: Eye movements and immediacy of processing. Journal of Verbal
Learning and Verbal Behaviour, 22, 75-87.
Forster, K.I., & Forster, J.C. (2003). DMDX: A Windows display program with
millisecond accuracy. Behaviour Research Methods, Instruments & Computers,
Garnham, A., & Oakhill, J. (1985). On-line resolution of anaphoric pronouns: Effects of
inference making and verb semantics. British Journal of Psychology, 76, 385-393.
Gernsbacher, M. (1989). Mechanisms that improve referential access. Cognition, 32, 99-
Gernsbacher, M., & Hargreaves, D.J. (1988). Accessing sentence participants: The
advantage of first mention. Journal of Memory and Language, 27, 699-717.
Gernsbacher, M., Hargreaves, D.J., & Beeman, M. (1989). Building and accessing
clausal representations: the advantage of first mention versus the advantage of
clause recency. Journal of Memory and Language, 28, 735-755.
Gordon, P.C., & Hendrick, R. (1997). Intuitive knowledge of linguistic co-reference.
Cognition. 62, 325-370.
Greene, S., McKoon, G., & Ratcliff, R. (1992). Pronoun resolution and discourse models.
Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 266-
Järvikivi, J., Van Gompel, R.P.J., Hyönä, J., & Bertram, R. 2005. Ambiguous pronoun
resolution: contrasting the first-mention and subject-preference accounts.
Psychological Science, 16, 260-264.
Just, M.A., & Carpenter, P.A. (1992). A capacity theory of comprehension: Individual
differences in working memory. Psychological Review, 99, 122-149.
Just, M.A., Carpenter, P.A., & Wooley, J.D. (1982). Paradigms and processes in reading
comprehension. Journal of Experimental Psychology: General, I11, 228-238.
Kreiner, H., Sturt, P., & Garrod, P. (2008). Processing definitional and stereotypical
gender in reference resolution: Evidence from eye-movements. Journal of
Memory and Language, 58, 239-261.
Lucas, M.M., Tanenhaus, M.K., & Carlson, G.N. (1990). Levels of representation in the
interpretation of anaphoric reference and instrument inference. Memory and
Cognition, 18, 611-631.
MacDonald, M.C., & MacWhinney, B. (1990). Measuring inhibition and facilitation from
pronouns. Journal of Memory and Language, 29, 469-492.
MacDonald, M.C., Pearlmutter, N.J., & Seidenberg, M.S. (1994). Lexical nature of
syntactic ambiguity resolution. Psychological Review, 101, 676-703.
Magliano, J.P., Graesser, A.C., Eymard, L.A., Haberlandt, K, & Gholson, B. (1993).
Locus of interpretive and inference processes during text comprehension: A
comparison of gaze durations and word reading times. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 19, 704-709.
McKoon, G., Ward, G., Ratcliff, R., & Sproat, R. (1993). Morphosyntactic and pragmatic
factors affecting the accessibility of discourse entities. Journal of Memory and
Language, 32, 56-75.
Nicol, J. (1988). Coreference processing during sentence comprehension. Unpublished
doctoral dissertation, Massachusetts Institute of Technology.
Nicol, J., & Swinney, D. (1989). The role of structure in coreference assignment during
sentence comprehension. Journal of Psycholinguistic Research, 18, 5-19.
Reinhart, T. (1976). The Syntactic Domain of Anaphora. Ph.D. thesis, MIT.
Rigalleau, F., & Caplan, D. (2000). Effects of gender marking in pronominal
coindexation. The quarterly journal of experimental psychology, 53A, 23-52.
Sanford, A., Moar, K., & Garrod, S. (1988). Proper names as controllers of discourse
focus. Language and Speech, 31, 43-56.
Sturt, P. (2003). The time-course of the application of binding constraints in reference
resolution. Journal of Memory and Language, 48, 542-562.
Trueswell, J.C. (1996). The role of lexical frequency in syntactic ambiguity resolution.
Journal of Memory and Language, 35, 566-585.
Van Dyke, J.A., & McElree, B. (2006). Retrieval interference in sentence
comprehension. Journal of Memory and Language, 55, 157-166.
Van Gompel, R.P.G., & Liversedge, S.P. (2003). The influence of morphological
information on cataphoric pronoun assignment. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 29, 128-139.
Van Gompel, R.P.G., & Majid, A. (2004). Antecedent frequency effects during the
processing of pronouns. Cognition, 90, 255-264.
Von Eckardt, B., & Potter, M.C. (1985). Clauses and the semantic representation of
words. Memory and Cognition, 13, 371-354.
APPENDIX A: Experimental materials
The 24 experimental sentences used in the experiment are listed below. Sentences listed
are from the single match/definite pronoun condition. Alternative proper names for the
multiple match variants and the alternative pronoun for the indefinite pronoun variants
are given in parentheses. The probe words from the probe recognition task that were
presented following each sentence are given in parentheses at the end of each complete
1. John mentioned that Beth (Bill) cooked him (everyone) spaghetti bolognaise for dinner
that night. (Mentioned)
2. Henry explained that Jenny (Jacob) brought him (everyone) beautiful daffodils last
3. Brian complained that Helen (Harry) packed him (everyone) coleslaw sandwiches for
lunch that day. (Coleslaw)
4. Bruce insisted that Julia (Jason) sold him (everybody) inexpensive furnishings for the
5. Gordon demanded that Rachel (Robert) pay him (everyone) substantial compensation
for the accident. (Accident)
6. Timothy proposed that Miranda (Malcolm) lend him (everyone) champagne glasses for
the party. (Umbrella)
7. Donald remembered that Louise (Lionel) challenged him (everyone) somewhat
stupidly to a fight. (Remembered)
8. Roger said that Alice (Aaron) gave him (everybody) wonderful presents for Christmas
this year. (Gifts)
9. James revealed that Laura (Lewis) passed him (everybody) answers discretely in the
exam hall. (Passed)
10. Greg believed that Jane (Jack) asked him (everybody) particularly difficult questions
in the test. (Difficulty)
11. Richard explained that Jessica (Jeffrey) offered him (everyone) substantial amounts
of money as a bribe. (Money)
12. Frank stated that Sarah (Simon) sent him (everybody) important documents in the
post that day. (Cheese)
13. Samantha announced that Jonathon (Jennifer) invited her particularly enthusiastically
to the party tonight. (Announced)
14. Nancy thought that Peter (Penny) owed her (everybody) another opportunity to solve
the problem. (Trouble)
15. Anne suggested that Tony (Tina) get her (everyone) another quotation for the
building works. (Quotation)
16. Sarah admitted that Frank (Fiona) offered her (everybody) additional lessons before
the exam. (Offer)
17. Polly appreciated that Jason (Janet) called her (everyone) extremely remorsefully last
night to apologise. (Night)
18. Alison explained that Trevor (Trisha) appreciated her (everyone) asking questions
about the incident. (Prepare)
19. Angela imagined that George (Gloria) made her (everyone) delicious chocolate tarts
for dessert. (Imagined)
20. Barbara assumed that Michael (Melissa) saw her (everyone) disappear quietly from
the party. (Vanish)
21. Maggie divulged that Thomas (Teresa) loaned her thirteen thousand pounds at the
22. Sheila implied that Andrew (Amanda) told her (everyone) confidential information
about the case. (Cases)
23. Sally guessed that Tommy (Tanya) overheard her (everybody) talking loudly about
the surprise party. (Surprise)
24. Lauren requested that Steven (Sophie) advance her (everybody) several hundred
pounds of this month‟s pay. (Concert)
APPENDIX B: Statistics for word reading times and probe responses
The participant means of word-by-word reading times and probe word response times in
milliseconds are given below for each condition. Standard errors of participant means are
given in parentheses.
Definite Pronoun Conditions Indefinite Pronoun Conditions
(Figure 1) (Figure 2)
Single Match Multiple Match Single Match Multiple Match
Condition Condition Condition Condition
That 416 (19) 420 (22) 423 (19) 404 (20)
Michael/Melissa 456 (35) 473 (39) 440 (25) 436 (40)
Saw 448 (24) 432 (24) 447 (28) 454 (25)
Her/Everyone 417 (20) 420 (21) 460 (25) 463 (34)
Disappear 527 (54) 484 (36) 523 (34) 527 36)
Quietly 530 (38) 536 (55) 524 (24) 518 (28)
From 437 (22) 452 (22) 441 (17) 441 (19)
The 424 (27) 381 (15) 387 (17) 396 (28)
Party 507 (35) 451 (23) 506 (39) 454 (22)
PROBE 1307 (45) 1444 (75) 1359 (47) 1418 (64)