martin by xiangpeng


									Prediction and embodiment in

       Martin Pickering
       University of Edinburgh
• Many researchers assume that cognition is
    “embodied” (or “grounded”) rather than
    “abstract” (e.g., Barsalou, 2008)
    – Activates representations associated with the body
      and actions
• Much of this work argues that language is
    embodied (e.g., Barsalou, 2008; Glenberg,
    2008; Zwaan & Taylor, 2006).
•   Similar claims coming from experimental
    psychology and from neuroscience (e.g.,
    Gallese, 2008; Pulvermüller, 2005)
Embodiment of form vs. meaning

• Embodiment of meaning
  – cf. simulation at the content level (Gallese, 2008)
  – Processing (producing or comprehending) walk involves the use
    of representations involved in the act of walking
  – Provides a component of meaning (no need to make “strong”
    claim that all meaning is embodied)
• Embodiment of form
  – cf. simulation at the vehicle level (Gallese, 2008)
  – Comprehending language involves the use of representations
    involved in the act of producing language
     • Definitionally true for producing language (of course)
Embodiment of meaning
• People’s representations of scene descriptions incorporate spatial
   perspective (e.g., Bransford & Johnson, 1973)

• Sense-judgements for a sentence involving movement (close the drawer)
   faster if the response involves movement in the same direction (away from
   the body) than opposite direction (towards the body; Glenberg & Kaschak,

• Participants turned a knob to present sentences a word at a time. They
   were faster to read words (turned down) when they turned the knob in the
   direction implied by the words (anticlockwise)

• MEG study showed activation of appropriate motor areas within 170ms
   (“foot area” for kick, “mouth area” for eat; Pulvermüller et al., 2005)
    – Compatible with speed of word recognition, hence seems to occur “on line”
Embodiment of form
• Listeners activate appropriate tongue/lip muscles while
  listening to speech but not non-speech (Fadiga et al.,
  1995; Watkins et al., 2003)

• Large overlap in cortical areas activated during speech
  and passive listening (Pulvermüller et al., 2006; Wilson
  et al., 2004)

• Activation of brain areas associated with production
  during aspects of comprehension from phonology (Heim
  et al., 2003) to narrative structure (Mar, 2004)
Effects of embodiment

• Consider effects on overt behaviour

• Effects of form and meaning

• Two kinds of effects
  – Overt imitation
  – Complementary responses
• Meaning:
  – Overt imitation: produce act of slapping
  – Complementary response: act of flinching
• Form:
  – Overt imitation: utter “slap” as well
  – Complementary response: utter “his face”

  – Note that imitation corresponds to the action, and the
    complementary response corresponds to the
    immediate response to that action
Imitative and complementary
activation of embodied meaning
• Already discussed standard “embodiment”
  evidence (Glenberg & Kaschak, 2002, etc.)

• But also complementary activation
  – Participants presented with words referring to small
    or large objects, and this affected hand aperture in a
    subsequent grasping task (Glover et al., 2004)
  – Also evidence of complementary activation from
     • e.g., words for graspable objects activating regions
       associated with grasping (see Martin & Chao, 2001)
Imitative and complementary
activation of form
• Imitation: evidence from “alignment”
 effects in dialogue
  – Tendency to repeat words (Brennan & Clark,
    1996), syntax (Branigan et al., 2000) – see
  – Effects are extremely rapid (Fowler et al.,
Alignment of syntax (Branigan et al., 2000)

    Box of selected cards


                                                         “The chef giving the jug to the swimmer”

  Participant                                           Confederate


           Box of cards to be   Branigan et al., 2000, Cognition
           described            Cleland & Pickering, 2003, JML                               10
 • Confederate describes card:
       The chef giving the jug to the swimmer
 • Participant selects the card that matches this description
 • Participant picks up top card from her box:


 • Participant describes card:
     “The cowboy handing …”
• Confederate says
  • Either The chef handing the cake to the swimmer
  • Or The chef handing the swimmer the cake

• Participant describes card

                         The cowboy handing the banana to the burglar
                         The cowboy handing the burglar the banana
Same vs. different verb
• 4 prime conditions:
 PO-same: The chef handing the cake to the swimmer
 DO-same: The chef handing the swimmer the cake
 PO-different: The chef giving the cake to the swimmer
 DO-different: The chef giving the swimmer the cake
The cowboy handing the banana to the

  % Participant says

                                                     confederate says
  banana to burglar

                                                      jug to swimmer
                                                      swimmer jug


                             same verb   different
 Alignment between languages
Confederate says                           Confederate says
    El taxi persigue el camión                    El camión es perseguido por el taxi
    “The taxi chases the truck”                      “The truck is chased by the taxi”
Participant tends to say                   Participant tends to say
    The bullet hits the bottle                       The bottle is hit by the bullet

• Interlocutors align on language-independent representations
    - facilitating rapid shifts between languages
• Relatedness between languages can enhance priming
    - when verbs have same meanings, when word orders are the same

Hartsuiker et al. (2004)   Schoonbaert et al. (2007)        Bernolet et al. (2007)
Psychological Science      Journal of Memory and Language   JEP:LMC
Imitative and complementary
activation of form
• Complementary activation
  – Addressees complete speakers’ contributions
     • A: and number 12 is, uh, … B: chair. (Clark & Wilkes-Gibbs,
  – People faster at naming word or picture after a
    syntactically compatible context than otherwise
    (Griffin & Bock, 1998; Tyler & Marslen-Wilson, 1977;
    Wright & Garrett, 1984)
     • As they glide gracefully over the city, flying kites ARE vs. IS
     • Are is complementary to kites (plural verb not noun)

• Why do we appear to get imitative and
• For both form and meaning embodiment, why do we
  sometimes get imitation and sometimes
   – Functional explanation: presumably sometimes useful to imitate,
     sometimes useful to behave in a complementary fashion
   – Mechanistic explanation: suppression of one’s own responses
     after they occur (e.g., Dell, 1986; see Hartsuiker et al., 2005).
     Similarly, people may activate then if necessary suppress
     imitative responses

• But why do we get any of it at all?
   – One important purpose appears to be to aid prediction
Covert simulation
• Much evidence for motor involvement during

• Massive literature on mirror neuron system
  (e.g., activation of same neurons during
  behaviour and observation) that appears to be
  goal-directed (e.g., Rizzolatti, Gallese, Iacoboni)

• Interference between moving arm and watching
  other person’s arm movement but not robot arm
  movement (Kilner et al., 2003)
  – Encoding another’s movements using one’s own
    motor programs
Covert simulation in language
• Already suggested that people activate
 form and meaning representations during
  – Form: activation of tongue/lip muscles and
    speech-related areas during listening, etc.
  – Meaning: effects of motor tasks on
    comprehending action sentences, activation
    of motor areas during processing of action
Simulation for prediction
• Why does such simulation occur?
    – For overt imitation?
         • But monkeys have mirror systems, yet don’t ape
         • Instead, overt imitation appears to be a consequence of covert simulation (and of
           course serves as evidence for covert simulation)
    – To aid action identification, understanding, and memory? (“postdictive”
         • clearly may occur (e.g., in rehearsal)
         • but perhaps not only purpose

• Or to aid prediction?
         • Emerging view in cognition (e.g., Prinz, 2006; Wilson & Knoblich, 2005), development
           (e.g., Csibra, 2007), cognitive neuroscience (e.g., Frith, 2007), computational speech
           processing (Moore, 2008)

• When understanding language, such prediction could involve simulation of
   form or meaning
Prediction (Wilson & Knoblich, 2005)

• Prediction gets you “ahead of the game” but only if the
  target is sufficiently predictable. Two main types:

• Predictable physical movements
   – Including acceleration, rotation etc.
   – We can (fairly) reliably predict where objects will be ahead of

• Predictable behaviour of other people
   – Again, we can (fairly) reliably predict some aspects of their
How do we predict other people?
• Experience observing others?
    – This is one possibility and clearly does occur

• Experience of our own behaviour?
    – Works if we are sufficiently like others (which we are in many respects)
    – Therefore use representations of own behaviour as proxy for others’ behaviour
    – Such simulation can be fast, because we have the relevant mechanisms in place
      (see below), and is arguably non-inferential

• Much evidence that people predict each other’s behaviour by working out
   “what would I do under these circumstances”
    – e.g., better at predicting outcomes of own behaviour (e.g., dart throwing) than
      others’ behaviour
    – Hard to explain all this evidence by prior perception of one’s own behaviours
Emulation (Grush, 2004)
• A forward model of an external system that runs
  simulations of that system in real time (Desmurget &
  Grafton, 2000; Wolpert, 2001)

   – Before moving your arm you model the path it should take
   – If it deviates, you correct accordingly
   – More rapid than feedback, and works in the absence of feedback
   – Motor system uses emulators extensively to determine if
     subsequent movements are correct
   – Presumably emulation is also used in monitoring language
     production (but our current interest is comprehension)
How might prediction occur?

• Perception  covert motor simulation
• Simulation  drive emulators
• The perceptual system can use such emulators to make predictions when
   perceiving the behaviour of other people (Wilson & Knoblich, 2005)
    – Because their behaviours are largely the same as the perceiver’s

• Of course this is the case in language comprehension
    – So people can emulate using language production mechanisms
    – At different linguistic levels (words, grammar, meaning …)
    – Particularly strongly in predictable contexts (“high-cloze”)

• Some such emulation can relate to embodied meaning
    – Comprehenders predict the motor activation that would occur if they used those
Prediction of form in comprehension of
• DeLong et al. (2005):
    – The day was breezy so the boy went outside to fly a kite (predictable)
    – The day was breezy so the boy went outside to fly an airplane (unpredictable)

    – People predict kite and that it begins with a consonant

        larger N400 on an than a

• Van Berkum et al. (2005): prediction of gender (in Dutch)
    – The safe … was situated behind a big …
    – Disrupted (reading time and N400) when big has wrong gender for painting.

• Anticipatory eye movements in scene perception (Altmann & Kamide, 1999)
• Prediction of grammar (e.g., Lau et al., 2007; Staub & Clifton, 2006)
• Predicting when others’ utterances are likely to end, based on the meaning
   of the utterance (de Ruiter et al., 2006)
• Pickering and Garrod (2007) proposed that the production system acts as an
   emulator during language comprehension

    –   Emulator continually predicts the next element using the results of simulation at different
        levels (meaning, grammar, sound …)
    –   Predictions depend on how constraining the context is at each level
    –   Also emulation assists in dealing with noisy input (e.g., phoneme restoration effect)

• Will activate both the current word, grammar etc.
    –   Potentially leading to overt imitation
• And the predicted word, grammar, etc.
    –   Potentially leading to complementary responses

• Pickering and Garrod focused on prediction of form
    –   With “meaning” not referring to embodied action representations
    –   But similar emulation of motoric representations presumably occurs
Prediction of meaning in
comprehension of monologue?

• Claim is that comprehenders should
 predict embodied meaning
  – e.g., effects such as Zwaan and Taylor (2006)
    should occur predictively (given strong
  – Not tested yet
Speech Input
                  Step 1 Harry   went out to fly his red …
                  Step 2 Harry   went out to fly his red f …
                  Step 3 Harry   went out to fly his red fl …
                  Step 4 Harry   went out to fly his red fla …
                  Step 5 Harry   went out to fly his red flag

Language Input
                       Input                         Ø             Ø                  Ø
Analysis System                                      /f/           Ø                  Ø
                       analysis                      /fl/
                                                     /flæg/        Noun               flyable

                                                 +                 +                  +
                                        Phonology             Syntax      Semantics
                                                 -                 -
                                                  /flæg/           Noun                flyable
                                                  /flæg/           Noun                flyable
Production         Forward                        /flæg/           Noun
System             model

• So far focused on monologue

• But emulation of form and meaning may
 be particularly useful for dialogue
  – And should lead to both imitative and
    complementary activation
Why is dialogue so easy?
• “Should” be harder than monologue
   –   Dealing with changes on the fly
   –   Can’t always plan ahead
   –   Working out precisely when to speak and who to speak to
   –   Produce and comprehend at same time (because of feedback)
   –   Constant task-switching

• Pickering and Garrod (2004)
   – interlocutors “align” their mental states
   – Conversation is successful when interlocutors come to see the
     world in the same way
   – but how?
Why emulation is useful for dialogue I

• Dialogue involves regular switches between
  production and comprehension
  – Interlocutors take turns to take the floor
  – Addressee isn’t passive listener but provides
    “backchannel” feedback (assertions, queries, etc.)
     • Such feedback enhances quality of narratives (e.g., Bavelas
       et al., 2000)

• Thus production system is constantly activated
  during comprehension in dialogue
Why emulation is useful for dialogue II

• Addressee must be constantly prepared to
 respond, to make a contribution when
 appropriate (e.g., Sacks et al., 1974)

  – Sometimes a contribution is normatively
    required (e.g., when asked a non-rhetorical
  – Other times it is optional (e.g., after speaker
    finishes some statements)
And why it is effective
• interlocutors align at many linguistic levels during
  dialogue (Pickering & Garrod, 2004)
   – For example, similar activation of words and grammar
   – In particular, their representations are more similar than non-

• Hence, predictions are more likely to be accurate
   – If we are well-aligned, using my own representations as proxies
     for your representations is likely to be successful

• Dialogue is therefore a form of joint activity that is
  particularly likely to benefit from simulation and
Emulation of embodied meaning in
• Much dialogue involves interlocutors also interacting with environment
    –   e.g., task-oriented dialogue
• Here, predictions about environment are especially useful
• Clark and Krych (2004) had a director instruct a builder to construct a LEGO model
    –   When director could see workspace, she changed her language and timed her speech to fit
        with the builder’s actions
    –   Appeared that the builder’s actions were treated as continuous feedback by the director

• Prediction of embodied meaning facilitates rapid perception or performance of the
    –   And therefore also helps make conversation easy and facilitates alignment

• May be other benefits of aligning embodied meaning
    –   Essentially another level of alignment (beyond alignment of words, grammar etc.) that
        supports communicative success
    –   Clearly involves “common coding”, as implicated in both production and comprehension
Interactive-Alignment Model
                         A                    B
                    Situation Model    Situation Model

                     Semantic           Semantic
                     representation     representation

   representation                                          representation

                      Lexical            Lexical
                      representation     representation


Key references
• Pickering, M.J., & Garrod, S. (2004). Toward a mechanistic psychology of
    dialogue. Behavioral and Brain Sciences, 27, 169-225.
•   Garrod, S., & Pickering, M.J. (2004). Why is conversation so easy? Trends
    in Cognitive Sciences, 8, 8-11.
•   Pickering, M.J., & Garrod, S. (2007). Do people use language production to
    make predictions during comprehension? Trends in Cognitive Sciences, 11,
•   Pickering, M.J., & Garrod, S. (in press). Prediction and embodiment in
    dialogue. European Journal of Social Psychology.
•   Garrod, S., & Pickering, M.J. (in press). Joint action, interactive alignment,
    and dialogue. Topics in Cognitive Science.
•   Pickering, M.J., & Garrod, S. (in press). The use of prediction to drive
    alignment in dialogue. In G. Semin, & G. Echterhoff (Eds), Grounding
    sociality: Neurons, minds, and culture. Hove: Psychology Press.

To top