Real World Constraints on the Mental Lexicon: Assimilation, the

Document Sample
Real World Constraints on the Mental Lexicon: Assimilation, the Powered By Docstoc
					The information profile of
Spanish word systems:

     I. The case of speech
     addressed to a child.

     II. An analysis by feature.

 Monica Tamariz         Richard Shillcock
monica@ling.ed.ac.uk   rcs@cogsci.ed.ac.uk
                 Overview

• Use information profiles of word systems
  (corpus, lexicons).
• More realistic representations of speech
  generate flatter profiles.
• Flatter profiles reflect a more efficient use
  of the representational space.
             Assumptions

• Phonology plays a part in the organization
  of the mental lexicon.

• For maximal efficiency, information should
  be spread as evenly as possible over the
  representational space.
Distribution of information over
   the representational space
                             1
     information: entropy




                            0.8

                            0.6

                            0.4

                            0.2

                             0
                                  1   2   3   4     5   6   7   8   9   10 11 12 13 14 15 16 17

                                                  representational space segment
                Entropy

• Concept of entropy from information theory
  (Shannon, 1948).
• Measure of the uncertainty or information
          H = -  (pi · log pi)
• Redundancy
          R=1-H
                Data sets

• Speech corpus: 707,000 word tokens.

• Speech lexicon: 42,000 word types.

• Dictionary lexicon: 28,000 headwords.
      Transcriptions

•   Citation
               (30 phonemes)
•   Fast-speech
               (50 phonemes and
               allophones)
      Fast-speech transcription
• Glides
• Approximant B D G
• Consonant assimilation
     e.g. []
          [z]
          [dental n], [dental l]
           Transcriptions
ORTHO.       CITATION       FAST-SPEECH

admitir      admitIr        amitIr
carnets      karnEts        kanEs
colgado      kolgAdo        kolA o
quitarle     kitArle        kitAle
gracias      grAzias        Azia
gaitero      gaitEro        gatEro
           The Information Profile
          Words                        Count of phonemes in each segment position
      1   2 3 4   5   6   7         P1     P2       P3       P4      P5      P6     P7
1     a   d m i   t   I   r   a     4189   6917     1533     1741    258     3923   9200
2     t   e r m   I   n   a   b     1789   947      2020     2918    1547    358    5
3     k   a r n   E   t   s   d     3075   164      1496     980     2066    4350   232
4     k   o l g   A   d   o   e     4481   8918     1003     1792    284     3427   3866
5     m   o m E   n   t   o   f     1104   169      766      131     26      12     2
6     f   a m I   l   i   a   g     1110   169      1739     677     285     654    25
7     k   i t A   r   l   e   i     1553   4278     1850     2561    4430    2129   106
8     t   e n E   m   o   s   k     3856   1075     1911     567     745     1364   59
9     g   r A z   i   a   s   l     567    1450     2207     994     1562    1612   1268
10    t   o d a   b   I   a   m     3540   972      4688     2668    3294    668    8
etc                           etc
                              T.    45559   45559   45559   45559   45559   45559   45559




      Calculate entropy H = -  (pi · log pi)
 The Information profile




Slope (-m)   Mean level of entropy (Hrel)
         The LERR principle
(Levelling effect of realistic representations)

             “Processes that make
         the representation of words
                more accurate
    will flatten the information profiles”
               The effect of the transcription:
                 Information profile slopes

             0.07
                            Citation
                            Fast-speech
                                          • Fast speech has flatter
             0.06
             0.05
                                            profiles (as in other
slope (-m)




             0.04                           languages)
             0.03
             0.02
             0.01
               0
                                          • Longer words have
                    4   5     6      7      flatter profiles
     The effect of the transcription:
            level of entropy.
• More entropy in the                             Citation
                                                  Fast-speech
  citation transcription.           0.8

                                   0.75


• Fast speech is more

                            Hrel
                                    0.7

  redundant and thus,              0.65

  more predictable.                 0.6
                                          4   5   6          7
                       The Speech Lexicon:
                    Information profile slopes.
                                           • Speech Lexicon: the
             0.07            Dict. Lex.      active mental lexicon
                             Speech Lex.
             0.06
                                             represented in the
             0.05
slope (-m)




             0.04                            brain.
             0.03
             0.02
                                           • The speech lexicon
             0.01                            has flatter profiles.
               0
                     4   5     6      7
           The Speech Lexicon:
             Level of entropy
• Speech lexicon: low
                                               Dict. Lex.
  redundancy levels.                           Speech Lex.
                                 0.8
• Level varies little
                                0.75
  across word lengths.


                         Hrel
                                 0.7
• Support for                   0.65
  Butterworth ‘Full              0.6
  Listing Hypothesis’.                 4   5   6       7
                       Corpus vs. Lexicon:
                    Information profile slopes
             0.07
             0.06
                                 Corpus
                                 Lexicon       • Corpus: representation
             0.05                                over time.
slope (-m)




             0.04
             0.03                              • Lexicon: representation
             0.02
             0.01                                over space.
                0
                    4    5       6         7   • The lexicon yields
                        w ord length             flatter profiles.
             Corpus vs. Lexicon:
              Level of entropy.
                                                       Corpus
                                                       Lexicon
                                   0.8
• The lexicon generates           0.75




                           Hrel
  higher entropy levels.           0.7
                                  0.65
                                   0.6
                                         4    5       6      7
                                             w ord length
                Discussion

• Fast-speech rules and a ‘Full List’ mental
  lexicon flatten the information profile.
• In the speech lexicon, the main constraint is
  efficiency of storage.
• In the corpus, other constraints - such as
  lexical segmentation - interact with the
  optimization of communication.
               Conclusion


• This simple analysis of the information
  profile of word systems is a useful tool
  that can provide insights into the validity
  of psycholinguistic theories.
  I. Speech addressed to a child

• Corpus “Maria” of speech addressed to a
  child aged 1-4 years.
• Limited vocabulary.
      41,000 word tokens : 3,900 word types.
• High frequency of diminutives (-ito, -ita).
              Speech addressed to a child:
               Information profile slopes

             0.04            adult   child
                                             • “Child” speech yields
             0.03
slope (-m)




                                               flatter slopes.
             0.02

             0.01
                                             • “Child” lexicon slopes
               0
                                               are steeper.
                    corpus      lexicon
   Speech addressed to a child:
        Level of entropy

                                                    adult   child
• “Child” speech shows               0.8




                         Mean Hrel
                                     0.6
  more phonological
                                     0.4
  redundancy.
                                     0.2

                                      0
                                           corpus      lexicon
 The lexicon addressed to a child
• The steep slope of the lexicon suggests it
  contains cues to word boundaries.
• Infants are sensitive to the probabilistic
  phonotactics of the language.
• Storage is not a constraint but word
  segmentation is, to “teach” children about
  the probabilistic phonotactics of the
  language.
       II. Analysis by feature

• Features instead of phonemes
• Consonants only
• Two analyses
     (a) By manner of articulation
     (b) By place of articulation
                                      Analysis by feature:
                                      Information profiles
                                                    Manner of articulation                             Place of articulation
          Phonem e
                                                         analysis                                            analysis
           analysis
                                                                             y = 0.001x + 0.7225                 0.85
                                                                                                                            y = -0.0418x + 0.8792
                 0.85       y = -0.0168x + 0.7569                 0.85
                                                                                                                  0.8




                                                                                                   Entropy (H)
                  0.8
   Entropy (H)




                                                                   0.8




                                                    Entropy (H)
                 0.75                                             0.75                                           0.75
                  0.7                                              0.7                                            0.7
                 0.65                                             0.65                                           0.65
                  0.6                                              0.6                                            0.6
                 0.55                                             0.55                                           0.55
                  0.5                                              0.5                                            0.5

                        1    2   3   4   5    6                          1   2   3   4   5   6                          1    2   3   4    5   6
                             Word segment                                    Word segment                                    Word segment




• Fast-speech transcriptions of the corpus
                       Analysis by feature:
                    Information profile slopes
                                    cons. (phon)                                       cons. (phon)
                                    cons. (manner)                                     cons. (manner)
                    CITATION                                             FAST-SPEECH
                                    cons. (place)                                      cons. (place)
             0.12                                                 0.12

              0.1                                                  0.1

             0.08                                                 0.08




                                                     slope (-m)
slope (-m)




             0.06                                                 0.06

             0.04                                                 0.04

             0.02                                                 0.02

               0                                                    0
                         Corp.Cit   Lex.Cit                                  Corp.FS   Lex.FS




                         • Manner vs. place analysis
                         • Citation vs. fast-speech
                         • Corpus vs. lexicon
                   Analysis by feature:
                    Level of entropy
             CITATION          cons. (phon)                  FAST-SPEECH     cons. (phon)
                               cons. (manner)                                cons. (manner)
       0.9                     cons. (place)           0.9                   cons. (place)


       0.8                                             0.8




                                                Hrel
Hrel




       0.7                                             0.7


       0.6                                             0.6


       0.5                                             0.5
                    Corp.Cit   Lex.Cit                             Corp.FS   Lex.FS




                 • Phoneme vs. Features
                 • Manner vs. Place of articulation
  Analysis by feature: Discussion
• Fast-speech rules flatten feature analysis
  slopes.
• Manner of articulation is most efficient over
  time - for communication (in the corpus).
• Place of articulation is not so efficient, but
  could have a role in word boundary
  recognition.
• Phoneme representation is most efficient over
  space - for storage (in the lexicon).
III. Speech addressed to a child:
   Analysis by feature: slopes
                                           cons. (phon)                                             cons. (phon)
                                           cons. (manner)                                           cons. (manner)
                    CITATION                                                    FAST-SPEECH
                                           cons. (place)                                            cons. (place)
             0.12                                                        0.12

              0.1                                                         0.1

             0.08                                                        0.08




                                                            slope (-m)
slope (-m)




             0.06                                                        0.06

             0.04                                                        0.04

             0.02                                                        0.02

               0                                                           0
                       Child-Corp.Cit   Child-Lex.Cit                             Corp.FS Child   Lex.FS Child




                         • Manner vs. place analysis
                         • Citation vs. fast-speech
                         • Corpus vs. lexicon
                                              Adult
                                    cons. (phon)                                        cons. (phon)
                                    cons. (manner)                                      cons. (manner)
                    CITATION                                              FAST-SPEECH
                                    cons. (place)                                       cons. (place)
             0.12                                                  0.12

              0.1                                                   0.1

             0.08                                                  0.08




                                                      slope (-m)
slope (-m)




             0.06                                                  0.06

             0.04                                                  0.04

             0.02                                                  0.02

               0                                                     0
                         Corp.Cit   Lex.Cit                                   Corp.FS   Lex.FS




                         • Manner vs. place analysis
                         • Citation vs. fast-speech
                         • Corpus vs. lexicon
                                                    Child
                                           cons. (phon)                                             cons. (phon)
                                           cons. (manner)                                           cons. (manner)
                    CITATION                                                    FAST-SPEECH
                                           cons. (place)                                            cons. (place)
             0.12                                                        0.12

              0.1                                                         0.1

             0.08                                                        0.08




                                                            slope (-m)
slope (-m)




             0.06                                                        0.06

             0.04                                                        0.04

             0.02                                                        0.02

               0                                                           0
                       Child-Corp.Cit   Child-Lex.Cit                             Corp.FS Child   Lex.FS Child




                         • Manner vs. place analysis
                         • Citation vs. fast-speech
                         • Corpus vs. lexicon
  Speech addressed to a child:
 Analysis by feature: level of H
         CITATION                cons. (phon)             FAST-SPEECH            cons. (phon)
                                 cons. (manner)                                  cons. (manner)
       0.9                       cons. (place)           0.9                     cons. (place)


       0.8                                               0.8
Hrel




                                                  Hrel
       0.7                                               0.7


       0.6                                               0.6


       0.5                                               0.5
              Child-Corp.Cit   Child-Lex.Cit                   Corp.FS Child   Lex.FS Child




             • Adult vs. child
             • Phoneme vs. Features
             • Manner vs. Place of articulation
                                         Adult

             CITATION          cons. (phon)                  FAST-SPEECH     cons. (phon)
                               cons. (manner)                                cons. (manner)
       0.9                     cons. (place)           0.9                   cons. (place)


       0.8                                             0.8




                                                Hrel
Hrel




       0.7                                             0.7


       0.6                                             0.6


       0.5                                             0.5
                    Corp.Cit   Lex.Cit                             Corp.FS   Lex.FS
                                              Child

         CITATION               cons. (phon)              FAST-SPEECH            cons. (phon)
                                cons. (manner)                                   cons. (manner)
       0.9                      cons. (place)            0.9                     cons. (place)


       0.8                                               0.8
Hrel




                                                  Hrel
       0.7                                               0.7


       0.6                                               0.6


       0.5                                               0.5
             Child-Corp.Cit   Child-Lex.Cit                    Corp.FS Child   Lex.FS Child
 Speech addressed to a child:
Analysis by feature: level of H
          CITATION                cons. (phon)             FAST-SPEECH            cons. (phon)
                                  cons. (manner)                                  cons. (manner)
        0.9                       cons. (place)           0.9                     cons. (place)


        0.8                                               0.8
 Hrel




                                                   Hrel
        0.7                                               0.7


        0.6                                               0.6


        0.5                                               0.5
               Child-Corp.Cit   Child-Lex.Cit                   Corp.FS Child   Lex.FS Child




              • Adult vs. child
              • Phoneme vs. Features
              • Manner vs. Place of articulation
               Discussion
• Fast-speech flattens feature analysis slopes.
• Manner of articulation is most efficient over
  time - for communication (in the corpus).
• Place of articulation is even less efficient
  than in adult (Steeper slope and higher H) -
  role in word boundary recognition.
• Phoneme representation not as efficient
  over space - for storage (in the lexicon) as
  adult speech.
                Discussion
• Speech addressed to a child makes a use of
  features such that:

• A feature representation is relatively more
  efficient than a phoneme representation of
  the mental lexicon.
• Place of articulation has an enhanced role in
  encoding word segmentation cues.
.

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:4
posted:7/22/2012
language:
pages:41