If I Have a Hammer Computational Linguistics in a

Document Sample
If I Have a Hammer Computational Linguistics in a Powered By Docstoc
					Carnegie
Mellon

                 If I Have a Hammer:
Computational Linguistics in a Reading Tutor that Listens
                     Jack Mostow
       Project LISTEN (www.cs.cmu.edu/~listen)
               Carnegie Mellon University

 “To a man with a hammer, everything looks like a nail.” – Mark Twain



                 Funding: National Science Foundation

      Keynote at 42nd Annual Meeting of the Association for
           Computational Linguistics, Barcelona, Spain
Project LISTEN                      1                          7/22/2004
Carnegie
Mellon                           If I had a hammer…
                                        [Hays & Seeger]

        If I had a hammer,
        I’d hammer in the morning
        I’d hammer in the evening,
        All over this land
        I’d hammer out danger,
        I’d hammer out a warning,
        I’d hammer out love between my brothers and my
        sisters,
        All over this land.



Project LISTEN                   2                       7/22/2004
Carnegie
Mellon
                              Outline
    1.     Project LISTEN’s Reading Tutor
    2.     Roles of computational linguistics in the tutor
    3.     So… Conclusions




Project LISTEN                    3                     7/22/2004
Carnegie
Mellon           Project LISTEN’s Reading Tutor (video)




Project LISTEN                    4                  7/22/2004
Carnegie
Mellon
                 Project LISTEN’s Reading Tutor (video)

 John Rubin (2002). The Sounds of Speech (Show 3).
   On Reading Rockets (Public Television series
   commissioned by U.S. Department of Education).
   Washington, DC: WETA.

 Available at www.cs.cmu.edu/~listen.




Project LISTEN                    5                  7/22/2004
Carnegie
Mellon
                 Thanks to fellow LISTENers
      Tutoring:                                             Field staff:
                Dr. Joseph Beck, mining tutorial data           Dr. Roy Taylor
                Prof. Albert Corbett, cognitive tutors          Kristin Bagwell
                Prof. Rollanda O’Connor, reading                Julie Sleasman
                Prof. Kathy Ayres, stories for children    Grad students:
                Joe Valeri, activities and interventions          Hao Cen, HCI
                Becky Kennedy, linguist                           Cecily Heiner, MCALL
                                                                   Peter Kant, Education
      Listening:
                                                                   Shanna Tellerman, ETC
            Dr. Mosur Ravishankar, recognizer
                                                            Plus:
            Dr. Evandro Gouvea, acoustic training
                                                                 Advisory board
            John Helman, transcriber
                                                                 Research partners
      Programmers:                                                    DePaul
            Andrew Cuneo, application                                UBC
                                                                      U. Toronto
            Karen Wong, Teacher Tool
Project LISTEN                                     6             Schools             7/22/2004
Carnegie
Mellon
                 Computational linguistics models
                     in an intelligent tutor
    Language models predict word sequences for a task.
          E.g. expect ‘once upon a time…’
    Domain models describe skills to learn.
          E.g. pronounce ‘c’ as /k/.
    Production models describe student behavior.
          E.g. which mistakes do students make?
    Student models estimate a student’s skills.
          E.g. which words will a student need help on?
    Pedagogical models guide tutorial decisions.
          E.g. which types of help work best?
    Theme: use data to train models automatically.
Project LISTEN                          7                  7/22/2004
Carnegie
Mellon               Language model of oral reading
                 [Mostow, Roth, Hauptmann, & Kane AAAI94]
    Problem: which word sequences to expect?
    Language model specifies word transition probabilities
         
        Given sentence text (e.g. ‘Once upon a time…’)
         
        Expect correct reading                PrRepeat      once
         
        But allow for deviations
                                                  PrCorrect upon
        With heuristic probabilities once
         
    Result:                                          PrTruncate
                                                            up
       Accepted 96% of correctly read words.
       Detected about half the serious mistakes.       . PrJump
                                                        . a
                                                       .




Project LISTEN                      8                          7/22/2004
Carnegie
Mellon     Using ASR errors to tune a language model
                     [Banerjee, Mostow, Beck, & Tam ICAAI03]

  Training data: 3,421 oral reading utterances
         Spoken by 50 children aged 6-10
         Recognized (imperfectly) by speech recognizer
         Transcribed by hand
  Method: learn to classify language model transitions
         Reward good  transitions that match transcript
         Penalize bad  transitions that cause recognizer errors
         Generalize from features (kid age, text length, word type, …)
  Result: reduced tracking error by 24% relative to baseline



Project LISTEN                          9                           7/22/2004
Carnegie
Mellon
                 Domain model of pronunciation
     Problem: what should students learn?
     Data: pronunciation dictionary for children’s text
            ‘teach’  /T IY CH/
     Method: align spelling against pronunciation
            ‘t’  /T/, ‘ea’  /IY/, ‘ch’  /CH/
     How frequent is each grapheme-phoneme mapping?
            ‘t’  /T/ occurred 622 times in 9776 mappings
            ‘z’  /S/ occurred once (in ‘quartz’)
     How consistently is each grapheme pronounced?
            ‘v’  /V/ always
            ‘e’  /EH/ (‘bed’), /AH/ (‘the’), /IY/ (‘be’), /IH/ (‘destroy’)
            + ‘ea’, ‘eau’, ‘ed’, ‘ee’, ‘ei’, ‘eigh’, ‘eo’, ‘er’, ‘ere’, ‘eu’, …
Project LISTEN                              10                              7/22/2004
Carnegie
Mellon             Production model of pronunciation
                 [Fogarty, Dabbish, Steck, & Mostow AIED2001]

  Problem: Which mistakes to expect?
  Data: U. Colorado database of oral reading mistakes
         ‘bed’  /B IY D/
  Method: train G  P  P’ malrules for decoding
         ‘e’  /EH/  /IY/




Project LISTEN                    11                    7/22/2004
Carnegie
Mellon
                  Top five G  P  P’ decoding errors

                 G         P          P’       Example
                 ‘s’      /S/          //       ‘plants’   Drop ‘s’.
                 ‘s’      /Z/          //        ‘arms’    Drop ‘s’.
                  „‟       //         /N/        ‘ha_d’    Add ‘n’.
                  „‟       //         /Z/        ‘car_’    Add ‘s’.
                 ‘n’      /N/          //        ‘land’    Drop ‘n’.
    Result: predicted mistakes in unseen test data
          Context-sensitive rules improved accuracy.
    Later work: predict real-word mistakes
          [Mostow, Beck, Winter, Wang, & Tobin ICSLP2002]
Project LISTEN                        12                        7/22/2004
Carnegie
Mellon                Student model of help requests
                       [Beck, Jia, Sison, & Mostow UM2003]

    Problem: when will a student request help on a word?
    Data: 7 months of Reading Tutor use by 87 students
          Average ~20 hours per student
          Transactions logged in detail
          Help request rate excluding common words: 0.5%–54%
    Method: train classifier using word, student, history
    Result: predict words that unseen students click on




Project LISTEN                      13                          7/22/2004
Carnegie
Mellon
                 Learning curves for students’ help requests
      .4
            Try to predict subset                            Selected data
                  Grade 1-2 level                               53 students
      .3
                  1-6 prior encounters                          175,961 words
                                                                 29,278 help requests
      .2
                                               Reading level
                                                   Grade 1
                                                             Train predictive model
      .1                                           Grade 2
                                                                 Count help requests 5x
                                                   Grade 3
                                                                 Predict other kids’ data
                                                                 71% accuracy
     0.0                                           Grade 4
           0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15



           Number of previous encounters
Project LISTEN                                     14                              7/22/2004
Carnegie
Mellon

                                    Features used
           Information about the student
                  Help request rate, overall reading proficiency, etc.
           Information about the word
                  Word length, position in sentence, etc.
           Student’s history with reading word
                  Percent of times accepted by Reading Tutor, time to
                   read, etc.
           Student’s prior help on this word
                  Was the word helped previously? Earlier today?


           How to get all this data??
Project LISTEN                              15                            7/22/2004
Carnegie
Mellon
                 Data collection and translation




                                         word features

Project LISTEN                 16                  7/22/2004
Carnegie
Mellon
                 Structure of Reading Tutor database

   Reading Tutor                                                 Student


             List readers                                      Login
                                       Session

             List stories                                 Pick stories
                                   Story Encounter
             Show one
             sentence at a time                         Read sentence
                                  Sentence Encounter
             Listens and
             helps                                     Read each word
                                   Word Encounter

Project LISTEN                            17                             7/22/2004
Carnegie
Mellon
                 Project LISTEN’s Reading Tutor:
                 A rich source of experimental data
                                               2003-2004 database:
                                                  9 schools
                                                  > 200 computers
                                                  > 50,000 sessions
                                                  > 1.5M tutor responses
                                                  > 10M words recognized
                                                  Embedded experiments
                                                     Randomized trials

         The Reading Tutor beats independent practice…
                  Effect sizes up to 1.3 [Mostow SSSR02, Poulsen 04]
         …but how? Use embedded experiments to investigate!
Project LISTEN                            18                              7/22/2004
Carnegie
Mellon           Pedagogical model of help on decoding
                          [Mostow, Beck, & Heiner SSSR2004]

    Problem: Which types of help work best?
    Data: 270 students’ assisted reading in the Reading Tutor
    Method: randomize choice of help and analyze its effects
    Result: detected significant differences in effectiveness




Project LISTEN                    19                     7/22/2004
Carnegie
Mellon
                    Within-subject experiment design:
                    270 students, 180,909 randomized trials

                 Student is reading a story      ‘People sit down and …’

            Student needs help on a word          Student clicks ‘read.’

                                                 Randomized choice
           Tutor chooses what help to give
                                                 among feasible types
                 Student continues reading          ‘… read a book.’

                                  Time passes…

       Student sees word in a later sentence     ‘I love to read stories.’
     Outcome: success = ASR accepts word as read fluently

(How) does the type of help affect the next encounter?
Project LISTEN                            20                         7/22/2004
Carnegie
Mellon
                       180,909 word hints
                  (average success rate 66.1%)
        Example: ‘People sit down and read a book.’
Whole word:           Analogy:
   24,841 Say In Context     13,165 Rhymes With
   56,791 Say Word           13,671 Starts Like
Decomposition:              Semantic:
   6,280 Syllabify           14,685 Recue
   14,223 Onset Rime         2,285 Show Picture
   19,677 Sound Out          488 Sound Effect
   22,933 One Grapheme     Which types stood out?
                              Best: Rhymes With 69.2% ± 0.4%
                              Worst: Recue 55.6% ± 0.4%
Project LISTEN                   21                      7/22/2004
Carnegie
Mellon
                 What helped which words best?
           Compare within level to control for word difficulty.
                           Same day:           Later day:

        Grade 1 words:     Say In Context,     Onset Rime
                           Onset Rime
        Grade 2 words:     Say In Context,     Rhymes With
                           Rhymes With
        Grade 3 words:     Say In Context      Rhymes With,
                                               One Grapheme

           Supplying the word helped best in the short term…
           But rhyming hints had longer lasting benefits.
Project LISTEN                         22                     7/22/2004
Carnegie
Mellon
              So…. what can your computational
           linguistics model in an intelligent tutor?
    What problem is important to solve?
            Language models predict word sequences for a task.
            Domain models describe skills to learn.
            Production models describe student behavior.
            Student models estimate a student’s skills.
            Pedagogical models guide tutorial decisions.
            …
    What data is available to train on?
    What method is suitable to apply?
    What result is appropriate to evaluate?


Project LISTEN                          23                        7/22/2004
Carnegie
Mellon
                 …Well I got a hammer
    Well I got a hammer,
    And I got a bell,
    And I got a song to sing,
    all over this land.
    It’s the hammer of Justice,
    It’s the bell of Freedom,
    It’s the song about Love
    between my brothers and
    my sisters,
    All over this land.



Project LISTEN                    24    7/22/2004
Carnegie
Mellon
                 Conclusions…
    Muchas gracias                    Tak
    Molto grazie                      Todah rabah

    Obrigado                          Shukra

    Merci beaucoup                    Efcharisto

    Danke schön                       Xeh-xeh

    Dank U well                       Arigato gozaymas

    Spaseeba                          Kop-kun krap

    Blagodaria                        Thank you! Questions?

                 See papers & videos at www.cs.cmu.edu/~listen.
                              Thanks
Project LISTEN                        25                      7/22/2004