Docstoc

Spoken Language Technology (PowerPoint)

Document Sample
Spoken Language Technology (PowerPoint) Powered By Docstoc
					Spoken Dialogue Technology
Learning outcomes
 appreciate limitations of current
  alternatives: touch-tone, IVR (interactive
  voice response)
 understand the scope of spoken dialogue
  systems
 be aware of some current examples of
  spoken dialogue systems
 understand why spoken dialogue systems
  are useful
Touch-tone Dialogues (DTMF): Example
S: Thank you for calling the Spare Parts Corporation. If you
   wish to place an order, please key number ‘1’ on your
   telephone now, otherwise please hold on.
U:<Presses ‘1’ key>
S: Using the telephone keypad, please enter the stock code of
   the first item you wish to order
U:<Presses 5-7-3-2-8-4>
S: Please enter the number of items you require
U: <Presses 1-0-0>
S: One hundred items of stock number 5-7-3-2-8-4. Press
   ‘1’to confirm or ‘2’ to cancel
U: <Presses ‘1’>
S: Please enter the stock code of the next item you wish to
   order, or press ‘star’ if your order is complete.
Limitations of Touch Tone Systems

 restricted vocabulary
      vocabulary of 12 items (0-9,#,*)
      good for number entry
      other vocabularies have to be mapped on to the
       available keys
 problem of long lists of items
  what colour would you like? Press ‘1’ for red, ‘2’ for
  yellow, ‘3’ for green, ‘4’ for blue, ‘5’ for maroon, ‘6’ for
  pink, ‘7’ for brown, ‘8’ for black, or ‘9’ for white.
 dealing with embedded tasks
An alternative: spoken dialogue

S: Thank you for calling ABC Office Supplies Order Line.
   How can I help you?
U: I’d like to order 10 high density IBM format 3.5 inch disks
   please
S: Certainly. Anything else?
U: No thanks
S: What’s your account number?
U: 25347
S: Okay, thanks. Your order will be dispatched today. Thank
   you for calling ABC Office Supplies.
Comparing touch-tone and spoken
dialogue
 problems with touch-tone applications
    people don’t use numbers to communicate

    numeric menu selection can be tedious

    people forget what the options are and where they are

 advantages of speech
    more intuitive

    ‘hands free’ operations e.g. voice dialling in car

    users can say what they want to do, no need to

     remember how
What is a Dialogue System?

 interaction with a computer system using
  (spoken) natural language to
     obtain information (e.g. travel information)
     have some service performed (e.g. redirecting
      telephone calls)
     get help in solving a problem (e.g. equipment
      repair)
     support interactive learning
 Characteristics of Dialogue

 extended interaction - query or problem cannot be
  resolved with a single Q-A exchange
 coherence of interaction - exchanges are related
  topically, e.g. as part of a task or plan
 subdialogues - to deal with sub-tasks,
  clarifications
 use of context to assist interpretation
 emergent structure - structure is not pre-defined,
  evolves dynamically
Requirements for Dialogue Systems

 natural - like human conversation
 mixed initiative - each participant can take
  the initiative
 co-operativity - system should try to satisfy
  the user’s goals, even where they are not
  expressed directly
 robustness - system should be able to
  process ill-formed input
Approaches to Dialogue I

 theoretical models
     discourse structure theory, ...
     dialogue and planning
     linguistics, psychology, sociology, AI, EE, ...
 dialogue modelling
     research prototypes (SUNDIAL, TRAINS, ...)
     issues: speech, (robust) parsing, dialogue
      management, standards, empirical methods,
      evaluation
Approaches to Dialogue II

 Dialogue engineering - systems that are
  designed to work in a real, commercial
  environment
     telebanking (NTT Japan)
       • uses 16 words, digits plus control words, used in
         over 70 cities in Japan
     voice dialling - for mobile telephone users
      (USA)
     directory assistance (Vocalis)
       • automates initial and final parts of the dialogue
                             Computer
                            Information
Components of a               System
Dialogue System
                             Dialogue
                            Management

                                          Language
               Message                    Processing
              Generation

                                          Acoustic
                  Text to                 Phonetic
                  Speech                  Decoding

                        AD/DA               Signal
          PSTN                            Processing
Components of a Dialogue System:
Speech Processing
 Speech recognition
     isolated word recognition v continuous speech
     vocabulary size (small e.g. 30 words v large
      e.g. 20,000 words)
     trained (single speaker) v speaker-independent
     output: N-best word sequences, word lattice
 Speech synthesis
     text normalisation
     prosodic contours
     concatenation (e.g. diphones)
Components of a Dialogue System:
Language Processing

 syntactic parsing
 semantic analysis
 robust parsing
 mixed strategies - stochastic parsing,
  keyword and keyphrase spotting
 output - semantic representation
  e.g. destination: London, time: mid morning
Components of a Dialogue System:
Dialogue Management

 Interpretation of the user’s utterances
     resolution of anaphora and ellipsis using
      dialogue context
     use of expectations to constrain range of
      interpretations
 Generation of system utterances in the light
  of the interpretation of the user’s utterances
  and the system’s goals
     e.g. confirm or clarify a parameter, look up
      database for information, ask a new question
Advanced dialogue management:
conversational agents
 model of the conversational partner
      beliefs about the partner’s mental state and desires
 model of the discourse context
      keeping track of and updating the conversational state
 model of discourse obligations that govern the
  permissibility of actions
      S1 promise A => S1 achieve A
      S1 request A => S2 address request: accept,
                    reject or clarify A
Example 1: The SUNDIAL project
 U:Can you tell me the arrival time of today’s
  Madrid flight?
 S: What is the number of the flight?
 U: I don’t know
 S: Where does the flight leave from?
 U: Madrid
 S: Was that from Paris?
 U: No, Madrid
 S: Did you say from Copenhagen?
 U: Madrid
 S: Please answer ‘yes’ or ‘no’. Did you say from
  Madrid?
 Example 2: The TRAINS project

U1: I want to go from Chicago to New York and Richmond
S1: What route would you like to get from Chicago to New
  York? <displays a route to Richmond> Well, is this okay?
U2: No, I want to go to New York first
S2: OK, what route would you like to get from Chicago to New
  York?
U3: Let’s go via Pittsburg
S3:Well, yes. City Pittsburg is congested. Trains will take an
  additional 5 hours to move through there.
U4:Okay. Let’s go via Cleveland and Syracuse instead.
S4:No problem. <displays route>

				
DOCUMENT INFO