Docstoc

Preparing for KNOWLISTICS

Document Sample
Preparing for KNOWLISTICS Powered By Docstoc
					                        LingTour



Groupe des Ecoles des
  Télécommunications    http://www.get-telecom.fr/
           Outline
                   Rationale for Lingtour
                   Objectives
                   Lingtour partners
                   Technical developments
                   Application architecture


Groupe des Ecoles des
  Télécommunications            The Lingtour Project
           Objectives: 3 scenarios
              Accessing information: the
               Virtual Guide
              Facilitating communication: the
               Communication Assistant
              Finding local information: the
               Orientation Assistant


Groupe des Ecoles des
  Télécommunications         The Lingtour Project
            Rationale for Lingtour
           A more user-friendly assistant
           Multimedia (text, speech, image, video)
           Multimodal access (text, speech, pen,
            visual I/O)
           Initially targeted for tourist applications



Groupe des Ecoles des
  Télécommunications         The Lingtour Project
           Accessing information: the
           Virtual Guide
              Convenient and rapid way to access useful
               information, locally or from a remote server
                    Hotel / restaurant (location/style/pricing),
                    Travel (possibilities/hours/fares),
                    City transportation (routes/time/fares/traffic),
                    Places to go / visit (location/hours/fees/route)
              Multimodal
                    Combining speech, text, map/image browsing
              Interactive (dialogues, question refinement)
                    Zoomable User Interfaces (ZUIs) + 2D Control menus
                    Tap and talk
                    Embodied Conversational Agents (ECAs)

Groupe des Ecoles des
  Télécommunications                      The Lingtour Project
           Facilitating communication:
           the Communication Assistant
              Visual display to mediate the dialogue
              Translation assistant
                    browsable sets of questions / answers
                           focused on useful situations : taxi, hotel, haggling over…
                    browsable lexicon
                           to help communication
                           for speech training thanks to the includes ASR and TTS
                    Access to a remote server / operator for difficult tasks
              Multimodal
                    Speech + text + sketching
              Interactive
                    2D Control menus
                    Tap and talk
                    ECA + TTS for speech and gestural training

Groupe des Ecoles des
  Télécommunications                             The Lingtour Project
           The Communication Assistant:
           modes of operation
              Tourist-to-local communication, or
              Local-to-tourist communication
                    Speech / text / menu-selected input
                    Menus for refinement / correction of ASR
                    Translation
                    Display and speech synthesis of translation
              Pronunciation practice
                    From lexicon or virtual guide items
              Training modules
                    Downloaded from a server
                    situation-specific (hotel, restaurant, taxi…)


Groupe des Ecoles des
  Télécommunications                      The Lingtour Project
           Finding local information: the
           Orientation Assistant
              Collecting input around the device to
                    Help localize the user
                    interpret the environment
              “intelligent camera” :
                    ability to refine pictures
                    integrated (Chinese) character recognition
                    can also operate on characters sketched on the display
              ? localization facilities
                    based on triangularisation and / or picture interpretation
                    possibility subject to the local network(s) characteristics.


Groupe des Ecoles des
  Télécommunications                      The Lingtour Project
           Lingtour partners
              TsingHua University
                    Pr. Mao Yuhang: translation from Chinese to French and English
                    Pr. Ding Xiaoqing: Chinese OCR, intelligent camera
                    Pr. Wang Zuo-yin: ASR
              CLIPS
                    Christian Boitet: translation
                    Mutsuko Tomokiyo: Multimedia-UNL
              Paris 8 University
                    Catherine Pélachaud: ECAs
              INT
                    Yang Ni: image refinement
                    Bernadette Dorizzi: HCI
              ENST-Paris
                    Gérard Chollet + Shiuan-Sung Lin: multilingual SR
                    Eric Lecolinet: ZUIs and 2-D control menus
                    Laurence Likforman: OCR
                    Jacques Prado + Alain Goyé: PDA-server communications
              ENST-Bretagne
                    Yannis Haralambous + Andre Thepaut: OCR

Groupe des Ecoles des
  Télécommunications                             The Lingtour Project
           Technical developments
              Chinese character recognition
              « Intelligent » Camera
              Text extraction
              Multilingual Speech Recognition
              Zoomable User Interfaces with 2-D
               control menus
              « Cultural » Embedded Conversational
               Agents
Groupe des Ecoles des
  Télécommunications         The Lingtour Project
           Chinese character recognition




Groupe des Ecoles des
  Télécommunications    The Lingtour Project
           Intelligent camera from
           TsingHua University

                        capture




                                                   reco
                           translation




Groupe des Ecoles des
  Télécommunications        The Lingtour Project
           Extracting text from scene
           images
              Complex color images
              Uncontrolled illumination
              Variations : size, fonts, orientation,
                  texture
              Complex backgrounds, shadows


Groupe des Ecoles des
  Télécommunications           The Lingtour Project
           Text extraction
              Searching for character regions (text has uniform
               color)
                    Multi-channel decomposition
                    Connected components analysis
                    Grouping of components
                    Alignment analysis (number of horizontally or vertically
                     aligned components)
                    Text identification (language independant features : size,
                     alignment,…)

              Detection rate : 84 %
              False alarm rate : 5.6 %
Groupe des Ecoles des
  Télécommunications                     The Lingtour Project
           Automatic Speech Recognition
           in Multiple Languages
              Sharing of acoustic models between languages to
               simplify extensibility to other languages.
              Combination of phone models and adaptation from
               small amounts of data in new languages.
              Model adaptation to user and environmental
               situations.

               Shared                                     Chinese
               acoustic
                                                          French
               models
                          Language specific models
Groupe des Ecoles des
  Télécommunications               The Lingtour Project
           Zoomable user interfaces with
           2-D control menus
               2-D control menus:
                    combine the selection and the
                     control of an operation
                     integrate up to two scroll
                     bars or spin-boxes
                    users keep their attention
                     focused on the contents
                    can have sub-menus
                    retain novice and expert
                     modes as marking menus

              http://www.infres.enst.fr/net/zomit/cdi.html
Groupe des Ecoles des
  Télécommunications                    The Lingtour Project
           Cultural Embedded
           Conversational Agents
              Behaviour adaptable to:
                    cultural and social context
                    user (tourist, journalist)
              various forms / complexity
               (2D, 3D, vector…) depending
               on device (PDA, Kiosk)
              driven by a Representation
               Language based on XML-
               XSD standard (UNL type)
              embedding the influence of a
               given culture, for example
               on:
                    choice of communicative
                     gesture (smile vs head nod)
                    the duration of gaze…

Groupe des Ecoles des
  Télécommunications                        The Lingtour Project
           Application architecture
                           UMTS (?) server




                                              a word graph,
                        Translation
                                            + a list of keywords
                                              Speech synthesis

Groupe des Ecoles des
  Télécommunications       The Lingtour Project

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:5/9/2013
language:English
pages:18