Accessing Multimodal Meeting Data Idiap Research Institute by waterwolltoremilion


									Interfaces For Accessing Multimodal Meeting Data:
               Systems, Problems and Possibilities

                   Simon Tucker and Steve Whittaker
                              University of Sheffield

 The availability of multimodal meeting data
  increases the requirement for sophisticated
  mechanisms for browsing this data.
 In this talk we will examine a taxonomy of
  meeting browsers.
 We will highlight some of the problems
  associated with such systems.
 We will also describe potential directions for
  future research.
    Why Taxonomy?
 One aim of the AMI project is to generate new
  browser components and technology.
 To begin this task it is useful to get an
  understanding of the ‘browser space’.
 This allows us to identify where efforts are
  allocated and where future opportunities lie.
 There are many potential taxonomies.
 For example, a taxonomy could be produced
  which classified meeting browsers by the types
  of data it made use of.
    Browser Taxonomy

 We segregate browsers into four classes
  according to their focus:
   Audio Browsers.
   Video Browsers.
   Artefact Browsers.
   Discourse Browsers.
 An example system from each of these classes
  will now be described in turn.
     Audio Browsers

      Example: SpeechSkimmer (Arons, 1997)

 Playback at two levels of
  skimming, e.g.
   Pause shortening.
   Segment selection.
 Also allows for playback
  rate to be changed.
 Listeners can skim
  forwards and backwards
  through speech
     Video Browsers

 Example: Manga-Style (Girgensohn et al., 2001)

 Displays a sequence of
 Size of each frame
  determined by an
  importance score.
 Browsing by selection of
  keyframes or by accessing
  a time-line.
      Artefact Browsers

Example: Distributed Meetings (Cutler et al., 2002)

 Focus of browsing is
  captured whiteboard
 Interface also displays
  video of current speaker.
 Also able to navigate via a
 Furthermore, option to
  increase playback speed.
      Discourse Browsers

           Example: Lalanne et al. (2003)

 Navigation using both the
  ASR transcript and modes
  of discourse.
 Also aligns the document
  segments with discussion.
 All views are time
 Novel kaleidoscope
  navigation device.
        Interim Summary
             Perceptual                  Semantic
Audio                       Artefacts
Speaker Turns.              Presented Slides.
Pause Detection.            Agenda Items
Emphasis.                   Whiteboard Annotations
User Determined Markings.   Notes – Both Personal and Private
                            Documents Discussed During The
Video                       Discourse
Keyframes.                  ASR Transcript.
Participant Behaviour.      Named Entities.
                            Mode Of Discourse.
     Possibility (1) – From Indices To Search

 Systems tend to focus on browsing via indices.
 There could be potential in making use of search
  in meeting browsers.
 There are potential benefits available from the
  use of text-processing techniques.
 For example navigation through a summary or
  by entities.
     Possibility (2) - Filtered Presentation

 The current set of browsers largely assume that
  users will want to review the entire meeting.
 There is an argument for browsers which filter
  the meeting and present a limited set of data.
 This filtering could be topic specific e.g. “Present
  anything to do with films said at the meeting”
 Or it could take a more abstract form e.g.
  “Summarise this hour long meeting in 5 minutes”
    Problem (1) - Assessment

 Most browser studies do not consider
  assessment, and those that do are largely not
  rigorous enough.
 Assessments are either too component-centric
  or are not comparative.
 The next generation of browsers should be
  assessed in relation to the field.
 See e.g. Browser Evaluation Test - Flynn and
  Wellner (2003).
    Problem (2) – Platform Assumption

 Current browsers assume that they will be used
  on a standard workstation.
 There are also advantages to providing access
  to meetings data for browsers with limited
  resources. For example, PDAs may be used to
  access meeting data between meetings.
 Such browsers would be significantly different
  from the feature-rich browsers described in the
 For example, there may not be the screen space
  for a video component.

 We have described a taxonomy of meeting
  browsers and have highlighted an example of
  each class.
 We have identified two problems with the current
  generation of meeting browsers.
 Furthermore, we have examined two areas with
  potential for future research.
 B. Arons (1997), "SpeechSkimmer: A System for Interactively Skimming
  Recorded Speech", ACM Transactions on Computer-Human Interaction, pp
 R. Cutler, Y. Rui, A. Gupta, J.J. Cadiz, I. Tashev, L. He, A. Colburn, Z.
  Zhang, Z. Liu, and S. Silverberg (2002), "Distributed Meetings: A Meeting
  Capture And Broadcasting System", Proceedings of 10th ACM International
  Conference on Multimedia, pp 503-512, Juan-les-Pins, France, 1-6
 M. Flynn and P. Wellner, (2003), “In Search of a Good BET: A Proposal For
  a Browser Evaluation Test”, IDIAP-COM 03-11.
 A. Girgensohn, J. Borczky, and L. Wilcox (2001), "Keyframe-Based User
  Interfaces For Digital Video", IEEE Computer, pp 61-67.
 D. Lalanne, S. Sire, R. Ingold, A. Behera, D. Mekhaldi, and D. Rotz (2003),
  "A Research Agenda For Assessing The Utility Of Document Annotations In
  Multimedia Databases Of Meeting Recordings", Proceedings of 3rd
  International Workshop on Multimedia Data And Document Engineering,
  Berlin, Germany, September 8th.

To top