Interfaces For Accessing Multimodal Meeting Data:
Systems, Problems and Possibilities
Simon Tucker and Steve Whittaker
University of Sheffield
The availability of multimodal meeting data
increases the requirement for sophisticated
mechanisms for browsing this data.
In this talk we will examine a taxonomy of
We will highlight some of the problems
associated with such systems.
We will also describe potential directions for
One aim of the AMI project is to generate new
browser components and technology.
To begin this task it is useful to get an
understanding of the ‘browser space’.
This allows us to identify where efforts are
allocated and where future opportunities lie.
There are many potential taxonomies.
For example, a taxonomy could be produced
which classified meeting browsers by the types
of data it made use of.
We segregate browsers into four classes
according to their focus:
An example system from each of these classes
will now be described in turn.
Example: SpeechSkimmer (Arons, 1997)
Playback at two levels of
Also allows for playback
rate to be changed.
Listeners can skim
forwards and backwards
Example: Manga-Style (Girgensohn et al., 2001)
Displays a sequence of
Size of each frame
determined by an
Browsing by selection of
keyframes or by accessing
Example: Distributed Meetings (Cutler et al., 2002)
Focus of browsing is
Interface also displays
video of current speaker.
Also able to navigate via a
Furthermore, option to
increase playback speed.
Example: Lalanne et al. (2003)
Navigation using both the
ASR transcript and modes
Also aligns the document
segments with discussion.
All views are time
Speaker Turns. Presented Slides.
Pause Detection. Agenda Items
Emphasis. Whiteboard Annotations
User Determined Markings. Notes – Both Personal and Private
Documents Discussed During The
Keyframes. ASR Transcript.
Participant Behaviour. Named Entities.
Mode Of Discourse.
Possibility (1) – From Indices To Search
Systems tend to focus on browsing via indices.
There could be potential in making use of search
in meeting browsers.
There are potential benefits available from the
use of text-processing techniques.
For example navigation through a summary or
Possibility (2) - Filtered Presentation
The current set of browsers largely assume that
users will want to review the entire meeting.
There is an argument for browsers which filter
the meeting and present a limited set of data.
This filtering could be topic specific e.g. “Present
anything to do with films said at the meeting”
Or it could take a more abstract form e.g.
“Summarise this hour long meeting in 5 minutes”
Problem (1) - Assessment
Most browser studies do not consider
assessment, and those that do are largely not
Assessments are either too component-centric
or are not comparative.
The next generation of browsers should be
assessed in relation to the field.
See e.g. Browser Evaluation Test - Flynn and
Problem (2) – Platform Assumption
Current browsers assume that they will be used
on a standard workstation.
There are also advantages to providing access
to meetings data for browsers with limited
resources. For example, PDAs may be used to
access meeting data between meetings.
Such browsers would be significantly different
from the feature-rich browsers described in the
For example, there may not be the screen space
for a video component.
We have described a taxonomy of meeting
browsers and have highlighted an example of
We have identified two problems with the current
generation of meeting browsers.
Furthermore, we have examined two areas with
potential for future research.
B. Arons (1997), "SpeechSkimmer: A System for Interactively Skimming
Recorded Speech", ACM Transactions on Computer-Human Interaction, pp
R. Cutler, Y. Rui, A. Gupta, J.J. Cadiz, I. Tashev, L. He, A. Colburn, Z.
Zhang, Z. Liu, and S. Silverberg (2002), "Distributed Meetings: A Meeting
Capture And Broadcasting System", Proceedings of 10th ACM International
Conference on Multimedia, pp 503-512, Juan-les-Pins, France, 1-6
M. Flynn and P. Wellner, (2003), “In Search of a Good BET: A Proposal For
a Browser Evaluation Test”, IDIAP-COM 03-11.
A. Girgensohn, J. Borczky, and L. Wilcox (2001), "Keyframe-Based User
Interfaces For Digital Video", IEEE Computer, pp 61-67.
D. Lalanne, S. Sire, R. Ingold, A. Behera, D. Mekhaldi, and D. Rotz (2003),
"A Research Agenda For Assessing The Utility Of Document Annotations In
Multimedia Databases Of Meeting Recordings", Proceedings of 3rd
International Workshop on Multimedia Data And Document Engineering,
Berlin, Germany, September 8th.