; Languages
Learning Center
Plans & pricing Sign in
Sign Out



  • pg 1
									The Automatic Generation of Formal
Annotations in a MultiMedia Indexing
    and Searching Environment

              Thierry Declerck
                    DFKI GmbH

     Annotation Workshop, DI, 15. Februar 2002
The MUMIS Consortium

 •   CTIT     University of Twente, Enschede, NL     NLP/IE
 •   TSI      University of Nijmegen, Nijmegen, NL   ASR
 •   DFKI     Saarbrücken, D                         NLP/IE
 •   MPI      Nijmegen, NL                           MM Archives
 •   DCS      University of Sheffield, UK            NLP/IE
 •   ESTEAM   Gothenburg, SE (location Athens, GR)   Translation
 • VDA        Hilversum, NL                          Video
Objectives of MUMIS

• Technology development to automatically index (with
  formal annotations) lengthy multimedia recordings
  (off-line process)
      Find and annotate relevant events, together with the involved
        entities and relations. Also detect Metadata information.

• Technology development to exploit indexed
  multimedia archives (on-line process)
      Search for interesting scenes and play them via Internet

Test Domain: Soccer Games / UEFA Tournament 2000
Off-line Task

 Indexing by
 • Automatic Speech Recognition (Radio/TV Broadcasts)
       Automatically transforms the speech signals into texts (for 3
         languages — Dutch, English and German)
 • Natural Language Processing (Information Extraction)
       Analyse all available textual documents (newspapers, speech
         transcripts, tickers, formal texts ...), identify and extract
         interesting entities, relations and events. Also detect Metadata
 • Merging all the annotations produced so far
 • Create a database with formal annotations
 • Use video processing to adjust time marks
       Current Procedure         MUMIS Procedure
   Manual Video Annotation   Automatic Video Annotation
    Integration Central DB       and DB Integration
         Query via PC               Query via PC
         Results on PC
    Contact Video Archive          Results on PC
        Get Video Tapes                And
    Search on Tape on VCR          Select & Play
        Segment & Play

 • What gets lost? Is it necessary?
 • Potential: direct Internet Service, less dependencies
The Generation of
Formal Annotations

  • Metadata (type of game, teams, date, final score,
  players etc.), as they can be used a.o. for classifying
  and filtering videos in the MM digital archive
  • Events (particular actions with time codes,
  involved entities and related events), as they can be
  extracted from the video sequences
  • All Formal Annotations available in XML Standard
The Event Table
Related to domain ontology and multilingual
terminology. Guiding the generation of formal

   Event           ID Time        Subcat/Modification              Metadata

   Final whistle   #   90>t>120   Subj=referee, score etc…         Final score
   Shot on Goal    #   0>t>120    Subj=pl, loc=loc, cons=cons,..

   Dribbling       #   0>t>120    Subj=pl, loc=loc, …

   Substitution    #   0>t>120    Subj=pl, I.obj=pl, cause=c, …    Team (adding pl)

   Red Card        #   0>t>120    Subj=ref, I.obj=pl, cause=c, …   Team (red at t)

   Goal            #   0>t>pen.   Subj=pl, I.obj=team, score=s,    Order of goal

  Off-line Task
             Newspaper                            Radio Commenting                         Newspaper
            Newspaper                             Radio Commenting
                                               Radio Commenting                           Newspaper
                Text                    Audio Commenting (TV, Radio)
                                                  3 Languages                         Tickers etc.
               Texts                              3 Languages
                                                 3 Languages                                  Text
                                                3 Languages                           3 Languages
           3 Languages

multilingual IE
=> event tables
                                                                                                 Event = goal
                                          Event = goal        Event = goal                       Type = Freekick
Merging of           Event = goal
                                                              Player= Basler
                                          Type = Freekick                                        Player = Basler
Annotations          Player = Basler
                                          Player = Basler     Team = Germany                     Team = Germany
                     Dist. = 25 m
                                          Dist. = 25 m        Time = 18                          Time = 18
                     Time = 18
                                          Time = 17            Score = 1:0                       Score = 1:0
                     Score = 1:0          Score: leading      Finalscore = 1:0                   Final score = 1:0
                                                                                                 Distance = 25 m

                                 Foul            Goal          Pass              Defense
                                 17 min          18 min        24 min            28min

Events indexed in video                          1:0
recording                                        Freekick                        Dribbling
                                 Neville         Basler        Matthäus          Campbell
                                 Basler                                          Scholl
                                 25 m            25 m          60 m
The Role of IE in
 • Information Extraction (IE) is the task of identifying,
   collecting and normalizing relevant information for
   a specific application or user.
 • The relevant information is typically represented in
   form of predefined “templates”, which are filled by
   means of Natural Language (NL) analysis (Template
   = Event Table in MUMIS)
 • IE combines pattern matching mechanisms,
   (shallow) NLP and domain knowledge (terminology
   and ontology).
Extension of our IE
system in MUMIS
 • Multilingual and multisource IE. Incremental
   information building
 • Cross-document co-reference resolution
 • Combine Metadata and event extraction => better
   organisation and dynamic updating of information
 • Multiple presentation of results: Template, Event
   table, integration in MPEG-7 XML and Hyperlinks
   (Named Entities, rel. to Knowledge Management)
• Based on XML output of SPPC (Dev. At DFKI)
• Mapping the XML into a feature structure
  (the CorpA/schug Program)
• Cascaded grammar descriptions for enriching
  (or correcting) the SPPC output
• Including agreement processing and
  detection of grammatical functions
• Adapting the “Paradime triangle” for
  template generation and filling
Information Extraction

 IE is generally subdivided in following tasks:
   - Named Entity task (NE)
   - Template Element task (TE)
   - Template Relation task (TR)
   - Scenario Template task (ST)
   - Co-reference task (CO)
Subtasks of IE

 • Named Entity task (NE): Mark into the text each
   string that represents, a person, organization, or
   location name, or a date or time, or a currency or
   percentage figure.
 • Template Element task (TE): Extract basic
   information related to organization, person, and
   artifact entities, drawing evidence from everywhere
   in the text.
Subtasks of IE (2)

• Template Relation task (TR): Extract relational
  information on employee_of, manufacture_of,
  location_of relations etc. (TR expresses domain-
  independent relationships).
• Scenario Template task (ST): Extract pre-specified
  event information and relate the event information
  to particular organization, person, or artifact
  entities (ST identifies domain and task specific
  entities and relations).
• Co-reference task (CO): Capture information on co-
  referring expressions, i.e. all mentions of a given
  entity, including those marked in NE and TE.
IE applied to soccer

 Terms as descriptors for the NE task
 Team: Titelverteidiger Brasilien, den respektlosen Außenseiter
 Player:Superstar Ronaldo, von Bewacher Calderwood noch von
    Abwehrchef Hendry, von Jackson als drittem Stürmer,
    Torschütze Cesar, von Roberto Carlos (16.),
 Referee: vom spanischen Schiedsrichter Garcia Aranda
 Trainer: Schottlands Trainer Brown, Kapitän Hendry seinen
    Keeper Leighton
 Location: im Stade de France von St. Denis (more fine-grained
    location detection would be: Stadion: im Stade de France and
    City: von St. Denis )
 Attendance: Vor 80000 Zuschauern
IE applied to soccer (2)

 Terms for NE Task
 Time: in der 73. Minute, nach gerade einmal 3:50 Minuten, von
    Roberto Carlos (16.), nach einer knappen halben Stunde,
    scheiterte Rivaldo (49./52.) jeweils nur knapp, das vor der
    Pause Versäumte versuchten die Brasilianer nach
    Wiederbeginn, ...
 Date: am Mittwoch, der Turnierstart (?), im WM-Eröffnungsspiel
 Score/Result: Brasilien besiegt Schottland 2:1, einen 2:1 (1:1)-
    Sieg, der zwischenzeitliche Ausgleich, in der 4. Minute in
    Führung gebracht, köpfte zum 1:0 ein
IE applied to soccer (3)

 Relations for TR Task
 Opponents: Brasilien besiegt Schottland, feierte der Top-Favorit
     ... einen glücklichen 2:1 (1:1)-Sieg über den respektlosen
     Außenseiter Schottland,
 Player_of: hatte Cesar Sampaio den vierfachen Weltmeister ... in
     Führung gebracht, Collins gelang ... der zwischenzeitliche
     Ausgleich für die Schotten, der Keeper des FC Aberdeen,
     Brasiliens Keeper Taffarel
 Trainer_of: Schottlands Trainer Brown
IE applied to soccer (4)

 Events for ST task:
 Goal: in der 4. Minute in Führung gebracht, das schnellste Tor ...
    markiert, Cesar Sampaio köpfte zum 1:0 ein, Collins (38.)
    verwandelte den Strafstoß, hätte Kapitän Hendry seinen
    Keeper Leighton um ein Haar zum zweiten Mal bezwungen, von
    dem der Ball ins Tor prallte
 Foul: als er den durchlaufenden Gallacher im Strafraum allzu
    energisch am Trikot zog
 Substitution: und mußte in der 59. Minute für Crespo Platz
IE applied to soccer (5)

 Description of the Templates: Team
                     TACTIC       []
                     SCORE       []
                     NAME        []
                     PLAYER       []
                     TRAINER       []

        goal-template                   team-template
        TIME         []                 TACTIC       []
        SCORE        [S]                SCORE       [S]
        PLAYER [P]                      NAME        []
        TEAM         [team-templ ]      PLAYER       [P]
        TYPE        []                  TRAINER [ ]
        SUCCESS [ ]
Merging Component

•   Acting on the generated formal annotations
    (Metadata and Events), but also interleaving
    with the generation process of those
•   Checking consistency, eliminating
    redundancy (Template Merging), in
    accordance with domain ontology
•   Completing the information with domain
    knowledge, inference Machine
Use of Standards

 •   XML as the annotation language and data
     interchange format

 •   MPEG-7: standard for the description of
     features of multimedia content, XML
     compliant (for content description), with a
     slot for textual annotations
More about MPEG           (Moving
Picture Coding Experts Group)

 •   MPEG-1: For the storage and retrieval of
     movie pictures and audio on storage media
 •   MPEG-2: For digital television
 •   MPEG-4: Codes content as objects and
     enables those objects to be manipulated
 •   MPEG-7: Where 1,2 and 4 make content
     available, MPEG-7 allows to find the content
     one needs
On-line Tasks

 Searching and Displaying
 • Search for interesting events with formal queries
       Give me all goals from Overmars shot with his head in 1. Half.
       Event=Goal; Player=Overmars; Time<=45; Previous-Event=Headball

 • Indicate hits by thumbnails & let user select scene

 • Play scene via the Internet & allow scrolling
       Of course: slow motion, fast play, start/stop, etc
On-line Tasks

 Searching and Displaying
 • Search for interesting events with formal queries
       Give me all goals from Overmars shot with his head in 1. Half.
       Event=Goal; Player=Overmars; Time<=45; Previous-Event=Headball

 • Indicate hits by thumbnails & let user select scene

 • Play scene via the Internet & allow scrolling
       Of course: slow motion, fast play, start/stop, etc
On-line Tasks
                                           Freekick           Goal       Pass       Defense

       Knowledge Guided                    17 min             18 min     24 min     28min
        User Interface                     Foul               Freekick              Dribbling
                                           Kohler             Basler     Matthäus   Wörns
        Search Engine
                                           Basler                                   Bierhoff
                                           25 m               25 m       60 m

 München - Ajax   München - Porto   Deutschland - Brasilien
     1998             1996                   1998

                                                        of that Game
On-line SW Architecture
Server structure:                         Ontology

    • fully distributed                                      Client
    • JMF media presentation               Lexica
    • RMI-based interaction

                Applet                   Media Server     Hit Rendering       Query Engine
                 JMF                       Objects            Objects            Objects

    (RTP,    WWW Server
    RTSP)                                MPEG Movies       Keyframes
              Java Server

                                                                          Annotations    Metadata

                                                    Query interface:
  Media          DB
                Media           Media                  • pre-selection
  Server       Server
                Server          Server
                               Server                  • guided by domain knowledge
 MPEG1         rDBMS
                MPEG1          MPEG1
                                                       • interactive, visual feedback
On-line HW Architecture

  RAID                  Media Server

           FC Switch                           Gb-Switch
                                               GB Switch
 Library                Media Server

           • efficient & reliable storage management
                     (near-line capacity, media change, 2. Location)
           • high storage capacity (n TB, 1 h MPEG1 = 1 GB)
           • powerful media servers / powerful network




To top