ppt

Document Sample
ppt Powered By Docstoc
					                                                                    JL-1




                                                         CANTATA


    Content Aware Networked systems Towards Advanced and Tailored
                              Assistance
                                          BMVA 2007

                                       December 12, 2007
                                 Francois Bremond – INRIA sophia




BMVA CANTATA – INRIA, December 12, 2007        page 1
                                                            JL-2




CANTATA Introduction

Problem statement

•   2 year ITEA Project, ending in December 2008
•   Large amounts of data for transfer and interpretation
•     3 MCA challenges
          Surveillance
          Consumer applications
          Medical
•   Solution
          High-level descriptions by
           means of content analysis
          Retrieval by Intelligent Indexing



BMVA CANTATA – INRIA, December 12, 2007   page 2
                                                                               JL-3




CANTATA Introduction

Long term vision

•   Develop systems
          That are aware of the content and understand it
          That apply this knowledge to establish an action or autonomously
           control the environment
          That will be a “virtual specialist” as it will apply the knowledge to
           assist the decision-making security officer
•   Challenges
          Video content models for robust analysis and reasoning
          Self-learning, context awareness for faithful system performance
          Performance quantification of MCA for objective evaluation
           (standard dataset, well-defined metrics)



BMVA CANTATA – INRIA, December 12, 2007   page 3
                                                   JL-4




CANTATA the scope

The CANTATA platform




BMVA CANTATA – INRIA, December 12, 2007   page 4
                                                            JL-5




WP4 Validation and classification

Objective

• This work package aims at defining an overall objective
  validation framework that covers the various aspects of
  MCA systems




                         Validation chain

BMVA CANTATA – INRIA, December 12, 2007     page 5
                                                             JL-6




WP4 Validation and classification

• Organisation
Organisation

• Work Package leader: Barco
      Task 4.1 State-Of-The-Art : Inria
      Task 4.2 Requirements : VDG Security
      Task 4.3 Creation of Datasets : Kingston University
      Task 4.4 Annotation tool : Traficon
      Task 4.5 Ground truth : Multitel
      Task 4.6 Validation metrics : Philips medical
      Task 4.7 Publication of validation : Codasystem


• Other partners: Acic, UPF, IBBT, Philips research



BMVA CANTATA – INRIA, December 12, 2007   page 6
                                                                         JL-7




                                               This work
               gives an overview of projects in performance evaluation
                                          and proposed datasets



BMVA CANTATA – INRIA, December 12, 2007         page 7
                                                                           JL-8




Performance evaluation

Creation of WEB PAGE with existing VIDEO DATASETS
• Topics:
       Surveillance
       Consumer applications
       Medical


• Content:
         Website: Webpage link (if any)
         Description of Dataset: (Content, size, etc)
         Description of Ground Truth/Metadata: (if any)
         Contextual info:environment conditions (calibration, scene...)
         Results from metrics and ground truth:
         Comments:
         Information on Copyright: Licence, Cost, etc.
         Contact person from Cantata:contact person to get more info.
BMVA CANTATA – INRIA, December 12, 2007   page 8
                                                                                   JL-9




    Surveillance

    ETISEO
•    Website: http://www-sop.inria.fr/orion/ETISEO/
•    Description of Dataset: 86 video clips. These sequences constitute a
     representative panel of different video surveillance areas.
     They merge indoor and outdoor scenes, corridors, streets, building entries,
     subway station...
     They also mix different types of sensors and complexity levels.
•    Description of Ground Truth/Metadata: 5 different levels: Object Detection,
     Object Localization, Object Tracking, Object Classification.
•    Contextual info: zone of interest, calibration matrix
•    Results from metrics and ground truth: bounding box, object class, events
•    Comments:
•    Information on Copyright: Free download but registration and user agreement
     is required.
•    Contact person from Cantata: francois.bremond@sophia.inria.fr



    BMVA CANTATA – INRIA, December 12, 2007   page 9
                                                                                       JL-10




    Surveillance

    PETS 2001

•    Website: http://www.cvg.cs.rdg.ac.uk/PETS2001/pets2001-dataset.html
•    Description of Dataset: Outdoor people and vehicle tracking (two synchronised
     viewsDescription of Ground Truth/Metadata: Tracking information on image
     plane and ground plane can be found at:
     http://www.cvg.cs.rdg.ac.uk/PETS2001/ANNOTATION/
•    Contextual info: Camera Calibration provided
•    Results from metrics and ground truth: Centroid and bounding box coordinates
     on image plane, object class (person, vehicle, other), position on ground plane
     and object orientation.
•    Information on Copyright: Free download from website
•    Contact person from Cantata: Dimitrios Makris, d.makris@kingston.ac.uk




    BMVA CANTATA – INRIA, December 12, 2007   page 10
                                                                                   JL-11




    Surveillance

    PETS 2002- VISOR BASE: Moving People
•    Website: http://www.cvg.cs.rdg.ac.uk/PETS2002/pets2002-db.html
•    Description of Dataset: Indoor people tracking (and counting). Two training and
     four testing sequences consist of people moving in front of a shop window.
     Sequences are provided as both MPEG movie format and as individual JPEG
     images.
•    Description of Ground Truth/Metadata: People tracking, counting and activity
     recognition.
•    Contextual info: No calibration
•    Results from metrics and ground truth: How many people are passing in front
     of the shop window, how many people stop and look into the window, how
     many people are looking into the window at each instant (frame) in time, the
     trajectories of people passing in front of the store, the time spent per frame
     (processing time): a histogram of the microseconds spent processing each
     frame.
•    Information on Copyright: Free download from website
•    Contact person from Cantata: Dimitrios Makris, d.makris@kingston.ac.uk

    BMVA CANTATA – INRIA, December 12, 2007   page 11
                                                                                       JL-12




    Surveillance

    PETS-ICVS'2003 - FGnet
•    Website: http://www.cvg.cs.rdg.ac.uk/PETS-ICVS/pets-icvs-db.html
•    Description of Dataset: Smart meeting, that includes facial expressions, gaze
     and gesture/action. The environment consists of three cameras: one mounted
     on each of two opposing walls, and an omnidirectional camera positioned at
     the centre of the room. The dataset consists of four scenarios.
•    Description of Ground Truth/Metadata: a) Eye positions of people in Scenarios
     A, B and D. (every 10th frame is annotated). b) Facial expression and gaze
     estimation for Scenarios A and D, Cameras 1-2. c) Gesture/action annotations
     for Scenarios B and D, Cameras 1-2.
•    Contextual info: Camera Calibration provided.
•    Results from metrics and ground truth: For each frame, the requirement is to
     perform:face localisation (centre location of eyes), recognition of facial
     expression, recognition of face/hand gesture, estimation of face/head direction
     (gaze), recognition of actions.
•    Information on Copyright: Free download
•    Contact person from Cantata: Dimitrios Makris, d.makris@kingston.ac.uk
    BMVA CANTATA – INRIA, December 12, 2007   page 12
                                                                                       JL-13




    Surveillance

    PETS-ECCV'2004 - CAVIAR
•    Website: http://groups.inf.ed.ac.uk/vision/CAVIAR/CAVIARDATA1/
     or http://www-prima.inrialpes.fr/PETS04/caviar_data.html
•    Description of Dataset: People walking alone, meeting with others, window
     shopping, fighting and passing out and leaving a package in a public place. All
     video clips were filmed with a wide angle camera lens. The resolution is half-
     resolution PAL standard (384 x 288 pixels, 25 frames per second) and
     compressed using MPEG2. The file sizes are about 10 MB.
•    Description of Ground Truth/Metadata: Person/Group Tracking, Person/Group
     Activity Recognition, Scenario/Situation Recognition
•    Contextual info: 3D coordinates of points for calibration purposes provided.
•    Results from metrics and ground truth: For each frame and object/group :
     bounding box and behaviour label. Also, for each frame, labels for
     situations/scenarios for the whole image.
•    Information on Copyright: Free download from website. If you publish results
     using the data, please acknowledge the data as coming from the EC Funded
     CAVIAR project/IST 2001 37540, found at URL:
     http://www.dai.ed.ac.uk/homes/rbf/CAVIAR/
•    Contact person from Cantata: Dimitrios Makris, d.makris@kingston.ac.uk
    BMVA CANTATA – INRIA, December 12, 2007   page 13
                                                                                        JL-14




    Surveillance

    PETS'2006 - ISCAPS
•    Website: http://pets2006.net/
•    Description of Dataset: Surveillance of public spaces, detection of left luggage
     events. Scenarios of increasing complexity, captured using multiple sensors.
•    Description of Ground Truth/Metadata: XML files: Calibration parameters,
     these are given in the sub-directory 'calibration‘ and configuration and ground-
     truth information.
•    Contextual info: Calibration provided.
•    Results from metrics and ground truth: The radii distances, luggage location,
     warning / alarm triggers etc
•    Information on Copyright: Free download from website . The UK Information
     Commisioner has agreed that the PETS 2006 data-sets described here may
     be made publicly available for the purposes of academic research. The video
     sequences are copyright ISCAPS consortium and permission is hereby
     granted for free download for the purposes of the PETS 2006 workshop.
•    Contact person from Cantata: Dimitrios Makris, d.makris@kingston.ac.uk

    BMVA CANTATA – INRIA, December 12, 2007   page 14
                                                                                 JL-15




    Surveillance

    PETS'2007 - REASON
•    Website: http://pets2007.net/
•    Description of Dataset: The datasets are multisensor sequences containing the
     following 3 scenarios, with increasing scene complexity: 1. loitering, 2.
     attended luggage removal (theft), 3. unattended luggage.
•    Description of Ground Truth/Metadata: Event Detection
•    Contextual info: Calibration provided
•    Results from metrics and ground truth: Event Details (type, location, time)
•    Information on Copyright: Free download from website . The UK Information
     Commisioner has agreed that the PETS 2007 datasets described here may be
     made publicly available for the purposes of academic research. The video
     sequences are copyright UK EPSRC REASON Project consortium and
     permission is hereby granted for free download for the purposes of the PETS
     2007 workshop.
•    Contact person from Cantata: Dimitrios Makris, d.makris@kingston.ac.uk



    BMVA CANTATA – INRIA, December 12, 2007   page 15
                                                                                           JL-16




    Surveillance

    Level Crossing
•    Website: http://www.multitel.be/~va/selcat/
•    Description of Dataset: These datasets are composed of 24 Hours of real
     sequences, showing a level crossing where some vehicles stop due to its
     particular configuration: on the right side of the LC, there is an avenue, parallel
     to the LC. So a traffic light is located just after the LC. Consequently,
     sometimes, vehicles stopped on the LC due to this traffic light. The Total
     Amount of data is about 7 GigaBytes. Description of Ground Truth/Metadata:
     For each video files, there is a corresponding ground truth file in XML that give
     the timestamp of events "stopped vehicles"
•    Contextual info:environment conditions (calibration, scene...)
•    Contact person from Cantata: Caroline Machy, machy@multitel.be




    BMVA CANTATA – INRIA, December 12, 2007   page 16
                                                                                          JL-17




    Surveillance

    SPEVI: Single face dataset
•    Website: www.spevi.org
•    Description of Dataset: This is a dataset for single person/face visual detection
     and tracking. The dataset is composed of five sequences with different
     illumination conditions and resolutions.
•    Description of Ground Truth/Metadata: The ground truth data is available in the
     .zip files for the sequences motinas_toni and motinas_emilio_webcam. In the
     ground truth files each line of text describes the objects' position and size in a
     frame. The syntax of a line is the following:
     frame number_of_objects obj_1_name x y half_width half_height angle obj
     _2_name x y half_width half_height angle ...
•    Information on Copyright: Requested citation acknowledgment E. Maggio, A.
     Cavallaro, "Hybrid particle filter and mean shift tracker with adaptive transition
     model", in Proc. of IEEE Int. Conference on Acoustics, Speech and Signal
     Processing (ICASSP 2005), Philadelphia, 19-23 March 2005, pp. 221 - 224.
•    Contact person from Cantata: Xavier Desurmont, desurmont@multitel.be

    BMVA CANTATA – INRIA, December 12, 2007   page 17
                                                                                                    JL-18




    Surveillance

    SPEVI: Multiple faces dataset
•    Website: www.spevi.org
•    Description of Dataset: This is a dataset for multiple people/faces visual detection and
     tracking. The dataset is composed of 3 sequences (same scenario); 4 targets repeatedly
     occlude each other while appearing and disappearing from the field of view of the
     camera. The sequence motinas_multi_face_frontal shows frontal faces only; in
     motinas_multi_face_turning the faces are frontal and rotated; in motinas_multi_face_fast
     the targets move faster that in the previous two sequences. Total number of images:
     2769, DivX 6 compression,640 x 480 pixels,25 Hz.
•    Description of Ground Truth/Metadata: No
•    Contextual info: No
•    Results from metrics and ground truth: No
•    Comments: No
•    Information on Copyright: Requested citation acknowledgment: E. Maggio, E. Piccardo,
     C. Regazzoni, A. Cavallaro. "Particle PHD filter for multi-target visual tracking", in Proc.
     of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP
     2007), Honolulu (USA), April 15-20, 2007
•    Contact person from Cantata: Xavier Desurmont, desurmont\a\multitel.be



    BMVA CANTATA – INRIA, December 12, 2007    page 18
                                                                                                JL-19




    Surveillance

    OVVV
•    Website: http://development.objectvideo.com/
•    Description of Dataset: The ObjectVideo Virtual Video provides the ability to generate
     virtual video sequences. These sequences can then be used to test VCA algorithms.
•    Description of Ground Truth/Metadata: The automatically generated ground truth is
     generated in a propriety binary format. The format is open, and a conversion program
     can be created to convert metadata to any format.
•    Contextual info: Virtual environment, the user can make his own environment from the
     internet. Several camera settings can be changed to simulate real-world cameras.
•    Results from metrics and ground truth: results from metrics and ground truth are not
     applicable for OVVV.
•    Comments: This is not a dataset as is but using these tools, very powerful and tailored;
     test videos can be created.
•    Information on Copyright: The ObjectVideo Virtual Video Tool is provided free for non-
     commercial use, for your own research and development purposes. If you publish or
     distribute images, videos or derivative results based on this software, you must
     acknowledge ObjectVideo by including "ObjectVideo Virtual Video Tool".
     To use the ObjectVideo Virtual Video tool a licence for the commercial game Half-Life 2
     is needed (www.steampowered.com).
•    Contact person from Cantata: Rick Koeleman, VDG-Security bv. rick@vdg-security.com

    BMVA CANTATA – INRIA, December 12, 2007   page 19
                                                                                         JL-20




    Surveillance

    CANDELA
•    Website: http://www.multitel.be/~va/candela/
•    Description of Dataset: "Indoor abandonned object" and "road intersection"
         Scenario 1: The detection of abandoned objects
         Scenario 2: Street at zebra crossings.
•    Description of Ground Truth/Metadata: no
•    Contextual info: no
•    Results from metrics and ground truth: Criteria for verification/ : -Is the alarm
     generated (yes/no)? -How correct is the timing of the alarm (start delay, overall
     time overlap) Position correctness
•    Information on Copyright: public domain
•    Contact person from Cantata: Xavier Desurmont, desurmont\a\multitel.be




    BMVA CANTATA – INRIA, December 12, 2007   page 20
                                                            JL-21




    Surveillance

    Traffic datasets (Institut fur Algorithmen
    und Kognitive Systemes)
•    Website: http://i21www.ira.uka.de/image_sequences/
•    Description of dataset: Traffic databases
•    Description of Ground Truth/Metadata: No
•    Contextual info: Different context, snow, fogs, etc.
•    Information on Copyright: license (no), cost (free):
•    Contact person from Cantata: Sabri Boughorbel
     (sabri.boughorbel@philips.com)




    BMVA CANTATA – INRIA, December 12, 2007   page 21
                                                                                JL-22




    Surveillance

    VISOR

•    Website: http://imagelab.ing.unimore.it/visor/
•    Description of Dataset: 4 types of video clips. These sequences constitute a
     representative panel of different video surveillance areas.
     They merge indoor and outdoor scenes, such as Indoor Domotic Unimore D.I.I.
     setup.
•    Description of Ground Truth/Metadata: Object Detection and Tracking.
•    Results from metrics and ground truth: (Viper-GT) bounding box,
•    Comments: mostly simple videos
•    Information on Copyright: Free download
•    Contact person: vezzani.roberto@unimore.it




    BMVA CANTATA – INRIA, December 12, 2007   page 22
                                                                                     JL-23




    Surveillance

    BEHAVE
•    Website: http://groups.inf.ed.ac.uk/vision/BEHAVEDATA/
•    Description of Dataset: crowd, people acting out various interactions.
•    Description of Ground Truth/Metadata: Object Detection and Tracking.
•    Contextual info: calibration info
•    Results from metrics and ground truth: (Viper-GT) bounding box, object class,
•    Comments: some complex videos
•    Information on Copyright: Free download
•    Contact person: Bob Fisher : rbf@inf.ed.ac.uk




    BMVA CANTATA – INRIA, December 12, 2007   page 23
                                                                                      JL-24




    Surveillance

    BEHAVE 2
•    Website: http://groups.inf.ed.ac.uk/vision/BEHAVEDATA/INTERACTIONS/
•    Description of Dataset: The dataset comprises of two views of various
     scenario's of people acting out various interactions. Ten basic scenarios were
     acted out: InGroup, Approach, WalkTogether, Split, Ignore, Following, Chase,
     Fight, RunTogether, and Meet.The data is captured at 25 frames per second.
     The resolution is 640x480. The videos are available either as AVI's or as a
     numbered set of JPEG single image files.
•    Description of Ground Truth/Metadata: Tracking, Event detection.
•    Contextual info: 3D coordinates of points for calibration purposes provided.
•    Results from metrics and ground truth: Bounding boxes (VIPER XML format).
     Event labels for persons and frame span
•    Comments: The site will be updated when more of the ground truth becomes
     available.
•    Information on Copyright: Free download from website.
•    Contact person from Cantata: Dimitrios Makris, d.makris@kingston.ac.uk
    BMVA CANTATA – INRIA, December 12, 2007   page 24
                                                                                     JL-25




    Consumer applications

    VS-PETS'2003 - INMOVE
•    Website: http://www.cvg.cs.rdg.ac.uk/VSPETS/vspets-db.html
•    Description of Dataset: Outdoor people tracking - football data (three
     synchronised views). The datasets consists of football players moving around
     a pitch.
•    Description of Ground Truth/Metadata: Tracking information on image plane for
     camera 3 can be found at:
     http://www.cvg.cs.rdg.ac.uk/VSPETS/Camera3Xml.zip. An AVI file of the
     ground truth for camera view 3 is also available at
     http://www.cvg.cs.rdg.ac.uk/VSPETS/Cam3_Gt.avi
•    Results from metrics and ground truth: The location of each player on the
     pitch, for each frame of the sequence. For each player, the bounding box (with
     origin bottom left) in pixels should be determined. The position of the player is
     defined as the middle bottom of the bounding box (in pixels).
•    Information on Copyright: Free download from website
•    Contact person from Cantata: Dimitrios Makris, d.makris@kingston.ac.uk



    BMVA CANTATA – INRIA, December 12, 2007   page 25
                                                                                      JL-26




    Consumer Applications

    TRICTRAC

•    Website: http://www.multitel.be/trictrac/
•    Description of dataset: Multicamera HD progressive image in jpeg for synthetic
     video sequence of soccer.
•    Description of Ground Truth/Metadata: XML (position is 2D, 3D of objects and
     camera)
•    Contextual info: No
•    Results from metrics and ground truth : no
•    Comments: the datasets is fully described in "TRICTRAC Video Dataset:
     Public HDTV Synthetic Soccer Video Sequences With Ground Truth", X.
     Desurmont, J-B. Hayet, J-F. Delaigle, J. Piater, B. Macq, Workshop on
     Computer Vision Based Analysis in Sport Environments (CVBASE), 2006.
•    Information on Copyright: Access / licence: All data is publicly available and
     downloadable. If you publish results using the data, please acknowledge the
     data as coming from the TRICTRAC project, found at URL:
     http://www.multitel.be/trictrac. THE DATASET IS PROVIDED WITHOUT
     WARRANTY OF ANY KIND.
         Contact person from Cantata: Xavier Desurmont, desurmont\a\multitel.be
    BMVA CANTATA – INRIA, December 12, 2007   page 26
                                                    JL-27


Medical Dataset
Example of one dataset




BMVA CANTATA – INRIA, December 12, 2007   page 27
                                                     JL-28




Example with 2 signals:

            a mass and a micro calcification




 BMVA CANTATA – INRIA, December 12, 2007   page 28
                                                                              JL-29




 Conclusion

 WEB SITE
• Many application domains (d.makris@kingston.ac.uk)
       25 datasets for Surveillance
       6 datasets for Comsumer applications
       3 datasets for Medical

      http://www.tudor.lu/cantata

http://www.tudor.lu/QuickPlace/cantata/PageLibraryC125725E002AB722.nsf/h_AA
    BC75AA0B05E5DFC125725E002B5E46/ED93066DB0E340C7C12573A2005
    6D789/?OpenDocument



User Name : Francois.Bremond@sophia.inria.fr
Password :
  BMVA CANTATA – INRIA, December 12, 2007   page 29

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:71
posted:8/12/2011
language:English
pages:29