HDR by wanghonghx



        Video Understanding
       Performance Evaluation
                     Francois BREMOND,
          A.T. Nghiem, M. Thonnat, V. Valentin, R. Ma
                         Orion project-team,
                   INRIA Sophia Antipolis, FRANCE


•ETISEO Project
•Video Data
•ETISEO Results
•Metric Analysis
•ETISEO General Conclusion

•There are many evaluation initiatives with different objectives
    • Individual works
    • projects: CAVIAR, ILids, VACE, CLEAR, CANTATA,…
    • Workshops: PETS, VS, AVSS (CREDS),…
•Not standard annotation (ground truth)
•Lack of analysis of Video Data
    • which specific video processing problems a sequence contains
    • how difficult these problems are
•Lack of analysis of metrics
    • Numbers, base-line algorithm

ETISEO Project
•2 years duration, from January 2005 to December 2006
          To evaluate vision techniques for video surveillance applications.

    • Unbiased and transparent evaluation protocol (no funding)
          •Large involvement (32 international teams)
    •   Meaningful evaluation
         • provide the strengths and weaknesses of metrics
         • to help developers to detect specific shortcomings depending on
             – scene type (apron, building entrance etc.)
             – video processing problem (shadows, illumination change etc.)
             – difficulty level (e.g. strong or weak shadows)

      ETISEO Project
• Approach: 3 critical evaluation concepts
    • Ground truth definition
       • Rich and up to the event level
       • Give clear and precise instructions to the annotator
           • E.g., annotate both visible and occluded part of objects

    • Selection of test video sequences
       • Follow a specified characterization of problems
       • Study one problem at a time, several levels of difficulty

    • Metric definition
       • various metrics for each video processing task
       • Performance indicators: sensitivity, precision and F-score.
       • A flexible and automatic evaluation tool, a visualization tool.
ETISEO Project :                                      6

Large participation (16 active international teams)
   4 Companies:
       - Barco
       - Capvidia NV
       - VIGITEC SA/NV
       - Robert Bosch GmbH

   12 Academics:
       - Lab. LASL University ULCO Calais
       - Nizhny Novgorod State University
       - Queen Mary, University of London
       - Queensland University of Technology
       - INRIA-ORION
       - University of Southern California
       - Université Paris Dauphine
       - University of Central Florida
       - University of Illinois at Urbana-Champaign
       - University of Maryland
       - University of Reading
       - University of Udine

ETISEO : Video Data
•Large annotated data set
   • 85 video clips with GT,
   • organized into
       • scene types : apron, building entrance, corridor, road, metro station,
       • video processing problems : noise, shadow, crowd, …
       • sensor types : one\multi-views, visible\IR, compression…

 Video Data : Airport



Toulouse – France

  Video Data :        INRETS

                            Light Changes

Building Entrance

                                            Car Park
     Villeneuve d’Ascq – France

Video Data :   CEA

                     Video Type & Quality


Video Data : RATP


                    People Density

ETISEO : Results

• Detection of physical objects

                                             on 6 videos


                                  16 Teams

ETISEO : Results

• Tracking of physical objects



     ETISEO: Results
•   Good performance comparison per video: automatic, reliable, consistent metrics:
     • 16 participants:
           •   8 teams achieved high quality results
           •   9 teams performed event recognition
           •   10 teams produced results on all priority sequences
     •   Best algorithms: combine moving regions and local descriptors
•   A few limitations:
     • Algorithm results depend on time processing (RT), manpower (parameter tuning), previous similar
         experiences, learning stage required or not…: questionnaire
     •   Lack of understanding of the evaluation rules (output XML, time-stamp, ground truth, number of
         processed videos, frame rate, start frame…)
     •   Video subjectivity: background, masks, GT (static, occluded, far, portable, contextual object, event)
     •   Many metrics and evaluation parameters
           •  Just evaluation numbers, no base-line algorithm
•   Need of two other analyses:
     1. Metric Analysis: define for each task:
           • Main metrics: discriminate and meaningful
          •  Complementary metrics: provide additional information
     2. Video Data Analysis: impact of videos on evaluation
          •  define a flexible evaluation tool to adapt GT wrt videos

   Metric Analysis : Object detection task
•Main metric: Number of objects
    • Evaluate the number of detected objects
        matching reference objects using bounding
    •   Unbiased towards large, homogenous
    •   Difficult to evaluate object detection quality

•Complementary metric: Object area
    • Evaluate the number of pixels in reference
        data that have been detected
    •   Evaluate the object detection quality
    •   Biased toward large, homogenous objects

Metric Analysis : Example (1)
•Sequence ETI-VS2-BE-19-C1 has one big object (car) and several small
and weakly contrasted objects (people)
•Algorithm 9 correctly detects more objects than algorithm 13 (metric:
Number of objects)

        Algorithm     9     1     14     28    12     13     32
        F-Score     0.49 0.49 0.42       0.4   0.39   0.37   0.37
        Algorithm     8    19     20     17    29      3     15
        F-Score     0.35 0.33 0.32       0.3   0.24   0.17   0.11

     Performance results using the metric “number of objects”

Metric Analysis : Example (2)
•Using metric Object area, biased toward big object (car):
    • algorithm 13 cannot detect some small objects (people),
    • algorithm 9 has detected difficult objects at low precision.

•Metric Object area is still useful:
    • it differentiates algorithms 1 and 9: both are good at detecting objects but
       algorithm 1 is more precise

           Algorithm       1      13       9     32       14         12    20
           F-Score       0.83 0.71 0.69 0.68             0.65    0.65      0.64
           Algorithm      19      28      17      3       29         8     15
           F-Score       0.64 0.59 0.55 0.54             0.51        0.5   0.3

           Performance results using the metric “object area”

Metric Analysis : Advantages & Limitations
• Advantages :
    • various metrics for every video processing task.
    • analysis of the metric strengths and weaknesses and how to use them.
    • insight into video analysis algorithms: for example, shadows, merge

• Still some limitations :
    • Evaluation results are useful for developers but not for end-users.
         • Ok, not a competition nor benchmarking
         • But difficult to judge if one algorithm is good enough for a particular
           application, or type of videos.

ETISEO : Video Data Analysis
ETISEO limitations:

•       Generalization of evaluation results is subjective :
    •      comparing tested and new videos
•       Selection of videos according to difficulty levels is subjective
    •     Videos have only qualitative scene description: eg. strong or weak
    •     Two annotators may assign 2 different difficulty levels
•       One video may contain several video processing problems
        at many difficulty levels
•       The global difficulty level is not sufficient to identify
        algorithm's specific problems for improvement

Video Data Analysis
Objectives of Video Data Analysis :
  •     Study dependencies between videos and video processing problems to
      •    Characterize videos with objective difficulty levels
      •    Determine algorithms capacity in solving one video processing problem.

  •    To treat each video processing problem separately
  •    Define a measure to compute difficulty levels of videos (or other input data)
  •    Select videos containing only the current problems at various difficulty levels
  •    For each algorithm, determine the highest difficulty level for which this
       algorithm still has acceptable performance.

Approach validation : applied to two problems
  •    Detection of weakly contrasted objects
  •    Detection of objects mixed with shadows

Video Data Analysis :
  Detection of weakly contrasted objects
•   Video processing problem definition :
     the lower the object contrast, the worse the object detection

•   For one algorithm, determine the lowest object
    contrast for which this algorithm has an acceptable

•   Issue: one blob may contain many regions at
    several contrast levels

Video Data Analysis : conclusion

•       Achievements:
    •     An evaluation approach to generalise evaluation results.
    •     Implementation of this approach for 2 problems.
•       Limitations:
    •      Need to validate this approach for more problems.
    •      Works well if the video contains only one problem.
    •      If not, detects the upper bound of algorithm capacity.
    •      The difference between the upper bound and the real performance
           may be significant if:
         •     The test video contains several video processing problems
         •     The same set of parameters is tuned differently to adapt to several
               dependent problems

General Conclusion
     • Good performance comparison per video: automatic, reliable, consistent metrics.
     • Emphasis on gaining insight into video analysis algorithms (shadows,
•A few limitations:
    • Data and rule subjectivity: background, masks, ground truth,…
    • Partial solutions for Metric and Video dependencies
•Future improvements: flexible evaluation tool
    • Given a video processing problem:
          •   Selection of metrics
          •   Selection of reference videos
          •   Selection of Ground Truth : filters for reference data, sparse GT for long

•ETISEO’s video dataset and automatic evaluation tools are publicly
available for research purposes: http://www-sop.inria.fr/orion/ETISEO/

Video Data Analysis :
 Detection of weakly contrasted objects
  •       At each contrast level, the algorithm performance is x/m
      •     x: number of blobs containing current contrast level detected by a
            given algorithm
      •     m: number of all blobs containing current contrast level
  •       Algorithm capacity: the lowest contrast level for which
          algorithm performance is bigger than a given threshold
    Video Data Analysis :
      Detection of weakly contrasted objects

•   Error rate threshold to determine
    algorithm capacity: 0.5

To top