The MPEG Standard Brief Tutorial by mikesanye

VIEWS: 10 PAGES: 26

									-- Outline --

Objectives of the MPEG-7 standard
A MPEG-7 application example
Main elements of MPEG-7
Conformance of new Descriptors
 /Description Schemes



                                     1
Why do we need Mpeg-7?

Allow accurate access to audio-visual
 content;
Achieve the maximum interoperability:
  Content: modalities, feature aspects &
   applications
  Environments: content creation,
   management, distribution & consumption


                                            2
What is Mpeg-7?

Is NOT a standard for feature extraction /
 matching (search engines);
Is NOT compression standard;
Is known as “Multimedia Content
 Description Interface”:
  structural, detailed descriptions of AV content
   at different granularity & in different
   application areas.

                                                 3
Mpeg-7 Application example

 Region color descriptor proposal (IBM)
   To query images based on the color of one
    or more of their regions (e.g. find me “skin-
    colored” regions ).
 Various colored surfaces can be modelled
  and clustered into surface classes.
   The region color descriptor of a region in an
    image can be represented by a surface class
    with a semantic class label & a class
    identifier.                                  4
Creating region color descriptor

            Mapped onto
            surface color                          <Segment>
            class model                                  ……..
                        class 1 label 1              <RegionColor>
                        class 2 label 2                <Label id = “n”/>
                        …                              <classID id= “n”/>
                        class n label n                     ……….
                                                     </RegionColor>
                                     Creating      </Segment>
                                     MP7
                                     description

    Image




                                                                            5
Querying region color descriptor
                  Mapping to
                  surface color
                                  class 1 label 1
                  class mddel                           Search for regions
                                  class 2 label 2
                                                        with class ID “n”
                                  …
                                                        in region color
   Region query                                         descriptor
                                  class n label n                       Matching
                                                                        utility
   Semantic
   query:         Looking up of                                              Returned
   Find label     the label in                                               list
   “n” region     surface color                     <Segment id= “s5” >
                  classes                                 ……..
                                                      <RegionColor>
                                                        <Label/>
                                                        <classID id= “n”/>
                                                      </RegionColor>
                                                    </Segment>
                                                                                   6
Main Elements of Mpeg-7 (1)

Descriptors (D’s)
  syntax & semantics of feature representation
Description Schemes (DS’s)
  semantics & structure of the relationships between
   components (D’s & DS’s)
Description Definition Language (DDL)
  creation of new D’s & DS’s
  modification / extension of existing D’s & DS’s
Systems tools
  issues of synchronisation, transmission
   mechanisms...
                                                        7
Main elements of Mpeg-7 (2)
                                                   DS
                            Model          DS
                                      DS            D
               MP7 Schema                  DS       D
      Define        +                               D
DDL            Extensions      Validate
                                                        Instantiate


                   Create                       <Mp7>
                                                  <tag1/>
          010010                                  ……
                            Systems
                                                </Mp7>



                                                                 8
Mpeg-7 specifications

Part   1:   Systems (System tools)
Part   2:   Description Definition Language (DDL)
Part   3:   Visual (D’s)
Part   4:   Audio (D’s)
Part   5:   Multimedia Description Schemes (DS’s &
 D’s)
Part   6: Reference Software
Part   7: Conformance

                                                      9
Description Definition Language is ...

XML Schema (W3C) + Mpeg-7 extensions
Reusable Mpeg-7 Schema
  Importing: to import type declarations of Mpeg-7
     schemes
   Redefinition: To modify existing D’s or DS’s
   Restriction: To restrict certain aspects of existing
     D’s or DS’s
   Extension: To extend existing D’s or DS’s

                                                           10
Mpeg-7 Audio
Low-level / generic tools
  Scale Tree (20 D’s): temporal envelope,
   spectral envelope, harmonicity…
  Silence segment (1 D): levels of silence
Application-specific tools
  Sound effects (5 D’s)
  Musical instrument timbre (3 D’s)
  Spoken content (12 D’s)
  Melody contour (4 D’s), Melody (5 D’s)
                                              11
<SoundEffectModel id="sfx1.1"   SoundEffectCategoryRef="Bark">
  <ProbabilityModel xsi:type="ContinuousMarkovModelType" numberStates="7">
  <Initial dim="7">
  0.04 0.34 0.12 0.04 0.34 0.12 0.00 </Initial>
  <Transitions dim="7 7">
  0.91 0.02 0.00 0.00 0.05 0.01 0.01
  0.01 0.99 0.00 0.00 0.00 0.00 0.00
     <!-- etc. -->
  </Transitions>
  </PobabilityModel>
  <AudioSpectrumBasis loEdge="62.5" hiEdge="8000" resolution="1/4 octave">
     <Matrix dim="31 5">
         0.26 -0.05 0.01 -0.70 0.44
         0.34 0.09 0.21 -0.42 -0.05
              <!-- etc. -->
    </Matrix>
  </AudioSpectrumBasis>
</SoundEffectModel>

 SoundEffectCategoryRef: a category label defined in Controlled Terms
 ProbabilityModel: statistical model used for content classification
 AudioSpectrumBasis: a projection matrix to reduce the dimentionality of
  a frequency spetra
Mpeg-7 Visual (1)
 Basic structures (5 D’s)     Color (7 D’s)
   Grid layout                  Color Space & Color
   Time series                   Quantization
   Multiple view                Scalable Color: HSV color
   Spatial 2D co-ordinates       space & Haar transformation
   temporal interpolation       Dominant Color
                                 Color Layout, Color structure
                                 Group-of-Frames/Group-of-
                                  Pictures color




                                                             13
Mpeg-7 Visual (2)
 Texture (3 D’s)                   Motion (4 D’s)
   Homogenous: directionality,       Motion Activity: intensity,
    coarseness and regularity of       direction, spatial distribution
    patterns
                                      Camera Motion
   Non-Homogenous (Edge
                                      Motion Trajectory
    Histogram)
                                    Localization (2 D’s)
 Shape (3 D’s)
                                      Region locator
   Contour-based: Curvature
    Scale-Space (CCS)                 Spatial-temporal locator
   Region-based: Angular           Face recognition (1 D)
    Radial Transformation
   3D


                                                                  14
<ContourShape numberOfPeaks=”54”>
  <GlobalCurvatureVector>3090 … 13323 </GlobalCurvatureVector>
  <PrototypeCurvatureVector> 2980 …
8453</PrototypeCurvatureVector>
  <HighestPeak> 17490 </HighestPeak>
  <Peak> <xpeak> 20480</xpeak> <ypeak>14390 </ypeak> </Peak>
  <Peak> <xpeak> 20480</xpeak> <ypeak>14390 </ypeak> </Peak>
    <!-- etc… -->
</ContourShape>

  GlobalCurvatureVector: global parameters of the contour,
   i.e.Eccentricity & Circularity
  PrototypeCurvatureVector: eccentricity & circularity of the
   prototype contour
  HighestPeak: the parameters of the filter corresponding to
   the highest peak
  Peak: the parameters of the remaining prominent peaks
Overview of MDS
                                                                   User
 Content
                                                                   interaction
 organisation
                                                    Navigation
                                                    & Access

                Content management
                Content description
                                         Time, Duration     Annotation, Person,
        Package, Root                                       Place
                        Matrix, Vector


Basic       Schema       Datatype &          Link & Media        Basic
elements    tools        structure           localisation        DS’s

                                                                            16
 Overview of MDS
                     Title, creator,                                 User
   Content           classification                                  interaction
   organisation
Format,Coding,                                 Usage Rights,
instance                                       Usage Record
                                                      Navigation
                    Creation &
                                                      & Access
   Media            Production            Usage
                  Content management
                  Content description                 Event, Object, relation
     Structural                        Conceptual
     Aspects                           Aspects


 Basic        Schema         Datatype &        Link & Media        Basic
   Segment, tools
 elements segment            structure         localisation        DSs
   relation graph
                                                                                17
Overview of MDS                                                   Filter, search,
                                                                  browse
                                                                    User
 Content              Collection &         Models                   interaction
 organisation         Classification
                                           Hierarchical /
                                                 Navigation
                                           sequential summary       User
                 Creation &
                                                 & Access           Preferences
 Media           Production
           Collection of semantic Usage
           concepts                              Summaries
              Content management
              Content description                Partitions          Usage
                                                 Decompositions      History
      Structural              Conceptual
      Aspects                 Aspects            Variation
                           Network condition,
Basic        Schema        resolution &
                            Datatype         Link & Media         Basic
elements     tools          structure        localisation         DSs

                                                                               18
Structural aspects of a video clip

                                           Feature Still      Video     Moving
                                                     region   segment   region
                            A Video clip
                                           Time         -       Yes
                             Video segs.   Color      Yes       Yes       Yes
                                           Shape      Yes        -        Yes
                             Children      Texture    Yes       Yes
                             V segs.
                                           Motion       -       Yes       Yes
                                           Camera       -       Yes
                             Key frames    motion


 A video clip is broken down into segments & sub-segments. Each segment
 may be described by a set of visual or audio D’s and DS’s.
                                                                          19
An example of textual description of
structural aspects in Mpeg-7

                              <Mpeg7>
                               <VideoSegment id="Seg0">
                                < MediaTime/>
                 A Video clip
                                 <TemporalDecomposition>
                                    <VideoSegment id= “vs001” >
                   Video seg.          <MediaLocator/>
                                       <MediaTime/>
                                       <TemporalDecomposition>
                                          <StillRegion id= “sr001” >
                                             <MediaLocator/>
                                             <MediaRelTimePoint />
                                             <Visual/>
                    Key frame             </StillRegion>
                                        </TemporalDecomposition>
                                    </VideoSegment>
           Visual features       </ TemporalDecomposition>
                                </VideoSegment>
                              </Mpeg7>
                                                                       20
Hierarchical summary
                 Segment tree
                                Theme list
                Key frame
   Seg 1
                                • Goal
     Sub-seg1

     Sub-seg2                   • Dribble

     Sub-seg3                   • Interview

   Seg 2                        • Discussion

   Seg 3

   Seg 4


                                               21
Conformance of extended D/DS

If a new DS defined by *restriction*(not by
 extension) of an existing one through DDL, it can
 be said to be compliant to Mpeg-7.
If descriptions instantiated from a new DS
 (defined by extension & redefinition), they can be
 said to be compliant to DDL, but not Mpeg-7.
It is assumed that consuming terminals
 understand the semantic of the new DS in a non-
 normative way.

                                                22
What can we do about Mpeg-7 based
on our current system?

 Sports hierarchical summary, maybe combined with
  user’s preference
 Simple structural video segments, maybe combined with
  text/semantic annotation
 Partial description generated from our specific
  visual/audio feature analysis tools for similarity retrieval




                                                            23
Summary of a given program
based on user’s preference


 AV content             Preferred              Preferred
                        summary                view
              Filter                Browsing
                                                            A
              engine                engine
                                                           user
AV content
Summaries

                 User’s preference
                 Requires summary of a
                 soccer program within a
                 specific season
                 Shows goals only

                                                           24
What to do if we cannot find the right
DS/D for our needs?


Extension by using DDL
Extension without using DDL
  through amendments (Mpeg-7 Version 2)
  through private tools
  through private tools defined within a
   consortium for a specific application domain



                                                  25
Conclusion

The Mpeg-7 standards cover a wide range of
 generic application needs.
The XML-based DDL is used to define various
 Mpeg-7 functional models. Also provided are to
 modify/create/extend D’s & DS’s.
Mpeg-7 visual/audio parts allow similarity
 retrieval.
Mpeg-7 MDS provides various metadata
 structures to describe AV content.
                                              26

								
To top