09 by wuxiangyu

VIEWS: 4 PAGES: 56

									Image Sequence Compression:
               • Principles

               • Standards

               Joseph RONSIN




                        Outlines
                                                    2

•   Introduction : Image and video sequences
•   Motion Estimation / Compensation
•   Block matching
•   Principles of video standards
•   MPEG-1 /MPEG-2
•   MPEG-4




                                           Video Coding CH 09
                                Outlines
                                                                     3

•   Introduction : Image and video sequences
•   Motion Estimation / Compensation
•   Block matching
•   Principes of video standards
•   MPEG-1 /MPEG-2
•   MPEG-4




                                                            Video Coding CH 09




                              Introduction
                                                                     4

• Video characteristics
    – Sequence of mages
    – frame rate / cadence
       • fluidity : min 20 images per second
       • classically 25 im / s
    – Bitrate or Rate = number of displayed (or
      transferred) bytes per time unite
       • bitrate => bytes / s = image size x nb of images / s




                                                            Video Coding CH 09
                             Introduction
                                                            5

• Classical video Formats
  – CIF : 352x288
  – QCIF : 176x144
  – subQCIF : 88x72

  – CCIR 601 : 720x576 (TV)

  – HDTV
    • 720p : HDTV video mode, progressive
    • 1080p30 : 1920x1080 video full HD at 30 Hz


                                                   Video Coding CH 09




                             Introduction                  6




                                                   Video Coding CH 09
                                                                            7




    Format      Subsamling      Subsampling         Bandwidth
                 horizontal       vertical          reduction


4:4:4                none          none                 none

4:2:2                 ↓2           none                     33%

                      ↓2            ↓2
4:2:0                                                       50%

                      ↓4            ↓4
4:1:1                                                  62.5%




                                                h.264/AVC         Video Coding CH 09




                              Introduction
                                                                           8

• Why compression?
   – storage capacity
       • 30 min of a video can reach 50 GBs !
   – Limited bandwidth
• Constraints for compression
   –    Quality
   –    Bitrate
   –    Complexity
   –    Cadence




                                                                  Video Coding CH 09
                                Introduction
                                                                        9

• Specificity of video compression
     – Exploitation of spatial redundancies
     – Exploitation of temporal redundancies
• «interframe» Coding
     – HVS: low pass filter
     – Low sensitivity to spatial and temporal
       high frequencies


• Motion Estimation / Compensation

                                                               Video Coding CH 09




                 Estimation / compensation
                                                                       10

• Estimation / compensation
     – coding/decoding

                coder
   previous
                                       motion parameters
  Image(s)            Estimation :
                     Motion analysis                       Encoding
source Image




                decoder
                                       motion parameters
reconstructed             Motion                           Decoding
    Image
                        Compensation



                         previous
                         Image(s)                              Video Coding CH 09
                    Introduction
                                                 11

 • Example : test sequence « foreman »




                                         Video Coding CH 09




                    Introduction
                                                 12

• Example : motion observation




   Frame t-1


                           difference



   Frame t                               Video Coding CH 09
                           Plan du cours
                                                               13

  •   Introduction : Image and video sequences
  •   Motion Estimation / compensation
  •   Block matching
  •   Video standards Principes
  •   MPEG-1 / MPEG-2
  •   MPEG-4




                                                       Video Coding CH 09




                             Introduction
                                                               14

• Scheme block of a simple video coder



Current                       + +           Encoding
                                            residual         Errors
Image
                                -            error


               motion          motion
              Estimation     Compensation


                                            previous
                                             Image

                  Motion Vectors


                                                       Video Coding CH 09
               Estimation / compensation
                                                               15

• What is motion ?
   – Intensity variation
   – 2D Representation of evolution of a 3D scene

• Origins of motion
   – motion of an object
   – motion of a camera
   – motion of a camera and motion of an object

• And also…
   – Lighting changes of the scene
   – Optical Illusion
       apparent motion

                                                       Video Coding CH 09




               Estimation / compensation
                                                               16

• Influence of lighting




               real motion           Apparent motion



                                                       Video Coding CH 09
                  Estimation / compensation                       17



• Barber's pole




  Rotating pole
                         Real motion         Apparent motion

                                                            Video Coding CH 09




                  Estimation / compensation                       18



• Optical flow: apparent motion
  characterization
  – Motion ≠ optical flow
     • Motion without optical flow (uniform objects)
     • Optical flow without motion (lighting conditions
       modifications)
     • Inconsistent optical flow and motion (barber pole)

• Objectives:
  – Pixel trajectories estimation between successive images
     • Computes velocity vector fields
     • Transforms image into next one
     • Projects 3D motion vectors

                                                            Video Coding CH 09
                                                     Estimation / compensation                        19


• Motion of a pixel: a geometrical concept
                    Impossible d’afficher l’image.




   – Pixel
   – Associated velocity :



                                                                                   3D Motion

                                                        2D Motion




                                                                                                Video Coding CH 09




                                                              Optical flow                            20


• Optical flow: radiometric concept
   – 2D Velocity vector of a luminance value
   – Hypothesis :
       • Constant lightning
       • Lambertian surface



                                                               Reflected light



                                                                                 The intensity of the
                                                                                 incident light is the
   Incident light
                                                                                 same in each direction.




                                                                                                Video Coding CH 09
                                                                                                                                                                                   Optical flow                                                          21


• Interpretation
       Impossible d’afficher l’image.




                                                                                                                                                                                                                        frame t+1


                                                                                                                                                                                       Impossible d’afficher l’image.




                                           frame t

                                               Impossible d’afficher l’image.




                                                                                                                                                  Impossible d’afficher l’image.




                                                                                                                 Impossible d’afficher l’image.




                                                                                                                                                                                                                                    Video Coding CH 09




                                                                                                                                                                                     OFCE                                                                22


• Optical Flow Constraint Equation or Apparent Motion Equation
                                                                                                                                                                                                                                         Impossible d’afficher l’image.




                                        – Hypothesis : preservation constraint of light intensity during
                                                                                Impossible d’afficher l’image.




• Other wording
                                        – DFD (Displaced Frame Difference)
 Impossible d’afficher l’image.




                                                                                                                                                                                                                                    Video Coding CH 09
                                                                                                                                           ECMA                                                                23


• ECMA
                                       Impossible d’afficher l’image.




                – Expansion in Taylor's series
   Impossible d’afficher l’image.




                – That is:                                                                                Impossible d’afficher l’image.




                                    with                                 Impossible d’afficher l’image.




                spatial gradient of the image at pixel I(x,y,t)

                                                                                                                                                                                                         Video Coding CH 09




                                                                                                                                           OFCE                                                                24


• Optical flow equation: geometrical interpretation

                                    Equal intensity on the contour
                                                                                                                                                                                              -I(x,y,t)
                                                                    Impossible d’afficher l’image.




                                                                                                                                               v                                                  Gradient: known
                                                                                                                                           vt et            Impossible d’afficher l’image.




                                                                                                                                                   v n en

                                                                                                                                                                                             v’

                      Tangential component of velocity vector


   Optical flow
                                      if v is a candidate, v’ is also a candidate!
                                      any point located on the tangent (= constraint)


                                                                                                                                                                                                         Video Coding CH 09
                                       OFCE                     25


• Aperture problem
   – Homogeneous area: invariant to translation during dt
   – Several motion parameter sets lead to the same sequence




                                 ?




                                                          Video Coding CH 09




                                       OFCE                     26


• Insufficient local constraints
   – Ill-posed problem
   – Add constraints
       • Global constraints
       • Spatio-temporal constraints
   – Smoothing of optical flow
       • regulation constraint




                                                          Video Coding CH 09
                                          Estimation                27


• Methodology
  – Representation
     • Motion model
     • Observation model
  – Estimation method
     • Techniques
     • Criteria
     • Optimization strategies
  – Direction estimation




                                                              Video Coding CH 09




                     Representation - model
                                                                      28

• motion representation: models for motion
  – Translationnal model
                                        dx / dt  b1 
                     
             vx, y  vx , vy      T
                                                
                                        dy / dt  b2 


     compression of natural sequences

  – affine Model
                                        b1  b3 b4  x 
                    
             vx, y  vx , vy   
                               T
                                                 
                                                       
                                        b2  b5 b6  y 

  – Quadratic models

                                                              Video Coding CH 09
                     Representation - support
                                                                    29

• Spatial support




          Global               Pixels              Blocks




                     Regions              Meshes
                                                            Video Coding CH 09




                                                             Methodology

                           Estimation techniques                  30


• Differential methods
   – Satisfy OFCE
   – ALL pixels

• Matching
   – Characteristics of a set of pixels
       • Support  pixel
   – Tracking of the support over frames

• Pixel-recursive techniques




                                                            Video Coding CH 09
                                                                                                                                                                Methodology

                                                                                                              Differential methods                                   31


                         • Differential methods
                                 – Optical flow: minimisation of functional
                                                                             Impossible d’afficher l’image.




                                 – Ill-posed problem: additional regulation constraint
                                 – Cost function:
                                                                                                               Impossible d’afficher l’image.




                                 – Hypothesis: one object  one motion
                                 – True for small displacements only




                                                                                                                                                               Video Coding CH 09




                                                                                                                                                                Methodology

                                                                                                                                                Matching             32


                         • Minimization of DFD
                                 – Support B                                                                                                               I
                                 – Search area K  I                                                                                                           B
                                                                                                                                                               K
                                 – Matching measure (x, y)K
Impossible d’afficher l’image.




                                 – Solution: minimization of DFD
                                            Impossible d’afficher l’image.




                                 – Matching criteria
                                     • p=1 : Mean of Absolute Differences (MAD) – SAD
                                     • p=2 : Mean Square Error
                                     • Intercorrelation coefficient

                                                                                                                                                               Video Coding CH 09
                                                                                 Methodology

                          Recursive techniques                                        33


• Sub-class of differential techniques
    – Recursive algorithms
        • pel-recursive


• Min(DFD) and optical flow
    – Numerical methods: gradient descent
    – Search of local min as close of optimal solution as possible


                 Bad initial condition
                                                      Correct initial
                                                          condition



                                                             Adjusted step

                                              Too large
                                                 step
                                                                                Video Coding CH 09




                                     optimisation
                                                                                        34

• DFD – looking for minimum
    – exhaustive search (full-search)
       • global optimum
       • Costly
            – For a N£N block and a search area K£ K: complexiy in O(N2 ¢ K2)




    – Idea: acceleration for search
        • Fast algorithms
        • sub-optimal solution




                                                                                Video Coding CH 09
                                                         Methodology

                           Strategies - optimisation          35


    • Three-Step Search
1. Initialisations
           n=3
2. Search space Kn of size 2n+1£ 2n+1

3. MSE or MAE computed on 9 positions
                                                         Best candidate

4. Choice of the best candidate




                                         Etape 1 : K3


                                                        Video Coding CH 09




                                                         Methodology

                           Strategies - optimisation          36


    • Three-Step Search
1. Initialisations
           n=3
2. Search space Kn of size 2n+1£ 2n+1

3. MSE or MAE computed on 9 positions                    Best
                                                         candidate
4. Choice of the best candidate

5. n = n – 1

6. Back to 2 until n = 1


                                         Etape 2 : K2


                                                        Video Coding CH 09
                                                                       Methodology

                            Strategies - optimisation                       37


   • Three-Step Search
1. Initialisations
           n=3
2. Search space Kn of size 2n+1£ 2n+1

3. MSE or MAE computed on 9 positions

4. Choice of the best candidate

5. n = n – 1

6. Back to 2 until n = 1


                                                       Etape 3 : K3


                                                                      Video Coding CH 09




                                                                       Methodology

                            Strategies - optimisation                       38


   • Advantages
        – Complexity
               • Number of operations: 1 + 8 log2 2n
        – Simple to implement
        – Quick convergence
               • Valid for small displacements


   • Derivated Techniques of log process
        – Diamond search
        – Conjugate Direction Search
        – Modified Motion Estimation Algorithm…




                                                                      Video Coding CH 09
                             -optimisation
                                                                  39

• multiresolution or hierarchical methods
   – Avoiding local min (not global)
   – Accelerating convergence

        Layer                          Initialisation

         2
                                                    enhancement

         1


         0



                                                          Video Coding CH 09




                             -optimisation
                                                                  40

• hierarchical methods
   – With considerations to scene content
   – Hierarchy of segmentation
   – Linked to objects




                                                          Video Coding CH 09
                                         direction of estimation
                                                                                                       41

    • Estimation prediction
            – For MPEG2




                      Forward                                                     Backward


1       2     3   4             14   2        3           14   2      3                  1     2       3    4
                                5     6                   5     6
5       6     7   8                           7   8                       7   8          5     6       7    8
                                     10                        10
9   10        11 12             91            1   12      91          1       12         9    10       11 12

        t-1                               t                         t-1                            t




                                                                                             Video Coding CH 09




                                                       Outlines
                                                                                                       42

    •       Introduction : Image and video sequences
    •       Motion Estimation / Compensation
    •       Block matching
    •       Principes of video standards
    •       MPEG-1 /MPEG-2
    •       MPEG-4




                                                                                             Video Coding CH 09
                        Block matching (BMA)
                                                                       43

• basic bricks for standards

           Current Frame                    Reference Frame

             (x, y) N                             -p       p
                        M                                  M
               Macroblock                         -p   N   p



                            Reference Frame

                               (x, y) N
                                 M        (x+u, y+v)



                                     Search region
                                                               Video Coding CH 09




                               Block matching
                                                                       44

• Example
     space search
        for min




                                                               Video Coding CH 09
                          Block matching
                                                                  45

• Result:
   – full search
   – Block 8x8
   – space for search: 15x15




                                                          Video Coding CH 09




                          Block matching                        46


• Block searching: example
   – Sequence




        Reference image           Image to be encoded


   – Matching
                               Macroblock to be encoded




                                                          Video Coding CH 09
                             Block matching                       47


   – Residual of prediction (error)




        Reference image               Image to encode




                                  Error prediction image: to be
       Predicted image
                                    encoded and transmitted



                                                            Video Coding CH 09




                          Block matching
                                                                    48

• Size blocks
   – Small blocks: better estimation
   – Complexity , Number of vectors 

    Adaptation to Size blocks
    Fast Algorithms suited to large blocks
• Estimation criterion
   – Classicaly SAD (sum of absolute difference)
   – DFD




                                                            Video Coding CH 09
                         Block matching
                                                                  49

• Advantages of BMA (Block Matching
  Algorithm)
   – Low cost: one vector/block
   – Implementation on embedded systems:
     parallelism
   – Robust

• Drawbacks
   – One block can contain several objects
   – Arg min Criterion: not real motion
   – No detection of zoom, of rotations, of local
     deformations
   – Blocky effects

                                                          Video Coding CH 09




                         Block matching                         50


• Hierarchical Block Matching (HBMA)
  – resolution scalable video Scheme (layers)



                                                Motion vector
                                                Level 2


                                                Motion vector
                                                Level 1




                                                Motion vector
                                                Level 0




                                                          Video Coding CH 09
                             Block matching
                                                                      51

• subpixel motion Estimation
   – real motion: not necessarily an integer number of pixels
   – Principle :
       • estimation to pixel level
       • interpolation (bilinear)
       • estimation to half-pixel level, etc…




                                                 half-pixel
                                                Estimation
                                        1  5.5
                                         2  5.5



                                                              Video Coding CH 09




                             Block matching
                                                                      52

• Interest of motion compensation
       example : H263 coder




                                                              Video Coding CH 09
                         Outlines
                                                   53

•   Introduction : Image and video sequences
•   Motion Estimation / compensation
•   Block matching
•   Principles of video standards
•   MPEG-1 /MPEG-2
•   MPEG-4




                                           Video Coding CH 09




                    standard Principle
                                                   54

• History of standards
    – ISO / ITU-T




                                           Video Coding CH 09
                             standard Principle
                                                                            55

• Specification
   – Decoder
   – Bitstream

• Decomposed in parts
   –    Part 1: Systems (synchronisation, multiplexing...)
   –    Part 2: Video
   –    Part 3: Audio
   –    Part 4: Conformance
   –    Etc.

    MPEG-1 into 5 parties, MPEG-2 into 9 et MPEG-4 into 16.



                                                                    Video Coding CH 09




                         Standards: principles                            56

• Hybrid coder


 I(t)           E(t)=I- Ip             Ê(t)                Ip(t) = predicted
                             DCT / Q            Coder                  image


                                  DCT-1 / Q-1
                                                        Motion vectors


                                         Î(t)

                    Motion          Memory
                  compensaton
                                                            Î(t-1)
                                        Î(t-1)              Reference image
                     Motion
                    estimator

                     Temporal predictor                             Video Coding CH 09
                        standard Principle
                                                                        57

• Associated Decoder


                       Ê(t)               Î(t)
         DCT-1 / Q-1

                       Ip(t)                     Memory

                                      Motion          Î(t-1)
                                    Compensator       ref. image




                                                                Video Coding CH 09




                   Common characteristics
                                                                        58

• Hierarchy of video data: 6 layers
                         Video Sequence
                       Group Of Picture (GOP)
  …                                                                  …

 Frame                                                       Block
                         Slice
                                                                     8 pixels




                                                          8 pixels

                              Macroblock (4 blocks)
                                                                Video Coding CH 09
                    Common characteristics
                                                                      59

• 3 types of images : I, P, B

• I Frames (Key frames)
   – Intraframe coding
   – Random access           I       B   B    P    B     B        P
• P Frames (predicted)
   – Encoding from previous frame (inter)
   – Error Propagation

• B Frames (bidirectionnal)
   – Coding : previous frame and next frame
   – Best compression rate


                                 I   B   B    P     B    B         P



                                                             Video Coding CH 09




                    Common characteristics                         60


• Interest of B frames




            X                                      Z

                                              Information available
 Available from                               from X
 (Z) - future                                 (past reference
 reference frame                              frame)

                             Y                               Video Coding CH 09
                   Common characteristics
                                                           61

• Group Of Picture (GOP)
    – Group of picture between two I frames
               I    B   B   P    B    B   P


    – Reordered frames for coding purposes
               I    P   B   B    P    B   B


• macroblock Coding in P or B frames
    – Inter coding
    – Intra coding: no possible prediction (overlapped
      areas)

                                                   Video Coding CH 09




                            Outlines
                                                           62

•   Introduction : Image and video sequences
•   Motion Estimation / compensation
•   Block matching
•   Principes of video standards
•   MPEG-1 /MPEG-2
•   MPEG-4




                                                   Video Coding CH 09
                MPEG introduction
                                             63




• MPEG-1, a standard for storage and
  retrieval of moving pictures and audio on
  storage media (1992)
• MPEG-2, a standard for digital television
  (1994)
• MPEG-4 a standard for multimedia
  applications (1998)
• MPEG-7 a content representation standard
  for information searching (2001).
                                     Video Coding CH 09




                     Outlines
                                             64

 • MPEG-1 / MPEG-2




                                     Video Coding CH 09
                     MPEG-1 / MPEG-2
                                                                 65

• MPEG-1 : progressive video coding up to 1.5
  Mb/s
  – 1992
  – Applications : CD-ROM, CD-i …
  – Similar to H.261.
     • Major difference is bi-directional motion compensation.
  – Bitrates: ~ 1.15 Mb/s for video, 128/256 kb/s for
    audio.
  – “MP3” Format




                                                         Video Coding CH 09




                     MPEG-1 / MPEG-2
                                                                 66

• MPEG-2 : digital TV broadcasting
  – 1994
  – Parameterisable by user (several profiles).
  – Higher bitrates : 3-15 Mbit/s for standard TV (15-
    30 Mbit/s for HDTV).
  – DVDs,digital TV, HDTV (US and Europa).
  – Interlaced Videos.
  – Multi-channel sound




                                                         Video Coding CH 09
                                  MPEG-1 / MPEG-2
                                                                                             67

Sequence is a hierarchical data structure:

    1. Sequence layer: Sequence layer overhead +
    GOP layer data;
    2. GOP layer: GOP layer overhead + Picture layer
    data;
    3. Picture layer: Picture layer overhead + Slice
    layer data;
    4. Slice layer (sequence of MB): Slice layer
    overhead + MB layer data;
    5. MB layer: MB layer overhead + Block layer data;
    6. Block layer: Transformed coefficients + End-of-
    block (EOB).
Slice: if part of a slice is lost, the decoder will skip the rest of the slice when it
    detects the error and start decoding from the beginning of the next slice.
                                                                                     Video Coding CH 09




                                  MPEG-1 / MPEG-2
                                                                                             68

• Intra image coding
      intra coding for macroblocks
     – JPEG
     – Two quantizers:
           • intra, non-intra
     – important binary rate
                                              16




                                         8

                                  Y      Y           Cr                 Cb
                                                                         b

                                  Y      Y
                                                                                     Video Coding CH 09
                         MPEG-1 / MPEG-2
                                                                             69

• 2 Quantizations
JPEG’s scheme with quantization specific to
  intra blocks or non intra blocks (differentially coded
  pictures)

   – Intra block:
        contains all frequencies
        likely to produce blocking effects if coarse quantization.
   – Non intra block:
        predominance of high frequencies (error of interpolation or
        prediction)
        support coarse quantization


So, 2 different quantizers, both near uniform, but with
  different behaviour around zero

                                                                     Video Coding CH 09




                         MPEG-1 / MPEG-2
                                                                             70




                                                                     Video Coding CH 09
                                  MPEG-1 / MPEG-2                                   71


• Inter image coding (on P and B) : I, P or B coding
  strategy of macroblocks
        • intra coded blocks strategy while inter image coding !!!
             – Inefficient temporal prediction



    – Block type per image :

                         Image       I    P      B

                         Bloc I                
                         Bloc P                 
                         Bloc B                  




                                                                           Video Coding CH 09




                                  MPEG-1 / MPEG-2                                   72


•   Inter image coding principles
    1. Intra coding (if bad inter coding…)
    2. BMA on luminance
        •   spatial or temporal DPCM on motion vectors
        •    Entropy coding
    3. Residual error (source – predicted) :
        •   DCT on each 8x8 blocks (typ. 6 blocks in 1 Macroblock)
        •   Entropy coding cf intra (different quantization tables)
    4. Mode selection
        •   "Skip": macroblock is identical to the one located in reference frame
        •   Otherwise choice between I, P or B coding
             – overall numbers of bits
        •   Macroblock tagging

 Remark: B image
    – Comparison between the 3 prediction methods for temporal
      DPCM
    –  complexity


                                                                           Video Coding CH 09
                             MPEG-1 / MPEG-2
                                                                                    73

    – B image




                                                                          Video Coding CH 09




                             MPEG-1 / MPEG-2
                                                                                    74

B-images coding
    Two best-match motion vectors are estimated for each MB:
      1: from previous I or P => d1 (forward)
      2: from next P => d2,( backward)

Compensation is based on 2 match vectors with 3 possibilities:
        1.   d1,
        2.   d2, just uncovered area is not predictable from past but from future
        3.   mean of compensations from d1 and d2, decreases the effect of
             noise.

Intra or predictive mode is selected for each MB
so for a MB of B frames
    –   Intra mode
    –   inter or Predictive mode: forward, backward, average.
             Motion compensation and estimation with ½ pixel precision.
    –   Skip


                                                                          Video Coding CH 09
                                 MPEG-1 / MPEG-2
                                                                                                75




                         Prediction modes for Macroblock in B-frame
  x is the coordinate of pel inside MB, mv01 the motion relative to reference picture I0, m21
                            the motion relative to reference picture




                                                                                       Video Coding CH 09




                                                                                                76

• Bitrate in a GOP




                                                                                       Video Coding CH 09
                       MPEG-1 / MPEG-2
                                                                   77

Flexibility
• coding using only I pictures (I I I I...)
     highest degree of random access but low compression

•    coding without B pictures (I P P P P P
     P I P P P P...)
     certain degree of random access (FF/FR functionality) and
     moderate compression.

•    coding with all 3 pictures I P B
     high compression , good random access (FF/FR
     functionality) but coding delay.

                                                           Video Coding CH 09




                   MPEG-2 versus MPEG-1
                                                                   78

• Efficient Encoding of interlaced sequences:
  broadcast
    –   other scanning modes
    –   Motion compensation on blocks 16x8
    –   New prediction modes
    –   Encoding 4:2:2

• Coding Improvments:
    – New quantization characteristics
    – New VLC tables

• Scalability
                                                           Video Coding CH 09
                                          MPEG-2
                                                                                      79
MPEG_2 scalable extensions
 SNR scalable: similarity to JPEG progressive DCT based-method. 2
  layers encode signal at same resolution . Base layer: coarse
  quantization. Enhancement layer: encoding difference between the
  non quantized DCT coefficients and the quantized coefficients from
  base layer.

   SPATIAL scalable: based on classical pyramidal approach.

   TEMPORAL scalable: layering is achieved by providing a temporal
    prediction for the enhancement layer based from the base layer.

   DATA PARTITIONNING: intended to assist with error concealment in
    the presence of transmission or channel errors. Breaks the blocks of
    transformed coefficients into 2 layers:
     – more critical lower frequency coefficients and side informations (DC
       values, motion vectors),
     – lower priority bit stream (higher frequency AC- coefficient).




                                                                              Video Coding CH 09




                                          MPEG-2
                                                                                      80

    Introduction of « profiles » and « levels »
     With "profiles", describing functionalities, and "levels",
    describing resolutions to stipulate conformance between
    equipment not supporting the full implementation. Profiles and
    levels provide a means for defining sub-sets of the syntax and
    thus the decoder capabilities required to decode a particular bit
    stream.
•   A profile is a defined sub-set of the entire bitstream syntax. A
    profile may contain one ore more levels. Profile defines a new
    set of algorithm added as a superset to the algorithm in the
    profile below.
•   A level is a defined set of constraints imposed on parameters
    in the bitstream. A level specifies the range of the parameters
    that are supported by the implementation.
•   There are 4 levels: low, main, high-1440, and high levels.
•   There are 5 profiles : simple, main, SNR scalable, spatially
    scalable and high profiles.



                                                                              Video Coding CH 09
                                          MPEG-2
                                                                                               81


High Level                 4:2:0                                      4:2:0 or 4:2:2
                           19201152                                  19201152
                           60 im/s                                    60 im/s
                           80 Mbit/s                                  80 Mbit/s


H-1440 Level               4:2:0                      4:2:0           4:2:0 or 4:2:2
                           14401152                  14401152       14401152
                           60 im/s                    60 im/s         60 im/s
                           60 Mbit/s                  60 Mbit/s       60 Mbit/s


Main Level     4:2:0       4:2:0       4:2:0                          4:2:0 or 4:2:2
               720576     720576     720576                        720576
               30 im/s     30 im/s     30 im/s                        30 im/s
               15 Mbit/s   15 Mbit/s   15 Mbit/s                      20 Mbit/s


Low Level                  4:2:0       4:2:0
                           352288     352288
                           30 im/s     30 im/s
                           4 Mbit/s    4 Mbit/s

               SIMPLE      MAIN        SNR Scalable   Spatial         HIGH
               PROFILE     PROFILE     PROFILE             Scalable   PROFILE
                                                      PROFILE



                                                                                       Video Coding CH 09




                                         Outlines
                                                                                               82

•   Introduction : Image and video sequences
•   Motion Estimation / compensation
•   Block matching
•   Principes of video standards
•   MPEG-1 /MPEG-2
•   MPEG-4




                                                                                       Video Coding CH 09
                                  MPEG-4
                                                                      83

• MPEG-4 part 2 : "visual"
   – DivX
   – Object

• Needs
   – Interaction : content concept
   – Integration : integrating natural or synthetic objects in a scene
     (compositing)
   – Flexibility and Evolutivity (scalability)
   – Robustness to transmission errors

• Fusion of 3 worlds
   – telecommunications
   – computers
   – TV/film areas
                                                              Video Coding CH 09




                                  MPEG-4
                                                                      84

• Final applications:
   – Digital TV and visioconference
   – Real-time communications
        Latency minimization
   –   Multimedia on mobile networks
   –   TV production, augmented reality, virtual reality
   –   Games
   –   Streaming video




                                                              Video Coding CH 09
                                MPEG-4
                                                                   85

Overview
• An object based representation
• A scene is coded as a composition of objects natural as well
  as synthetic
• The central concept: the audio-visual object
• MPEG-4 is constructed as a tool box




                                                           Video Coding CH 09




                                MPEG-4
                                                                   86

Consequently MPEG-4 provides a standardized way to
  describe a scene, allowing for example to:
   – place media objects anywhere in a given coordinate
     system
   – apply transformations to the geometrical or acoustical
     appearance of a media object
   – group primitive media objects in order to form compound
     media objects
   – apply streamed data to media objects, in order to modify
     their attributes
   – change ( interactively ) the user’s viewing and listening
     points anywhere in the scene.




                                                           Video Coding CH 09
                         MPEG-4
                                                    87




                                            Video Coding CH 09




                         MPEG-4
                                                    88

• Video Coding base layer
  – Based on MPEG 2
• Improvments to MPEG 1/2
  – Global motion estimation
  – Compensation to ¼ of pixel
     • macroblock
     • block
  – Improved prediction of motion vectors




                                            Video Coding CH 09
                        MPEG-4
                                                      89

Structure and syntax OF VIDEO OBJECTS
An MPEG-4 visual scene may consist of one
  or more video objects corresponding to
  particular objects in the scene with temporal
  and spatial information: VO’s




                                              Video Coding CH 09




                        MPEG-4
                                                      90
• New data hierarchy

                                   •Visual Object
                                   Sequence
                                   •Visual Objects
                                   (VOs)
                                   •Video Object
                                   layers (VOLs)
                                   •Groups of VOPs
                                   (GOVs)
                                   •Video Object
                                   Plane (VOP)
                                   •Video packets

                                              Video Coding CH 09
                                 MPEG-4
                                                                     91

The hierarchical levels are:
• Visual Object Sequence (VS): The complete MPEG-4 scene
  which may contain any 2-D or 3-D natural or synthetic objects
  and their enhancement layers.
• Video Object (VO): A video object corresponds to a particular
  (2-D) object in the scene. A rectangular frame, or an arbitrarily
  shaped object corresponding to an object or background of the
  scene.
• Video Object Layer (VOL): Each video object can be
  encoded in scalable (multi-layer) or non-scalable form (single
  layer), represented by the video object layer (VOL). The VOL
  provides support for scalable coding.
• Group of Video Object Planes (GOV). GOVs can provide
  points in the bitstream where video object planes are encoded
  independently from each other, and can thus provide random
  access points into the bitstream. GOVs are optional.
• Video Object Plane (VOP): A VOP is a time sample of a video
  object. VOPs can be encoded independently of each other, or
  dependent on each other by using motion compensation.

                                                             Video Coding CH 09




                                 MPEG-4
                                                                     92




                                                             Video Coding CH 09
                               MPEG-4
                                                       93

An MPEG-4 visual bitstream:
a hierarchical description of a visual scene.
 Each level of the hierarchy can be accessed
 in the bitstream by special code values
 called start codes.




                                               Video Coding CH 09




                               MPEG-4
                                                       94

• Object Definition : VOP
   – Smallest rectangle  object
   – Integer Number of macroblocks


• The VOPs do not necessarily have same spatial and
  temporal resolutions
• Encoding
   – Shape, motion, texture
   – Layers VOL
   – Possible individual decoding




                                               Video Coding CH 09
                                    MPEG-4
                                                                         95

• The shape information is used to form a VOP. A VOP is
  composed of macroblocks. The VOP is formed by first drawing
  the tightest rectangle around the object. The rectangle is then
  extended to a bounding rectangle that contains a multiple of
  macroblocks. This ensures that the VOP contains a minimum
  number of macroblocks to represent the object.
• A macroblock contains a section of the luminance component
  and the spatially subsampled chrominance components. The
  format is 4:2:0. In this format, each macroblock contains
        •   4 luminance blocks,
        •   2 chrominance blocks.   1       2
                                                5        6
                                    3       4

                                        Y       C       C
                                                    B        R
• Each block contains 8x8 pixels, and is encoded using the DCT
  transform.
• A macroblock carries the shape information, motion
  information, and texture information.



                                                                 Video Coding CH 09




                                    MPEG-4
                                                                         96




                                                                 Video Coding CH 09
                           MPEG-4
                                            97



• Interest for VOPcoding
   – Example "Akiyo"




                                    Video Coding CH 09




                           MPEG-4
                                            98

• Shape binary (mask)




• Texture




• Fixed background




                                    Video Coding CH 09
                  MPEG-4: coding VOPs
                                                            99

• Shape coding
  – binary mask Coding
  – Data per pixel on 8 bits

• Motion coding
  – Block into VOP: BMA
  – Block on border of the object: modified BMA

• Texture coding
  – residual error after motion compensation
  – Border Blocks : padding on pixels outside the
    object

                                                    Video Coding CH 09




                  MPEG-4: functionalities
                                                          100

• Interaction: some possible manipulations
  inside a scene
  –   Modification of spatial position of an object (VOP)
  –   Application of a zoom factor to an object
  –   Changing object speed
  –   Adding or suppressing objects




                                                    Video Coding CH 09
                      MPEG-4: functionalities
                                                           101

• Spatial Scalability
   – VOP layer
   – Main bitstream: base layer
   – Complementary Information: enhancement layer
         • B-VOP and P-VOP


              P                B                    B




         I                     P                     P

                                                     Video Coding CH 09




                      MPEG-4: functionalities
                                                           102

 • Temporal Scalability
    – Frame level
    – Main bitstream: base layer
    – Complementary Information:
      enhancement layer
         • Decoding according to terminal possibilities


Enhancement       B     B             B     B
   layer



 base     I                    P                      P
layer
                                                     Video Coding CH 09
                  MPEG-4: functionalities
                                                                    103

• Robustness: error resilience
                              2. Resynchronisation
             1. Detection                            3. Data Recovery

                Image N-1         X       Image N+1


                       4. Error concealment

• Resynchronisation markers
• Reversible Codewords (RVLC)
                   1. Detection       2. Resynchronisation

                 Motion                XXXX
                            Marq.                Mark.
                  Data                Errors

                Forward decoding        3. Data Recovery /
                                        Backward decoding Video Coding CH 09




                               MPEG-4
                                                                    104

   Face Animation
   – The MPEG-4 standard allows using
     proprietary 3D face models that are
     resident at the decoder as well as
     transmission of face models such that the
     encoder can predict the quality of the
     presentation at the decoder.
   – Face animation integrates text to speech
     synthesis
   – Body animation.

                                                              Video Coding CH 09
                     MPEG-4
                                               105




                                         Video Coding CH 09




                     MPEG-4
                                               106

Mesh Animation
– MPEG-4 version-1 supports 2D uniform or
  content-based (nonuniform) Delaunay
  triangular mesh representation of arbitrary
  visual objects, that includes an efficient
  method for animation of such meshes.
– 3D generic animation is integrated in
  MPEG4 version 2.




                                         Video Coding CH 09
                  MPEG-4
                                        107




                                  Video Coding CH 09




                 competitors
                                        108

• H264/AVC FRExt
    the most performant
• VC-1 (2006)
    from SMPTE equivalent to H264 /
    AVC
• Theora (under license BSD)
    wavelets
• WebM (under license creative
  commons)freeware Google
                                  Video Coding CH 09
                                 Future
                                                                109

• HEVC High Efficiency Video Coding
The successor of H264/AVC
With
   • bitrate ½ with same subjective quality
   • Hardware no more heavy than 3 times
   • Specifications for july 2012




                                                          Video Coding CH 09




                            Bibliography
                                                                110

• Antoine Manzaner, "Estimation du mouvement dans les
  séquences d’images", support de cours, Master IA&D,
  ENSTA/LEI, 2005
• Eric Bruno, "De l'estimation locale à l'estimation globale de
  mouvement dans les séquences d'images", Thèse de doctorat,
  Université Joseph Fournier, Grenoble, 2001
• Bernd Girod, "Image and Video Compression", lecture notes,
  Standford University, 2005
• Abhishek Girotra, "Motion Estimation: an overview", 2002
• Aloha, "Fundamentals on lossy video compression", 2006
• Isabelle Amonou, "La norme MPEG-4 Vidéo", France Telecom
  R&D, Rennes, janvier 2006
• M. Neveu, "Quelques mots sur la vidéo", LE2I, 2005



                                                          Video Coding CH 09
END




      Video Coding CH 09

								
To top