Docstoc

Image and video compression

Document Sample
Image and video compression Powered By Docstoc
					Introduction to Image and Video
          Compression
Need for Image & Video Compression
 Uncompressed video
    640 x 480 resolution, 8 bit (1 bytes) colour, 24 fps
       307.2 Kbytes per image (frame)
       7.37 Mbytes per second
       442 Mbytes per minute
       26.5 Gbytes per hour
    640 x 480 resolution, 24 bit (3 bytes) colour, 30 fps
       921.6 Kbytes per image (frame)
       27.6 Mbytes per second
       1.66 Gbytes per minute
       99.5 Gbytes per hour
    Given a 100 Gigabyte disk (100 x 109 bytes), can store about 1-
    4 hours of high quality video
 MPEG-1 compresses video to 187 Kbytes/second

                                2
  Need for Image & Video Compression

– Raw video contains an immense amount of data
– Communication and storage capabilities are limited and expensive
• Example HDTV video signal:
– 720x1280 pixels/frame, progressive scanning at 60 frames/s:




– 20 Mb/s HDTV channel bandwidth
→ Requires compression by a factor of 70 (equivalent to .35 bits/pixel)




                                    3
Compression
 Based on Information Theory as described by
 Claude Shannon in 1948

 Lossless and lossy methods

 Efficient binary representation of information

 Remove redundancy
                         4
Compression Theory
(lossy/lossless)
                             Original Data
                               5      4   2



               Lossy                            Lossless
               CODE                             CODEC
                C


3.1   2.1      0       4.9 3.9                  5   4   2
Decoded Data              1.9                 Decoded Data
                       Decoded Data
 High Error
                        Mild Error
                                5
Compression theory

 Original Picture 800 x 600       Decoded Picture (Low Losses)




         1.37 MB                             85 KB
                              6
Compression theory

Original Picture 800 x 600       Decoded Picture (High Losses)




        1.37 MB                              16 KB
                             7
  Compression Theory (Entropy)

10 22 0 0 3 03 30 03 10
3 30 33 44 05 35 0 50 05
0 50 5 50 8 60 8 60 66 8
66 666 88 86 79 84 67 34
                                ENTROPY
                                 ENTROPY
22 0 0 3 03 30 03 10 33
30 33 44 05 05 0 50 05 0
50 5 50 8 60 8 60 66 8 60
8 60 66 8 66 666 88 86
79 84 67 34 22 0 30 3 33
30 03 10 3 30 33 44 05 35       Redundancy
0 50 05 0 50 5 50 8 60 8
60 66 44 33 0 0 0 0 0 0 0



                            8
Compression Theory (Entropy)


   Part of the Entropy
                                                   ENTROPY
                                                    ENTROPY




        Lossy Compression                   Optimal Lossless Compression

Only Part of the Entropy is Preserved         Full Entropy is Preserved
                                        9
Image Compression and Formats
 RLE
 Huffman
 LZW
 GIF
 JPEG
 Fractals
 TIFF, PICT, BMP, etc.

                     10
Video Compression and Formats
 H.261/H.263
 Cinepak
 Sorensen
 Indeo
 Real Video
 MPEG-1, MPEG-2, MPEG-4, etc.
 QuickTime, AVI

                   11
Special Coding Requirements
 Viewing real-time source information
   End-to-end delay (EED) should not exceed 150-200 ms
   Face-to-face application needs EED of 50ms (including
   compression and decompression)
 Interactive viewing - random access
   Random access to single images and audio frames, access
   time should be less than 0.5 sec
   Decompression of images, video, audio should not be linked
   to other data units to support random access



                            12
Compression Steps

   Uncompressed
   Picture
                  Picture Preparation



                  Picture Processing
 Adaptive
 Feedback
 Loop
                     Quantization


    Compressed
    Picture
                  Entropy Coding

                          13
Picture Preparation
 Analog-to-digital conversion
 Generate appropriate digital representation
 Divide picture into macro blocks (usually 8x8)
 Fix the number of bits per pixel




                       14
Picture Processing
 Transform to frequency domain
   E.g., use the Discrete Cosine Transform (DCT)
 Compute motion vectors for each block




                        15
Quantization and Coding
 Map real numbers to integers
   E.g., the DC and AC coefficients from DCT are
   real numbers, but only want to store as integers


 Entropy coding
   Compress a sequential bit stream without loss




                         16
Types of Compression
 Symmetric compression
   Requires same time for encoding and decoding
   Used for dialog mode applications (teleconference)
 Asymmetric compression
   Performed once when enough time is available
   Two Pass Encoding
   Used for retrieval mode applications (e.g., an
   interactive CD-ROM)


                         17
Broad Classification of
Compression Techniques
 Entropy Coding
    lossless encoding
    used regardless of media’s specific characteristics
    data taken as a simple digital sequence
    decompression process regenerates data completely
    e.g. Run-length coding, Huffman coding, LZW, Arithmetic coding
 Source Coding
    lossy encoding
    May take into account the semantics of the data
    degree of compression depends on data content
    E.g. DPCM, ADPCM
 Hybrid Coding (used by most multimedia systems)
    combine entropy with source encoding
    E.g. JPEG, H.263, MPEG-1, MPEG-2, MPEG-4
                                 18
Compression Techniques
 Statistical techniques (Entropy)
 Predictive techniques (Source)
 Transform techniques (Hybrid)




                       19
Statistical Techniques
 Entropy (H) refers to how much variability is in data
    Low/high entropy means low/high variability
    Zero-order entropy model    H = log 2Μ
                                      m

    First-order entropy model  H = −∑ P (i ) log 2 P (i )
                                     i =1




 Huffman encoding, for example
    Use statistical profile to determine encoding scheme
    Use fewer/more bits to encode more/less frequent data
    Must transmit a codebook


                                  20
   Huffman Coding
                             Source Data
     ABABCCDBBABBAAABBAACCDDEEAAC...


Symbol A B C D E                              Huffman Table
 Count 20 8 6 6 4                          Symbol   Count   Code
Total Size: 44 Bytes / 352                   A       20       0
           Bits                              B       8      100

Total size : 92 Bits                         C       6      101
                                             D       6      110
Compression Ration: 3.8
                                             E       4      111
≅ 2 Bits/Pixel
010001001011011101001001100100000100100…                  Encoder
                                      21
                       Compressed Data
Predictive Techniques
 DPCM
   Compare adjacent pixels and only transmit the difference
   between them
   E.g., use 8 bits per pixel and use 4 bits for each difference


 ADPCM
   Similar to DPCM, but use a variable number of pixels to
   transmit differences
   E.g., use 8 bits per pixel and use 1-5 bits for each difference



                               22
Transform Techniques
 Convert to data to an alternate form that better
 supports specific operations
    Typically operates on blocks of data
    Larger blocks give better results, but require more
    computational overhead
 Discrete Cosine Transform
    Converts an image from spatial to frequency domain
    Operates on 8 x 8 blocks (64 pixels) of data
    DC coefficient represents zero spatial frequency, which is
    the average value for all the pixels in the 8 x 8 block
    AC coefficients represent amplitudes of progressively higher
    horizontal and vertical spatial frequency components

                               23
Run Length Encoding (RLE)
 Form of entropy coding (lossless)
   Content dependent coding scheme


 Series of repeated values replaced by a single
 value and a count
 Example:
   The sequence abbbbbbbccddddeeddd
   would be replaced by 1a7b2c4d2e3d

                       24
Example Implementation
 Use 8 bits (1 byte) data elements
    Unsigned [0, 255]; signed [-127, 127]
 Encode repeated values using two bytes
    First byte provides count (N) between [-1, -127]
    The repetition count is –N + 1
    Second byte contains the data value to repeat
 Repeated patterns cannot be longer than 128
    Must be broken into multiple runs
 A non-repeating data value will have a positive first
 byte, which is the data value
                              25
Exercise
 Assume same sequence from before
     abbbbbbbccddddeeddd
 Given previous implementation, what would
 be the transmitted code




                     26
Exercise
 Assume same sequence from before
     abbbbbbbccddddeeddd
 Given previous implementation, what would
 be the transmitted code

    a -6b -1c -3d -1e -2d




                            27
Differential Encoding (Source)
 Consider a sequence of values S1, S2, S3, etc.
 that differ in value, but not dramatically
   Encode differences from a specific value
   E.g., S1, S2-S1, S3-S2, etc.
 E.g. still image
   Calculate difference between nearby pixels
   Areas of rapid color change characterized by large
   values, other areas by small values
   After differential encoding, apply RLE
                         28
Differential Encoding Example


             0    0     0     0      0

             0    255   250   253    251


             0    255   251   254    255


             0    0     0     0      0


  DPCM: 0, 0, 0, 0, 0, 0, 255, -5, 3, -2, 0, 255, -4, 3, 1, …
  RLE: 6(0), 255, -5, 3, -2, 0, 255, -4, 3, 1, …



                                    29
Compression Steps

   Uncompressed
   Picture
                  Picture Preparation



                  Picture Processing
 Adaptive
 Feedback
 Loop
                     Quantization


    Compressed
    Picture
                  Entropy Coding

                          30
JPEG Compression Steps
 Prepare the image for compression
   Transform color space
   Down-sample components
   Interleave the color planes
   Partition into smaller blocks
 Apply the Discrete Cosine Transform
 Quantize the DC and AC coefficients
 Apply entropy encoding to quantized numbers

                          31
   JPEG Algorithm
8x8 blocks                    Source
                              Image

                              B
                          G
                      R
                    DCT-based encoding

                                       Entropy   Compressed
             FDCT     Quantizer
                                       Encoder   image data


                          Table         Table
                                  32
Color Transformation
 May want to transform RGB to YUV or YCbCr
   This is an optional step, so why do it?
 Human visual system detects changes better in
 luminance than in chrominance components
   YUV or YCbCr better supports this than RGB
 Facilitates better compression by enabling
   Down-sampling of the image in chrominance
   Elimination of more of the high-frequency changes
   in the chrominance dimensions
                          33
Down-Sampling Chrominance
 Optional step, but increases compression ratio
   Only down-sample the chrominance components,
   never the luminance component
 Average groups of pixels in the horizontal
 and/or vertical resolution of the image
   Referred to as H2V1, H2V2, 4:2:2, 4:2:0, etc.
   E.g., H2V2 means average every 2 pixels in
   horizontal and vertical dimensions
   H4V4 means use the SAME resolution in the
   horizontal and vertical dimensions

                         34
Color sub-sampling



           Sub-Sampling




   4:4:4                  4:2:2



                   35
Color sub-sampling



           Sub-Sampling




   4:4:4                  4:1:1



                    36
Color sub-sampling



           Sub-Sampling




   4:4:4                  4:2:0



                    37
How does colour sub-sampling aid
         compression?
100 x 100 pixels frame requires:
  100 x 100 Y Pixels
  100 x 100 Cr Pixels
  100 x 100 Cb Pixels
Using 4:2:0 colour sub-sampling:
  100 x 100 Y Pixels
  50 x 50 Cr Pixels
  50 x 50 Cb Pixels

                        38
Blocks and Pixel Shift
 Divide each component into 8x8 blocks
 Send to the FDCT

                                                Source
                                                Image




 Shift each pixel from unsigned range [0, 255] into
 signed range [-127, 127]
    DCT requires range be centered around 0

                             39
Forward DCT
 Convert from spatial to frequency domain
   Convert intensity function into weighted sum of
   elementary frequency components
   Identify pieces of spectral information that can be
   thrown away without loss of quality
 Intensity values in each color plane often
 change slowly (see next example)
   Contributions from higher frequency bands in the
   frequency domain can be ignored
   Better compression without loss of quality

                          40
Equations for 2D DCT
  Forward DCT:
            2             m −1 n −1
                                             ⎛ (2 x + 1)uπ ⎞      ⎛ (2 y + 1)vπ ⎞
F (u , v) =    C (u )C (v)∑∑ I ( x, y ) * cos⎜             ⎟ * cos⎜             ⎟
            nm            y =0 x =0          ⎝      2n     ⎠      ⎝     2m      ⎠




  Inverse DCT:
            2 m −1 n −1                          ⎛ (2 x + 1)uπ   ⎞      ⎛ (2 y + 1)vπ ⎞
I ( y, x) =    ∑∑
            nm v =0 u =0
                         F (v, u )C (u )C (v) cos⎜
                                                 ⎝      2n
                                                                 ⎟
                                                                 ⎠
                                                                   * cos⎜
                                                                        ⎝     2m
                                                                                      ⎟
                                                                                      ⎠



                                         41
Visualization of Basis Functions

   Increasing frequency




                          Increasing frequency
                                      42
Quantization
 Divide each coefficient in a block by an integer in the
 range [1, 255]
    Comes in the form of a table, same size as a block
    Multiply the block of coefficients by the table, and then
    round the result to nearest integer
 In the decoding process, multiply the quantized
 coefficients by the inverse of the table
    Get back a number close to, but the same as the original
    Error is always less than half of the quantization number
 Larger numbers in quantization table cause more loss
    This is the main source of loss in JPEG

                               43
De facto Quantization Table

                               16    11    12    14    12    10    16    14
  Eye becomes less sensitive

                               13    14    18    17    16    19    24    40
                               26    24    22    22    24    49    35    37
                               29    40    58    51    61    60    57    51
                               56    55    64    72    92    78    64    68
                               87    69    55    56    80    109   81    87
                               95    98    103   104   103   62    77    113
                               121   112   100   120   92    101   103   99

                                      Eye becomes less sensitive
                                                 44
Entropy Encoding
 Compress the sequence of quantized DC and
 AC coefficients from the quantization step
   Further increase compression, but without loss


 Separate DC from AC components
   DC components change slowly, thus will be
   encoded using difference encoding



                         45
DC Encoding
 DC represents average intensity of a block
   Because image intensity tends to change slowly,
   DC values tend to change slowly
   Encode using difference encoding scheme
   Use 3x3 pattern of blocks
 Because difference tends to be near zero, can
 use less bits in the encoding
   Categorize difference into difference classes
   Send the index of the difference class, followed by
   the bits representing the difference
                         46
AC Encoding
 Use a zig-zag ordering of coefficients, why?
   Orders frequency components from low->high
   Should produce maximal series of 0s at the end
   Lends well to RLE
 Apply RLE to ordering




                         47
Huffman Encoding
 String together the RLE of the AC coefficients
 along with the DC difference indices and values
 Apply Huffman encoding to resulting sequence
 Attach appropriate headers
 Finally have the JPEG image!




                        48

				
DOCUMENT INFO
Shared By:
Stats:
views:48
posted:4/21/2012
language:English
pages:48
Description: Need for Image & Video Compression
megaphonegm megaphonegm
About