Document Sample

Introduction to Image and Video Compression Need for Image & Video Compression Uncompressed video 640 x 480 resolution, 8 bit (1 bytes) colour, 24 fps 307.2 Kbytes per image (frame) 7.37 Mbytes per second 442 Mbytes per minute 26.5 Gbytes per hour 640 x 480 resolution, 24 bit (3 bytes) colour, 30 fps 921.6 Kbytes per image (frame) 27.6 Mbytes per second 1.66 Gbytes per minute 99.5 Gbytes per hour Given a 100 Gigabyte disk (100 x 109 bytes), can store about 1- 4 hours of high quality video MPEG-1 compresses video to 187 Kbytes/second 2 Need for Image & Video Compression – Raw video contains an immense amount of data – Communication and storage capabilities are limited and expensive • Example HDTV video signal: – 720x1280 pixels/frame, progressive scanning at 60 frames/s: – 20 Mb/s HDTV channel bandwidth → Requires compression by a factor of 70 (equivalent to .35 bits/pixel) 3 Compression Based on Information Theory as described by Claude Shannon in 1948 Lossless and lossy methods Efficient binary representation of information Remove redundancy 4 Compression Theory (lossy/lossless) Original Data 5 4 2 Lossy Lossless CODE CODEC C 3.1 2.1 0 4.9 3.9 5 4 2 Decoded Data 1.9 Decoded Data Decoded Data High Error Mild Error 5 Compression theory Original Picture 800 x 600 Decoded Picture (Low Losses) 1.37 MB 85 KB 6 Compression theory Original Picture 800 x 600 Decoded Picture (High Losses) 1.37 MB 16 KB 7 Compression Theory (Entropy) 10 22 0 0 3 03 30 03 10 3 30 33 44 05 35 0 50 05 0 50 5 50 8 60 8 60 66 8 66 666 88 86 79 84 67 34 ENTROPY ENTROPY 22 0 0 3 03 30 03 10 33 30 33 44 05 05 0 50 05 0 50 5 50 8 60 8 60 66 8 60 8 60 66 8 66 666 88 86 79 84 67 34 22 0 30 3 33 30 03 10 3 30 33 44 05 35 Redundancy 0 50 05 0 50 5 50 8 60 8 60 66 44 33 0 0 0 0 0 0 0 8 Compression Theory (Entropy) Part of the Entropy ENTROPY ENTROPY Lossy Compression Optimal Lossless Compression Only Part of the Entropy is Preserved Full Entropy is Preserved 9 Image Compression and Formats RLE Huffman LZW GIF JPEG Fractals TIFF, PICT, BMP, etc. 10 Video Compression and Formats H.261/H.263 Cinepak Sorensen Indeo Real Video MPEG-1, MPEG-2, MPEG-4, etc. QuickTime, AVI 11 Special Coding Requirements Viewing real-time source information End-to-end delay (EED) should not exceed 150-200 ms Face-to-face application needs EED of 50ms (including compression and decompression) Interactive viewing - random access Random access to single images and audio frames, access time should be less than 0.5 sec Decompression of images, video, audio should not be linked to other data units to support random access 12 Compression Steps Uncompressed Picture Picture Preparation Picture Processing Adaptive Feedback Loop Quantization Compressed Picture Entropy Coding 13 Picture Preparation Analog-to-digital conversion Generate appropriate digital representation Divide picture into macro blocks (usually 8x8) Fix the number of bits per pixel 14 Picture Processing Transform to frequency domain E.g., use the Discrete Cosine Transform (DCT) Compute motion vectors for each block 15 Quantization and Coding Map real numbers to integers E.g., the DC and AC coefficients from DCT are real numbers, but only want to store as integers Entropy coding Compress a sequential bit stream without loss 16 Types of Compression Symmetric compression Requires same time for encoding and decoding Used for dialog mode applications (teleconference) Asymmetric compression Performed once when enough time is available Two Pass Encoding Used for retrieval mode applications (e.g., an interactive CD-ROM) 17 Broad Classification of Compression Techniques Entropy Coding lossless encoding used regardless of media’s specific characteristics data taken as a simple digital sequence decompression process regenerates data completely e.g. Run-length coding, Huffman coding, LZW, Arithmetic coding Source Coding lossy encoding May take into account the semantics of the data degree of compression depends on data content E.g. DPCM, ADPCM Hybrid Coding (used by most multimedia systems) combine entropy with source encoding E.g. JPEG, H.263, MPEG-1, MPEG-2, MPEG-4 18 Compression Techniques Statistical techniques (Entropy) Predictive techniques (Source) Transform techniques (Hybrid) 19 Statistical Techniques Entropy (H) refers to how much variability is in data Low/high entropy means low/high variability Zero-order entropy model H = log 2Μ m First-order entropy model H = −∑ P (i ) log 2 P (i ) i =1 Huffman encoding, for example Use statistical profile to determine encoding scheme Use fewer/more bits to encode more/less frequent data Must transmit a codebook 20 Huffman Coding Source Data ABABCCDBBABBAAABBAACCDDEEAAC... Symbol A B C D E Huffman Table Count 20 8 6 6 4 Symbol Count Code Total Size: 44 Bytes / 352 A 20 0 Bits B 8 100 Total size : 92 Bits C 6 101 D 6 110 Compression Ration: 3.8 E 4 111 ≅ 2 Bits/Pixel 010001001011011101001001100100000100100… Encoder 21 Compressed Data Predictive Techniques DPCM Compare adjacent pixels and only transmit the difference between them E.g., use 8 bits per pixel and use 4 bits for each difference ADPCM Similar to DPCM, but use a variable number of pixels to transmit differences E.g., use 8 bits per pixel and use 1-5 bits for each difference 22 Transform Techniques Convert to data to an alternate form that better supports specific operations Typically operates on blocks of data Larger blocks give better results, but require more computational overhead Discrete Cosine Transform Converts an image from spatial to frequency domain Operates on 8 x 8 blocks (64 pixels) of data DC coefficient represents zero spatial frequency, which is the average value for all the pixels in the 8 x 8 block AC coefficients represent amplitudes of progressively higher horizontal and vertical spatial frequency components 23 Run Length Encoding (RLE) Form of entropy coding (lossless) Content dependent coding scheme Series of repeated values replaced by a single value and a count Example: The sequence abbbbbbbccddddeeddd would be replaced by 1a7b2c4d2e3d 24 Example Implementation Use 8 bits (1 byte) data elements Unsigned [0, 255]; signed [-127, 127] Encode repeated values using two bytes First byte provides count (N) between [-1, -127] The repetition count is –N + 1 Second byte contains the data value to repeat Repeated patterns cannot be longer than 128 Must be broken into multiple runs A non-repeating data value will have a positive first byte, which is the data value 25 Exercise Assume same sequence from before abbbbbbbccddddeeddd Given previous implementation, what would be the transmitted code 26 Exercise Assume same sequence from before abbbbbbbccddddeeddd Given previous implementation, what would be the transmitted code a -6b -1c -3d -1e -2d 27 Differential Encoding (Source) Consider a sequence of values S1, S2, S3, etc. that differ in value, but not dramatically Encode differences from a specific value E.g., S1, S2-S1, S3-S2, etc. E.g. still image Calculate difference between nearby pixels Areas of rapid color change characterized by large values, other areas by small values After differential encoding, apply RLE 28 Differential Encoding Example 0 0 0 0 0 0 255 250 253 251 0 255 251 254 255 0 0 0 0 0 DPCM: 0, 0, 0, 0, 0, 0, 255, -5, 3, -2, 0, 255, -4, 3, 1, … RLE: 6(0), 255, -5, 3, -2, 0, 255, -4, 3, 1, … 29 Compression Steps Uncompressed Picture Picture Preparation Picture Processing Adaptive Feedback Loop Quantization Compressed Picture Entropy Coding 30 JPEG Compression Steps Prepare the image for compression Transform color space Down-sample components Interleave the color planes Partition into smaller blocks Apply the Discrete Cosine Transform Quantize the DC and AC coefficients Apply entropy encoding to quantized numbers 31 JPEG Algorithm 8x8 blocks Source Image B G R DCT-based encoding Entropy Compressed FDCT Quantizer Encoder image data Table Table 32 Color Transformation May want to transform RGB to YUV or YCbCr This is an optional step, so why do it? Human visual system detects changes better in luminance than in chrominance components YUV or YCbCr better supports this than RGB Facilitates better compression by enabling Down-sampling of the image in chrominance Elimination of more of the high-frequency changes in the chrominance dimensions 33 Down-Sampling Chrominance Optional step, but increases compression ratio Only down-sample the chrominance components, never the luminance component Average groups of pixels in the horizontal and/or vertical resolution of the image Referred to as H2V1, H2V2, 4:2:2, 4:2:0, etc. E.g., H2V2 means average every 2 pixels in horizontal and vertical dimensions H4V4 means use the SAME resolution in the horizontal and vertical dimensions 34 Color sub-sampling Sub-Sampling 4:4:4 4:2:2 35 Color sub-sampling Sub-Sampling 4:4:4 4:1:1 36 Color sub-sampling Sub-Sampling 4:4:4 4:2:0 37 How does colour sub-sampling aid compression? 100 x 100 pixels frame requires: 100 x 100 Y Pixels 100 x 100 Cr Pixels 100 x 100 Cb Pixels Using 4:2:0 colour sub-sampling: 100 x 100 Y Pixels 50 x 50 Cr Pixels 50 x 50 Cb Pixels 38 Blocks and Pixel Shift Divide each component into 8x8 blocks Send to the FDCT Source Image Shift each pixel from unsigned range [0, 255] into signed range [-127, 127] DCT requires range be centered around 0 39 Forward DCT Convert from spatial to frequency domain Convert intensity function into weighted sum of elementary frequency components Identify pieces of spectral information that can be thrown away without loss of quality Intensity values in each color plane often change slowly (see next example) Contributions from higher frequency bands in the frequency domain can be ignored Better compression without loss of quality 40 Equations for 2D DCT Forward DCT: 2 m −1 n −1 ⎛ (2 x + 1)uπ ⎞ ⎛ (2 y + 1)vπ ⎞ F (u , v) = C (u )C (v)∑∑ I ( x, y ) * cos⎜ ⎟ * cos⎜ ⎟ nm y =0 x =0 ⎝ 2n ⎠ ⎝ 2m ⎠ Inverse DCT: 2 m −1 n −1 ⎛ (2 x + 1)uπ ⎞ ⎛ (2 y + 1)vπ ⎞ I ( y, x) = ∑∑ nm v =0 u =0 F (v, u )C (u )C (v) cos⎜ ⎝ 2n ⎟ ⎠ * cos⎜ ⎝ 2m ⎟ ⎠ 41 Visualization of Basis Functions Increasing frequency Increasing frequency 42 Quantization Divide each coefficient in a block by an integer in the range [1, 255] Comes in the form of a table, same size as a block Multiply the block of coefficients by the table, and then round the result to nearest integer In the decoding process, multiply the quantized coefficients by the inverse of the table Get back a number close to, but the same as the original Error is always less than half of the quantization number Larger numbers in quantization table cause more loss This is the main source of loss in JPEG 43 De facto Quantization Table 16 11 12 14 12 10 16 14 Eye becomes less sensitive 13 14 18 17 16 19 24 40 26 24 22 22 24 49 35 37 29 40 58 51 61 60 57 51 56 55 64 72 92 78 64 68 87 69 55 56 80 109 81 87 95 98 103 104 103 62 77 113 121 112 100 120 92 101 103 99 Eye becomes less sensitive 44 Entropy Encoding Compress the sequence of quantized DC and AC coefficients from the quantization step Further increase compression, but without loss Separate DC from AC components DC components change slowly, thus will be encoded using difference encoding 45 DC Encoding DC represents average intensity of a block Because image intensity tends to change slowly, DC values tend to change slowly Encode using difference encoding scheme Use 3x3 pattern of blocks Because difference tends to be near zero, can use less bits in the encoding Categorize difference into difference classes Send the index of the difference class, followed by the bits representing the difference 46 AC Encoding Use a zig-zag ordering of coefficients, why? Orders frequency components from low->high Should produce maximal series of 0s at the end Lends well to RLE Apply RLE to ordering 47 Huffman Encoding String together the RLE of the AC coefficients along with the DC difference indices and values Apply Huffman encoding to resulting sequence Attach appropriate headers Finally have the JPEG image! 48

DOCUMENT INFO

Shared By:

Categories:

Tags:
image, video, compression

Stats:

views: | 48 |

posted: | 4/21/2012 |

language: | English |

pages: | 48 |

Description:
Need for Image & Video Compression

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.