VIEWS: 3 PAGES: 4 POSTED ON: 3/25/2012
IMAGE COMPRESSION WITH A GEOMETRICAL ENTROPY CODER Onur G. Guleryuz Arthur L. da Cunha DoCoMo USA Laboratories, Inc. University of Illinois at Urbana-Champaign 181 Metro Drive, Suite 300, Dept. of Electrical and Computer Engineering San Jose, CA 95110 Urbana, IL 61801 guleryuz@docomolabs-usa.com arthur.cunha@ifp.uiuc.edu Image ABSTRACT Wavelet Dead-zone Geometric Entropy Bits Transform Quantizer DPCM Coder In this paper we propose a geometrical entropy coder to com- Side press images with geometry. Our proposed scheme works Information on quantized coefﬁcients of the discrete wavelet transform (DWT). The geometry inherent in those coefﬁcients is ex- Fig. 1. Building blocks of the proposed codec ploited by means of directional, predictive coding. We ac- complish directional prediction with the aid of the coefﬁcients compression performance [1]. of the undecimated wavelet transform (UDWT), which are es- Notable examples of image compression algorithms that timated from the available DWT ones both at the encoder and exploit edge regularity include [2] with bandelets, and [3, 4] decoder. Following directional prediction, a simple entropy with wedgelets. The common feature of those schemes is coder takes care of the remaining redundancy. The resulting that they employ clever methods to decorrelate straight edges. codec attains competitive performance. Contours and other kinds of geometry are exploited by means Index Terms— Image coding, Wavelet transforms, Pre- of a quadtree segmentation. Such segmentation is typically diction methods, Quadtrees, Entropy codes. done using a rate-distortion criteria. Those algorithms often perform at the state-of-the-art level and compare favorably to 1. INTRODUCTION benchmark algorithms such as SPIHT [5] and the more recent Digital images contain large amounts of information that needs JPEG2000 standard. to be compressed for further storage and transmission. Lossy In this paper we propose a simple method to exploit edge image compression is a well-studied topic with many com- regularity in a compression context. Our proposal can be pression algorithms proposed in the literature. The conven- viewed as a new entropy encoder that exploits edge regu- tional model for the lossy compression of images is that of larity. Instead of using wedgelets or bandelets, we perform transform coding. In such a model, the image is transformed linear prediction in the quantized wavelet transform domain. through an invertible linear transform, followed by quantiza- Prediction is made effective by using coefﬁcients of the un- tion and entropy coding. Decoding is accomplished by revers- decimated wavelet transform (UDWT). These are estimated ing the process. The goal of the transform stage is to decor- on the ﬂy, using data from the quantized DWT. Results indi- relate the pixels which are typically quantized with a scalar cate that our proposed codec can be a valuable alternative to quantizer. The quantization indices are then entropy encoded more complex schemes such as wedgelets and bandelets. and stored. The entropy encoder of choice in most compres- The paper is organized as follows. In section 2 we discuss sion algorithms is arithmetic entropy coding. the general coder, leaving the description of its main building Recently there have been several efforts toward improv- blocks to subsequent sections. In section 3 we explain how ing the transform used in a compression algorithm. The main the UDWT coefﬁcients are estimated, while in section 4 we motivation is that the widely used 2-D discrete wavelet trans- describe the directional predictions. Results are presented in form (DWT) correctly exploits regularity on smooth regions section 5 and concluding remarks in section 6. of the image, but it otherwise fails to exploit the regularity 2. ENCODER OVERVIEW existing on straight edges and contours. The reason is be- The building blocks of the proposed encoder is shown in Fig. 1. cause 2-D wavelets are obtained from tensor product of 1-D It follows the transform coding paradigm. The image pix- wavelets. The transform is therefore implemented as row → els are ﬁrst transformed with a standard DWT using the 9- column processing. The result is a transform that is “geome- 7 biorthogonal ﬁlters with symmetric extension. The coefﬁ- try blind” in the sense that it ignores geometric structure. Be- cients are then scalar quantized. This results in a matrix of ing able to exploit edge regularity can considerably improve integer coefﬁcients that is fed to the directional DPCM. The 3. UDWT COEFFICIENT ESTIMATION The UDWT coefﬁcient estimation is best explained in one di- mension using a single level of wavelet decomposition. While our algorithm uses well-known biorthogonal wavelets, for con- venience in notation, assume an orthonormal wavelet decom- position of a signal x with N samples. The corresponding (a) (b) UDWT representation of the signal is obtained by adding the coefﬁcients resulting from the expansion of the shifted signal. Denote the matrices representing the DWT and shifted DWT ˜ ˜ ˜ bases by Φ = (φ1 , . . . , φN ) and Φ = (φ1 , . . . , φN ) respec- tively. The two wavelet transforms are denoted by y = ΦT x ˜ and w = ΦT x. Observe that to obtain the shifted DWT coef- ﬁcient of index j, we simply compute ˜ wj = φT x. (2) j (c) (d) In the proposed encoder, we will process the wavelet coef- Fig. 2. Prediction in the DWT versus in the UDWT do- ﬁcients sequentially, so that at point j, (j = 2, . . . , N ) we main (LH subband). The shift-invariant UDWT preserves the have the wavelet coefﬁcients y1 , . . . , yj−1 available at the en- geometry better (DWT images are shown upsampled by two coder/decoder. At point j, because we have only a portion of using pixel replication for clarity, ∆ = 10). (a) Cameraman the wavelet coefﬁcients available, we can only approximate elbow DWT. (b) Cameraman elbow UDWT. (c) Barbara pants the shifted wavelet coefﬁcients. We do so as follows. DWT. (d) Barbara pants UDWT Denote the partial reconstruction of x at point j by xj = j−1 i=1 yi φi . We obtain the shifted wavelet coefﬁcients we will integer coefﬁcients are obtained from use in the prediction of yj with the aid of this partial recon- struction. The whole operation can be divided into two steps: |y| yi = sign (y)⌊ ⌋ (1) 1. Obtain xj , ∆ xj = xj−1 + yj−1 φj−1 . and the quantized values by yq = ∆(yi + sign(yi )/2). The geometrical entropy encoder consist of two units: a geomet- ˆ 2. Compute an approximation wi of wi (i = 1, . . . , j − 1), rical DPCM unit, and a standard arithmetic coder unit. In the ideal scenario, the output of the geometrical DPCM unit will ˜ wi = φT xj . ˆ i be quantized coefﬁcients without any residual edge regular- ity. The geometrical DPCM takes the quantized wavelet coef- ˆ The coefﬁcients wi are approximate (“noisy”) because xj = ﬁcients in each subband and performs linear prediction along x. Observe however that since the wavelet basis have com- directions following a pre-computed ﬂow ﬁeld. Prediction is pact support, for some constant C and i < j − C, wi = wi ˆ accomplished using causally computed UDWT coefﬁcients. and the updates in step (2) are only needed for a limited set The ﬂow ﬁeld is a number for each coefﬁcient that tells the of coefﬁcients. When predicting yj we will use the UDWT encoder and decoder which is the best direction to perform sequence, linear prediction. The ﬂow ﬁeld for each subband is trans- ˆ ˆ [ wj−1 , yj−1 , . . . , wj−C , yj−C , wj−C−1 , yj−C−1 , . . .] (3) mitted as side information and used at the decoder to recover with the ﬁrst few values of shifted wavelet coefﬁcients ap- the predicted pixels. Notice that the geometrical DPCM is a proximate. Of course, the shifted coefﬁcient values become lossless DPCM in that it does not add quantization error. progressively more accurate as one examines deeper into the The rationale for doing prediction in the UDWT domain UDWT array. rather than in the DWT domain is that the former preserves The coefﬁcient estimation can be generalized to several the geometry much better. In other words, because of the levels of decomposition and to two-dimensions in a straight- downsamplers in the forward DWT, the resulting geometry in forward way. The process requires only a few operations per the subbands have missing parts that makes prediction inef- sample as determined by the DWT ﬁlter sizes. For instance, fective. By contrast, the UDWT preserves the geometry while if the size of the wavelet ﬁlters is L, then in 2D one needs annihilating the smooth regions. Fig. 2 shows the subbands ∼ C 2 L2 multiplications per coefﬁcient update. in the two transforms for the elbow of “cameraman” and the pants of “Barbara”. As illustrated, one can discern the correct 4. GEOMETRICAL DPCM directionality with the aid of the UDWT much better. This is The geometrical DPCM is performed on the quantized sub- especially the case for Barbara for which the DWT provides band coefﬁcients, using causally estimated UDWT values. a very aliased picture. Fig. 3 illustrates how prediction is done. Suppose we want (a) (b) Fig. 5. Quadtree segmentation example in the proposed en- Fig. 3. Directional prediction in the proposed codec for a tropy coder. The segmentation obtained for the LH subband given ﬂow. is upsampled and shown on top of the original image. Green to predict the coefﬁcient Xi,j in a given subband. The direc- lines determine the computed geometric ﬂow direction inside tional angles are deﬁned by (1 ≤ d ≤ dmax ): each segmented block. (a) Cameraman. (b) Barbara. prediction is performed along a certain direction. The method θd = π − δd, (4) that we use is a bottom-up approach in which we prune back where δ = π/dmax . We thus trace a line along this angle and the tree according to whether it is better to code the parent select the coefﬁcients adjacent to the line. Given the predic- node with a single direction or to segment it into four nodes tion direction 1 ≤ d ≤ dmax in the ﬂow-ﬁeld, we select the and use four directions. We thus test all directions and choose coefﬁcients along this direction, bilinearly interpolating the the one that results in the least rate cost for the segmentation values that fall in between two coefﬁcients. and for the prediction errors combined (see Fig. 4.) Since The number of coefﬁcients selected vary according to the geometrical DPCM is lossless, distortion is not included in prediction method being used. For instance, when using an the optimization. Rate cost for the coefﬁcients is determined autoregressive model predictor (AR) of order 1 or 2, we typi- based on their ﬁrst order entropy. We choose this metric for cally select 10 coefﬁcients to form a series, estimate the cor- its simplicity during the pruning process, but others can be relation values, determine prediction coefﬁcients, but use just used as well. Ideally, the geometrical prediction should an- the nearest ones in the prediction. We scan coefﬁcients in the nihilate edges so that straight lines are converted into a few subbands in an interleaved raster manner going from coarse dots. Thus, if the prediction is successful, the resulting sub- to ﬁner resolutions. As more coefﬁcients become available band should have a smaller ﬁrst order entropy. we update the UDWT data accordingly. This operation is re- The segmentation tree has a prediction direction associ- peated at the decoder. ated to each leaf (Fig. 5). This represents a square block Care must be taken with the shifted DWT coefﬁcients from the subband that will be coded with the DPCM along around the coefﬁcient being predicted. In a given subband, that direction. The segmentation tree, including the direc- the shifted DWT coefﬁcients around the coefﬁcient Xi,j , are tional values, forms the side information and is coded with estimated using data up to Xi,j . As a result, the shifted coef- an arithmetic coder. ﬁcients in a small neighborhood around Xi,j tend to be noisy. 5. EXPERIMENTS A satisfactory solution that we have found is to avoid using the nearest coefﬁcient in the prediction. Rather we use the We apply the proposed codec to several images and compare next one, following the direction in which the coefﬁcients are its performance to that of an implementation of the SPIHT selected. We then adjust the DPCM predictor accordingly. coder of [5]. The SPIHT algorithm offers state-of-the-art im- age compression performance and provides a valuable bench- Subband mark in assessing the proposed scheme. As we also encode Quantized Blockwise Flow Field Data Tiling Directional Quadtree Data the computed prediction error bands with SPIHT, the compar- Prunning DPCM isons can be considered as SPIHT with geometrical prediction v.s. geometry blind SPIHT encoding. (Unoptimized software that implements the geometric entropy decoder portion of the Fig. 4. Flow ﬁeld computation. The best ﬂow direction is proposed codec runs in less than one second on a 3Ghz PC.) selected following the ﬁrst order entropy of a given block. Signiﬁcant improvement is observed on contrived images 4.1. Quadtree Segmentation and ﬂow computation such as shown in Fig. 6 (a)-(b). In the ﬁgure G-DPCM stands In order to efﬁciently choose the directions where the pre- for the proposed geometrical DPCM coder. G-DPCM results diction should be made, we use a quadtree structure that seg- incorporate the rate required to transmit the geometric ﬂow ments the subbands into square blocks. In each of those blocks, (∼ 5% of the bitrate). As the rate-distortion curves in Fig. 6 39 (c) and (d) indicate, there are gains of at least 4dB on those 38.5 images. The improvement is expected as prediction on those 38 images is very straightforward. Improvements on more so- 37.5 PSNR [dB] phisticated computer graphics images such as those in Fig. 7 37 are less, but still noticeable. 36.5 G−DPCM 36 SPIHT 35.5 0.9 0.95 1 1.05 1.1 1.15 1.2 Rate [bpp] Fig. 8. “Barbara” image where the proposed codec attains modest improvements. tropy are better (Fig. 9), the absence of very long range linear (a) (b) singularities limit further gains in realistic scenarios. 52 45 50 44 45 37.5 48 44 37 42 43 40 36.5 46 42 40 36 41 PSNR [dB] PSNR [dB] 44 40 35.5 PSNR PSNR PSNR 35 38 42 39 35 38 40 36 34.5 37 34 38 30 36 G−DPCM G−DPCM 34 G−DPCM G−DPCM No geometry No geometry No geometry 33.5 35 36 SPIHT G−DPCM 32 34 33 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.8 0.85 0.9 0.95 1 1.05 1.1 1.15 1.2 SPIHT 25 Rate (bpp) Rate (bpp) Rate (bpp) 34 0.1 0.15 0.2 0.25 0.3 0.35 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 Rate [bpp] Rate [bpp] Fig. 9. First order entropy improvements over the images in (c) (d) Fig. 7 (a), (b), and Barbara respectively. Fig. 6. Contrived images where the proposed codec attains 6. CONCLUSION substantial improvements. (a) Tilted rectangle and (b) Dia- monds. (c), (d) R-D comparison on (a) and (b) respectively. We have introduced an image compression system where the geometry inherent in images is taken advantage of in wavelet domain. Using UDWT expansions and autoregressive predic- tors, our algorithm successfully discovers and exploits geo- metrical structure over otherwise highly aliased wavelet sin- gularity coefﬁcients. The geometrical segmentations gener- ated in each wavelet band mostly coincide with the scene geometry except for some nonideal behavior close to the an- (a) (b) nihilated direction of that band. Even in such cases our pre- dictive technique obtains compression improvements. 45 44 44 As can be evidenced from our simulation results, simi- 42 43 42 lar to other reported techniques that take advantage of scene 40 41 geometry, our approach provides the biggest improvements PSNR [dB] PSNR [dB] 38 40 39 on cartoons and other images that exhibit strong singularities 36 38 along curves. Further extension of these results to more gen- 37 34 G−DPCM SPIHT 36 G−DPCM SPIHT eral images is required. We note however that our formulation 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 35 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 is more amenable to such generalizations compared to other Rate [bpp] Rate [bpp] methods which are more susceptible to the type of singulari- (c) (d) ties (step, impulse, etc.). Our work is also easier to generalize Fig. 7. Computer graphics images where the proposed codec to curved geometry which is omnipresent unless one restricts attains moderate improvements. (a), (b) images obtained from to images with very high resolutions. [6]. (c), (d) R-D comparison on (a) and (b) respectively. 7. REFERENCES e [1] E. J. Cand` s and D. L. Donoho, “New tight frames of curvelets and optimal rep- For typical test images such as “Barbara”, the proposed resentations of objects with piecewise C 2 singularities,” Comm. Pure and Appl. scheme offers more modest improvements in performance, Math, vol. 57, no. 2, pp. 219–266, February 2004. [2] E. Le Pennec and S. Mallat, “Sparse geometric image representation with ban- depending on the number of regions in the image where pre- delets,” IEEE Trans. Image Proc., vol. 14, no. 4, pp. 423–438, April 2005. dicting the geometry can provide gains. Fig. 8 shows the rate- [3] M. B. Wakin, J. K. Romberg, H. Choi, and R. G. Baraniuk, “Wavelet-domain ap- proximation and compression of piecewise smooth images,” in IEEE Trans. Image distortion performance on the “Barbara” image around 1bpp. Proc., to appear, 2005. [4] R. Shukla, P. L. Dragotti, M. N. Do, and M. Vetterli, “Rate-distortion optimized There is a ∼ 0.5dB improvement over the SPIHT coder. tree structured compression algorithms for piecewise smooth images,” IEEE Trans- In images such as “Barbara”, most of the improvements actions on Image Processing, vol. 14, pp. 342–359, 2005. [5] A. Said and W. A. Pearlman, “A new fast and efﬁcient image codec based on set obtained with the geometrical predictor are compensated by partitioning in hierarchical trees,” IEEE Trans. on Circuits and Systems for Video the efﬁcient context-sensitive, bit-plane based arithmetic coder. Technology, vol. 6, pp. 243–250, 1996. [6] http://www.graphics.cornell.edu/online/research/. While our improvements when measured in ﬁrst order en-