Docstoc

Image Super Resolution Using Marginal Ditribution Prior

Document Sample
Image Super Resolution Using Marginal Ditribution Prior Powered By Docstoc
					                                                                (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                   Vol. 8, No. 2, 2010

                 IMAGE SUPER RESOLUTION USING
                  MARGINAL DITRIBUTION PRIOR
                       S.Ravishankar                                                            Dr.K.V.V.Murthy
      Department of Electronics and Communication                                 Department of Electronics and Communication
        Amrita Vishwa Vidyapeetham University                                       Amrita Vishwa Vidyapeetham University
                    Bangalore, India                                                            Bangalore, India
             s_ravishankar@blr.amrita.edu                                                 kvv_murthy@blr.amrita.edu


Abstract— In this paper, we propose a new technique for image                wavelet transform. They estimate the wavelet coefficients at
super-resolution. Given a single low resolution (LR) observation             higher scale from a single low resolution observation and
and a database consisting of low resolution images and their high            achieve interpolation by taking in-verse wavelet transform.
resolution versions, we obtain super-resolution for the LR                   The authors in [5] propose technique for super-resolving a
observation using regularization framework. First we obtain a
close approximation of the super-resolved image using learning
                                                                             single frame image using a database of high resolution images.
based technique. We learn high frequency details of the                      They learn the high frequency details from a database of high
observation using Discrete Cosine Transform (DCT). The LR                    resolution images and obtain initial estimate of the image to be
observation is represented using a linear model. We model the                super-resolved. They formulate regularization using wavelet
texture of the HR image using marginal distribution and use the              prior and MRF model prior and employ simulated annealing
same as priori information to preserve the texture. We extract               for optimization. Recently, learning based techniques are
the features of the texture in the image by computing histograms             employed for super-resolution. Missing information of the
of the filtered images obtained by applying filters in a filter bank         high resolution image is learned from a database consisting of
and match them to that of the close approximation. We arrive at              high resolution images. Freeman et al. [6] propose an example
the cost function consisting of a data fitting term and a prior
term and optimize it using Particle Swarm Optimization (PSO).
                                                                             based super-resolution technique. They estimate missing high-
We show the efficacy of the proposed method by comparing the                 frequency details by interpolating the input low-resolution
results with interpolation methods and existing super-resolution             image into the desired scale. The super-resolution is
techniques. The advantage of the proposed method is that it                  performed by the nearest neighbor based estimation of high-
quickly converges to final solution and does not require number              frequency patches based on the corresponding patches of input
low resolution observations.                                                 low-frequency image. Brandi et al. [7] propose an example-
                                                                             based approach for video super-resolution. They restore the
Keywords-component; formatting; style; styling; insert (key                  high-frequency
words)

                          I.INTRODUCTION
                                                                             Information of an interpolated block by searching in a
                                                                             database for a similar block, and by adding the high frequency
In many applications high resolution images lead to better                   of the chosen block to the interpolated one. They use the high
classification, analysis and interpretation. The resolution of an            frequency of key HR frames instead of the database to
image depends on the density of sensing elements in the                      increase the quality of non-key restored frames. In [8], the
camera. High end camera with large memory storage                            authors address the problem of super-resolution from a single
capability can be used to capture the high resolution images.                image using multi-scale tensor voting framework. They
In some applications such as wildlife sensor network, video                  consider simultaneously all the three color channels to produce
surveillance, it may not be feasible to employ costly camera.                a multi-scale edge representation to guide the process of high-
In such applications algorithmic approaches can be helpful to                resolution color image reconstruction, which is subjected to
obtain high resolution images from low resolution images                     the back projection constraint. The authors in [9] recover the
obtained using low cost cameras. The super-resolution idea                   super-resolution image through neighbor embedding
was first proposed by Tsai and Huang [1]. They use frequency                 algorithm. They employ histogram matching for selecting
domain approach and employ motion as a cue. In [2], the                      more reasonable training images having related contents. In
authors use a Maximum a posteriori (MAP) framework for                       [10] authors propose a neighbor embedding based super-
jointly estimating the registration parameters and the high-                 resolution through edge detection and Feature Selection
resolution image for severely aliased observations. The                      (NeedFS). They propose a combination of appropriate features
authors in [3] describe an MAPMRF based super-resolution                     for preserving edges as well as smoothing the color regions.
technique using blur cue and recover both the high-resolution                The training patches are learned with different neighborhood
scene intensity and the depth fields simultaneously. The                     sizes depending on edge detection. The authors in [11]
authors in [4] present technique of image interpolation using                propose modeling methodology for texture images. They



                                                                       347                              http://sites.google.com/site/ijcsis/
                                                                                                        ISSN 1947-5500
                                                              (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                 Vol. 8, No. 2, 2010
capture the features of texture using a set of filters which               is the number of the training sets in the data base. Now the
represents the marginal distribution of image and match the                best matching HR block for the considered low resolution
same in feature fusion to infer the solution. In this paper, we            image block (up-sampled) is obtained as

                                                                                                              8
                                                                              m = ������������������������������������������������ �
                                                                              �                                                                            �CT (i , j) − CLR (i, j)� .
                                                                                                                                                             (m)          (m)        2
propose an approach to obtain super-resolution from a single
image. First, we learn the high frequency content of the super-
                                                                                                              ��������+�������� >��������ℎ������������������������ ℎ������������������������
resolved image from the high-resolution training images in the
data base and use the learnt image as a close approximation to

                                                                                  �
                                                                           Here, m(i, j) is the index for the training image which gives
                                                                                                                                             (1)
the final solution. We solve this ill-posed problem using prior
information in the form of marginal distribution. We apply
different filters on the image and calculate the histograms. We            the minimum for the block. Those non aliased best matching
assume that these histograms remain deviate from that of the               HR image DCT coefficients are now copied in to the
close approximation. We show the result of our method on                   corresponding locations in the block of the up sampled test
real images and compare it with the existing approaches.                   image. In effect, we learn non aliased DCT coefficients for
.                                                                          the test image block from the set of LR-HR images. The
                                                                           coefficients that correspond to low frequencies are not altered.
                                                                           Thus at location (i , j)in a block, we have,

                                                                                         �������� (��������, ��������) ���������������� (�������� , ��������) > ��������ℎ������������������������ℎ������������������������
  II. DCT  BASED              APPROACH          FOR      CLOSE
                                                                                               (�������� ) �
                                                                            ΧΤ(ι, ϕ) = � ����������������                                                                  �
  APPROXIMATION.

                                                                                        ����������������(�������� ,�������� ) ��������������������������������
                                                                                                                                                                                     (2)
In this section, DCT based approach to learn high frequency
details for the super-resolved for a decimation factor of 2 (q =
2) is described. Each set in the database consists of a pair of            This is repeated for every block in the test image. We
low resolution and high resolution image. The test image and               conducted experiment with different Threshold values. We
LR training images are of size M × M pixels. Corresponding                 begin with Threshold =2 where all the coefficients except the
HR training images have size of 2M × 2M pixels. We first up                DC coefficient are learned. We subsequently increased the
sample the test image and all low resolution training images               threshold value and conducted the experiment. The best
by factor of 2 and create images of size 2M X 2M pixels each.              results were obtained when the Threshold was set to 4 that
A standard interpolation technique can be used for the same.               correspond to learning a total of 10 coefficients from the best
We divide each of the images, i.e. the up sampled test image,              matching HR image in the database. After learning the DCT
up sampled low resolution images and their high resolution                 coefficients for every block in the test image, we take inverse
versions, in blocks of size4 × 4. The motivation for dividing              DCT transform to get high spatial resolution image and
into 4X4 block is due to the theory of JPEG compression                    consider it as the close approximation to the HR image.
where an image is divided into 8X8 blocks in order to extract
the redundancy in each block. However, in this case we are
interested in learning the non aliased frequency components                                        III. IMAGE FORMATION MODEL
from the HR training images using the aliased test image and
the aliased LR training images. This is done by taking the                 In this work, we obtain super-resolution for an image from a
DCT on each of the block for all the images in the database as             single observation. The observed image Y
well as the test image. Fig.1.a) shows the DCT blocks of the               is of size M x M pixels. Let y represent the lexicographically
up sampled test image whereas Fig.1. (b) Shows the DCT                     ordered vector of size M 2 × 1, which contains the pixels from
blocks of up sampled LR training images and HR training                    image Y and z be the super-resolved image. The observed
images. We learn DCT coefficients for each block in the test               images can be modeled as
image from the corresponding blocks in the HR images in the
database. It is reasonable to assume that when we interpolate                        y = Dz + n,                                                                               (3)
the test image and the low resolution training images to obtain
2M × 2M pixels, the distortion is minimum in the lower                     where D is the decimation matrix which takes care of aliasing.
frequencies. Hence we can learn those DCT coefficients that                For an integer decimation factor of q, the decimation matrix D
correspond to high frequencies (already aliased) and now                   consists of q2 non-zero elements along each row at appropriate
distorted due to interpolation. We consider up sampled LR                  locations. We estimate this decimation matrix from the initial


                                                                           and variance ���������������� . It is of the size, M2× 1. The multivariate
training images to find the best matching DCT coefficients for             estimate. The procedure for estimating the decimation matrix
                                                                                             2
each of the blocks in the test image.                                      is described below. n is the i.i.d noise vector with zero mean
                                                                                                                           2
 Let CT (i , j), 1≤ ( i , j) ≤ = 4, be the DCT coefficient at              noise probability density is given by 2������������������������ .
location (i, j) in a 4 × 4 block of the test image. Similarly, let
C(m)LR ( i , j) and C(m)HR (I , j), m = 1,2,……,L, be the DCT               Our problem is to estimate z given y, which is an ill-posed
coefficients at location( i, j) ,in the block at the same position         inverse problem. It may be mentioned here that the
in the mth up-sampled LR image and mth HR image. Here L                    observation captured is not blurred. In other words, we assume




                                                                     348                                                           http://sites.google.com/site/ijcsis/
                                                                                                                                   ISSN 1947-5500
                                                              (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                 Vol. 8, No. 2, 2010
identity matrix for blur. Generally, the decimation model to             A. Filter Bank
obtain the aliased pixel intensities from the high resolution            The Gaussian filters play an important role due to its nice low


                          1 1 …1                0
pixels, for a decimation factor of q, has the form [12]                  pass frequency property. The two dimensional Gaussian


                 2�             1 1… 1              �
            1                                                                                                                                 �������� −�������� 0      �������� −�������� 0
                                                                                                                                             −�              +                 �
                                                                                                                                      ��������
                                                                         function can be defined as
                                                                                                                      1                        2�������� 2
                      0                     1 1 … .1
                                                                                                                                                                2�������� 2
          ��������
                                                                                                             2�������������������������������� ��������
     D=                                                 (4)
                                                                            G(x, y\x0,y0,σx,σy)=                                                                                    (5)


                                                                         Here (x0, y0) are location parameters and (σx σy) are scale
The decimation matrix in Eq. (4) indicates that a low                    parameters. The Laplacian of Gaussian (LoG) filter is a
resolution pixel intensity Y (i, j) is obtained by averaging the         radially symmetric centers around Gaussian filter with (x0, y0)
intensities of q2 pixels corresponding to the same scene in the          = (0, 0) and σx = σy = T. Hence LoG filter can be represented
high resolution image and adding noise intensity n (i, j).
                                                                                                                                                      �������� 2 +�������� 2
                                                                                                                                                  −
                                                                         by
                                                                              F(x, y\0 ,0, T) = c.(x2 +y2 –T2)��������                                          �������� 2                 (6)


                                                                                                  1
                                                                         Here c is a constant and T scale parameter. We can choose
                                                                                                  √2
                                                                         different scales with T = , 1, 2,3, and so on.The Gabor filter
                                                                         with sinusoidal frequency ‘ω ‘and amplitude modulated by the
                                                                         Gaussian function can be represented by

                                 Training Set-1                              Fw(x,y) = G(x,y\0,0,σx , σy)�������� −������������������������                                                        (7)

                                                                         A simple case of Eq. (7) with both sine and cosine components

                                                                                G(x,y\0, 0,T, ��������) = c�������� 2�������� 2 ( 4(���������������������������������������� + ����������������������������������������)2 +
                                                                                                                             1
                                                                         can chosen as
                                        •

                                        •                                       (−����������������������������������������+����������������������������������������)2                                         (8)

                                        •                                By varying frequency and rotating the filter in - y plane, we


                                                                         as �������� = 00 ,300,600,900 and so on.
                                                                         can obtain a bank of filters. We can choose different scales
                                                                         T=2, 4, 6, 8 and so on. Similarly, the orientation can be varied


                                                                         B. Marginal Distribution Prior

                                                                         As mentioned earlier, the histograms of the filtered images
                                                                         estimate the marginal distribution of the image. We use this
                                                                         marginal distribution as a prior. We obtain the close
                                 Training set L                          approximation ZC of the HR image using discrete cosine
                                                                         transform based learning approach as described in section II
                                   Figure-1                              and assume that the marginal distribution of the super-resolved


                                                                         obtain filtered images), where ���������������� , where α = 1,. . . . . . ,|��������|.
                                                                         image should match that of the close approximation ZC. Let B
                                                                                                                   ��������
                          IV   TEXTURE MODELLING                         be a bank of filters. We apply each of the filters in B to ZC and

                                                                         We compute histogram �������� �������� of ���������������� . Similarly, we apply
Natural images consist of smooth regions, edges and texture                                                 (��������)      (��������)



                                                                         filtered images �������� �������� ,where �������� = 1, 2, 3…. |��������|. We compute
areas. We regularize the solution using the texture preserving

                                                                         histogram ���������������� of ���������������� . We define the marginal distribution
                                                                         each of the filter in B to the initial HR estimate and obtain
                                                                                        ��������        ��������
prior. We capture the features of texture by applying different
filters to the image and compute histograms of the filtered
images. These histograms estimate the marginal distribution of           prior term as,

                                                                          CH =���������=1|HC − H α |
                                                                                    |��������|
the image. These histograms are used as the features of the
image. We use a filter bank that consists of two kinds of                              α
filters: Laplacian of Gaussian (LoG) filters and Gabor filters.                                                                                                                          (9)




                                                                   349                                                   http://sites.google.com/site/ijcsis/
                                                                                                                         ISSN 1947-5500
                                                                                                                                                                                    ̂
                                                                                                                          where f (i, j)is the original high resolution image and �������� (i , j) is
                                                                                                             (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                                                                Vol. 8, No. 2, 2010

                                                                                                                          estimated super-resolution image.
                               V. SUPER-RESOLVING THE IMAGE

The final cost function consisting of the data fitting term and
marginal distribution prior term can be expressed as

                  ̂
                ��������= argument min�
                                    ‖��������−����������������‖
                                                    + �������� ∑��������=1|HC − H α |�.
                                                                   2
                                                            |��������|
                                              2       2����������������
                                                                    α

                                                             (10)

Where, λ is a suitable weight for the regularization term. The
cost function consists of non-linear term it cannot be                                                                     (a) HR Image                    (b) Learnt Image
minimized using simple gradient descent optimization
technique. We employ particle swarm optimization and avoid
the computationally complex optimization methods like
simulated annealing. Let S be the swarm. The swarm S is
populated of images Zp, p =1……|��������| expanded using existing
interpolation techniques such as bi-cubic interpolation,
lanczose interpolation and learning based approaches. Each
pixel in this swarm is a particle. The dimension of the search
space for each image is
D = N × N. The i-th image of the swarm can be represented by                                                              (c) PSO Optimized Image
a D-dimensional vector, Z =(Zi1, Zi2,…, ZiD)T . The velocity
of particles in this image can be represented by another D-                                                                                             Figure-2
dimensional vector

 Vi = (vi1; vi2,….., viD)T. The best previously visited position
of the i-th image is denoted as

Pi = (pi1, pi2;…., piD)T .Defining ’ g ‘ as the index of the best


������������������������+1 = w������������������������ + c1 r1( ������������������������ - ������������������������ )+ c2r2 ( ������������������������ - ������������������������ )
particle in the swarm, the swarm is manipulated according to
                                                                                   ��������
         ��������              ��������                   ��������          ��������                                  ��������
the following two equations [13],
                                                                                                            (11)

������������������������+1 = ������������������������ + ������������������������
                                                                                                                          (a) HR Image                       (b) Learnt Image

         ��������             ��������           ��������                                                                (12)

where d =1, 2,….,D; i =1, 2,.….F; w is weighting function, r1
and r2 are random numbers uniformly distributed V1,V2,…are
the iteration numbers, C1,C2,.. are cognitive and social
parameter, respectively. The fitness function in our case is the
cost function that has to be minimized.

                                                                                                                          (c)PSO Optimized Image
                                      VI. EXPERIMENTAL RESULTS
                                                                                                                                                        Figure-3
In this section, we present the results ( shown in the fig.2, fig.3                                                                                    TABLE-1
and table-1) of the proposed method for the super-resolution.
We compare the performance of the proposed method on the                                                                  Image           MMSE between                MMSE between HR and PSO
basis of quality of images. All the experiments were conducted                                                            Num           HR and Learnt images          images
on real images. Each observed image is of size 128 ×128
pixels. The super-resolved images are also of size 128 × 128.                                                                1             0.02173679178                    0.02154759509
We used the quantitative measure Mean Square Error (MSE)

                                                                                           �             2
                                                            ∑��������,�������� ���������(��������,�������� )−��������(��������,�������� )�
for comparison of the results. The MSE used here is
                                                                                                                             2             0.01117524672                    0.01107802761
                                                                     ∑��������,�������� |��������(��������,�������� )|2
                                                M.S.E =




                                                                                                                    350                                 http://sites.google.com/site/ijcsis/
                                                                                                                                                        ISSN 1947-5500
                                                                      (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                         Vol. 8, No. 2, 2010
                         VII.CONCLUSION.                                           [6] W.Freeman, T.Jones, E.Pasztor, “Example based super-resolution,” IEEE
                                                                                   Computer GRAPHICS AND asPPLICATIONS, Vol.22,no.2,pp.56-65,2002.

We have presented a technique to obtain super-resolution for                       [7]F.Brandi, R.de Queiroz, and D.Mukerjee, “super-resolution of video using
an image captured using a low cost camera. The high                                key frames,”IEEE International Symposium on circuits and systems,pp.1608-
frequency content of the super-resolved image is learnt from a                     1611,2008.
database of low resolution images and their high resolution
versions. The suggested technique for learning the high                            [8]Y.W.Tai, W.S.tongand C.K.Tang,”Perceptually inspired and edge-directed
                                                                                   color image super-resolution,”IEEE Computer Society Conference on
frequency content of the super-resolved image yields close                         Computer Vision and Pattern recognition, vol.2,pp.1948-1955,2006.
approximation to the solution. The LR observation is
represented using linear model and marginal distribution is                        [9] T.Chan and J.Zhang,”An improved super-resolution with manifold
used as prior information for regularization. The cost function                    learning and histogram matching,”Proc.IAPRInternational Conference on
consisting of a data fitting term and a marginal distribution                      Biometric,pp.756-762,2006.
prior term is optimized using particle swarm optimization.
The optimization process converges rapidly. It may be                              [10] T.Chan,J.Zhang, J.Pu, and H.Huang,”Neighbot Embedding based super-
concluded that the proposed method yields better results                           resolution algorithm through edge detection and feature selection,”Pattern
                                                                                   Recognition Letters, Vol.30,no.5,pp,494-502,2009.
considering both smoother regions as well as texture regions
and greatly reduces the optimization time.
                                                                                   [11] W.Y.Zhu, S.C. and D.Mumford,”Filters,random fields and aximum
                                                                                   entropy(FRAME): Towards unified theory for texture modeling,”International
                                                                                   Journal of computer Vision, vol.27,no.2,pp.107-126, 1998.
                             REFERENCES
[1]R.Y.Tsai and T.S.Huang, “Multiframe image resolution and registration”          [12] R.R.Schultz and R.L.Stevenson,”A Bayseian approach to image
Advances in computer vision and image processing, pp.317-339, 1984.                expansion for improved definition,” IEEE Trans.Image.Process,vol.3,
                                                                                   no.3,pp233-242, May 1994.
[2] R.C. Hardle, K.J. Barnard, and E.E. Armstrong. “Joint MAP registration
and high-resolution image estimation using a sequence of under sampled             [13] M.Vrahatis and K.Parsopoulous, “ Natural Computing,” Kluwer , 2002.
images”, IEEE Trans.Image Process,vol.6.no12,pp.1621-1633,Dec.1997.

[3] D. Rajan and S. Chaudhuri, “Generation of super-resolution images from
blurred observation using an MRF model,”.ath.Imag.ision, Vol.16, pp,5-15,
2002.
[4] S.Chaudhuri, Super-resolution imaging, S.Chaudhuri, Ed.kluwer, 2001.
[5] C.V.Jiji, M.V.Joshi, and S. Chaudhuri, ”Single frame image super-
resolution using learned wavelet coefficients. “International Journal of
Imaging Systems and Technology, Vol.14,no.3,pp.105-112,2004.



                      AUTHORS PROFILE




                                                                             351                                   http://sites.google.com/site/ijcsis/
                                                                                                                   ISSN 1947-5500

				
DOCUMENT INFO