Ryan S. Prendergast and Truong Q. Nguyen

                                   Department of Electrical and Computer Engineering
                                           University of California, San Diego
                                                 La Jolla, CA 92093 USA

                          ABSTRACT                                       struction solution for the problem of generalized undersam-
   The super-resolution problem is considered using a mean-              pling [5] can be applied to the problem of super-resolution.
squared error minimizing solution of a generalized under-                The term “generalized undersampling” refers to a scenario
sampling model in this non-iterative frequency domain ap-                similar to classic generalized sampling [4], in which a signal
proach. While previous frequency domain approaches have                  is passed through multiple linear time-invariant filters, the
been based on a bandlimited image model, this approach                   outputs of which are individually sampled at a sub-Nyquist
uses a non-bandlimited stationary spectral model. This al-               rate. However, unlike in the classic sense, the average sam-
lows improved reconstruction of certain image features. The              pling rate falls below the Nyquist rate, guaranteeing a loss
model and algorithm are presented along with an example.                 of information. In the case of real-world images, the band-
                                                                         limited assumption made in previous frequency domain ap-
                                                                         proaches to super-resolution does not hold. If a sufficiently
                    1. INTRODUCTION                                      high resolution result is sought, solutions based on such an
                                                                         assumption can contain significant errors (e.g., rippling ef-
Over roughly the last two and a half decades there have
                                                                         fects). This paper’s approach assumes knowledge of a spec-
emerged several families of solution types to the problem
                                                                         tral model and finds the MMSE linear reconstruction from
of super-resolution, the synthesis of a single high-resolution
                                                                         the undersampled data. This enables reconstruction of fea-
(HR) image from multiple overlapping lower-resolution (LR)
                                                                         tures containing high-frequency data content which would
portions of the same scene. Notable initial techniques used
                                                                         be lost when a bandlimited reconstruction is forced. There
frequency domain based approaches for reconstruction [1,
                                                                         also exist a wide variety of alternate approaches for solv-
2]. A later approach [3] used a spatial domain reconstruc-
                                                                         ing the super-resolution problem (a small selection of which
tion technique based on generalized sampling theory [4]
                                                                         are [6–10]). Since the focus of this paper is specifically
and determined a set of linear shift-invariant (LSI) filters
                                                                         examining the use of frequency domain approaches, com-
which were applied to a set of upsampled LR images. Since
                                                                         parison with alternate methods cannot be made within what
an equivalent frequency domain version of this solution is
                                                                         space is available.
found by taking the discrete Fourier transform of the de-
termined filters, in some sense [3] can be considered a fre-                   The model used is described in Section 2, along with the
quency domain approach. While some aspects of [2] are                    assumptions required for its use. The approach is described
not found in [3] (namely advantages associated with its use              in Section 3, followed by an implementation discussion in
of a weighted recursive least-squares approach), for the pur-            Section 4. Finally, an example is provided in Section 5.
poses of this paper the two techniques are comparable. Both
consider known global translational shifts between the indi-                        2. MODEL AND ASSUMPTIONS
vidual LR images, assume a bandlimited scene, and require
the existence of a sufficient number of LR images guaran-
                                                                         In order to establish which super-resolution scenarios this
teeing an average sampling rate at least equal to the Nyquist
                                                                         paper’s approach can be applied to, the model will first be
                                                                         examined. Instead of a continuous real-world scene, the
    This paper’s purpose is to examine how recent work
                                                                         LR images are assumed taken from a HR discrete-space
finding the minimum mean-squared error (MMSE) recon-
                                                                         scene, the resolution of which should be equal to that of the
   This work is supported by a grant from the Office of Naval Research.   super-resolved image. The most stringent requirement of
this approach is that this discrete-space scene has a known      where Hp (W q ) refers to the qth sub-band of the pth filter.
wide sense stationary (WSS) spectral model. Since the true       This matrix is only defined for frequencies in the range
scene is not known in practice, a spectral model will have
to be assumed or estimated from the set of LR images (in                              |ωH | ≤ π/DH ,
which case the effects of noise and blurring must be con-                              |ωV | ≤ π/DV ,                          (2)
sidered). In addition, WSS models will be required for any
                                                                 where ωH and ωV represent the respective normalized fre-
additive noise. Finally, as with most super-resolution algo-
                                                                 quencies along the horizontal and vertical dimensions. An
rithms, this technique will also require registration informa-
                                                                 equivalent representation to (1) is made for the reconstruc-
tion (which can be estimated with reasonable accuracy [11])
                                                                 tion filters F1 , F2 , · · · FC . Using the same notation to index
and a LSI model for any blurring.
                                                                 the sub-bands, a WSS spectral model for the input image
    A block diagram for the image degradation model and
                                                                 can be represented through the diagonal matrix
reconstruction process are shown in Fig. 1. Each LR image
is found by passing the HR scene through a unique LSI fil-                    Sxx(W 1 )
                                                                                                                              
                                                                                                    0       ···        0
ter modelling blur and translational motion, followed by a                                                  ..        .
                                                                                                                       .       
decimation block to model scene sampling, and then com-
                                                                                  0           Sxx(W 2 )        .      .       
                                                                   Sxx =          .
bining the result with additive stationary noise. These indi-
                                                                                  .               ..        ..                
                                                                                  .                  .         .      0       
vidual LR images are then passed through expansion blocks                          0               ···        0 Sxx(W D )
to increase the sampling rate, followed by LSI filters, then                                                                     (3)
additively combined to form the super-resolved image.                A matrix representation for the additive noise is also
                                                                 found. It is assumed that the noise processes are statisti-
                                                                 cally independent from the image. If not, a more compli-
              H1     ↓                 ↑     F1                  cated representation provided in [5] can be used. The cross-
                          n1                                     spectrum of the kth and lth noise components is contracted
                                                                 from its full normalized frequency range to that of a single
              H2     ↓                 ↑     F2                  sub-band. This is represented through
                                                                              Snk nl = Snk nl (ejωH DH , ejωV DV ).            (4)
             HC      ↓                 ↑     FC                  The noise matrix is then defined through
                          nC                                                       
                                                                                       ¯             ¯
                                                                                       Sn1 n1 · · · Sn1 nC
           Motion, blurring,             Linear                             N=          .
                                                                                         .     ..      .
                                                                                                       .   .                  (5)
                                                                                                          
                                                                                         .        .    .
      sampling, and additive noise   reconstruction
                                                                                      Sn n · · · Sn n¯
                                                                                           C   1           C   C

Fig. 1. Block diagram representing image degradation             As with H, F, and Sxx , this noise matrix is only defined for
model and resolution enhancement.                                the frequency ranges of (2), in this case due to the spectral
                                                                 contraction mentioned above.
 3. OBTAINING THE SUPER-RESOLVED IMAGE                               The linear filters F1 , F2 , · · · FC are found by minimiz-
                                                                 ing a function of the super-resolved image’s average MSE
The filter bank model of Fig. 1 is examined using the ap-         over F. Piecewise recombination of this minimizing result,
proach in [5], which includes a more detailed mathematical
analysis of the result. Assuming separable horizontal and                 Fopt = DSxx H∗ DN + HT Sxx H∗                   ,    (6)
vertical integer decimation operations of DH and DV re-
spectively, the model can be analyzed in the frequency do-       provides a frequency domain representation for the optimal
main by subdividing the spectrum into D = DH DV rect-            resolution enhancement filters.
angular portions of area 2π/DH × 2π/DV . These portions
correspond to the aliased sub-bands, which can be indexed                 4. IMPLEMENTATION CONCERNS
in any consistent manner. The divided sub-bands of filters
H1 , H2 , · · · HC are then collectively represented using the   Once the filters F1 , F2 , · · · FC are found, resolution enhance-
matrix                                                           ment can be performed at a relatively low cost. Each LR
                                                                 image is upsampled and transformed to its frequency do-
             H1 (W 1 ) H2 (W 1 ) · · · HC (W 1 )
                                                      
         H1 (W 2 ) H2 (W 2 ) · · · HC (W 2 )                   main representation, then multiplied with the frequency do-
                                                        , (1)
                                                                main representation of the corresponding filter. The pro-
                  .          .
                             .        ..        .
                 .          .           .      .               cessed images are then additively combined and returned to
                    D           D                  D             the spatial domain to obtain the super-resolved image.
             H1 (W ) H2 (W ) · · · HC (W )
    Determining the enhancement filters will represent the
bulk of computational cost. The equation (6) determines
the solution for all D sub-bands of all C reconstruction fil-
ters simultaneously, but this process must be repeated for a
sufficient number of frequencies in the range (2) to obtain
an accurate reconstruction technique. For example, if a total
of Q × Q frequency domain samples are desired per filter,
(6) will have to be solved Q2 /D times.
    To further complicate matters, structure of these filters
are dependent on the spectral content of the scene, the com-
bined motion and blurring, and the noise statistics. This
almost nullifies the possibility that the enhancement filters
used in one scenario will be effective in another. How-
ever, since the components of (6) have certain structures
present, there may be cases where the computation can be
significantly reduced. If only certain types of scenes are ex-
amined, a database of appropriate spectral models can be
stored, which can assist in the selection of a spectral model
from the LR images. However, the computational cost will
generally be far from trivial.                                                         Fig. 2. Test image.

                       5. EXAMPLE
                                                                   significantly decreased ringing and increased readability of
Evaluation is performed using the airplane test image in Fig.      the plane’s number. The PSNRs of the complete super-
2. To illustrate the advantages of this paper’s approach using     resolved images are 31.23 dB for reconstruction under the
an undersampled spectral model over a frequency domain             bandlimited assumption and 34.56 dB with a known im-
approach assuming a bandlimited model, a simple scene              age spectrum. Further simulation results can be found at
degradation process will be used. The individual LR im-  
ages will be uncorrupted by noise and free from blurring.
The only differences between the LR frames will be known
translational shifts. In the event of uncertain registration ac-                      6. CONCLUSION
curacy, this approach can be modified to consider random
                                                                   A new approach to frequency domain super-resolution was
shifts, a problem considered in [12]. However, the modifi-
                                                                   presented. An MMSE approach with a known spectral model
cation will reduce overall reconstruction quality. Inaccurate
                                                                   was used to find a resolution enhancement technique specif-
registration in super-resolution was also considered in [9]
                                                                   ically tailored to the scene. While a high computational
using an adaptive approach. The spectrum of the scene is
                                                                   cost is associated with filter calculation and certain limita-
assumed known in this example, and found by taking the
                                                                   tions are induced by requiring LSI modelling of the LR im-
squared magnitude of the model scene. In practice, the
                                                                   age degradation, a higher-quality image can be produced.
spectrum would have to be estimated from the combined
                                                                   This was illustrated by the presented example, where re-
LR frames.
                                                                   sults were significantly improved upon by using spectral
    Four individual LR images are obtained by decimating
                                                                   modelling instead of a bandlimited assumption. This re-
the model scene of Fig. 2 by a factor of 4 horizontally and
                                                                   sult serves to mitigate some of the weaknesses that have
vertically. The super-resolved image is found at the model’s
                                                                   been commonly associated with frequency domain super-
original resolution, representing a total undersampling fac-
                                                                   resolution. Future work will investigate improvements upon
tor of 4. Relative to the first LR image, the others have
                                                                   this approach such as computation reduction and the related
respective horizontal and vertical shifts of (2, 1), (3, 2), and
                                                                   problem of image spectra modelling.
(1, 3) pixels.
    Fig. 3 shows a selected portion of the original scene in
(a) and one of the four LR frames in (b), along with two                              7. REFERENCES
super-resolved versions (c) and (d). The scene is assumed
bandlimited and critically sampled to obtain (c), which con-       [1] T. S. Huang and R. Y. Tsai, “Multi-frame image restora-
tains significant ringing, a feature corresponding to an im-            tion and registration,” in Advances in Computer Vision
age being bandlimited. This paper’s approach is then used              and Image Processing, T. S. Huang, ed. Greenwich, CT:
with the known spectral model to obtain (d), which has                 JAI Press, 1984, vol. 1, pp. 317-339.
                                                              [2] S. P. Kim, N. K. Bose, and H. M. Valenzuela,
                                                                  “Recursive reconstruction of high resolution image
                                                                  from noisy undersampled multiframes,” IEEE Trans.
                                                                  Acoust., Speech, Signal Processing, vol. 38, pp. 1013-
                                                                  1027, June 1990.

                                                              [3] H. Ur and D. Gross, “Improved resolution from sub-
                                                                  pixel shifted pictures,” CVGIP: Graph. Models Image
                                                                  Processing, vol. 54, pp. 181-186, Mar. 1992.

                                                              [4] A. Papoulis, “Generalized sampling expansion,” IEEE
                           (a)                                    Trans. Circuits Systems, vol. CAS-24, pp. 652-654,
                                                                  Nov. 1977.

                                                              [5] R. S. Prendergast and T. Q. Nguyen, “Minimum mean-
                                                                  squared error reconstruction for generalized undersam-
                                                                  pling of cyclostationary processes,” submitted to IEEE
                                                                  Trans. Sig. Processing.

                                                              [6] H. Stark and P. Oskoui, “High resolution image recov-
                                                                  ery from image-plane arrays, using convex projections,”
                                                                  J. Opt. Soc. Amer. A, vol. 6, pp. 1715-1726, Nov. 1989.

                                                              [7] R. C. Hardie, K. J. Barnard, and E. E. Armstrong, “Joint
                                                                  MAP registration and high-resolution image estima-
                           (b)                                    tion using a sequence of undersampled images,” IEEE
                                                                  Trans. Image Processing, vol. 6, pp. 1621-1633, Dec.
                                                              [8] M. Elad and Y. Hel-Or, “A fast super-resolution recon-
                                                                  struction algorithm for pure translational motion and
                                                                  ommon space invariant blur,” IEEE Trans. Image Pro-
                                                                  cessing, vol. 10, pp. 1187-1193, Aug. 2001.
                                                              [9] E. S. Lee and M. G. Kang, “ Regularized adaptive high-
                                                                  resolution image reconstructin considering inaccurate
                                                                  subpixel registration,” IEEE Trans. Image Processing,
                           (c)                                    vol. 12, pp. 826-837, July 2003.

                                                              [10] S. Farisu, M. D. Robinson, M. Elad, and P. Milan-
                                                                  far, “Fast and robust multiframe super resolution,” IEEE
                                                                  Trans. Image Processing, vol. 13, pp. 1327-1344, Oct.

                                                              [11] H. Shekarforoush, M. Berthod, and J. Zerubia, “Sub-
                                                                  pixel image registration by estimating the polyphase de-
                                                                  composition of cross power spectrum” in Proc. 1996
                                                                  IEEE Computer Society Conf. Computer Vision Pattern
                                                                  Recognition, June 1996, pp. 532-537.

                           (d)                                [12] R. S. Prendergast, T. Q. Nguyen, “Optimal Recon-
                                                                  struction of Periodically Sampled Signals with Proba-
                                                                  bilistic Timing Delays,” to appear in Proc. 38th Asilo-
Fig. 3. Magnified portions of test scene (a), one of four
                                                                  mar Conf. Signals, Systems, and Computers Pacific
LR frames at 1/16th resolution (b), frequency-domain re-
                                                                  Grove, CA, 2004.
construction under bandlimited assumption (c), this paper’s
approach (d).

To top