IEEE COMSOC MMTC E-Letter httpwww.comsoc.org~mmc 2330 Vol.4

W
Document Sample
scope of work template
							IEEE COMSOC MMTC E-Letter
Focused Technology Advances Series

A Case for Compressive Video Streaming in Wireless Multimedia Sensor Networks
        Tommaso Melodia, Scott Pudlewski, University at Buffalo, SUNY, USA
                 tmelodia@eng.buffalo.edu, smp25@buffalo.edu

Introduction                                             frame, leading to the loss of the entire sequence
Wireless       Multimedia      Sensor      Networks      of video frames that are dependent on the I
(WMSN) [1]            are self-organizing wireless       frame. Instead, ideally, when one bit is in error,
systems of embedded devices deployed to                  the effect on the reconstructed video should be
retrieve, distributively process in real-time, store,    unperceivable, with minimal overhead. In
correlate, and fuse multimedia streams originated        addition, the video quality should gracefully and
from heterogeneous sources. WMSNs will enable            proportionally degrade with decreasing channel
new      applications      including     multimedia      quality.
surveillance, storage and subsequent retrieval of
potentially relevant activities, and person locator      Compressive Video Streaming for WMSN
services.                                                   Our preliminary investigation reveals that new
   In recent years, there has been intense research      cross-layer optimized networking protocols
and considerable progress in solving numerous            integrated with video encoders based on the
wireless sensor networking challenges. However,          recently proposed compressive sensing (CS)
the key problem of enabling real-time quality-           paradigm [4], [5] can offer a convincing solution
aware video streaming in large-scale multi-hop           to the aforementioned problems. However, as
wireless networks of embedded devices is still           will become clearer in the following, this will
open and largely unexplored. In fact, traditional        require a careful rethinking of traditional
video streaming systems based on transmitting            wireless networking functionalities across
predictively-encoded video through a layered             multiple layers. Compressed sensing (aka
communication protocol stack suffer from high            “compressive sampling”) is a new paradigm that
complexity at the encoder and low resiliency to          allows the recovery of signals from far fewer
channel errors.                                          measurements than methods based on Nyquist
   • Encoder Complexity. Predictive encoding             sampling. In particular, the main result of CS is
requires complex processing algorithms, leading          that a N-dimensional signal can be reconstructed
to high energy consumption [1]. New video                from M noise-like incoherent measurements as
encoding paradigms are needed to reverse the             if one had observed the M/log(N) most
traditional balance of complex encoder and               important coefficients in a suitable base [6].
simple decoder, which is unsuited for WMSN.              Hence, CS can offer an alternative to traditional
Recently developed distributed video coding [2]          video encoders by enabling imaging systems that
algorithms exploit the source statistics at the          sense and compress data simultaneously at very
decoder, thus shifting the complexity at this            low computational complexity for the encoder.
end.     While promising for WMSNs, most                 Image coding based on CS has been recently
practical Wyner-Ziv codecs require end-to-end            explored [7], [6]. So-called single-pixel cameras
feedback from the decoder [3], which introduces          that can operate efficiently across a much
overhead and delay. Furthermore, gains of                broader spectral range (including infrared) than
practical distributed video codecs are typically         conventional silicon-based cameras have also
limited to 2-5 dBs PSNR.                                 been studied [8]. However, wireless networking
   • Limited Resiliency to Channel Errors. In            protocols optimized for transmission of CS video,
existing layered protocol stacks based on the            and their statistical traffic characterization, are
IEEE 802.11 and 802.15.4 standards, video                substantially unexplored areas. In particular, in
frames are split into multiple packets. If even a        this position paper we show that CS-based image
single bit is flipped due to channel errors, after a     representation shows an inherent resiliency to
cyclic redundancy check, the entire packet is            random wireless channel errors that should guide
dropped at a final or intermediate receiver. This        and inform protocol design optimized for
packet loss can lead to the video decoder being          wireless video streaming in WMSNs.
unable to decode an independently coded (I)


http://www.comsoc.org/~mmc/                             23/30         Vol.4, No.9, October 2009
IEEE COMSOC MMTC E-Letter




Fig. 1 Structural Similarity (SSIM) Index vs BER for (a) Reconstruction With and Without Incorrect
Samples (b) CS vs JPEG images (c) Adaptive Parity

   Effect of Channel Errors on CS Video               still an indicator of good image quality. CS
   We conducted a preliminary investigation of        image       representation      is     completely
the effect of channel errors on wireless              unstructured: this fact makes CS video much
networked CS images and video. To assess              more resilient than existing video coding
the impact of channel errors and interference         schemes to random channel errors. This simple
on CS video quality, we evaluated the                 fact has obvious, deep, consequences on
Structural Similarity Index (SSIM) [9] between        protocol design for end-to-end wireless
the original and the encoded image for a              transport of CS video.
standardized set of 25 images. We represented            This inherent resiliency of compressed sensing
each frame of a quarter common intermediate           to random channel bit errors is even more
format (QCIF) video by 8-bit intensity values,        noticeable when compared to traditional
i.e., a grayscale bitmap. To satisfy the sparsity     compression schemes. Figure 1(b) shows the
requirement of CS theory, the wavelet                 average SSIM of 25 images transmitted through
transform is used as a sparsifying base. The          a wireless channel with varying BER. The
image is sampled using a scrambled block              quality of CS-encoded images degrades
Hadamard ensemble [10], and recreated through         gracefully as the BER increases, and is still
GPSR [11]. In CS, the transmitted samples             very high for BERs as high as 10—3 .
constitute a random, incoherent combination of        Instead, JPEG-encoded images very quickly
the original image pixels. This means that,           deteriorate. This is visually emphasized in Fig.
unlike traditional wireless imaging systems, in       4, which shows a frame from a surveillance
CS no individual sample is more important for         camera at the University at Buffalo encoded with
image reconstruction than any other sample.           CS and JPEG and transmitted with end-to-end bit
Instead, the number of correctly received             error rates of 10—5 , 10—4 , and 10—3 , respectively.
samples is the only main factor in determining        The difference is stunning - the effect of bit errors
the quality of the received image. Hence, a           is much more disruptive for structured data like
peculiar characteristic of CS video is its            JPEG-encoded       images.      The     effect    on
inherent and fine-grained spatial scalability.        predictively-encoded video is even worse, since
The video quality can be regulated at a much          even low bit error rates can lead to the loss of I
finer granularity than traditional video encoders,    frames, causing the decoder to be unable to
by simply varying the number of samples per           decode long sequences of frames that depend on
frame. Also, a small amount of random channel         the I frame.
errors does not affect the perceptual quality of         Our preliminary investigation also reveals also
the received image at all, since, for moderate        that while forward error correction (FEC) is not
BERs, the greater sparsity of the “correct”           beneficial for low to moderate values of BER
image will offset the error caused by the             up to 10—2 , the perceptual quality of CS
incorrect bit. This is demonstrated in Fig.           images can be improved by dropping errored
1(a). For any BER lower than 10—4 , there is          samples that would contribute to image
no noticeable drop in the image quality. Up to
BERs lower than 10—3, the SSIM is above 0.8,




http://www.comsoc.org/~mmc/                          24/30         Vol.4, No.9, October 2009
IEEE COMSOC MMTC E-Letter




Fig. 2 Surveillance Image with CS (above) and JPEG (below) for BER (a) 10-5 (b) 10-4 (c) 10-3.



reconstruction with incorrect information (Fig.        References
1(c)). This calls for a data protection strategy        [1] I. F. Akyildiz, T. Melodia, and K. R.
based on using even parity on a predefined                  Chowdhury, “A Survey on Wireless
number of samples, which are all dropped at                 Multimedia Sensor Networks,” Computer
the receiver or at an intermediate node if the              Networks (Elsevier), vol. 51, no. 4, pp.
parity check fails. This is particularly                    921—960, Mar. 2007.
beneficial in situations when the BER is still low,     [2] B . Girod, A. Aaron, S. Rane, and D.
but too high to just ignore errors (above 10—5 ).          Rebollo-Monedero, “Distributed Video
                                                           Coding,” Proc. of the IEEE, vol. 93, no. 1,
We have analytically determined the optimal
                                                           pp. 71—83, January 2005.
number of samples to be jointly protected, for a
                                                        [3] A. Aaron, S. Rane, R. Zhang, and B.
given BER and quantization level, based on
                                                            Girod, “Wyner-Ziv Coding for Video:
which the encoder can adaptively regulate the
                                                            Applications to Compression and Error
level of protection to track the end-to-end                 Resilience,” in Proc. of IEEE Data
BER. This simple strategy is shown in Fig. 1(c)             Compression Conf. (DCC), Snowbird, UT,
to considerably improve the received video                  March 2003, pp.
quality compared to protecting the CS samples               93-102.
FEC with different levels of protection using           [4] D. Donoho, “Compressed Sensing,” IEEE
rate-compatible punctured codes (RCPC) with ¼              Transactions on Information Theory, vol.
mother codes.                                              52, no. 4, pp. 1289—1306, Apr. 2006.
   Based on the encouraging results of this             [5] E. Candes, J. Romberg, and T. Tao,
preliminary investigation, we are currently                 “Robust uncertainty principles: exact signal
conducting a cross-layer analysis of the impact             reconstruction from highly incomplete
of functionalities handled at all layers of the             frequency          information,”       IEEE
communication protocol stack on the perceived               Transactions on Information Theory, vol.
video quality for competing CS-encoded video                52, no. 2, pp. 489—509, Feb. 2006.
streams. In addition, we are developing a tool to       [6] J. Romberg, “Imaging via Compressive
perform network simulations of CS video                     Sampling,”       IEEE Signal Processing
applications. The tool will allow generating CS             Magazine, vol. 25, no. 2, pp. 14—20, 2008.
video traces and evaluating the impact of CS            [7] M . Wakin, J. Laska, M. Duarte, D.
video on network protocols over the widely used             Baron, S. Sarvotham, D. Takhar, K.
ns-2 simulator.                                             Kelly, and R. Baraniuk, “Compressive
                                                            imaging for video representation and
                                                            coding,” in Proc. Picture Coding



http://www.comsoc.org/~mmc/                           25/30         Vol.4, No.9, October 2009
IEEE COMSOC MMTC E-Letter
    Symposium (PCS), April                          Science Indicators. He is an Associate Editor for
    2006.                                           the Computer Networks (Elsevier) Journal,
 [8] M. Duarte, M. Davenport, D. Takhar, J.         Transactions on Mobile Computing and
    Laska, T. Sun, K. Kelly, and R. Baraniuk,       Applications (ICST) and for the Journal of
    “Single-Pixel Imaging via Compressive           Sensors (Hindawi). He serves in the technical
    Sampling,” IEEE Signal Processing               program committees of several leading
    Magazine, vol. 25, no. 2, pp. 83—91, 2008.      conferences in wireless communications and
 [9] Z . Wang, A. Bovik, H. Sheikh, and E.          networking, including IEEE Infocom, ACM
    Simoncelli, “Image quality assessment:          Mobicom, and ACM Mobihoc. He was the
    from error visibility to structural             technical co-chair of the Ad Hoc and Sensor
    similarity,” Image Processing, IEEE             Networks Symposium for IEEE ICC 2009. His
    Transactions on, vol. 13, no. 4, pp. 600—       current research interests are in modeling and
    612, April 2004.                                optimization of multi-hop wireless networks,
[10] L . Gan, T. Do, and T. D. Tran, “Fast          cross-layer design and optimization, wireless
    Compressive Imaging Using Scrambled             multimedia sensor and actor networks,
    Block Hadamard Ensemble,” Preprint,             underwater acoustic networks, and cognitive
    2008.                                           radio networks.
[11] M. A. T. Figueiredo, R. D. Nowak, and S.
    J. Wright, “Gradient Projection for Sparse
    Reconstruction:          Application     to
    Compressed Sensing and Other Inverse
    Problems,” IEEE Journal of Selected
    Topics in Signal Processing, vol. 1, no. 4,
    pp. 586—598, 2007.




                                                    Scott Pudlewski is a PhD student in the Department
                                                    of Electrical Engineering at the University at Buffalo,
                                                    The State University of New York (SUNY), in the the
                                                    Wireless Networks and Embedded Systems
                                                    Laboratory. He received his BS in electrical
Tommaso Melodia is an Assistant Professor           engineering at the Rochester Institute of Technology
with the Department of Electrical Engineering at    in 2008. He is the recipient of the SAP America
the University at Buffalo, The State University     Scholarship in 2008. His current research interests are
of New York (SUNY), where he directs the            in compressive sensing for wireless multimedia sensor
Wireless Networks and Embedded Systems              networks and video distortion based networking.
Laboratory. He received his Ph.D. in Electrical
and Computer Engineering from the Georgia
Institute of Technology in      2007. He had
previously received his Laurea (integrated B.S.
and M.S.) and Doctorate degrees in
Telecommunications Engineering from the
University of Rome La Sapienza, Rome, Italy, in
2001 and 2005, respectively. He coauthored a
paper that was recognized as the Fast Breaking
Paper in the field of Computer Science for
February 2009 by Thomson ISI Essential


http://www.comsoc.org/~mmc/                        26/30          Vol.4, No.9, October 2009

						
Related docs