Docstoc

Broadcast Sync

Document Sample
Broadcast Sync Powered By Docstoc
					                                        Best viewed on-screen in slide-show m




Shifted Codes
Sachin Agarwal
Deutsch Telekom A.G., Laboratories
Ernst-Reuter-Platz 7
10587 Berlin
Germany

Joint work with Andrew Hagedorn and Ari Trachtenberg at Boston Univers
Outline

1. Motivation & Problem Definition
2. Background
 a.   Rateless Codes
 b.   Digital Fountain Codes
3. Shifted Codes
 a.   Motivation – Inefficiency of LT codes
 b.   Construction of Shifted Codes
 c.   Analysis – Communication and Computation Complexity
4. Experimental Comparison
 a.   LT vs. Shifted Codes
 b.   Constraint Sensors – Deployment on TMotes
5. Discussion and Round-up

                    S. Agarwal, sachin.agarwal@telekom.de, January 2008   1
Outline

1. Motivation & Problem Definition
2. Background
 a.   Rateless Codes
 b.   Digital Fountain Codes
3. Shifted Codes
 a.   Motivation – Inefficiency of LT codes
 b.   Construction of Shifted Codes
 c.   Analysis – Communication and Computation Complexity
4. Experimental Comparison
 a.   LT vs. Shifted Codes
 b.   Constraint Sensors – Deployment on TMotes
5. Discussion and Round-up

                    S. Agarwal, sachin.agarwal@telekom.de, January 2008   2
Partial Information

Transmission Channel with Erasures




      Transmitter                                                                Receiver




 Input symbols                                                            Received Symbols

                    S. Agarwal, sachin.agarwal@telekom.de, January 2008                      3
Partial Information

Transmission Channel with Erasures




      Transmitter                                                                Receiver




 Input symbols                                                            Received Symbols

                    S. Agarwal, sachin.agarwal@telekom.de, January 2008                      4
Partial Information

Transmission Channel with Erasures




      Transmitter                                                                Receiver




 Input symbols                                                            Received Symbols

                    S. Agarwal, sachin.agarwal@telekom.de, January 2008                      5
Partial Information

Transmission Channel with Erasures




      Transmitter                                                                Receiver




 Input symbols                                                            Received Symbols

                    S. Agarwal, sachin.agarwal@telekom.de, January 2008                      6
Partial Information

Transmission Channel with Erasures




      Transmitter                                                                Receiver




 Input symbols                                                            Received Symbols

                    S. Agarwal, sachin.agarwal@telekom.de, January 2008                      7
Partial Information

Transmission Channel with Erasures




      Transmitter                                                                Receiver




 Input symbols                                                            Received Symbols

                    S. Agarwal, sachin.agarwal@telekom.de, January 2008                      8
Partial Information

Multiple Receivers may have different erasures

                                                                           Receiver 1



Given the situation of multiple receivers having
  partial information, how can all of them be
updated to full information efficiently, and over a
                      Transmitter
               broadcast channel?


        Receiver 3                                                         Receiver 2




                     S. Agarwal, sachin.agarwal@telekom.de, January 2008            9
Partial Information
Another Example
Multiple mobile devices may have out-dated information
a. Mobile databases
b. Sensor network information aggregation                                 Mobile device 1
c. RSS updates for devices




                                         Broadcaster


                      Latest version of information

         Mobile device 3                                                  Mobile device 2




                    S. Agarwal, sachin.agarwal@telekom.de, January 2008            10
Problem Definition

Given an encoding host with k input symbols and a
decoding host with n out of the k input symbols, the goal
is to efficiently determine the remaining k-n input symbols
at the decoding host.
       The encoding host has no information of which k-n input
        symbols are missing at the decoding host.
       Different decoding hosts may be missing different input
        symbols


Efficiency
1.Communication complexity – Information transmitted from
the encoding host to the decoding host should be close in size to the
transmission size of the missing k-n input symbols
2.Computational complexity – The algorithm must be
computationally tractable

                    S. Agarwal, sachin.agarwal@telekom.de, January 2008   11
Information Theoretic Lower Bound

Known Result
At a minimum, the encoding
host would have to send only a                                         lg( k  n)
little less than the exact                             C  (k  n)b 
contents of the missing input                                               b
symbols to the decoding host.


Intuition
                                                           k – Number of input
Decoding host is missing k-n                                   symbols
input symbols
                                                           n – Number of symbols
Special case of set                                            known a priori at the
reconciliation                                                 decoding host
                                                           b – Field size of each
                                                               symbol


                 S. Agarwal, sachin.agarwal@telekom.de, January 2008                   12
Outline

1. Motivation & Problem Definition
2. Background
 a.   Rateless Codes
 b.   Digital Fountain Codes
3. Shifted Codes
 a.   Motivation – Inefficiency of LT codes
 b.   Construction of Shifted Codes
 c.   Analysis – Communication and Computation Complexity
4. Experimental Comparison
 a.   LT vs. Shifted Codes
 b.   Constraint Sensors – Deployment on TMotes
5. Discussion and Round-up

                   S. Agarwal, sachin.agarwal@telekom.de, January 2008   13
Rateless Codes

Definition
     “A class of erasure codes with the property that a potentially
     limitless sequence of encoding symbols can be generated from
     a given set of source symbols such that the original source
     symbols can be recovered from any subset of the encoding
     symbols of size equal to or only slightly larger than the number
     of source symbols. ”
                                                           Wikipedia.org
Examples
1.   Random Linear Codes
2.   LT Codes
3.   Raptor Codes
4.   Shifted Codes
5.   …

                     S. Agarwal, sachin.agarwal@telekom.de, January 2008   14
Rateless Codes - Encoding
Used for content distribution over error-prone channels
     k input symbols                                                         At least k Encoded Symbols


          A                                                                           1     =A+B




           B                                                                          2     =B




                                                                                      3     =A+B+C
          C



                                                                                       4   =A+C
Random choice of edges
based on a probability
density function

                       S. Agarwal, sachin.agarwal@telekom.de, January 2008                                15
Rateless Codes - Decoding
Used for content distribution over error-prone channels
 k input symbols
                                                                            At least k Encoded Symbols

                                                                                     1     =A+B
       A



                                                                                     2     =B
       B                                        Solve
                     Gaussian Elimination, Belief Propagation


                                                                                     3     =A+B+C
      C


  Irrespective of which encoded symbols are lost in the                               4   =A+C
  communication channel, as long as sufficient encoded
  symbols are received, the decoding can retrieve all the k
  input symbols
                                                                             System of Linear Equations
                      S. Agarwal, sachin.agarwal@telekom.de, January 2008                         16
      Decoding Using Belief Propagation
                                                                                                    Redundant!

       k+ Encoded Symbols




                    Decode




Decoding host
                                                                                    Input Symbols

                                                                        Decoded k Input Symbols

                              S. Agarwal, sachin.agarwal@telekom.de, January 2008                                17
Digital Fountain Codes
LT Codes
1. Class of rateless erasure codes invented by                                     Asymptotic Properties2
   Michael Luby1                                                                   Expected number of encoded symbols required
                                                                                   for successful decoding
2. Computationally practical (as compared to
   Random Linear Codes)

3. Fast decoding algorithm based on Belief                                                      k  O ( k ln 2 k )
   propagation instead of Gaussian Elimination
                                                                                   Expected decoding computational complexity
4. Form the outer code for Raptor Codes3, which
   have linear decoding computational complexity
                                                                                                    O(k ln k )
5. Designed for the case when no input symbols                                     k: number of input symbols
   are available at the Decoding host initially                                    2Assuming    a constant probability of failure 



1MichaelLuby, “LT codes,” in The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002, pp. 271–282.
3Amin Shokrollahi, “Raptor codes,” IEEE Transactions on Information Theory, vol. 52, no. 6, 2006, pp. 2551–2567.



                                          S. Agarwal, sachin.agarwal@telekom.de, January 2008                                         18
Digital Fountain Codes
LT Codes’ Robust Soliton Probability Distribution
 Robust Soliton Probability Distribution k,
           Probability of an encoded symbol with degree d is k(d)

 Property of releasing degree 1 symbols at a controlled, near-constant rate throughout the decoding
 process
                                                                        0
                                                                                         LT Code (Robust Soliton)
                                                                        -1

                                                   log10(Probability)   -2
   LT code distribution, with parameters
   k = 1000, c = 0.01,  = 0.5.                                        -3

                                                                        -4

                                                                        -5

                                                                        -6
                                                                          0   200       400    600     800     1000
                                                                                          Degree
                                  S. Agarwal, sachin.agarwal@telekom.de, January 2008                               19
Outline
1.        Motivation & Problem Definition
2.        Background
     a.    Rateless Codes
     b.    Digital Fountain Codes
3. Shifted Codes
     a.    Motivation – Inefficiency of LT codes
     b.    Construction of Shifted Codes
     c.    Analysis – Communication and Computation Complexity
4.        Experimental Comparison
     a.    LT vs. Shifted Codes
     b.    Constraint Sensors – Deployment on TMotes
5.        Discussion and Round-up

                                S. Agarwal, sachin.agarwal@telekom.de, January 2008   20
      Inefficiency of LT Codes for our Problem
                                                                       Many redundant encoded symbols

       k+ Encoded Symbols




                        Decode




Decoding host
                                                                                          Input Symbols

                n out of k input symbols are known a priori at the decoding host

                                    S. Agarwal, sachin.agarwal@telekom.de, January 2008                   21
Inefficiency of LT Codes for our Problem
The number of these redundant encoded symbols grows with the ratio of input symbols known at the
decoder (n) to the total input symbols (k)

If n input symbols are known a priori, then an additional LT-encoded symbol will provide no new
information to the decoding host with probability



                            k
                                    d n i 
                           k (d )  k  i 
                          d 1
                                   
                                    i 0
                                             
                                             
…which quickly approaches 1 as n → k




                             S. Agarwal, sachin.agarwal@telekom.de, January 2008                   22
Intuitive Fix
n known input symbols serve the function of degree 1 encoded symbols,
disproportionately skewing the degree distribution for LT encoding

We thus propose to shift the Robust Soliton distribution to the right in order to
compensate for the additional functionally degree 1 symbols

Questions                                                                      0
                                                                                            LT Code (Robust Soliton)
1) How?                                                                        -1


                                                          log10(Probability)
                                                                               -2
2) By how much?
                                                                               -3

                                                                               -4

                                                                               -5

                                                                               -6
                                                                                 0   200   400    600     800     1000
                                                                                             Degree
                         S. Agarwal, sachin.agarwal@telekom.de, January 2008                                             23
Shifted Code Construction
Definition
The shifted robust soliton distribution is given by
                                                             
                                                         i 
                k ,n ( j )  0   k  n (i ) for round      j
                                                        1 n 
                                                             
Intuition                                                  k
n known input symbols at the decoding host reduce the degree of each encoding
symbols by an expected fraction

                                                  
                                              1 
                                                  
                                             1 n 
                                                  
                                                k
                          S. Agarwal, sachin.agarwal@telekom.de, January 2008   24
Shifted Code Distribution

                                          0
                                                                         LT Code (Robust Soliton)
                                          -1                             Shifted Code
                     log10(Probability)
                                          -2

                                          -3

                                          -4

                                          -5

                                          -6
                                            0       200              400    600                       800   1000
                                                                       Degree

LT code distribution and proposed Shifted code distribution, with parameters k = 1000, c = 0.01,  = 0.5. The
number of known input symbols at the decoding host is set to n = 900 for the Shifted code distribution. The probabilities of
the occurrence of encoded symbols of some degrees is 0 with the shifted code distribution.


                                                S. Agarwal, sachin.agarwal@telekom.de, January 2008                            25
Shifted Code – Communication Complexity
Lemma IV.2
A decoder that knows n of k input symbols needs

                                   2 k  n 
           m  (k  n)  O k  n ln 
                                            
                                         
encoding symbols under the shifted distribution to decode all k input symbols with
probability at least 1−.

Proof
We have k-n input symbols comprising the encoded symbols after the n known
input symbols are removed from the decoding graph. The expresson follows from
Luby‘s analysis.

                        S. Agarwal, sachin.agarwal@telekom.de, January 2008          26
Shifted Code – Average Degree of
Encoded Symbol
Lemma IV.3
The average degree of an encoding node under the k,n distribution is given by
                           k              
                         O     ln( k  n) 
                          k n            
Proof
The proof follows from the definitions, since a node with degree d in the μk distribution will
correspond to a node with degree roughly
                                                    
                                             d 
                                                    
                                            1 n 
                                                    
in the shifted code distribution.                 k
From Luby‘s analysis,the expresson for the average degree of an LT encoded symbol is


                                                 O(ln k )
                              S. Agarwal, sachin.agarwal@telekom.de, January 2008                27
Shifted Codes – Computational Complexity
Lemma IV.4*
For a fixed  , the expected number of edges R removed from the decoding graph upon knowledge of
n input symbols at the decoding host is given by
                                                    R = O (n ln(k − n))
Theorem IV.5
For a fixed probability of decoding failure , the number of operations needed to decode using a shifted
code is
                                                           O (k ln(k − n))
Proof
Summing Lemma IV.4 and the computational complexity of (LT) decoding for the unknown k-n
input symbols

*Proof described in: S. Agarwal, A. Hagedorn and A. Trachtenberg, “Rateless Codes Under Partial Information”, Information Theory and Applications
Workshop, UCSD, San Diego, 2008



                                           S. Agarwal, sachin.agarwal@telekom.de, January 2008                                                      28
Outline
1.        Motivation & Problem Definition
2.        Background
     a.    Rateless Codes
     b.    Digital Fountain Codes
3.        Shifted Codes
     a.    Motivation – Inefficiency of LT codes
     b.    Construction of Shifted Codes
     c.    Analysis – Communication and Computation Complexity
4. Experimental Comparison
     a.    LT vs. Shifted Codes
     b.    Constraint Sensors – Deployment on TMotes
5.        Discussion and Round-up

                                S. Agarwal, sachin.agarwal@telekom.de, January 2008   29
Experimental Comparison
LT Codes vs. Shifted Codes
Benefit
For k = 1000, n = 900, the decoding host needs to download about 700
encoded symbols using conventional LT codes. But using shifted codes, only about
180 encoded symbols are required

Y-axis                                                                                                     1200
                                                                                                                                                                              LT
                                                                                                                                                                      Without Invention
                                                                                                                                                                         Shifted Code
                                                                                                                                                                      With Invention
Number of encoded symbols required at the



                                                              Required encoding symbols at Decoding host
                                                                                                           1000
mobile device to obtain the whole data-set
X-axis                                                                                                     800



Number of input symbols n available a priori at                                                            600

the mobile device
                                                                                                           400



                                                                                                           200



                                                                                                             0
                                                                                                                  0   100   200   300     400    500    600     700     800        900   1000
                                                                                                                                        Known input symbols (n)
                                                                  The experiment was repeated 100 times and the error-bars of the standard deviation
                                                                  are also plotted in the graph.
                             S. Agarwal, sachin.agarwal@telekom.de, January 2008                                                                     30
                           Experimental Comparison
                           Constraint Sensors – Deployment on TMotes

                                  Total time to Encode                                                                         Total time to Decode
                             (Measure of computational complexity)                                                        (Measure of computational complexity)

                      3                                                                                        12
                                             LT (Robust Soliton)                                                                    LT (Robust Soliton)
                     2.5                     Shifted Code distribution                                         10                   Shifted Code distribution




                                                                                          Time To Decode (s)
Time to Encode (s)




                      2                                                                                        8


                     1.5                                                                                       6


                      1                                                                                        4


                     0.5                                                                                       2


                      0                                                                                        0
                            100       200          300                      400                                     100      200         300                 400
                                  Number of Input Symbols                                                                 Number Input Symbols


                                                          S. Agarwal, sachin.agarwal@telekom.de, January 2008                                                     31
More Data: Communication Savings
                                                                      k=1000 input symbols, 20 randomized trials
                                                      1200
                                                                                                                            LT Robust Soliton
 Required encoded symbols for successful \ decoding

                                                                                                                            Shifted Code
                                                      1000



                                                      800



                                                      600



                                                      400



                                                      200



                                                         0
                                                        -200   0          200       400        600       800       1000                     1200
                                                                   n, number of known input symbols at decoding host
                                                                      S. Agarwal, sachin.agarwal@telekom.de, January 2008                          32
More Data: Communication Savings
Normalized k=1000 input symbols, 20 randomized trials
    Encoded symbols required, normalized with LT-RS                                                                        LT Robust Soliton
                                                       1
                                                                                                                           Shifted Code



                                                      0.8



                                                      0.6



                                                      0.4



                                                      0.2



                                                       0
                                                            100      200   300     400     500    600     700    800                    900
                                                                  n, number of known input symbols at decoding host
                                                                     S. Agarwal, sachin.agarwal@telekom.de, January 2008                       33
More Data: Time Savings, Normalized
                                                               k=1000 input symbols, 20 randomized trials

                                                                                                                     LT Robust Soliton
                                                 1
                                                                                                                     Shifted Code
  Time taken to decode, normalized with LT-RS




                                                0.8



                                                0.6



                                                0.4



                                                0.2



                                                 0
                                                      100      200   300     400     500    600     700    800                    900
                                                            n, number of known input symbols at decoding host
                                                               S. Agarwal, sachin.agarwal@telekom.de, January 2008                       34
Distribution Shifting
When the estimate of n at the Encoding Host is not
accurate
               k 1                              
                                              i 
                      
 p ( j )  0  p ( )  k   (i ) for round    j
                                             1 
                0                              
                                               k 



The Theta distribution shifting decodes
input symbols much more quickly than the
standard LT codes.




                            S. Agarwal, sachin.agarwal@telekom.de, January 2008   35
Outline
1.        Motivation & Problem Definition
2.        Background
     a.    Rateless Codes
     b.    Digital Fountain Codes
3.        Shifted Codes
     a.    Motivation – Inefficiency of LT codes
     b.    Construction of Shifted Codes
     c.    Analysis – Communication and Computation Complexity
4.        Experimental Comparison
     a.    LT vs. Shifted Codes
     b.    Constraint Sensors – Deployment on TMotes
5. Discussion and Round-up

                                S. Agarwal, sachin.agarwal@telekom.de, January 2008   36
Many Applications
1.        Broadcasting coded updates to synchronize databases
2.        Adapting LT codes when partial information has been delivered
     a.    Continuous shifting of the distribution
     b.    Using the partial information in case of unsuccessful decoding (when only some of the input
           symbols were decoded)
3.        Efficient erasure correction when channel characteristics are already known
     a.    For example, input symbols can be first sent as plain-text, and then depending on the estimate of
           number of lost input symbols, shifted-coded symbols can be transmitted
4.        Heterogeneous channel data delivery
5.        Application in gossip protocols, particularly in later iterations
6.        Sensor networks - data aggregation, routing information, etc.
7.        Restoring storage media that are partially erased
…


                                 S. Agarwal, sachin.agarwal@telekom.de, January 2008                       37
Conclusions & Future-work
Conclusions
a. Generalization of LT Code when some of the input symbols are already available
    at the decoding host
b. Many applications

Future Work
a. By adopting Raptor Code concepts (inner code), Shifted codes can be made more
     efficient
b. Analytical expressions for Distribution Shifting
c. Application specific shifted codes design
d. “Shifting” other rateless codes


                       S. Agarwal, sachin.agarwal@telekom.de, January 2008          38
Further Reading
1.   S. Agarwal, A. Hagedorn and A. Trachtenberg, “Rateless Codes Under Partial Information”,
     Information Theory and Applications Workshop, UCSD, San Diego, 2008
2.   S. Agarwal (Deutsche Telekom A.G.), “Method and System for Constructing and Decoding
     Rateless Codes with Partial Information”, European Patent Application EP 07 023 243.4
3.   Michael Luby, “LT codes,” in The 43rd Annual IEEE Symposium on Foundations of Computer
     Science, 2002, pp. 271–282.
4.   Amin Shokrollahi, “Raptor codes,” IEEE Transactions on Information Theory, vol. 52, no. 6,
     2006, pp. 2551–2567.




                            S. Agarwal, sachin.agarwal@telekom.de, January 2008                   39

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:6
posted:4/10/2012
language:
pages:40