Network Coding Theory

Document Sample
Network Coding Theory Powered By Docstoc
					      Network Coding for
      Error Correction and Security


                 Raymond W. Yeung
                 The Chinese University of Hong Kong




Raymond W. Yeung, CUHK           ACoRN Network Coding School   December 1-2, 2008
                               Outline
      •        Introduction
      •        Network Coding vs Algebraic Coding
      •        Network Error Correction
      •        Secure Network Coding
      •        Applications of Random Network Coding in P2P
      •        Concluding Remarks




Raymond W. Yeung, CUHK          ACoRN Network Coding School   December 1-2, 2008
                         Introduction




Raymond W. Yeung, CUHK      ACoRN Network Coding School   December 1-2, 2008
                     Butterfly Network I
                                    s                                            s
                          b1                 b2                        b1                 b2

                         1                    2                      1                     2
                               b1       b2                                  b1       b2

                                    3                                            3

                   b1          b1       b2          b2          b1              b1+b2           b2
                                    4                                            4
                              b2         b1
                                                                       b1+b2 b1+b2
                         t1                   t2                     t1                    t2


Raymond W. Yeung, CUHK                            ACoRN Network Coding School                        December 1-2, 2008
         Advantage of Network Coding

   • 9 instead of 10 bits need to be transmitted
     in order to multicast 2 bits.
   • If only 1 bit can be transmitted from node 3
     to node 4, only 1.5 bits on the average can
     be multicast.




Raymond W. Yeung, CUHK   ACoRN Network Coding School   December 1-2, 2008
                     Butterfly Network II
                   1                   2
                         b1       b2

                              3                                           3

              b1          b1+b2             b2             t1’                t2’
                              4                                           4
                     b1+b2 b1+b2
                   t1                  t2




Raymond W. Yeung, CUHK                      ACoRN Network Coding School             December 1-2, 2008
     Wireless and Satellite Communications

   • A node is a broadcast node by the nature
     of wireless/satellite communication.
   • At a node
         1. all the output channels have the same
            capacity;
         2. the same symbol is sent on each of the
            output channels.



Raymond W. Yeung, CUHK    ACoRN Network Coding School   December 1-2, 2008
     A Graph-Theoretic Representation




Raymond W. Yeung, CUHK   ACoRN Network Coding School   December 1-2, 2008
           Butterfly Network II Revisited




             t1’         t2’

                                               t1’     t2’




Raymond W. Yeung, CUHK   ACoRN Network Coding School     December 1-2, 2008
                   Wireless/Satellite Application
                         b1                                           b2

                               b1
              t=1

                                                              b2
              t=2

                              b1+b2                           b1+b2
              t=3

                  50% saving for downlink bandwidth!

Raymond W. Yeung, CUHK          ACoRN Network Coding School                December 1-2, 2008
               A Network Coding Example

                         with Two Sources




Raymond W. Yeung, CUHK       ACoRN Network Coding School   December 1-2, 2008
               An Network Coding Example
                     with 2 Sources


                  b1             b2                             b1           b2

       b1                b1 b2          b2          b1               b1+b2            b2

                   b2            b1                           b1+b2      b1+b2




Raymond W. Yeung, CUHK                ACoRN Network Coding School                 December 1-2, 2008
               Two Themes of Network Coding

     • When there is 1 source to be multicast in a
       network, store-and-forward may fail to
       optimize bandwidth.
     • When there are 2 or more independent
       sources to be transmitted in a network
       (even for unicast), store-and-forward may
       fail to optimize bandwidth.
     In short, Information is NOT a commodity!

Raymond W. Yeung, CUHK    ACoRN Network Coding School   December 1-2, 2008
        Single-Source Linear Network Coding
                         Acyclic Networks




Raymond W. Yeung, CUHK       ACoRN Network Coding School   December 1-2, 2008
         Model of a Point-to-Point Network

      • A directed network is represented by G =
        (V,E) with node set V and edge (channel)
        set E.
      • A symbol from an alphabet F can be
        transmitted on each channel.
      • There can be multiple edges between a
        pair of nodes.
      • The source node is denoted by s.

Raymond W. Yeung, CUHK   ACoRN Network Coding School   December 1-2, 2008
                         Some Terminologies

   • A pair of channels (d,e) is called an
     adjacent pair if e immediately follows d.
   • A directed network G is cyclic if it contains
     a directed cycle, otherwise G is acyclic.
   • The value of a maximum flow between the
     source node and a collection of nodes T is
     denoted by maxflow(T).


Raymond W. Yeung, CUHK         ACoRN Network Coding School   December 1-2, 2008
                         Acyclic Networks
   • If G is an acyclic network, then the nodes can
     be ordered such that if there is an edge from
     node i to node j, then node i is before node j.
   • Such an order is called an upstream-to-
     downstream order (ancesterol order).
   • When the nodes encode according to an
     upstream-to-downstream order, whenever a
     node encodes, all the information needed
     would have already been received on the input
     channel of that node.

Raymond W. Yeung, CUHK        ACoRN Network Coding School   December 1-2, 2008
                                   Example
                         s


           1                 2                  An upsteam-to-
                                                downstream order:
                         3                      S, 2, 1, 3, 4, t2, t1

                                  b2
                         4


          t1                 t2


Raymond W. Yeung, CUHK                 ACoRN Network Coding School      December 1-2, 2008
                         The Max-Flow Bound
      • The source node s generates an information
        vector
                       x = (x1 x2 … xω)  Fω.
      • What is the condition for a node t to be able to
        receive the information vector x?
      • Max-Flow Bound. If maxflow(t) < ω, then node t
        cannot possibly receive x.




Raymond W. Yeung, CUHK         ACoRN Network Coding School   December 1-2, 2008
                         The Basic Results
   • If network coding is allowed, a node t can receive the
     information vector x iff
                             maxflow(t) ≥ k
     i.e., the max-flow bound can be achieved simultaneously by
     all such nodes t. (Ahlswede et al. 00)
   • Moreover, this can be achieved by linear network coding for
     a sufficiently large base field. (Li, Y and Cai, Koetter and
     Medard, 03)




Raymond W. Yeung, CUHK        ACoRN Network Coding School   December 1-2, 2008
                     Linear Network Coding




Raymond W. Yeung, CUHK      ACoRN Network Coding School   December 1-2, 2008
       Description of a Linear Network Code

   Two equivalent descriptions
   • Local Encoding Kernels
         – specify the “gain” from channel d to channel e
         – local encoding kernel for adjacent pair (d,e)
           denoted by kd,e
         – Local encoding kernel at node t denoted by Kt
   • Global Encoding Kernels
         – specify the linear transformation between the
           message vector and symbol transmitted on each
           channel
         – global encoding kernel for channel e denoted by fe

Raymond W. Yeung, CUHK       ACoRN Network Coding School    December 1-2, 2008
                                                Example

                         s
             b1                   b2

         t                         u
                  b1         b2

                         w

                    b1+b2                  b2
                         x
             b1+b2 b1+b2
         y                             z


Raymond W. Yeung, CUHK                           ACoRN Network Coding School   December 1-2, 2008
    Constraints on Global Encoding Kernels

   • ω imaginary channels are installed at the
     source node s.
   • The vectors fe for the ω imaginary
     channels form the standard basis of Fω.
   • fe =               d In(t )   k fd for e  Out(t)
                                      d ,e




                                           



Raymond W. Yeung, CUHK                       ACoRN Network Coding School   December 1-2, 2008
                Max-Flow Bound for Linear
                          Network Codes
       • V  { f : e  }
                            e


       • Max-Flow Bound.
                                dim(VT )  min{,max flow(T)}



         



    Raymond W. Yeung, CUHK                 ACoRN Network Coding School   December 1-2, 2008
                         Desirable Properties of a
                          Linear Network Code
   • Linear Multicast.
    dim(V T)   for every non-source node t with max flow(t)  
   • Linear Broadcast.
     dim(V t)  min{,max flow(t)} for every non-source node t
                                          
   • Linear Dispersion.
    dim(V T)  min{,max flow(T)} for every collection of non-
      source node T



Raymond W. Yeung, CUHK            ACoRN Network Coding School   December 1-2, 2008
                         Implementation of Linear
                             Network Codes
   • The global encoding kernels must be
     known at a sink node, which can be
     delivered on the fly (Ho et al. 2003)
   • Consider using the same linear network
     code n > ω times.
   • Send an “identity matrix” through during the
     first ω time slots.


Raymond W. Yeung, CUHK           ACoRN Network Coding School   December 1-2, 2008
                         Implementation of Linear
                             Network Codes

                                  m1 
                                  
              m1              m 2 
                                                I             FIn(t ) 
                                  M                                        
              m 2  I                          x1              x1FIn(t ) 
                                                                    
               M            m 
                                     FIn(t )   x 2  In(t )   x 2 FIn(t ) 
                                                            F
               m
                             x1                                       
                                                 M                 M 
                                    x2
                                               x n 
                                                                   n FIn(t )
                                                                       x
                                                                                   
                                  M 
                                  
                                x
                                  n 



Raymond W. Yeung, CUHK             ACoRN Network Coding School                          December 1-2, 2008
             Existence and Construction




Raymond W. Yeung, CUHK   ACoRN Network Coding School   December 1-2, 2008
                         Polynomial Equation
   Lemma (Koetter-Medard 03)
   Let g(z1, z2, …, zn) be a nonzero polynomial
     with coefficients in a field F. If |F| is greater
     than the degree of g in every zj, then there
     exists a1, a2, …, an  F such that
                   g(a1, a2, …, an) ≠ 0
   Corollary. If |F| is large, by choosing a1, a2, …,
     an randomly from F,
               Pr{g(a1, a2, …, an) ≠ 0} ≈ 1
Raymond W. Yeung, CUHK         ACoRN Network Coding School   December 1-2, 2008
      The Butterfly Network Revisited




Raymond W. Yeung, CUHK   ACoRN Network Coding School   December 1-2, 2008
      The Butterfly Network Revisited

   • Consider a general linear network code on
     the network by letting the local encoding
     kernels a, b, …, l  F be indeterminates.
   • Clearly,
            det(FIn(w) )det(FIn(y) )det(FIn(z) )  0  F[a,b,L ,l]
   • By the previous lemma, there exists a, b,
     …, l  F s.t. det(FIn(w) )det(FIn(y) )det(FIn(z) )
     is evaluated to a non-zero value in F.
Raymond W. Yeung, CUHK           ACoRN Network Coding School   December 1-2, 2008
      The Butterfly Network Revisited

   • This implies that det(FIn(w) ),det(FIn(y ) ),det(FIn(z) )
     are all evaluated to a non-zero value in F
     simultaneously.
   • In our previous solution,
               
          b = c = 0, others = 1
          The linear network code is a linear multicast.



Raymond W. Yeung, CUHK      ACoRN Network Coding School   December 1-2, 2008
           Existence of Linear Multicast

   Theorem (Koetter-Medard 03)
   There exists a linear multicast on an acyclic
    network for sufficiently large base field F.




Raymond W. Yeung, CUHK   ACoRN Network Coding School   December 1-2, 2008
                         Global Encoding Kernels of
                           a Linear Network Code
      • Recall that x = (x1 x2 … xk) is the multicast message.
      • For each channel e, assign a column vector fe such that the
        symbol sent on channel e is x fe. The vector fe is called the
        global encoding kernel of channel e.
      • The global encoding kernel of a channel is analogous to a
        column in the generator matrix of a classical block code.
      • The global encoding kernel of an output channel at a node must
        be a linear combination of the global encoding kernels of the
        input channels.




Raymond W. Yeung, CUHK            ACoRN Network Coding School    December 1-2, 2008
                          An Example

                         k = 2, let x = (b1, b2)




Raymond W. Yeung, CUHK          ACoRN Network Coding School   December 1-2, 2008
                        
                        1                        
                                                 0
                        
                        0
                        
                                                 
                                                 1
                                                                        b1                 b2

                             
                             1             
                                           0
                                     
                           0
                                       1
                                                                              b1     b2

                               
                
                1                       
                                        1                
                                                         0
                                                 
                0
                                    1
                                                     1
                                                                b1                 b1+b2        b2
                          
                          1                     
                                                1
                                            
                        1
                          
                                              1
                                               
                                                
                                                                            b1+b2 b1+b2

                                




Raymond W. Yeung, CUHK                                   ACoRN Network Coding School                     December 1-2, 2008
                         Network Coding vs
                          Algebraic Coding



Raymond W. Yeung, CUHK         ACoRN Network Coding School   December 1-2, 2008
                         A Linear Multicast
      • A message of k symbols from a base field F is generated
        at the source node s.
      • A k-dimensional linear multicast has the following
        property: A non-source node t can decode the message
        correctly if and only if
                              maxflow(t)  k.
      • By the Max-flow bound, this is also a necessary condition
        for a node t to decode (for any given network code).
      • Thus the tightness of the Max-flow bound is achieved by
        a linear multicast, which always exists for sufficiently
        large base fields.

Raymond W. Yeung, CUHK         ACoRN Network Coding School   December 1-2, 2008
                         An (n,k) Code with dmin = d

      • Consider a (n,k) classical block code with
        minimum distance d.
                                             n 
      • Regard it as a network code on an n  d  1
                                                   

        combination network.
      • Since the (n,k) code can correct d-1 erasures, all
                                       

        the nodes at the bottom can decode.




Raymond W. Yeung, CUHK             ACoRN Network Coding School   December 1-2, 2008
                           n 
                The                
                          n  d  1
                                          Combination Network
                                                      s
                                                                        n




                         n-d+1                                                n-d+1



Raymond W. Yeung, CUHK                      ACoRN Network Coding School               December 1-2, 2008
      • For the nodes at the bottom,
                            maxflow(t) = n-d+1.
      • By the Max-flow bound,
                          k  maxflow(t) = n-d+1
        or d  n-k+1, the Singleton bound.
      • Therefore, the Singleton bound is a special case of the
        Max-flow bound for network coding.
      • An MDS code is a classical block code that achieves
        tightness of the Singleton bound.
      • Since a linear multicast achieves tightness of the Max-
        flow bound, it is formally a network generalization of an
        MDS code.



Raymond W. Yeung, CUHK         ACoRN Network Coding School    December 1-2, 2008
                      Two Ramifications of
                 Single-Source Network Coding
      • The starting point of classical coding theory and
        information-theoretic cryptography is the existence of a
        conduit through which we can transmit information from
        Point A to Point B without error.
      • Single-source network coding provides a new such
        conduit.
      • Therefore, we expect that both classical coding theory
        and information-theoretic cryptography can be extended
        to networks.



Raymond W. Yeung, CUHK        ACoRN Network Coding School   December 1-2, 2008
                 Network Error Correction




Raymond W. Yeung, CUHK    ACoRN Network Coding School   December 1-2, 2008
                 Point-to-Point Error Correction
                          in a Network
      • Classical error-correcting codes are devised for
        point-to-point communications.
      • Such codes are applied to networks on a link-by-
        link basis.




Raymond W. Yeung, CUHK       ACoRN Network Coding School   December 1-2, 2008
                         Channel                       Channel
                         Decoder                       Decoder



                                       Network
                                       Encoder


                                       Channel
                                       Encoder




Raymond W. Yeung, CUHK   ACoRN Network Coding School             December 1-2, 2008
                             A Motivation for
                         Network Error Correction
      • Observation Only the receiving nodes have to know
        the message transmitted; the immediate nodes don’t.
      • In general, channel coding and network coding do not
        need to be separated 
                        Network Error Correction
      • Network error correction generalizes classical point-
        to-point error correction.




Raymond W. Yeung, CUHK           ACoRN Network Coding School   December 1-2, 2008
                                     Network
                                      Codec




Raymond W. Yeung, CUHK   ACoRN Network Coding School   December 1-2, 2008
      What Does Network Error Correction Do?

       •       A distributed error-correcting scheme over the
               network.
       •       Does not explicitly decode at intermediate nodes
               as in point-to-point error correction.
       •       At a sink node t, if c errors can be corrected, it
               means that the transmitted message can be
               decoded correctly as long as the total number of
               errors, which can happen anywhere in the network,
               is at most c.




Raymond W. Yeung, CUHK          ACoRN Network Coding School   December 1-2, 2008
              Classical Algebraic Coding
                                      y=x+z




                         received     codeword                    error
                         vector                                   vector


                 y, x, and z are all in the same space.

Raymond W. Yeung, CUHK              ACoRN Network Coding School            December 1-2, 2008
          Minimum Distance: Classical Case

      •       Hamming distance is the most natural distance
              measure.
      •       For a code C, dmin = min d(v1,v2), where v1,v2 
              C and v1  v2.
      •       If dmin = 2c+1, then C can
              –      Correct c errors
              –      Detect 2c errors
              –      Correct 2c erasures

Raymond W. Yeung, CUHK            ACoRN Network Coding School   December 1-2, 2008
                         Sphere Packing

                                                                  
                                 dmin            

                             


                                                               
                                               
                         


Raymond W. Yeung, CUHK           ACoRN Network Coding School           December 1-2, 2008
      Coding Bounds: Classical Case

      • Upper bounds
            – Hamming bound
            – Singleton bound
      • Lower bound
            – Gilbert-Varsharmov bound




Raymond W. Yeung, CUHK      ACoRN Network Coding School   December 1-2, 2008
                             Network Coding
                                                               yt   t


                         x                                     yu   u
                s


                                                               yv   v
                             z

Raymond W. Yeung, CUHK           ACoRN Network Coding School            December 1-2, 2008
                         Input/Output Relation
      •       The network code is specified by the local encoding
              kernels at each non-source node.
      •       Fix a sink node t.
      •       The codeword x, the error vector z, and the received
              vectors yt are all in different spaces.
      •       In this tutorial, we consider only linear network
              codes. Then
                                 yt = x Fs,t + z Ft
              where Fs,t and Ft depend on t.
      •       In the classical case, Fs,t = Ft are the identity matrix.


Raymond W. Yeung, CUHK             ACoRN Network Coding School     December 1-2, 2008
           Distance Properties of Linear Network
                Codes (Yang, Y, Zhang 07)
      •       The network Hamming distance can be defined for
              linear network codes.
      •       Many concepts in algebra coding based on the
              Hamming distance can be extended to network
              coding.




Raymond W. Yeung, CUHK         ACoRN Network Coding School   December 1-2, 2008
                  How to Measure the Distance
                   between Two Codewords?
      •    Fix both the network code and the codebook C, i.e., the set of all
           possible codewords transmitted into the network.
      •    For a sink node t,
                                    yt(x,z) = x Fs,t + z Ft
      •    For two codewords x1, x2  C , define their distance by
                                Dtmsg(x1,x2) = arg minz wH(z)
           where the minimum is taken over all error vectors z such that
                                    yt(x1,0) = yt(x2,z) , or
                                      yt(x1,z) = yt(x2,0)
      •    Idea Dtmsg(x1,x2) is the minimum Hamming weight of an error vector
           z that makes x1 and x2 indistinguishable at node t.
      •    Dtmsg defines a metric on the input space of the linear network code.


Raymond W. Yeung, CUHK               ACoRN Network Coding School           December 1-2, 2008
        Minimum Distance for a Sink Node

      • For a sink node t,
                          dmin,t = minx1x2 Dtmsg(x1,x2)
      • Each sink node has a different view of the codebook as
        each is associated with a different distance measure.
      • dmin,t is the minimum distance as seen by sink node t.
      • If the codebook C is linear, dmin,t has the following
        equivalent definition:
                         dmin,t = min { wH(z) : z  At }
        where
                    At = { z : yt(x,z) = 0 for some x  C }.


Raymond W. Yeung, CUHK        ACoRN Network Coding School   December 1-2, 2008
          Error Correction/Detection and Erasure
           Correction for a Linear Network Code
      •       If dmin,t = 2c+1, then sink node t can
              – Correct c errors
              –      Detect 2c errors
              –      Correct 2c erasures
      •       Some form of “sphere packing” is at work.
      •       Much more complicated when the network code
              is nonlinear.


Raymond W. Yeung, CUHK             ACoRN Network Coding School   December 1-2, 2008
                         Sphere Packing

                                                                  
                                 dmin            

                             


                                                               
                                               
                         


Raymond W. Yeung, CUHK           ACoRN Network Coding School           December 1-2, 2008
                 Remark on Error Detection

      • In network coding, some error patterns have no
        effect on the sink nodes. These are “invisible”
        error patterns that cannot be (or do not need to
        be) detected.
      • Also called “Byzantine modification detection”
        (Ho et al, ISIT 04)




Raymond W. Yeung, CUHK     ACoRN Network Coding School   December 1-2, 2008
          Remark on Erasure Correction
      •     In classical algebraic coding, erasure correction has
            three equivalent interpretation:
              – A symbol is erased means that it is not available
              – A symbol is erased means that the erasure symbol
                is received
              – The error locations are known.
      •     In our context, erasure correction means that the
            locations of the errors are known by the sink nodes
            but not the intermediate nodes.


Raymond W. Yeung, CUHK           ACoRN Network Coding School   December 1-2, 2008
        Coding Bounds for Network Codes

      • Cai & Y (02, 06) obtained the Hamming bound,
        the Singleton bound and the Gilbert-Varshamov
        bound for network codes.
      • These bounds are natural extension of the
        bounds for algebraic codes.
      • Let the base field be GF(q), n = mint maxflow(t)
        and dmin = mint dmin,t




Raymond W. Yeung, CUHK     ACoRN Network Coding School   December 1-2, 2008
                         Upper Bounds
         • Hamming bound



           where                      .
         • Singleton bound


         • The Singleton bound is asymptotically tight,
           i.e., when q is sufficiently large.

Raymond W. Yeung, CUHK       ACoRN Network Coding School   December 1-2, 2008
                   Refined Coding Bounds
      •       Observation Sink nodes with larger maximum
              flow can have better error correction capability.
      •       For a given linear network code, refined
              Hamming bounds and Singleton bounds
              specific to the individual sink nodes can be
              obtained.




Raymond W. Yeung, CUHK          ACoRN Network Coding School   December 1-2, 2008
                 Refined Hamming Bound

      •       A network code with rank(Fs,t) = mt,
              codebook C, and dmin,t > 0, satisfies




              where                           , for all sink node t.


Raymond W. Yeung, CUHK       ACoRN Network Coding School       December 1-2, 2008
                  Refined Singleton Bound

      • A network code with rank(Fs,t) = mt,
        codebook C, and dmin,t > 0, satisfies


           for all sink node t.




Raymond W. Yeung, CUHK       ACoRN Network Coding School   December 1-2, 2008
                         Remark
      • Note that mt  maxflow(t) for all sink nodes t.
      • Thus the refined Hamming bounds imply the
        Hamming bound, and the refined Singleton bounds
        imply the Singleton bound.




Raymond W. Yeung, CUHK    ACoRN Network Coding School   December 1-2, 2008
                         Tightness of the Refined
                            Singleton Bounds
      • These bounds are shown to be asymptotically tight
        for linear network codes by construction, i.e., it is
        possible to construct a codebook that achieves
        tightness of the individual bound at every sink node
        t for any given linear network code provided that q
        is sufficiently large.
      • This implies that for large base fields, only linear
        transformations need to be performed at the
        intermediate nodes! No decoding needed.



Raymond W. Yeung, CUHK           ACoRN Network Coding School   December 1-2, 2008
           Construction of Network Codes that
          Achieve the Refined Singleton bounds
      •       Deterministic algorithms
              –      Alg1: Yang, Ngai and Y (ISIT 07)
              –      Alg2: Matsumoto (IEICE, 07) obtained an algorithm
                     based on robust network codes.
              –      Alg3: Yang and Y (ITW, Bergen 07)
      •       All these algorithms have almost the same
              complexity in terms of the field size requirement and
              time complexity.
      •       These algorithms imply that when q is very large,
              network codes satisfying these bounds can be
              constructed randomly with high probability.
Raymond W. Yeung, CUHK              ACoRN Network Coding School   December 1-2, 2008
                         Gilbert Bound
   • Let ns be the outgoing degree of source node s.
   • Let

                         
              t (x, d )  x' F ns : Dtmsg (x' , x)  d      
        be the d-ball about x with respect to the metric
        Dtmsg.




Raymond W. Yeung, CUHK          ACoRN Network Coding School       December 1-2, 2008
                         Gilbert Bound
      •       Given a network code, let |C|max be the
              maximum possible size of the codebook such
              that dmin,t ≥ dt for each sink node t. Then,




              where



Raymond W. Yeung, CUHK        ACoRN Network Coding School   December 1-2, 2008
                 Idea of the Gilbert Bound
         • If dmin,t ≥ dt for each sink node t, then for any x, there
           exists a codeword v such that Dtmsg(v,x) < dt ,
           otherwise can add one more codeword to the
           codebook.
         •Thus all the (dt-1)-balls around                             v
         the codewords cover the whole
         input space.                                                      x
                                                               dt -1



Raymond W. Yeung, CUHK           ACoRN Network Coding School                   December 1-2, 2008
                         Varshamov Bound
      •       Given a set of local encoding kernels, let ωmax
              be the maximum possible dimension of the
              linear codebook such that dmin,t ≥ dt for each
              sink node t. Then,


              where




Raymond W. Yeung, CUHK         ACoRN Network Coding School   December 1-2, 2008
                         Error Correction Capability of
                           Random Network Codes
   • Balli, Yan and Zhang 07
         – Study the distribution of dmin,t for random
           network codes based on a refined bound on
           the probability of decoding error for a random
           linear network code for multicast.




Raymond W. Yeung, CUHK              ACoRN Network Coding School   December 1-2, 2008
                             Algorithms for
                         Network Error Correction




Raymond W. Yeung, CUHK           ACoRN Network Coding School   December 1-2, 2008
                          For Deterministic and
                         Random Network Codes
   • Zhang 07 (to appear in IT)
      – Proposed the minimum rank decoding principle which is
        equivalent to minimum distance decoding.
                           d    1
      – Can decode up to  2  errors for each sink node t.
                               min,t

                                  
      – A fast decoding algorithm for packet networks with random
        network coding (the same network code is used
        repeatedly).
   • Yan, Balli and Zhang 07
      – Decoding beyond the error correction capability.
   • Balli, Yan and Zhang 07
      – A hybrid approach that combines link-by-link error
        detection and network erasure correction.

Raymond W. Yeung, CUHK                 ACoRN Network Coding School   December 1-2, 2008
                 For Random Network Codes

   • Jaggi, Langberg et al. (INFOCOM 07)
      – Consider packet networks (the same network code
        is used repeatedly).
      – Scenario 1: Alice and Bob has a low-rate secret
        channel.
                • A polynomial-time algorithm that achieves the optimal
                  rate asymptotically.
         – Scenario 2: Alice and Bob has no shared secret.
                • A polynomial-time algorithm that achieves the Singleton
                  bound asymptotically.
                • Extendable to the refined Singleton bounds?

Raymond W. Yeung, CUHK              ACoRN Network Coding School     December 1-2, 2008
                 For Random Network Codes

   • Koetter and Kschischang (ISIT 07)
         – Let the input space of the random network code be Fn,
           where n = mint maxflow(t).
         – At a sink node t, the transfer matrix is likely to be full rank.
         – The codebook is the collection of all k-dimensional
           subspaces of Fn, each called a codeword.
         – If a codeword A is chosen, then transmit a set of vectors
           in A that span A. Does not matter which set.
         – If the transfer matrix at a sink node t is full-rank (with high
           probability), the received vectors also spans A.
         – Can be regarded as a more general theoretical
           framework for random linear network coding.


Raymond W. Yeung, CUHK            ACoRN Network Coding School      December 1-2, 2008
   • Koetter and Kschischang (cont.)
      – Thus the codeword can be decoded correctly in the
        absence of error.
      – In the presence of error, decoding is done according to a
        distance measure between subspaces.
      – Yet to understand the performance of such codes in a
        given network.




Raymond W. Yeung, CUHK       ACoRN Network Coding School   December 1-2, 2008
                             Applications of
                         Network Error Correction




Raymond W. Yeung, CUHK           ACoRN Network Coding School   December 1-2, 2008
               Errors due to Noise in Channels

      •    Separation of channel coding and network coding is
           asymptotically optimal provided two conditions are satisfied:
              1. All channels are memoryless.
              2. The channels are independent.
           (Borade 02, Song & Y 06)
      •    If not, there is no separation theorem.
      •    Then applying turbo codes link-by-link does not guarantee
           optimality.
      •    Linear network error-correcting code is an attractive solution for
           its low encoding complexity.




Raymond W. Yeung, CUHK              ACoRN Network Coding School         December 1-2, 2008
                Malicious Injection of Errors

      • Malicious nodes in the network may inject
        errors deliberately to disturb data transmission.
      • Classical error correction does not help
        because redundancy is injected only in time.
      • Network error correction is a natural solution
        because redundancy is injected in both time
        and space.


Raymond W. Yeung, CUHK     ACoRN Network Coding School   December 1-2, 2008
                           Further Reading for
                         Network Error Correction
      •    R. W. Yeung and N. Cai, “Network error correction, Part I & II,” Communications in
           Information and Systems, 2006. First presented at ITW 2002.
      •    Ho et al, “Byzantine modification detection in multicast networks using randomized network
           coding,” ISIT 2004.
      •    R. W. Yeung, S.-Y. R. Li, N. Cai and Z. Zhang, Network Coding Theory, now Publishers, 2005
           (Foundation and Trends in Communications and Information Theory).
      •    S. Yang and R. W. Yeung, “Characterizations of network error correction/detection and
           erasure correction,” NetCod 2007.
      •    Z. Zhang, “Linear network error correction codes in packet networks,” to appear in IEEE IT.
      •    S. Yang, C. K. Ngai, and R. W. Yeung, “Construction of linear network codes that achieve a
           refined Singleton bound,” ISIT 2007.
      •    R. Koetter and F. Kschischang, “Coding for errors and erasures in random network coding,”
           ISIT 2007.
      •    S. Yang and R. W. Yeung, “Refined coding bounds for network error correction,” ITW, Bergen
           2007.
      •    S. Jaggi et al., “Resilient network coding In the presence of Byzantine adversaries”,
           INFOCOM 2007.
      •    Z. Zhang, “Some recent progress in network error correction progress,” NetCod 2008.


Raymond W. Yeung, CUHK                       ACoRN Network Coding School                      December 1-2, 2008
   •    H. Balli, X. Yan, and Z. Zhang, “Error correction capability of random network error correction
        codes,” submitted to IT.
   •    X. Yan, H. Balli, and Z. Zhang, “Decode network error correction codes beyond error correction
        capability,” submitted to IT.
   •    H. Balli, X. Yan, and Z. Zhang, “A hybrid network error correction coding system,” in preparation.




Raymond W. Yeung, CUHK                         ACoRN Network Coding School                        December 1-2, 2008
                   Secure Network Coding




Raymond W. Yeung, CUHK     ACoRN Network Coding School   December 1-2, 2008
                         Problem Formulation
      • The underlying model is the same as network multicast using
        network coding except that some sets of channels can be
        wiretapped.
      • Let A be a collection of subsets of the edge set E.
      • A subset in A is called a wiretap set.
      • Each wiretap set may be fully accessed by a wiretapper.
      • No wiretapper can access more than one wiretap set.
      • The network code needs to be designed in a way such that no
        matter which wiretap set the wiretapper has access to, the
        multicast message is information-theoretically secure.
      • The model is a network generalization of secret sharing (Blakley,
        Shamir, 78) and wiretap channel II (Ozarow and Wyner 84).



Raymond W. Yeung, CUHK           ACoRN Network Coding School       December 1-2, 2008
           A Coding Scheme (Cai-Y 02)

      • The multicast message is (m,k), where
             – m is the secure message
             – k is the key (randomness)
      • Both m and k are generated at the
        source node.



Raymond W. Yeung, CUHK       ACoRN Network Coding School   December 1-2, 2008
                        A Example of
                   a Secure Network Code




Raymond W. Yeung, CUHK     ACoRN Network Coding School   December 1-2, 2008
                                              m-k              m+k


                                                      m-k m+k




                                      m-k                  k       m+k

    One of the 3 red channels
     can be wiretapped                            k            k
    m is the secure message
    k is the key



Raymond W. Yeung, CUHK       ACoRN Network Coding School             December 1-2, 2008
                    Another Example of
                   Secure Network Coding
                           The (1,2)-threshold
                         Secret Sharing Scheme



Raymond W. Yeung, CUHK          ACoRN Network Coding School   December 1-2, 2008
                                                           k         m-k
                                                               m+k




     One of the 3 red channels
      can be wiretapped
     m is the secure message
     k is the key



Raymond W. Yeung, CUHK       ACoRN Network Coding School               December 1-2, 2008
                         Construction of Secure
                            Network Codes
      • Let n = mint maxflow(t).
      • A sufficient condition under which a secure linear network code
        can be constructed has been obtained (Cai and Y, 02 and 07).
      • Important Special Case If A consists of all the r-subsets of E,
        where r < n, then we can construct a secure network code with
        multicast message (m,k) such that |m| = n - r and |k| = r.
      • For this case, the condition is also necessary.
      • Interpretation For a sink node t, if r channels in the network are
        wiretapped, the number of “secure paths” from the source node
        to T is still at least n - r. So n - r symbols can go through
        securely.




Raymond W. Yeung, CUHK            ACoRN Network Coding School        December 1-2, 2008
               Idea of Code Construction
      • Start with a linear network code for multicasting n symbols.
      • For all wiretap set A  A, let fA = { fe : e  A }, the set of global
        encoding kernels of the channels in A.
      • Let dim(span(fA))  r for all A  A. [sufficient condition]
      • When the base field F is sufficiently large, we can find b1, b2, …,
        bn-r  Fn such that
               b1, b2, …, bn-r are linearly independent of fA
        for all A  A.
      • Extend b1, b2, …, bn-r to b1, b2, …, bn-r , bn-r+1 , …, bn to form a
        basis for Fn, and let let M = [b1 b2 … bn ].
      • M is invertible.


Raymond W. Yeung, CUHK             ACoRN Network Coding School          December 1-2, 2008
      • Let the multicast message be (m,k), with |m| =
        n-r and |k| = r.
      • Take a linear transformation of the given linear
        network code by the matrix M-1 to obtain the
        desired secure network code.




Raymond W. Yeung, CUHK     ACoRN Network Coding School   December 1-2, 2008
                             Optimality of the
                         Cai-Yeung Construction
      • When the wiretap set A consists of all r-subsets of
        E, the construction is optimal in terms of
         – the size of the message (maximum)
         – the size of the key (minimum).
      • The proof of the latter involves a set of inequalities
        due to T. S. Han.




Raymond W. Yeung, CUHK          ACoRN Network Coding School   December 1-2, 2008
                         Han’s Inequalities

      • Let
                                1             H ( X | X )
                         gk          
                               n   :| |k      k
                               
                              k 
                               
      • Then
                              g1  g2  …  gn.


Raymond W. Yeung, CUHK               ACoRN Network Coding School   December 1-2, 2008
                             Algorithms for
                         Secure Network Coding
   • Jain 2004
         – A security protocal that uses both network coding and
           one-way functions.
   • Feldman et al, 2004
         – A characterization of secure network codes in terms of a
           generalized distance measure.
         – A smaller field size can be used by giving up a small
           amount of overall capacity.




Raymond W. Yeung, CUHK          ACoRN Network Coding School   December 1-2, 2008
                             Algorithms for
                         Secure Network Coding
   • Bhattad and Narayanan 05
     – Propose weakly secure network coding for which
       the wiretaper cannot obtain any “useful” information.
     – Very simple scheme.
     – Not information-theoretically secure.




Raymond W. Yeung, CUHK          ACoRN Network Coding School   December 1-2, 2008
                             Algorithms for
                         Secure Network Coding
   • Jaggi, Langberg et al., 07
      – An efficient algorithm using random network coding
        in an unknown network topology that achieves
        asymptotically the same optimal rate as Cai-Yeung.
      – Requires repeated use of the same random
        network code.




Raymond W. Yeung, CUHK          ACoRN Network Coding School   December 1-2, 2008
                           Further Reading for
                         Secure Network Coding
      •    N. Cai and R. W. Yeung, “Secure network coding,” ISIT 02. Full version
           available upon request.
      •    K. Jain, “Security based on network topology against the wiretapping
           attacking,” IEEE Wireless Comm., Feb 2004.
      •    J. Feldman, T. Malkin, C. Stein, R. A. Servedio “On the capacity of secure
           network coding”, 2004 Allerton Conference.
      •    K. Bhattad and K.R. Nayayanan, “Weakly secure network coding,” NetCod
           2005.
      •    N. Cai and R. W. Yeung, “A security condition for multi-source linear
           network coding”, ISIT 2007.
      •    S. Jaggi et al., “Resilient Network Coding In the Presence of Byzantine
           Adversaries”, INFOCOM 2007.
      •    E. Soljanin and S. El Rouayheb. “On wiretap networks II,” ISIT 2007.



Raymond W. Yeung, CUHK                 ACoRN Network Coding School            December 1-2, 2008
                         Applications of Random
                         Network Coding in P2P




Raymond W. Yeung, CUHK           ACoRN Network Coding School   December 1-2, 2008
           What is Peer-to-Peer (P2P)?
   • Client-Server is the traditional architecture for
     content distribution in a computer network.
   • P2P is the new architecture in which users who
     download the file also help disseminating it.
   • Extremely efficiently for large-scale content
     distribution, i.e., when there are a lot of clients.
   • P2P traffic occupies at least 70% of Internet
     bandwidth.
   • BitTorrent is the most popular P2P system.

Raymond W. Yeung, CUHK     ACoRN Network Coding School   December 1-2, 2008
                         What is Avalanche?
   • Avalanche is a Microsoft P2P prototype that uses
     random linear network coding.
   • It is one of the first applications / implementations of
     network coding by Gkantsidis and Rodriguez 05.
   • It has recently been further developed into Microsoft
     Secure Content Distribution (MSCD).




Raymond W. Yeung, CUHK         ACoRN Network Coding School   December 1-2, 2008
                   How Avalanche Works?
   • When the server or a client uploads to a neighbor, it
     transmits a random linear combination of the blocks
     it possesses. The linear coefficients are attached
     with the transmitted block.
   • Analogy: Color-mixing.
   • Each transmitted block is some linear combination
     of the original blocks of the seed file.
   • Download is complete if enough linearly
     independent blocks have been received, and
     decoding can be done accordingly.

Raymond W. Yeung, CUHK    ACoRN Network Coding School   December 1-2, 2008
           The Butterfly Network: A Review


                              b1                    b2


                                   b1        b2



                                                                 Synchronization
                         b1             b1+b2              b2    here


                              b1+b2          b1+b2




Raymond W. Yeung, CUHK             ACoRN Network Coding School             December 1-2, 2008
        What Exactly is Avalanche Doing?

   • In Avalanche, there does not seem to be any need
     of synchronization.
   • Is Avalanche doing the same kind of network coding
     we have been talking about?
   • If not, what is it doing and is it optimal in any sense?




Raymond W. Yeung, CUHK     ACoRN Network Coding School   December 1-2, 2008
                  A Time-Parametrized Graph

               t=0               t=1                 t=2                     t=3
 Server
                                               1
                             2

 Client A
                                 2             1
                         4                                           1
 Client B
                                                     2
                                                                         1
 Client C




Raymond W. Yeung, CUHK                 ACoRN Network Coding School                 December 1-2, 2008
                         Analysis of Avalanche
                             (Y, NetCod 2007)

   • The time-parametrized graph, not the physical
     network, is the graph to look at.
   • By examining the maximum flows in this graph, the
     following questions can be answered:
      – When can a client receive the whole file?
      – If the server and/or some clients leave the system,
        can the remaining clients recover the whole file?
      – If some blocks are lost at some clients, how does it
        affect the recovery of the whole file?


Raymond W. Yeung, CUHK          ACoRN Network Coding School   December 1-2, 2008
                         Some Remarks
   • Avalanche is not doing the usual kind of random
     network coding, but it can be analyzed by the tools
     we are familiar with.
   • Avalanche minimizes delay with respect to the given
     transmission schedule if computation is ignored.
   • Extra computation is the price to pay.
   • Avalanche provides the maximum possible
     robustness for the system.
   • P2P is perhaps the most natural environment for
     applying random network coding because the
     subnet is formed on the fly.

Raymond W. Yeung, CUHK      ACoRN Network Coding School   December 1-2, 2008
                         Networks with Packet Loss:
                              A Toy Example

                         A              B                       C


      • One packet is sent on each channel per unit time.
      • Packet loss rate = 0.1 on each channel.
      • By using a fountain code, information can be
        transmitted from A to C at rate (0.9)2 = 0.81.
      • By using an Avalanche-type system, information
        can be transmitted from A to C at rate 0.9 = max-
        flow from A to C.

Raymond W. Yeung, CUHK            ACoRN Network Coding School       December 1-2, 2008
                             An Explanation


                                                             


                                                                 




Raymond W. Yeung, CUHK           ACoRN Network Coding School       December 1-2, 2008
              Networks with Packet Loss
      • By using an Avalanche-type system, the max-
        flow from the source to the sink (amortized by the
        packet loss rate) can be achieved automatically,
        which is the fundamental limit.
      • Virtually nothing needs to be done.




Raymond W. Yeung, CUHK     ACoRN Network Coding School   December 1-2, 2008
                         Concluding Remarks
      • The theory of network coding naturally ramifies in the
        direction of error correction and information-theoretic
        cryptography.
      • The development in these areas of network coding are
        still in its infancy.
      • Many potential applications in networking, wireless,
        information security, etc.
      • Applications are driven theory.
      • A lot of very exciting research ahead.


Raymond W. Yeung, CUHK         ACoRN Network Coding School   December 1-2, 2008

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:5
posted:9/29/2012
language:Unknown
pages:114