Docstoc

Performance Comparison of Scheduling Algorithms for Peer-to-Peer

Document Sample
Performance Comparison of Scheduling Algorithms for Peer-to-Peer Powered By Docstoc
					Performance Comparison of
Scheduling Algorithms for Peer-
to-Peer Collaborative File
Distribution

       Presented by: Chan Siu Kei, Jonathan
        Supervisors: Prof. VOK Li, Dr. KS Lui
Overview
   Introduction
   Communication Model
   Analysis
   Scheduling Algorithms
    - Rarest Piece First
    - Most Demanding Node First
    - Maximum-Flow Algorithms
   Simulation Results
   Future Work
   Conclusion
Introduction
   P2P file sharing applications are highly popular in
    the Internet, e.g. BitTorrent, Gnutella, Kazaa,
    Napster, etc.
   More scalable (faster) compared with traditional
    client/server approach (e.g. FTP)
   Former research focuses on topics like overlay
    topology formation, peer discovery, content search,
    fairness and incentive issues, etc. But seldom look
    into the data distribution scheduling problem
   We present the first effort and propose a novel
    Maximum-Flow algorithm to better solve the
    problem
Communication Model
    Synchronous Scheduling
     - same transmission time for every pair of nodes
    Asymmetric Bandwidth
     - send p pieces out, receive q pieces in for each cycle
Notations and Definitions
   N = no. of peers, M = no. of file pieces
   F = {F1, F2, …, FM}
   P = NxM possession matrix,
    Pij = 1 iff node i possesses file piece Fj, otherwise Pij = 0
   Pt = possession matrix at time t
   p = {p1,p2,…,pN} (upload limit vector),
    q = {q1,q2,…,qN} (download limit vector)

                                                 1    0   1   0    0   0   0   0
                                                 1    0   0   1    0   0   0   0
                                            P0  0    1   0   0    1   0   1   1
                                                 1    0   1   0    0   1   0   1
                                                 0                             1
                                                      1   0   0    0   0   1    
                                               p = {1,1,2,2,2}, q = {2,3,2,3,3}
Schedule (1)
   Specifies which file pieces each peer has to send out and to
    whom
   A possible schedule for P0 with p={1,1,2,2,2}, q={2,3,2,3,3}
    - Node 1: send piece 3 to node 2
    - Node 2: send piece 4 to node 1                                             1    0   1   0   0   0   0   0
    - Node 3: send piece 5 to node 1                                             1    0   0   1   0   0   0   0
              send piece 5 to node 2                                        P0   0   1   0   0   1   0   1   1
    - Node 4: send piece 6 to node 2                                             1    0   1   0   0   1   0   1
              send piece 6 to node 3                                             0                            1
    - Node 5: send piece 2 to node 4                                                  1   0   0   0   0   1    
              send piece 7 to node 4
   Formally, we use NxM matrix Sk to represent the schedule at
    cycle k. From Sk, we can derive transmission matrix Tk (NxM)
    0   0   0   2   3   0   0   0       0   0   0   1   1   0   0   0
    0   0   1   0   3   4   0   0       0   0   1   0   1   1   0   0      e.g. Node 1 receives piece 4
S  0
 0
         0   0   0   0   4   0   0   T  0
                                       0
                                               0   0   0   0   1   0   0      from Node 2, piece 5 from
    0   5   0   0   0   0   5   0       0   1   0   0   0   0   1   0      Node 3 =>S 14  2 and S15  3
                                                                                           0          0
    0                           0       0                           0
        0   0   0   0   0   0               0   0   0   0   0   0    
Schedule (2)
   Given Pk-1 and the schedule Sk-1, Tk-1, the possession
    matrix at next cycle k is Pk = Pk-1 + Tk-1 (k > 0)
     P1  P0  T0
          1   0   1   0   0   0   0   0 0     0   0   1   1   0   0   0 1     0   1   1   1   0   0   0
          1   0   0   1   0   0   0   0 0     0   1   0   1   1   0   0 1     0   1   1   1   1   0   0
         0   1   0   0   1   0   1   1  0   0   0   0   0   1   0   0  0   1   0   0   1   1   1   1
          1   0   1   0   0   1   0   1 0     1   0   0   0   0   1   0 1     1   1   0   0   1   1   1
          0                           1 0                             0 0                             1
              1   0   0   0   0   1           0   0   0   0   0   0           1   0   0   0   0   1    
   The distribution terminates after certain, say k0
    cycles, until Pij  1,i , j
                    k              0




   Our goal is to minimize k0, which is the time needed
    for complete distribution
Analysis on Lower Bound (1)
   Let p = {p1,p2,…,pN}, q = {q1,q2,…,qN} be theNupload and               N
    download limit vectors. pmax  i[ 1 ,N ]pi  , psum   pi , qsum   qi
                                    max
                                                            i 1 M        i 1
   Let ri be the total no. of 0s across row i, i.e. ri  1 Pij , the min.
    value of k0 is given by           ri                     j 1
                                k0  max                    (1)
                                     i[ 1 , N ]
                                                  qi  
                                                                                   N
   Let cj be the total no. of 1s along column j, i.e. c j   Pij , we can
    find the minimum no. of 1s along all columns, cmin  min c j ,
                                                             i 1

                                                                 j[ 1 , M ]
                                                   N 
    the min. value of k0 is given by k0  log1 p   (2)
                                                                   c 
                                                                    min 
                                                                  max
                                                        
                                                                               , the min. value
                                                                 N M
   Let z be the total no. of 0s in P, i.e.                  z   1  Pij
                                                                i 1 j 1
    of k0 is given by k0      z     
                                          (3)   
                            min{ psum , q sum } 
Analysis on Lower Bound (2)
    Combining (1),(2),(3), the lower bound k0 is given by
             
                           ri      N        z      
    k0  max  max    , log1 pmax       , 
                                         c  minp , q   (4)
             
             
               i[ 1 , N ]
                            qi      min     sum sum  
                                                               
        1    1   1   1   1   1   1   1      2      3
        0    1   0   0   0   0   0   0      2      3
        
    P  0    0   1   0   0   0   0   0 p   2 q   4
          0   0   0   1   0   0   0   0     1       3
        0    0   0   0   1   0   0   0     1       4
        0                                   1       2
             0   0   0   0   1   1   1
                                                    
                                   0  7   7  7   7   5  
From (1), k0  max   ,   ,   ,   ,   ,     max0 ,3 ,2 ,3 ,2 ,3  3
                        3   3   4   3   4   2  
                          6 
From (2), k 0  log 1 2    2
                          1                  k0  max{ 3 ,2 ,4 }  4
                    33        
From (3), k 0                  4
                 min9 ,19  
Rarest Piece First (RPF)
   Borrowed from the Rarest Element First algorithm
    employed in BitTorrent
   Rarity cj of piece j is the no. of peers who have piece j,
               N
    i.e. c j   Pij
                      i 1

    RPF – Node-Oriented: (p={1,1,2,2,2}, q={2,3,2,3,3})
            1    0     1    0   0   0   0   0        1    0   1   1   1   0   0   0
            1    0     0    1   0   0   0   0        1    0   1   1   1   1   0   0
       P 0  0                                                                      
                  1     0    0   1   0   1   1   P1   0   1   0   0   1   1   1   1   …
            1    0     1    0   0   1   0   1
            0                                         1    1   1   0   0   1   1   1
                 1     0    0   0   0   1   1
                                                      0
                                                            1   0   0   0   0   1   1
                                                                                      
    RPF – Piece-Oriented: (p={1,1,2,2,2}, q={2,3,2,3,3})
          1      0     1    0   0   0   0   0        1    0   1   1   1   0   0   0
          1      0     0    1   0   0   0   0        1    1   0   1   1   1   0   0
        0 
       P  0                                 1                                         …
                  1     0    0   1   0   1        P1   0   1   1   0   1   1   1   1
          1      0     1    0   0   1   0   1
          0                                           1                            1
                                             1
                                                             1   1   0   0   1   0
                 1     0    0   0   0   1            0                            1
                                                            1   0   0   0   0   1    
Most Demanding Node First
(MDNF)d of node i is the no. of un-received pieces for
  Demand
                          i
    node i, i.e. d i   1 Pij 
                                   M


                       j 1
   When choosing recipients, prefer sending to the node
    with largest di
   MDNF – Node-Oriented: (p={1,1,2,2,2}, q={2,3,2,3,3})
           1      0   1   0   0   0   0   0 6        1    0   1   1   1   0   0   0
           1      0   0   1   0   0   0   0 6        1    0   1   1   1   1   0   0
                                                  P1   0                            
         0 
        P  0      1   0   0   1   0   1   1 4              1   0   0   1   0   1   1   …
           1      0   1   0   0   1   0   1 4        
           0                              1 5        1    1   1   0   0   1   1   1
                  1   0   0   0   0   1              0    1   0   0   0   1   1   1
      MDNF – Piece-Oriented: (p={1,1,2,2,2}, q={2,3,2,3,3})
             1    0   1   0   0   0   0   0 6        1    0   1   1   1   0   0   0
             1                                        1                            0
                   0   0   1   0   0   0   0 6              1   0   1   1   1   0
        P0   0   1   0   0   1   0   1   1 4   P1   0   1   0   0   1   0   1
                                                                                      
                                                                                     1   …
             1                                        
                   0   1   0   0   1   0   1 4        1    1   1   0   0   1   0   1
             0                            1 5
                  1   0   0   0   0   1              0    1   1   0   0   1   1   1
Problem with RPF and MDNF
   The max. no. of transmissions for each
    cycle cannot be achieved
Using MDNF – Piece-Oriented:       (p={2,2,2,1}, q={2,1,2,2})
only 6 transmissions can be scheduled (but the max. is 7)

           1    1   0   0 2            1    1   0   0 2
           1    0   1   1 1            1    0   1   1 1
       P  1
        0
                                     P  1
                                      0
                                                       1 2
                 0   0   1 2                  0   0
           0            0 2            0            0 2
                1   1                       1   1    
     MDNF (only 6 transmissions)       Maximum is 7 transmissions
     Maximum-Flow (MaxFlow)
      1    0   1   1   1       2     2
      1    0   0   1   1      1     1
   P 0    1   0   1   1  p   2  q 3 
      0    0   1   0   0      1      3
      1                0       3     3
           0   1   1                 
 Let G = (V,E) to be the flow network graph
  V  L  R  B  { s,t }
        L = {L1, L2, …, LN}
        R = {R1, R2, …, RN}
        bij B  Pij 0
E{( s ,Li )|LiL }{( Ri ,t )|RiR }{( bij ,Ri )|bijB ,RiR }H
      H {( Lu ,bvj )|Lu L,bvj B ,and ( Puj 1 Pvj 0 )}

  c( s ,Li ) p i
  c ( Ri , t )  q i
  c( u ,v ) 1 ( u ,v )E \{( s , Li )} {( Ri ,t )}
 Maximum-Flow (MaxFlow)
   1   0   1   1   1       2     2
   1   0   0   1   1      1     1
P 0   1   0   1   1  p   2  q 3 
   0   0   1   0   0      1      3
   1               0       3     3
       0   1   1                 


 Edmonds-Karp Algorithm:
 - Find augmenting paths using
 BFS
 - Guarantee to find maximum # of
 transmissions in each cycle
                          2
                                 N           
                                           N
 - Complexity = O E f O N M min p i ,q i  
                      *
                                                
                                  i 1 i 1  
 MaxFlow – Counter Example
    Pure MaxFlow performance is unsatisfactory, as it does
     not consider whether we can match more in subsequent
     cycles
Using MaxFlow, total 3 cycles are needed: (p={2,2,2,2,2}, q={3,3,3,3,3})
       1    0   1   1   1   1   0
                                 1 1 1 1 1 1 1             1 1 1 1 1 1 1
       1    1   0   1   0   1   1
                                 1 1 1 1 1 1 1             1 1 1 1 1 1 1
  P 0  0                P 1   0 0 1 1 1 1 1
                                 1               … P 1 1 1 1 1 1 1
                                                          3
             0   0   0   1   0
       0    0   0   1   1   1   0 0 1 1 1 1 1
                                 0                          1 1 1 1 1 1 1
       0                        0 0 1 1 1 0 1
                                 0                          1 1 1 1 1 1 1
            0   1   1   1   0                                         
Using RPF – Node-Oriented, only 2 cycles are needed: (p={2,2,2,2,2},
q={3,3,3,3,3})
      1 0 1 1 1 1 0           1 1 1 1 1 1 1          1 1 1 1 1 1 1
      1 1 0 1 0 1 1           1 1 1 1 0 1 1          1 1 1 1 1 1 1
  P 
   0  0 0 0 0 1 0 1     P1   1 1 0 0 1 1 1    P 2  1 1 1 1 1 1 1
      0 0 0 1 1 1 0           1 0 1 1 1 1 1          1 1 1 1 1 1 1
      0 0 1 1 1 0 0           0 0 1 1 1 1 0          1 1 1 1 1 1 1
                                                                   
MaxFlow - Weighted
                                         1   0   1   1   1
   Put weights on both sides to         1   0   0   1   1
    give priorities to some nodes     P 0   1   0   1   1
                                         0   0   1   0   0
    during searching                     1               0
                                             0   1   1    
   Weights on Li =  i (sum of
    the no. of 0s in other peers
    for those pieces that peer i
    has)
   Weights on Bij =δij (sum of the
    no. of 0s across row i and
    column j)
   E.g.  3 4127
          δ42 = 7
MaxFlow – Weighted
Counter Example
For p={2,2,2,2,2}, q={3,3,3,3,3}
Using MaxFlow – Weighted, total 3 cycles are needed:
      1    1   0   1   0   0   1   04         1    1   1   1   1   0   1   1
      1                            14         1
      0
            0   0   1   0   0   1                     1   1   1   0   1   1   1
            1   1   1   1   0   1   12         0    1   1   1   1   0   1   1
      0                            1 5        0
 P 0  1
            1   0   0   0   1   0                     1   1   1   1   1   0   1
            0   0   1   1   1   0   04    P1   1   1   0   1   1   1   0   1   … P3 = 1
      1                            05         1
      1
            0   0   0   0   1   1                     1   1   1   0   1   1   0
            0   1   0   1   0   0   14         1                            1
      0                            04         0
                                                      1   1   0   1   1   0
           0   1   0   1   1   1                   0   1   0   1   1   1   0
                                                                               
        3   5   5   4   4   4   3   4
Using MDNF – Piece-Oriented, only 2 cycles are needed:
      1    1   0   1   0   0   1   04         1    1   1   1   0   1   1   0
      1                            14         1
      0
            0   0   1   0   0   1                     1   1   1   0   1   1   1
            1   1   1   1   0   1   12         0    1   1   1   1   0   1   1
      0                            1 5        0
 P 0  1
            1   0   0   0   1   0
                                    04    P1   1   1   1   1   0   1   0   1    P2 = 1
      1
            0
            0
                0
                0
                    1
                    0
                        1
                        0
                            1
                            1
                                0
                                1   05               1   0   1   1   1   0   0
      1    0   1   0   1   0   0   14
                                                1    1   1   1   0   1   1   0
      0                            04         1    1   1   1   1   1   0   1
           0   1   0   1   1   1              0                            0
        3   5   5   4   4   4   3   4                1   1   1   1   1   1    
MaxFlow – Dynamically-Weighted
   Allows the weights to be dynamically varied
    within each scheduling cycle
    γ = {15,14,25,13,15,10,16,16} and δ43 = 9 which is the greatest value
    among all δij

   1      1   0   1   0   0   1   04            1    1   0   1   0   0   1   04
   1      0   0   1   0   0   1   14            1    0   0   1   0   0   1   14
   0      1   1   1   1   0   1   12            0    1   1   1   1   0   1   12
   0      1   0   0   0   1   0   1 5           0    1   1   0   0   1   0   14
P  1
 0
           0   0   1   1   1   0   04         P  1
                                                0
                                                        0   0   1   1   1   0   04
   1      0   0   0   0   1   1   05            1    0   0   0   0   1   1   05
   1      0   1   0   1   0   0   14            1    0   1   0   1   0   0   14
   0                                             0                            04
          0   1   0   1   1   1   04
                                                      0   1   0   1   1   1     
     3     5   5   4   4   4   3   4                3   5   4   4   4   4   3   4
 Simulation Results (1)




  Fig. 1 Performance comparison of various scheduling algorithms (All) with
varying peer sizes (file size = 100, pi = 2, qi = 3, equal probability for 1s and 0s)
   Simulation Results (2)




 Fig. 2 Performance comparison of various scheduling algorithms (Representative)
with varying peer sizes (file size = 100, pi = 2, qi = 3, equal probability for 1s and 0s)
  Simulation Results (3)




Fig. 3 Performance comparison of various scheduling algorithms (Representative)
with varying file sizes (peer size = 10, pi = 2, qi = 3, equal probability for 1s and 0s)
Future Work
 Study the case of asynchronous
  scheduling, where the transmission time is
  different for different pairs of nodes
 Study the case when the network is
  dynamic in nature, where peers can come
  and go at any instant and they may shift to
  communicate with different sets of peers
  during the distribution process
Conclusion
   The data distribution problem in P2P networks is
    not well studied in previous research
   We formally define the collaborative file
    distribution problem with the possession and
    transmission matrix formulations
   We also deduce a theoretical bound for the
    minimum distribution time required
   We develop several types of algorithms (RPF,
    MDNF, MaxFlow) for solving the problem
   Our novel dynamically-weighted max-flow
    algorithm outperforms all other algorithms by
    simulations
Thank You!



       Q&A

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:4
posted:8/26/2011
language:English
pages:24