Docstoc

P2P

Document Sample
P2P Powered By Docstoc
					Introduction

Quickly grown in popularity
  Dozens or hundreds of file sharing applications
  many million people worldwide use P2P
   networks
  Audio/Video transfer now dominates traffic on
   the Internet




                                                     1
                                              P2P
Overview

Centralized Database
   Napster

Swarming
   BitTorrent

Hierarchical Query Flooding
   KaZaA
   Skype (VOIP)
 Conclusion


                                     2
                               P2P
  P2P: centralized directory
original “Napster” design
   (1999, S. Fanning)                                                  Bob
                                  centralized
1) when peer connects, it      directory server
                                                      1
   informs central server:
                                                                      peers
    IP address                                       1
    content
                                                      1           3
2) Alice queries directory
   server for “Hey Jude”                    2     1

3) Alice requests file from
   Bob          Problems?
               r Single point of failure
                                                          Alice
               r Performance bottleneck
               r Copyright infringement                                       3
                                                                      P2P
Napster: Publish


                           insert(X,
                             123.2.21.23)
                           ...

                Publish


   I have X, Y, and Z!
                    123.2.21.23
                                                  4
                                            P2P
Napster: Search

  123.2.0.18


                          search(A)
                          -->
               Fetch      123.2.0.18

                  Query   Reply


      Where is file A?

                                             5
                                       P2P
Napster: Discussion
 Pros:
  Simple
  Search scope is O(1)
  Controllable (pro or con?)
 Cons:
  Server maintains O(N) State
  Server does all processing
  Single point of failure


                                       6
                                 P2P
BitTorrent: History
 In 2002, B. Cohen debuted BitTorrent
 Key Motivation:
    Popularity exhibits temporal locality (Flash Crowds)
    E.g., CNN on 9/11, new movie/game release
 Focused on Efficient Fetching, not Searching
    Does not include a search mechanism, but rather, it relies on
     central-directory based search facilities as provided by Web sites
     such as suprnova.org
 Has some “real” publishers:
    Blizzard Entertainment using it to distribute games



                                                                       7
                                                                 P2P
BitTorrent – Measurement on SuprNova

                                 overall




                                 videos
                                 games
                                 music
                                       8
BitTorrent-Characteristic

越熱門的檔案,傳輸速度越快
Reduce the load of file publisher
Reliable transmission
Fairness
  Choke/unchoke




                                     9
BitTorrent: Overview
Swarming:
  Join: contact centralized “tracker” server, get a
   list of peers.
  Publish: Run a tracker server. Creat .torrent file.
  Search: Out-of-band. E.g., use Google to find a
   tracker for the file you want.
  Fetch: Download pieces of the file from your
   peers. Upload pieces you have to them.



                                                      10
                                                P2P
Components
 Peers
     Seed (file owner)
     Leech(downloader)
 Data
     divided into 256KB pieces
     利用hash 檢查檔案正確性
 Metainfo File (.torrent file)
     Data資訊和各個pieces的hash value
     Tracker的url
     Etc…
 Tracker
     HTTP/HTTPS services
     維護torrent中各個peers的資訊
     幫助peers追蹤其他peers的一各途徑
 BT client
     BitComet
     mTorrent
     Etc…                         11
File publish

 將分享檔案製造出一個metainfo file,也
  就是.torrent file,這個檔案會紀錄被分享
  資料的一些資訊。
 在Web Server上執行一個叫做tracker的
  程式,tracker會紀錄torrent內所有peers
  的資訊。
 將metainfo file (.torrent)發布在網路上供
  其他人取得。
                                 12
BitTorrent – Peer-set
 Peer-set
  The list of neighbors a peer is allowed to communicate
   with


 Peer-set construction
  Each peer (seed or leecher) contacts the tracker and
   gets a list of peers participating in the same session
  Typically 50 peers are chosen at random by the tracker
   for each peer
  The peer-set is augmented by peers connecting directly
   to you
  The peer-set size is limited to 80 peers               13
Architecture




               14
Join/fetch
   Download metainfo file
   Connect to tracker and send request
   Get peer list from tracker
   Pick some peers from the peer list, handshake with
    them
   handshake成功後,互相告知自己用有的pieces
   檢查是否有自己interested的pieces
   若有且沒被對方choke,則開始作piece的transfer,若
    被choke則可選擇等待或連結其他peers。
   定期重複step 2和3讓peers和tracker都擁有最新的peers
    資訊。

                                                     15
File distribution: BitTorrent
    P2P file distribution
        tracker: tracks peers           torrent: group of
                                        peers exchanging
       participating in torrent          chunks of a file



    obtain list
    of peers

                              trading
                              chunks




                  peer
                                                                    16
                                             2: Application Layer
BitTorrent - Algorithms

Two components in BitTorrent downloading
 algorithm:

Peer Selection – determines from whom to
 download the piece?

Piece Selection – determines which piece
 to download?
                                            17
Tit for Tat
 "tip for tat", an agent using this strategy will respond
  in kind to a previous opponent's action.
 If the opponent previously was cooperative, the
  agent is cooperative. If not, the agent is not.
 This strategy is dependent on the following
  conditions :
    1. Unless provoked, the agent will always cooperate
    2. If provoked, the agent will retaliate
    3. The agent is quick to forgive




                                                           18
BitTorrent - Peer selection

Choke Algorithm
  Choking is a temporal refusal to upload
  Each peer unchokes a fixed number of peers
   (default = 4)
    3 peers on tit-for-tat basis
    1 peer on optimistic unchoke basis




                                                19
BitTorrent - Peer selection (cont’d)

Tit-for-tat peer selection
  Select the 3 peers from which you downloaded
   most and that are interested in your chunks
  Peer selection is done every 10 seconds, based
   on the download rates are of the last 30
   seconds.




                                                20
BitTorrent - Peer selection (cont’d)

Optimistic unchoke peer selection
  Select one peer at random that is interested in
   your chunks, regardless of the current
   download rate from it
  Rotates every 30 seconds.
Reason:
  To discover currently unused connections that
   are better than the ones being used


                                                     21
 BitTorrent                         Sending Chunks: tit-for-tat
Pulling Chunks                     r Alice sends chunks to four
 at any given time,                 neighbors currently sending
  different peers have              her chunks at the highest rate
  different subsets of file           re-evaluate top 4 every 10
  chunks                                          secs
 periodically, a peer (Alice)   r every 30 secs: randomly select
  asks each neighbor for             another peer, starts sending
  list of chunks that they                     chunks
  have.                                newly chosen peer may

 Alice sends requests for                     join top 4
  her missing chunks


                                                                        22
                                                 2: Application Layer
BitTorrent: Tit-for-tat
  (1) Alice “optimistically unchokes” Bob
     (2) Alice becomes one of Bob’s top-four providers; Bob reciprocates
   (3) Bob becomes one of Alice’s top-four providers




                                          With higher upload rate,
                                           can find better trading
                                          partners & get file faster!           23
                                                         2: Application Layer
BitTorrent: Sharing Strategy
“Tit-for-tat” sharing strategy
   “I’ll share with you if you share with me”
   Be optimistic: occasionally let freeloaders download
       Otherwise no one would ever start!
       Also allows you to discover better peers to download from
        when they reciprocate


 Super seed




                                                                  24
                                                            P2P
BitTorrent - Piece selection
 Random first piece
   Only applies if leecher has downloaded less than 4
    pieces (chunks)
   Choose randomly the next piece to download

 Local rarest first policy
   Determine the pieces that are most rare among your
    peers and download those first
   Ensures that the most common pieces are left till the
    end to download
   Rarest first also ensures that a large variety of pieces
    are downloaded from the seed

                                                               25
BitTorrent - Summary
 Efficient file download thanks to simple incentive
  mechanisms
  Local rarest first
     High piece entropy
  Tit-for-tat
     Avoids free-riding
     Optimizes resource utilization
 Space for improvement?
  Central tracker server



                                                   26
BitTorrent References
Section inspired by:
 “Rarest First and Choke Algorithms are Enough”,
  Arnaud Legout, G. Urvoy-Keller, P. Michiardi,
  IMC 2006.
 “The Bittorrent P2P File-sharing System:
  Measurements and Analysis”, J.A Pouwelse, P.
  Garbacki, D.H.J Epema, H.J. Sips, IPTPS 05,
  February 2005.
 “Incentives Build Robustness in BitTorrent”,
  Bram Cohen, First Workshop on Economics of
  Peer-to-peer Systems, June 2003.
                                               27
BTW: File Distribution: Server-Client vs
P2P
 Question : How much time to distribute file
  from one server to N peers?
                                                us: server upload
                                                    bandwidth
               Server
                                                 ui: peer i upload
                         u1   d1   u2                bandwidth
                    us                  d2
                                                di: peer i download
File, size F                                         bandwidth
               dN
                            Network (with
               uN         abundant bandwidth)




                                                                         28
                                                  2: Application Layer
BTW: File distribution time: server-client
                                        Server
server
                                    F                u1 d1 u2
 sequentially                                 us                d2

 sends N copies:                         dN          Network (with
   NF/us time                           uN
                                                   abundant bandwidth)

client i takes F/di
 time to download
      Time to distribute F
         to N clients using   = dcs = max { NF/us, F/min(di) }
    client/server approach                                      i

                                   increases linearly in N
                                        (for large N)Application Layer
                                                    2:
                                                                         29
BTW: File distribution time: P2P
 server must send one                 Server

  copy: F/us time                  F                   u1 d1 u2
                                                us                  d2
 client i takes F/di time
                                                       Network (with
  to download                              dN
                                                     abundant bandwidth)
 NF bits must be                          uN

  downloaded
  (aggregate)
        r fastest possible upload rate: us + Sui


             dP2P = max { F/us, F/min(di) , NF/(us + Sui) }
                                       i
                                                                                  30
                                                           2: Application Layer
                                  Server-client vs. P2P: example
Client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us

                                 3.5
                                           P2P
     Minimum Distribution Time



                                  3
                                           Client-Server
                                 2.5

                                  2

                                 1.5

                                  1

                                 0.5

                                  0
                                       0    5      10      15       20   25   30      35

                                                                N
                                                                                                      31
                                                                               2: Application Layer
All Peers Equal?

         1.5Mbps DSL                                  1.5Mbps DSL


                                    Quic kTim e™ and a
                                                           56kbps Modem
                          TIFF (Uncompres sed) dec ompressor
                             are needed to s ee this pic ture.




1.5Mbps DSL


                                                                 10Mbps LAN

          1.5Mbps DSL
                        56kbps Modem
                                                                    56kbps Modem




                                                                                         32
                                                                                   P2P
KaZaA: History
 In 2001, KaZaA created by Dutch company
  Kazaa BV
 Popular file sharing network with >10 million
  users (number varies)




                                                        33
                                                  P2P
KaZaA: Overview
 “Smart” Query Flooding:
  Join: on startup, client contacts a “supernode” ... may at
   some point become one itself
  Publish: send list of files to supernode
  Search: send query to supernode, supernodes flood
   query amongst themselves.
  Fetch: get the file directly from peer(s);




                                                              34
                                                        P2P
KaZaA: Network Design
            “Super Nodes”




                                  35
                            P2P
KaZaA: File Insert


    insert(X,
      123.2.21.23)
    ...

       Publish

  I have X!

       123.2.21.23
                           36
                     P2P
   KaZaA: File Search
          search(A)
          -->
          123.2.22.50


                                  search(A)
   123.2.22.50                    -->
             Query      Replies   123.2.0.18

Where is file A?

                                  123.2.0.18
                                                     37
                                               P2P
KaZaA: Discussion
 Pros:
    Tries to take into account node heterogeneity:
        Bandwidth
        Host Computational Resources
        Host Availability (?)


 Cons:
    Mechanisms easy to circumvent
    Still no real guarantees on search scope or search time



 P2P architecture used by Skype
                                                                     38
                                                               P2P
Skype Overlay
 Protocol not fully understood today
  Content and control messages are encrypted

 Protocol reuses concepts of the FastTrack
  overlay used by KaZaA

 Builds upon an unstructured overlay
  Two tier hierarchy
      Super Nodes (SN)
      Ordinary Nodes (ON)



                                                39
Skype Overlay (cont’d)
 Super Nodes (SN)
  Connect to each other, building a flat unstructured
   overlay (similar to the Gnutella overlay)


 Ordinary Nodes (ON)
  Connect to Super Nodes


 Skype login server
  Only central component
  Stores and verifies usernames and passwords

                                                         40
Skype Overlay (cont’d)


                                     Skype
                                      login
                                     server
                                       Message
                                    exchange during
                                        login for
                                     authentication




      SN   ON     Neighbor relationship
                                              41
How is the overlay constructed? - Super
Node Lists
Each node keeps a host cache(HC) with a
 list of Super Nodes IP-addresses and port
 pairs
  Up to 200 entries
  Cache in the windows registry
Some Super Nodes IP-addresses are
 hard-coded


                                          42
How is the overlay constructed? -
Login
Contact login server and authenticate
Advertise your presence to other peers
  Contact a Super Node
  Contact your buddies (through Super Node),
   and notify your presence




                                                43
Login process-
contact a
super node




          44
Super Nodes – Index servers

Super Nodes are index servers
  I.e. index of locally connected Skype users (and
   their IP addresses)
If buddy is not found in local index of a
 Super Node
  Spread node search to neighboring Super
   Nodes
  Not clear how this is implemented
     Possibly flood the request similar to Gnutella
                                                       45
Super Nodes – Relay nodes
Super nodes also act as relay nodes
  Enables NAT traversals
Alice would like to call Bob (or inversely)

   Alice




   Bob
                                               46
Super Nodes – Relay nodes

Alice would like to call Bob (or inversely)
 Alice
                                       Relay
                               Contact Call Node



                                           Skype
                                         relay node
 Bob



                                               47
Super Node election

When does an ordinary node becomes a
 super node?
  High bandwidth, Public IP address, but details
   not clear




                                                    48
Super Node election

A world map of Skype Super Nodes




                                    49
Skype References
Section inspired by:
 “An Analysis of the Skype Peer-to-Peer Internet
  Telephony Protocol”, S.A. Baset and H.G.
  Schulzrinne, Infocom 2006, April 2006.
 “An Experimental Study of the Skype Peer-to-
  Peer VoIP System”, Saikat Guha, Neil Daswani,
  Ravi Jain, IPTPS’06, February 2006.
 “Characterizing and Detecting Skype-Relayed
  Traffic”, K. Suh, D. R. Figueiredo, J. Kurose, D.
  Towsley, Infocom 2006, April 2006.
                                                 50
Conclusion and future of p2p




                               51
P2P Attracting Attentions from
Commercial World
 Startups providing P2P live program: pplive,
  coolstreaming
 BBC Legal Download Platforms: iMP / Kontiki
  Allow users in UK to download BBC TV and radio
   programs via a program guide for up to 7 days after
   broadcast




                                                     52
Will P2P Go Beyond Desktop?
 Current device requirement
  CPU, memory, and disk space requirement
  Platforms supported
  Internet connection requirement
 Three categories of p2p application
  file downloading
     BitTorrent already on some SetTop-Boxes and DSL-
      routers
  Voice
     Skype mobile phones
  Video
     Pplive                                             53


     TVant
Will P2P Go Beyond Desktop?
(Discussion)
Mobile P2P?
  What benefits does p2p offer over mobile
   device?
    ???
  What are potential issues?
    Power
    Connection speed
    ???



                                              54
Future of P2P - Ad-hoc P2P
 Opportunistically use all available technologies!
 Access knowledge and resource of devices you
  cross in the street



                 GSM



 Local P2P content search
  What is currently the best place to find a cab ?
  What are the results of yesterday’s soccer match ?   55
Conclusions and Future of P2P
 More commercial P2P applications
    Combats between legal and illegal content sharing will continue
    More p2p used in commercial environment
        Reduce distribution cost and compete with illegal content
 Secure P2P
 Better performance
    More intelligent sharing
    More scalable
 Supporting diversity – long tail content
    YouTube




                                                                       56

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:5
posted:8/26/2011
language:English
pages:56