CMPT 880 P2P Systems

Document Sample
CMPT 880 P2P Systems Powered By Docstoc
					       School of Computing Science
         Simon Fraser University

CMPT 765/408: P2P Systems

  Instructor: Dr. Mohamed Hefeeda

         P2P Computing: Definitions

   Peers cooperate to achieve desired functions
     - Peers:
        • End-systems (typically, user machines)
        • Interconnected through an overlay network
        • Peer ≡ Like the others (similar or behave in similar manner)
     - Cooperate:
        • Share resources, e.g., data, CPU cycles, storage, bandwidth
        • Participate in protocols, e.g., routing, replication, …
     - Functions:
        • File-sharing, distributed computing, communications,
          content distribution, …

   Note: the P2P concept is much wider than file
                When Did P2P Start?

   Napster (Late 1990’s)
     - Court shut Napster down in 2001
   Gnutella (2000)
   Then the killer FastTrack (Kazaa, ...)
   BitTorrent, and many others
   Accompanied by significant research interest
   Claim
     - P2P is much older than Napster!
   Proof
     - The original Internet!
     - Remember UUCP (unix-to-unix copy)?
     What IS and IS NOT New in P2P?

   What is not new
    - Concepts!
   What is new
    - The term P2P (may be!)
    - New characteristics of
       • Nodes which constitute the
       • System that we build

         What IS NOT New in P2P?

   Distributed architectures
   Distributed resource sharing
   Node management (join/leave/fail)
   Group communications
   Distributed state management
   ….

             What IS New in P2P?

   Nodes (Peers)
    - Quite heterogeneous
       • Several order of magnitudes difference in resources
       • Compare the bandwidth of a dial-up peer versus a
         high-speed LAN peer
    - Unreliable
       • Failure is the norm!
    - Offer limited capacity
       • Load sharing and balancing are critical
    - Autonomous
       • Rational, i.e., maximize their own benefits!
       • Motivations should be provided to peers to cooperate
         in a way that optimizes the system performance
       What IS New in P2P? (cont’d)
   System
    - Scale
       • Numerous number of peers (millions)
    - Structure and topology
       • Ad-hoc: No control over peer joining/leaving
       • Highly dynamic
    - Membership/participation
       • Typically open 
    - More security concerns
       • Trust, privacy, data integrity, …
    - Cost of building and running
       • Small fraction of same-scale centralized systems
       • How much would it cost to build/run a super computer
         with processing power of that 3 Million SETI@Home PCs?

        What IS New in P2P? (cont’d)

   So what?
   We need to design new lighter-weight
    algorithms and protocols to scale to
    millions (or billions!) of nodes given the
    new characteristics
   Question: why now, not two decades ago?
     - We did not have such abundant (and
       underutilized) computing resources back then!
     - And, network connectivity was very limited

     Why is it Important to Study P2P?

   P2P traffic is a major portion of Internet
    traffic (50+%), current killer app
   P2P traffic has exceeded web traffic
    (former killer app)!
   Direct implications on the design,
    administration, and use of computer
    networks and network resources
     - Think of ISP designers or campus network
   Many potential distributed applications
          Sample P2P Applications

   File sharing
     - Gnutella, Kazaa, BitTorrent, …
   Distributed cycle sharing
     - SETI@home, Gnome@home, …
   File and storage systems
     - OceanStore, CFS, Freenet, Farsite, …
   Media streaming and content distribution
     - PROMISE
     - SplitStream, CoopNet, PeerCast, Bullet,
       Zigzag, NICE, …
    P2P vs. its Cousin (Grid Computing)

   Common Goal:
     - Aggregate resources (e.g., storage, CPU
       cycles, and data) into a common pool and
       provide efficient access to them
   Differences along five axes [Foster & Imanitchi 03]
     - Target communities and applications
     - Type of shared resources
     - Scalability of the system
     - Services provided
     - Software required

     P2P vs Grid Computing (cont’d)

    Issue                Grid                      P2P
                                        Grass-root
Communities communities, e.g.,           communities
and         scientific institutions      (anonymous)
Applications     Computationally-       Mostly, file-
                intensive problems       swapping
                Powerful and Reliable
                                         PCs with limited
                machines, clusters
                                         capacity and
Resources       High-speed              connectivity
Shared          connectivity
                                            Unreliable
                 Specialized
                                            Very diverse

    P2P vs Grid Computing (cont’d)

   Issue                Grid                       P2P
                                           Hundreds of
System        Hundreds to thousands
                                           thousands to
Scalability   of nodes
                                           Millions of nodes
              Sophisticated services:
              authentication, resource
                                           Limited services:
              discovery, scheduling,
Services      access control, and
                                           resource discovery
Provided      membership control           limited trust
                                           among peers
              Members usually trust
              Sophisticated suit: e.g.,   Simple: (screen
Software      Globus, Condor               saver), e.g., Kazza,
required                                   SETI@Home           13
    P2P vs Grid Computing: Discussion

   The differences mentioned are based on the
    traditional view of each paradigm
     - It is conceived that both paradigms will converge and
       will complement each other [e.g., Butt et al. 03]
   Target communities and applications
     - Grid: is going open
   Type of shared resources
     - P2P: is to include various and more powerful resources
   Scalability of the system
     - Grid: is to increase number of nodes
   Services provided
     - P2P: is to provide authentication, data integrity, trust
       management, …
           P2P Systems: Simple Model

                                           P2P Application


                                     P2P Substrate

                                          Operating System


System architecture: Peers form an
   overlay according to the P2P           Software architecture
            Substrate                       model on a peer
                        Overlay Network

   An abstract layer built on top of physical network
   Neighbors in overlay can be several hops away in
    physical network
   Why do we need overlays?
     - Flexibility in
         • Choosing neighbors
         • Forming and customizing topology to fit application’s
           needs (e.g., short delay, reliability, high BW, …)
         • Designing communication protocols among nodes
     - Get around limitations in legacy networks
     - Enable new (and old!) network services

Overlay Network (cont’d)

            Overlay Network (cont’d)

   Some applications that use overlays
     - Application level multicast, e.g., ESM, Zigzag, NICE, …
     - Reliable inter-domain routing, e.g., RON
     - Content Distribution Networks (CDN)
     - Peer-to-peer file sharing
   Overlay design issues
     - Select neighbors
     - Handle node arrivals, departures
     - Detect and handle failures (nodes, links)
     - Monitor and adapt to network dynamics
     - Match with the underlying physical network

Overlay Network (cont’d)
  Recall: IP Multicast


    Overlay Network (cont’d)
Application Level Multicast (ALM)


               Peer Software Model

   A software client
                                   P2P Application
    installed on each peer
   Three components:
     - P2P Substrate
                             P2P Substrate
     - Middleware
     - P2P Application            Operating System


                                 Software model on peer

        Peer Software Model (cont’d)

   P2P Substrate (key component)
     - Overlay management
       • Construction
       • Maintenance (peer join/leave/fail and network
     - Resource management
       • Allocation (storage)
       • Discovery (routing and lookup)

   Ex: Pastry, CAN, Chord, …

   More on this later
       Peer Software Model (cont’d)

   Middleware
    - Provides auxiliary services to P2P applications:
       • Peer selection
       • Trust management
       • Data integrity validation
       • Authentication and authorization
       • Membership management
       • Accounting (Economics and rationality)
       • …
    - Ex: CollectCast, EigenTrust, Micro payment

           Peer Software Model (cont’d)

   P2P Application
    - Potentially, there could be multiple applications
      running on top of a single P2P substrate
    - Applications include
       •   File sharing
       •   File and storage systems
       •   Distributed cycle sharing
       •   Content distribution
    - This layer provides some functions and
      bookkeeping relevant to target application
       • File assembly (file sharing)
       • Buffering and rate smoothing (streaming)
   Ex: Promise, Bullet, CFS
                   P2P Substrate

   Key component, which
    - Manages the Overlay
    - Allocates and discovers objects
   P2P Substrates can be
    - Structured
    - Unstructured
    - Based on the flexibility of placing objects at

          P2P Substrates: Classification

   Structured (or tightly controlled, DHT)
     − Objects are rigidly assigned to specific peers
     − Looks like as a Distributed Hash Table (DHT)
     − Efficient search & guarantee of finding
     − Lack of partial name and keyword queries
     − Maintenance overhead
     − Ex: Chord, CAN, Pastry, Tapestry, Kademila (Overnet)
   Unstructured (or loosely controlled)
     − Objects can be anywhere
     − Support partial name and keyword queries
     − Inefficient search & no guarantee of finding
     − Some heuristics exist to enhance performance
     − Ex: Gnutella, Kazaa (super node), GIA [Chawathe et al. 03] 26
         Structured P2P Substrates

   Objects are rigidly assigned to peers
     − Objects and peers have IDs (usually by
       hashing some attributes)
     − Objects are assigned to peers based on IDs
   Peers in overlay form specific geometrical
    shape, e.g.,
     - tree, ring, hypercube, butterfly network
   Shape (to some extent) determines
     − How neighbors are chosen, and
     − How messages are routed
      Structured P2P Substrates (cont’d)

   Substrate provides a Distributed Hash
    Table (DHT)-like interface
    − InsertObject (key, value), findObject (key), …
    − In the literature, many authors refer to
      structured P2P substrates as DHTs
   It also provides peer management (join,
    leave, fail) operations
   Most of these operations are done in O(log
    n) steps, n is number of peers

    Structured P2P Substrates (cont’d)

   DHTs: Efficient search & guarantee of
   However,
    − Lack of partial name and keyword queries
    − Maintenance overhead, even O(log n) may be too
      much in very dynamic environments
   Ex: Chord, CAN, Pastry, Tapestry, Kademila

    Example: Content Addressable Network (CAN)
                          [Ratnasamy 01]

−   Nodes form an overlay in d-dimensional space
     − Node IDs are chosen randomly from the d-space
     − Object IDs (keys) are chosen from the same d-space
−   Space is dynamically partitioned into zones
−   Each node owns a zone
−   Zones are split and merged as nodes join and
−   Each node stores
     − The portion of the hash table that belongs to its zone
     − Information about its immediate neighbors in the d-

2-d CAN: Dynamic Space Division








         0                        7

2-d CAN: Key Assignment

                          K2    n2





     0                               7

2-d CAN: Routing (Lookup)

                             K2    n2






      0                                 7

                  CAN: Routing

−   Nodes keep 2d = O(d) state information
    (neighbor coordinates, IPs)
    − Constant, does not depend on number of
      nodes n
−   Greedy routing
    - Route to the node that is closest to the
    - On average, is done in O(n1/d) = O(log n) when
      d = log n /2

                   CAN: Node Join

−   New node finds a node already in the CAN
     − (bootstrap: one (or a few) dedicated nodes outside the
       CAN maintain a partial list of active nodes)
−   It finds a node whose zone will be split
     − Choose a random point P (will be its ID)
     − Forward a JOIN request to P through the existing node
−   The node that owns P splits its zone and sends
    half of its routing table to the new node
−   Neighbors of the split zone are notified

              CAN: Node Leave, Fail
−   Graceful departure
     − The leaving node hands over its zone to one of its
−   Failure
     − Detected by the absence of heart beat messages sent
       periodically in regular operation
     − Neighbors initiate takeover timers, proportional to the
       volume of their zones
     − Neighbor with smallest timer takes over zone of dead
        − notifies other neighbors so they cancel their timers (some
          negotiation between neighbors may occur)
     − Note: the (key, value) entries stored at the failed node
       are lost
        − Nodes that insert (key, value) pairs periodically refresh (or
          re-insert) them

                  CAN: Discussion
−   Scalable
     − O(log n) steps for operations
     − State information is O(d) at each node
−   Locality
     − Nodes are neighbors in the overlay, not in the physical
     − Suggestion (for better routing)
        − Each node measure RTT between itself and its neighbors
        − Forward the request to the neighbor with maximum ratio
          of progress to RTT
−   Maintenance cost
     − Logarithmic
     − But, may still be too much for very dynamic P2P

           Unstructured P2P Substrates
−   Objects can be anywhere  Loosely-controlled
−   The loose control
     − Makes overlay tolerate transient behavior of nodes
        − For example, when a peer leaves, nothing needs to be done
          because there is no structure to restore
     − Enables system to support flexible search queries
        − Queries are sent in plain text and every node runs a mini-
          database engine
−   But, we loose on searching
     − Usually using flooding, inefficient
     − Some heuristics exist to enhance performance
     − No guarantee on locating a requested object (e.g., rarely
       requested objects)
−   Ex: Gnutella, Kazaa (super node), GIA [Chawathe et al.
                Example: Gnutella
−   Peers are called servents
−   All peers form an unstructured overlay
−   Peer join
     − Find an active peer already in Gnutella (e.g., contact
       known Gnutella hosts)
     − Send a Ping message through the active peer
     − Peers willing to accept new neighbors reply with Pong
−   Peer leave, fail
     − Just drop out of the network!
−   To search for a file
     − Send a Query message to all neighbors with a TTL (=7)
     − Upon receiving a Query message
        − Check local database and reply with a QueryHit to
        − Decrement TTL and forward to all neighbors if nonzero
          Flooding in Gnutella

Scalability Problem

    Heuristics for Searching [Yang and Garcia-Molina 02]

−   Iterative deepening
     − Multiple BFS with increasing TTLs
     − Reduce traffic but increase response time
−   Directed BFS
     − Send to “good” neighbors (subset of your neighbors
       that returned many results in the past)  need to keep
−   Local Indices
     − Keep a small index over files stored on neighbors (within
       number of hops)
     − May answer queries on behalf of them
     − Save cost of sending queries over the network
     − Index currency?
Heuristics for Searching: Super Node

−   Used in Kazaa (signaling protocols are
−   Studied in [Chawathe 03]
−   Relatively powerful nodes play special role
     − maintain indexes over other peers

   Unstructured Substrates with Super Nodes

Super Node (SN)

                  Ordinary Node (ON)

Example: FastTrack Networks (Kazaa)

−   Most of info/plots in following slides are from
    Understanding Kazaa by Liang et al.
−   The most popular (~ 3 million active users in a
    typical day) sharing 5,000 Terabytes
−   Kazaa traffic exceeds Web traffic
−   Two-tier architecture (with Super Nodes and
    Ordinary Nodes)
−   SN maintains index on files stored at ONs
    attached to it
     − ON reports to SN the following metadata on each file:
     − File name, file size, ContentHash, file descriptors (artist
       name, album name, …)
        FastTrack Networks (cont’d)

−   Mainly two types of traffic
     − Signaling
        − Handshaking, connection establishment, uploading
          metadata, …
        − Encrypted! (some reverse engineering efforts)
        − Over TCP connections between SN—SN and SN—ON
        − Analyzed in [Liang et al. 04]
     − Content traffic
        − Files exchanged, not encrypted
        − All through HTTP between ON—ON
        − Detailed Analysis in [Gummadi et al. 03]

                  Kazaa (cont’d)

−   File search
     − ON sends a query to its SN
     − SN replies with a list of IPs of ONs that have
       the file
     − SN may forward the query to other SNs
−   Parallel downloads take place between
    supplying ONs and receiving ON

       FastTrack Networks (cont’d)

−   Measurement study of Liang et al.
    − Hook three machines to Kazaa and wait till one
      of them is promoted to be SN
    − Connect the other two (ONs) to that SN
    − Study several properties
       − Topology structure and dynamics
       − Neighbor selection
       − Super node lifetime
       − ….

Kazaa: Topology Structure [Liang et al. 04]

 ON to SN: 100 - 160 connections  Since there are ~3M
     nodes, we have ~30,000 SNs
 SN to SN: 30 – 50 connections  Each SN connects to
     ~0.1 % of total number of SNs
Kazaa: Topology Dynamics [Liang et al. 04]

−   Average ON – SN connection duration
    − Is ~ 1 hour, after removing very short-lived
      connections (30 sec) used for shopping for
−   Average SN – SN connection duration
    − 23 min, which is short because of
       − Connection shuffling between SNs to allow ONs to
         reach a larger set of objects
       − SNs search for other SNs with smaller loads
       − SNs connect to each other from time to time to
         exchange SN lists (each SN stores 200 other SNs in
         its cache)

Kazaa: Neighbor Selection [Liang et al. 04]

−   When ON first joins, it gets a list of 200 SNs
     − ON considers locality and SN workload in selecting its future SN
−   Locality
     − 40% of ON-SN connections have RTT < 5 msec
     − 60% of ON-SN connections have RTT < 50 msec
          − RTT: E. US  Europe ~100 msec
        Kazaa: Lifetime and Signaling
            Overhead [Liang et al. 04]

−   Super node average lifetime is ~2.5 hours
−   Overhead:
     − 161 Kb/s upstream
     − 191 Kb/s downstream
     −  Most of SNs are high-speed (campus network, or cable)
Kazaa vs. Firewalls, NAT [Liang et al. 04]
−   Default port WAS 1214
     − Easy for firewalls to filter out Kazaa traffic
−   Now, Kazaa uses dynamic ports
     − Each peer chooses its random port
         − ON reports its port to its SN
         − Ports of SNs are part of the SN refresh list exchanged among
     − Too bad for firewalls!
−   Network Address Translator (NAT)
     − A requesting peer can not establish a direct connection with a
       serving peer behind NAT
     − Solution: connection reversal
         − Send to SN of NATed peer, which already has a connection with it
         − SN tells NATed peer to establish a connection with requesting
         − Transfer occurs happily through the NAT
         − Both peers behind NATs?
        Kazaa: Lessons [Liang et al. 04]
−   Distributed design
−   Exploit heterogeneity
−   Load balancing
−   Locality in neighbor selection
−   Connection Shuffling
     − If a peer searches for a file and does not find it, it may try
       later and gets it!
−   Efficient gossiping algorithms
     − To learn about other SNs and perform shuffling
     − Kazaa uses a “freshness” field in SN refresh list  a
       peer ignores stale data
−   Consider peers behind NATs and Firewalls
     − They are everywhere!

   P2P is an active research area with many
    potential applications in industry and academia
   In P2P computing paradigm:
     - Peers cooperate to achieve desired functions
   New characteristics
     - heterogeneity, unreliability, rationality, scale, ad hoc
     -  new and lighter-weight algorithms are needed
   Simple model for P2P systems:
     - Peers form an abstract layer called overlay
     - A peer software client may have three components
         • P2P substrate, middleware, and P2P application
         • Borders between components may be blurred
               Summary (cont’d)

   P2P substrate: A key component, which
    - Manages the Overlay
    - Allocates and discovers objects
   P2P Substrates can be
    - Structured (DHT)
       • Example: CAN
    - Unstructured
       • Example 1: Gnutella,
       • Example 2: Kazza


Shared By: