Gnutella2 A Better Gnutella

Document Sample
Gnutella2 A Better Gnutella Powered By Docstoc
					Gnutella2: A Better

 COMP 5204: Data Networks
        Julie Thorpe
 School of Computer Science
     Carleton University
   Gnutella and Gnutella2 are P2P application protocols.
   Gnutella has an interesting history – essentially a
    reverse engineered Beta version.
     Changes   governed by Gnutella Developers Forum (GDF).
   Gnutella2 is a completely different protocol than
    Gnutella, claiming it's what Gnutella should have been.
     TheGDF reject this claim, and refuse to call Gnutella2 by its
      name – they instead call it “Mike's Protocol”.
   Unclear whether Gnutella2 better than Gnutella.
     My project goal is to compare these protocols to determine
      which is theoretically better.
Presentation Outline
   Gnutella review
   Gnutella's problems
   Gnutella2 review
   Comparison:
     Network    architecture
     Searching   algorithms
     Cooperation    incentives
     Security

   Concluding remarks
Gnutella Review (1)
   Purely decentralized, simple protocol for file
   Runs over TCP/IP connections.
   New node enters Gnutella network by connecting
    to a known server (sends Ping), and when server
    responds (sends Pong) they are now connected
   Learns of other nodes if server forwards Ping to its
    peers (and gets a Pong back in response).
Gnutella Review (2)
   Has no way to advertise
   Peers find files by “flooding”
    with query requests (stop
    after TTL hops).
   Query responses are
    routed back along the same
    path as the request arrived
   Addressed by GUIDs.
Gnutella's Problems
   Scalability.
   Performance.
   Lack of cooperation incentives.
   Abuse by servents (Gnutella client/server program).
Gnutella's Problems - Scalability
• For a node to be reached by a query, it (and all other
  nodes on the path to it) must be forwarded the
• The reach is determined by n (# connections to other
  hosts) and TTL (# hops each request is permitted to
  take):         TTL

                  (n  1)
                 t 1
                               t 1

• Assumption: nodes all have the same n and TTL.
• Ritter [1] estimated that for reasonable parameters, to
  achieve a reach of 10 6 nodes, Gnutella nodes must
  have a bandwidth between 19.2 and 64 Gbps!
Gnutella's Problems - Performance
   Messages are forwarded
    through many other peers on the
   Connection speeds are
    effectively restricted to the
    bandwidth of the slowest peer
    along the route.
   If A and B have high-speed
    connections, and C has a
    modem connection, the
    download rate between A and B
    is limited to C's speed.
Gnutella's Problems – Lack of
Cooperation Incentives
   One study found that over 70% of users shared no
    files, and 50% of all responses are returned by the
    top 1% [2] .
   Implications of free-riders on Gnutella:
     Increased search horizon (farthest set of hosts reachable
      by a search request, directly related to its TTL).
     The top 1% that is providing most files reaches
      connection saturation.
   This generally a difficult problem to solve for purely
    decentralized systems.
Gnutella's Problems – Abuse by
   Since it's a protocol, implementations can implement
    the Gnutella protocol as they please (in theory).
   Servents can act selfishly to improve their performance
     IncreasingTTL to increase search horizon (generating
      geometrically higher # messages).
     Frequent    re-querying (generating more messages, degrading
   GDF recommended (in version
    0.6) “ultrapeers” to improve
    Gnutella's performance and
     Gnutella2    enforces a variation of
      this to improve performance and
   Decentralized, 2-tier hierarchy
    of peers (“leaf nodes” and
   Other important differences will
    come out in comparison.
Comparison: Gnutella vs. Gnutella2
   Network architecture
   Searching algorithms
   Cooperation incentives
   Security
Gnutella's Network Architecture
   Recall it is completely
Gnutella2's Network Architecture
   Decentralized, 2-tier.
   This architecture is
    recommended for Gnutella in
   New node enters by connecting
    to a known hub (almost identical
    to Gnutella's handshake).
   Hubs typically accept 300-500
    leaves, and connect to 5-30
    other hubs.
   Leaves typically connect to 3
Comparison – Network Architecture
   Gnutella's purely decentralized is much simpler, but
    not as scalable.
   No real difference between Gnutella's v0.6
    “ultrapeer” structure and Gnutella2's “hub”
     Eitherof these strategies should reduce searching traffic
      as explained in searching algorithms.
Gnutella's Searching Algorithm
   Recall that using the purely decentralized version, packets are
    flooded throughout the network.
   If the v0.6 ultrapeer recommendation is implemented, searching
    is optimized using Query hash tables(QHTs).
     A QHT   is maintained by each node, and describes the
       content it is sharing.
     An ultrapeer maintains an aggregate of its leaf's QHTs and its
       own QHT.
     Searches    are performed by forwarding a query to an
       ultrapeer, who checks its aggregate QHT for a match.
           If there is a match, the query is forwarded to the
            appropriate leaf, otherwise the query is forwarded to
            neighbouring ultrapeers by “flooding”.
Gnutella2's Searching Algorithm
   Ultrapeers are called “hubs”.
   Uses a QHT like Gnutella, but if a hub cannot match a
    query to its aggregate QHT, it checks a set of caches:
     Each  hub maintains a cached copy of each neighbouring
      hub's aggregate QHT.
     Upon   a search miss, a hub will try to match the query against
      its cached copies of its neighbours QHTs.
     If thequery matches, it will forward the query once, and the
      node that receives the query processes it and directly sends
      the result back to the client.
     If no
          match is made, the searching client will continue at
      another untried hub.
Comparison – Searching Algorithms
   Both Gnutella with Ultrapeers and Gnutella2
    significantly reduce the number of messages.
     Should   increase performance and scalability.
   Gnutella2's method further reduces the number of
    messages sent for a query:
     Less
         query request messages due to caching neighbour's
     Less  response messages since Gnutella2's method allows
      the responding node to directly contact the requesting node,
      rather than sending the message back through the path to
      get there.
Comparison: Cooperation Incentives
   Neither Gnutella nor Gnutella2 specify cooperation
   Implementations often will not allow connections to
    network unless you share something, and by default
    make downloads shared.
   The problem of cooperation incentives in a
    decentralized environment is interesting, since nodes
    can avoid connecting to those that have a profile of
    their behaviour.
Gnutella's Security
   Gnutella's query messages are routed through peers,
    and the query does not contain the querying node's IP
    address, but a Globally Unique Identifier (GUID).
     Provides   anonymity by masking requester's identity.
   Denial-of-service (DOS) attacks are possible by
    flooding the network with many requests with a fake
   Another node could be similarly DOS'ed if a GUID for
    one of their request GUIDs is known.
   Response IP addresses could be spoofed and
    malicious content provided.
Gnutella2's Security
   Gnutella2 does not use GUIDs for queries
     Sends   the response directly back to the requesting node.
   The QHTs do not contain information about the content
    stored on a neighbouring node, providing privacy.
   Queries make use of query keys (to verify the query
    return address is that of the original sender).
     Prevents malicious users from sending out queries for the
      purpose of flooding the network with spoofed requests.
     Search  clients only permitted to query a hub after obtaining a
      “query key”, which are unique to each (hub, search client
      return address) from it to include in the transmission.
Comparison - Security
   Both Gnutella with ultrapeers and Gnutella2 provide
    privacy through their caching.
   Both Gnutella and Gnutella2 are suceptible to spoofed
    response IP addresses.
   For Gnutella2:
     Gnutella does not provide authentication of nodes for
      querying, thus it is susceptible to request flooding attacks.
     Gnutellaultrapeers cannot block certain hosts (do not have
      query keys or unique request addresses).
   For Gnutella:
     Gnutella2   does not provide anonymous queries.
Concluding Remarks
   Gnutella has some serious flaws (scalability,
    performance, lack of cooperation incentives and
    servent abuse).
     Gnutella2   solves all but cooperation incentives.
     Gnutella with ultrapeers solves scalability and performance,
      but the searching algorithm and caching is less sophisticated.
   Gnutella2 has many more features outside of this
    comparison, primarily being more extensible (yet
    specific) to support applications other than file sharing.
   Although they are different protocols, Gnutella2 is in
    essence, an improved version of Gnutella with
Open Problems
   Is it possible to create useful cooperation incentives in
    a large, truly distributed environment like Gnutella
    where peers may reconnect to different hubs upon
1. Jordan Ritter, “Why Gnutella Can't Scale. No, Really.”,

2. Eytan Adar and Bernardo A. Huberman, “Free-Riding on Gnutella”,

3. Farhad Manjoo, “Gnutella Bandwidth Bandits”, Aug. 8, 2002.

4. RFC-Gnutella 0.6,

5. Anurag Singla and Christopher Rohrs, “Ultrapeers: Another Step Towards Gnutella
Scalability”, Version 1.0,

6. Gnutella vs. Gnutella2, Part 1,

7. The Gnutella2 Developers Network.

8. “LimeWire: Network Improvements ”,

9. P2P Networking Technologies. URL:

Shared By: