The BitTorrent content distribution system Nikitas Liogkas CS219 – P2P Systems University of California, Los Angeles Motivation flash crowd (aka slashdot) effect many clients, few servers Problem: servers cannot handle load Solution: swarming clients download pieces of the file from each other has been proven to have good scaling and performance properties Presentation outline Joining the system Encoding / metadata file Tracker protocol Peer wire protocol Piece selection Peer selection Client implementations Resources Joining a torrent metadata file new leecher website 1 2 join peer list 3 data request seed/leecher tracker 4 Peers divided into: seeds: have the entire file leechers: still downloading 1. obtain the metadata file (out of band) 2. contact the tracker 3. obtain a peer list (contains seeds & leechers) 4. contact peers from that list for data Exchanging data leecher B leecher A I have ! seed leecher C ● Verify pieces using hashes ● Download sub-pieces (blocks) in parallel ● Advertise received pieces to the entire peer list ● interested: need pieces that a given peer has Bencoding encoding format of all exchanged messages four types byte strings integers lists dictionaries (mapping keys to values) examples 4:spam represents the string “spam” i10e represents the integer 10 Metadata file structure contains information necessary to contact the tracker and describes the files in the torrent announce URL of tracker file name file length piece length (typically 256KB) SHA-1 hashes of pieces for verification and creation date, comment, creator, … Tracker protocol communicates with clients via HTTP/HTTPS client GET request info_hash: uniquely identifies the file peer_id: uniquely identifies the client client IP and port (typically 6881-6889) numwant: how many peers to return (defaults to 50) stats: bytes uploaded, downloaded, left tracker GET response interval: how often to contact the tracker list of peers, containing peer id, IP and port stats: complete, incomplete tracker-less mode; based on the Kademlia DHT Presentation outline Joining the system Encoding / metadata file Tracker protocol Peer wire protocol Piece selection Peer selection Client implementations Resources Peer wire protocol implemented on top of TCP messages handshake (maybe with bitfield) keep-alive choke / unchoke interested / not interested have (advertisement of a newly-acquired piece) request / piece cancel (only used in “endgame mode”) port (used in tracker-less mode) Piece selection when downloading starts: choose at random get complete pieces as quickly as possible obtain something to trade after we have 4 pieces: (local) rarest first achieves the fastest replication of rare pieces obtain something of value to trade get unique pieces from the seed endgame mode defense against the “last-block problem” send requests for missing sub-pieces to all peers in our peer list send cancel messages upon receipt of a sub-piece Last-block problem at the end of the download, a peer may have trouble finding the missing pieces based on anecdotal evidence other proposals network coding [Gkantsidis et al., Infocom’05] prefer to upload to peers with similar file completeness; unfair for the peers with most of the pieces [Tian et al., Infocom’06] Last-block problem – a myth? is it a problem after all? figure from [Legout et al., INRIA-TR-2006], with permission Peer selection - unchoking leecher B leecher A seed leecher C • calculate data-receiving rates • upload to (unchoke) the fastest • rate calculation is performed periodically (a round occurs typically every 10 seconds) • constant number of unchoking slots • attempt to achieve Pareto efficiency Optimistic unchoking periodically select a peer at random and upload to it typically performed every 3 rounds (30 seconds) multi-purpose mechanism allow bootstrapping of new clients continuously look for the fastest partners keep the network connected; every peer has a non-zero chance of interacting with any other peer Seed unchoking old algorithm unchoke the fastest downloaders problem: fastest peers may monopolize seeds new algorithm periodically sort all leechers according to their last unchoke time prefer the most recently unchoked leechers; on a tie, prefer the fastest (presumably) achieves equal spread of seed bandwidth Downloading only from seeds new list request leecher B leecher A tracker peer list seed leecher C ● Repeatedly query the tracker for peer lists ● Distinguish the seeds, and receive data from them ● Violates fairness model; may be harmful to honest peers Evaluation in private torrents Download rates for all peers 22% max Limit bandwidth of leechers 1 to 6, no limit on seed. 75%ile Modest fairness violation (22% better rate) median when selfish peer is fast 25%ile Robustness does not suffer: most honest slower by <15% min Evaluation with modified seed Download rates for all peers 155% Seed only unchokes one leecher at a time Considerable fairness violation: selfish peer faster by 155% Robustness suffers: honest peers slower by at least 32% Rate- vs. volume-based selection Proponents of rate-based decision metrics: [Cohen, P2PECON’03] and [INRIA TR’2006] Proponents of volume-based metrics: [Bharambe et al., MSR-TR-2005], [Gkantsidis et al., Infocom’05], [Jun et al., P2PECON’05], and eDonkey file-sharing system No clear winner yet! Client implementations mainline: written in Python; right now, the only one employing the new seed unchoking algorithm Azureus: the most popular, written in Java; implements a special protocol between clients (e.g. peers can exchange peer lists) other popular clients: ABC, BitComet, BitLord, BitTornado, μTorrent, Opera browser various non-standard extensions retaliation mode: detect compromised/malicious peers anti-snubbing: ignore a peer who ignores us super seeding: masqueraded seed Resources #1 Basic BitTorrent mechanisms [Cohen, P2PECON’03] BitTorrent specification Wiki http://wiki.theory.org/BitTorrentSpecification Measurement studies [Izal et al., PAM’04], [Pouwelse et al., Delft TR 2004 and IPTPS’05], [Guo et al., IMC’05], and [Legout et al., INRIA-TR-2006] Resources #2 Theoretical analysis and modeling [Qiu et al., SIGCOMM’04], and [Tian et al., Infocom’06] Simulations [Bharambe et al., MSR-TR-2005] Incentives and exploiting them [Shneidman et al., PINS’04], [Jun et al., P2PECON’05], and [Liogkas et al., IPTPS’06] Conclusion and food for thought BitTorrent is fast and robust Yet, many parameters are arbitrarily set number of unchoking slots round duration size of pieces/sub-pieces What can we learn from BitTorrent for the design of future P2P content distribution protocols?
Pages to are hidden for
"BitTorrent description"Please download to view full document