Document Sample

L3S Research Center, University of Hannover Dynamics of Distributed Hash Tables Wolf- Wolf-Tilo Balke and Wolf Siberski 21.11.2007 *Original slides provided by S. Rieche, H. Niedermayer, S. Götz, K. Wehrle (University of Tübingen) and A. Datta, K. Aberer (EPFL Lausanne) Peer-to-Peer Systems and Applications, Springer LNCS 3485 1 Overview 1. CHORD (again) 2. 2 New node joins: Stabilization 3. Reliability in Distributed Hash Tables 4. Storage Load Balancing in Distributed Hash Tables 5. The P-GRID Approach Distributed Hash Tables L3S Research Center 2 Distributed Hash Tables Distributed Hash Tables (DHTs) Also known as structured Peer-to-Peer systems Efficient, scalable, and self-organizing algorithms For data retrieval and management Chord (Stoica et al., 2001) Scalable Peer-to-peer Lookup Service for Internet Applications Nodes and data are mapped with hash function on a Chord ring Routing Routing (“Finger”) - Tables O (log N) Distributed Hash Tables L3S Research Center 3 CHORD A consistent hash function assigns each node and each key an m-bit identifier using SHA 1 (Secure Hash Standard). b big h to k lli i improbable m = any number bi enough t make collisions i b bl Key identifier = SHA-1(key) Node identifier = SHA-1(IP address) Both are uniformly distributed Both exist in the same ID space Distributed Hash Tables L3S Research Center 4 CHORD Identifiers are arranged on a closed identifier circle (e.g. modulo 26) ( d l => Chord Ring Distributed Hash Tables L3S Research Center 5 CHORD A key k is assigned to the node whose identifier is equal g y to or greater than the key‘s identifier This node is called successor(k) and is the first node clockwise from k Distributed Hash Tables L3S Research Center 6 CHORD // ask node n to find the successor of id n.find_successor(id) if (id (n; successor]) return successor; else // forward the query around the circle return successor.find_successor(id); => Number of messages linear in the number of nodes ! Distributed Hash Tables L3S Research Center 7 CHORD Additional routing information to accelerate lookups Each node n contains a routing table with up to m entries (m: number of bits of the identifiers) => finger table ith entry in the table at node n contains the first node s that succeeds n by at least 2 s = successor (n + 2i-1) s is called the ith finger of node n Distributed Hash Tables L3S Research Center 8 CHORD Finger table: finger[i] := successor (n + 2 i) Distributed Hash Tables L3S Research Center 9 CHORD Search in finger table for the nodes which most immediatly precedes id Invoke find_successor from that node => Number of messages O(log N) Distributed Hash Tables L3S Research Center 10 CHORD Important characteristics of this scheme: Each node stores information about only a small number of nodes (m) Each nodes knows more about nodes closely following it than about nodes farther away A finger table generally does not contain enough information to directly determine the successor of an arbitrary key k Distributed Hash Tables L3S Research Center 11 Overview 1. CHORD (again) 2. 2 New node joins: Stabilization 3. Reliability in Distributed Hash Tables 4. Storage Load Balancing in Distributed Hash Tables 5. The P-GRID Approach Distributed Hash Tables L3S Research Center 12 Volatility Stable CHORD networks can always rely on their finger tables for routing But what happens, if the network is volatile (churn)? Lecture 3: Nodes can fail, depart or join Failure is handled by ‘soft-state‘ approach with periodical republish by nodes (unpublish) or simply Departure can be handled by clean shutdown ( ) as failure Joining needs stabilization of the CHORD ring Distributed Hash Tables L3S Research Center 13 CHORD Stabilization ? (1) (2) (3) Distributed Hash Tables L3S Research Center 14 CHORD Stabilization Stabilization protocol for a node x: x.stabilize(): ask successor y for its predecessor p if p ∈ (x; y] then p is x‘s new successor x.notify(): notify x‘s successor p of x‘s existence notified node may change predecessor to x Distributed Hash Tables L3S Research Center 15 CHORD Stabilization • N26 joins the system • N26 aquires N32 as its successor • N26 notifies N32 • N32 aquires N26 as its predecessor Distributed Hash Tables L3S Research Center 16 CHORD Stabilization • N26 copies keys • N21 runs stabilize() and asks its successor N32 for its predecessor which is N26. Distributed Hash Tables L3S Research Center 17 CHORD Stabilization • N21 aquires N26 as its successor • N21 notifies N26 of its existence • N26 aquires N21 as predecessor Distributed Hash Tables L3S Research Center 18 CHORD Stabilization (1) (2) (3) Distributed Hash Tables L3S Research Center 19 Overview 1. CHORD (again) 2. 2 New node joins: Stabilization 3. Reliability in Distributed Hash Tables 4. Storage Load Balancing in Distributed Hash Tables 5. The P-GRID Approach Distributed Hash Tables L3S Research Center 20 “Stabilize” Function Stabilize Function to correct inconsistent connections Remember: Periodically done by each node n n asks its successor for its predecessor p n checks if p equals n n also periodically refreshes random finger x by (re)locating successor Successor-List to find new successor successor-list If successor is not reachable use next node in successor list Start stabilize function But what happens to data in case of node failure? Distributed Hash Tables L3S Research Center 21 Reliability of Data in Chord Original No Reliability of data Recommendation Use of Successor-List The reliability of data is an application task Replicate inserted data to the next f other nodes Chord informs application of arriving or failing nodes … … … … Distributed Hash Tables L3S Research Center 22 Properties Advantages After failure of a node its successor has the data already stored Disadvantages Node stores f intervals More data load After breakdown of a node Find new successor Replicate data to next node g More message overhead at breakdown Stabilize-function has to check every Successor-list Find inconsistent links More message overhead Distributed Hash Tables L3S Research Center 23 Multiple Nodes in One Interval Fixed positive number f Indicates how many nodes have to act within one interval at least Procedure First node takes a random position A new node is assigned to any existing node Node is announced to all other nodes in same interval 1 4 6 9 … 2 5 7 10 … … 3 8 … Node Distributed Hash Tables L3S Research Center 24 Multiple Nodes in One Interval Effects of algorithm Reliability of data Better load balancing Higher security 1 4 6 9 … 2 5 7 10 … … 3 8 … Node Distributed Hash Tables L3S Research Center 25 Reliability of Data Insertion Copy of documents Always necessary for replication Less additional expenses Nodes have only to store pointers to nodes from the same interval Nodes store only data of one interval … … … … Distributed Hash Tables L3S Research Center 26 Reliability of Data Reliability Failure: no copy of data needed Data are already stored within same interval fingers Use stabilization procedure to correct f As in original Chord 1 4 6 9 … 2 5 7 10 … … 3 8 … Node Distributed Hash Tables L3S Research Center 27 Properties Advantages Failure: no copy of data needed Rebuild intervals with neighbors only if critical Requests can be answered by f different nodes Disadvantages Less number of intervals as in original Chord Solution: Virtual Servers Distributed Hash Tables L3S Research Center 28 Fault Tolerance: Replication vs. Redundancy Replication Each data item is replicated K times K replicas are stored on different nodes Redundancy Each data item is split into M fragments K redundant fragments are computed Use of an "erasure-code“ (see e.g. V. Pless: Introduction to the Theory of Error-Correcting Codes. Wiley-Interscience, 1998) Any M fragments allow to reconstruct the original data For each fragment we compute its key M + K different fragments have different keys Distributed Hash Tables L3S Research Center 29 Overview 1. CHORD (again) 2. 2 New node joins: Stabilization 3. Reliability in Distributed Hash Tables 4. Storage Load Balancing in Distributed Hash Tables 5. The P-GRID Approach Distributed Hash Tables L3S Research Center 30 Storage Load Balancing in Distributed Hash Tables Suitable hash function (easy to compute, few collisions) Standard assumption 1: uniform key distribution Every node with equal l d E d ith l load No load balancing is needed Standard assumption 2: equal distribution Nodes across address space Data across nodes But is this assumption justifiable? Analysis of distribution of data using simulation Distributed Hash Tables L3S Research Center 31 Storage Load Balancing in Distributed Hash Tables Analysis of distribution Optimal distribution of of data documents across nodes Example Parameters 4,096 nodes 500,000 documents Optimum ~122 documents per node No optimal distribution in Chord without load balancing Distributed Hash Tables L3S Research Center 32 Storage Load Balancing in Distributed Hash Tables Number of nodes without storing any document Parameters 4,096 4 096 nodes 100,000 to 1,000,000 documents Some nodes without any load Why is the load unbalanced? We need load balancing to keep the complexity of DHT management low Distributed Hash Tables L3S Research Center 33 Definitions Definitions System with N nodes The load is optimally balanced, Load of each node is around 1/N of the total load. A node is overloaded (heavy) Node has a significantly higher load compared to the optimal distribution of load. Else the node is light Distributed Hash Tables L3S Research Center 34 Load Balancing Algorithms Problem Significant difference in the load of nodes Several techniques to ensure an equal data distribution Power of Two Choices (Byers et. al, 2003) Virtual Servers (Rao et. al, 2003) Thermal-Dissipation-based Approach (Rieche et. al, 2004) A Simple Address-Space and Item Balancing (Karger et. al, 2004) … Distributed Hash Tables L3S Research Center 35 Outline Algorithms Power of Two Choices (Byers et. al, 2003) Virtual Servers (Rao et. al, 2003) John Byers, Jeffrey Considine, and Michael Mitzen- macher: “Simple Load Balancing for Distributed Hash Tables“ in Second International Workshop on Peer-to- Peer Systems (IPTPS), Berkeley, CA, USA, 2003. Distributed Hash Tables L3S Research Center 36 Power of Two Choices Idea One hash function for all nodes h0 Multiple hash functions for data h1, h2, h3, …hd Two options Data is stored at one node Data is stored at one node & th d t i t other nodes store a pointer Distributed Hash Tables L3S Research Center 37 Power of Two Choices Inserting Data Results of all hash functions are calculated h1(x), h2(x), h3(x), …hd(x) Data is stored on the retrieved node with the lowest load Alternative Other nodes stores pointer The owner of a data has to insert the document periodically Prevent removal of data after a timeout (soft state) Distributed Hash Tables L3S Research Center 38 Power of Two Choices (cont'd) Retrieving Without pointers Results of all hash functions are calculated Request all of th possible nodes i parallel R t ll f the ibl d in ll l One node will answer With pointers Request only one of the possible nodes. Node can forward the request directly to the final node Distributed Hash Tables L3S Research Center 39 Power of Two Choices (cont'd) Advantages Simple Disadvantages Message overhead at inserting data With pointers Additional administration of pointers More load Without pointers Message overhead at every search Distributed Hash Tables L3S Research Center 40 Outline Algorithms Power of Two Choices (Byers et. al, 2003) Virtual Servers (Rao et. al, 2003) Ananth Rao, Karthik Lakshminarayanan, Sonesh Surana, Richard Karp, and Ion Stoica “Load Balancing y in Structured P2P Systems” in Second International Workshop on Peer-to-Peer Systems (IPTPS), Berkeley, CA, USA, 2003. Distributed Hash Tables L3S Research Center 41 Virtual Server Each node is responsible for several intervals "Virtual server" Example Chord Node A Node C Chord Ring Node B [Rao 2003] Distributed Hash Tables L3S Research Center 42 Rules Rules for transferring a virtual server From heavy node to light node 1. The transfer of an virtual server makes the receiving node not heavy 2. The virtual server is the lightest virtual server that makes the heavy node light 3. If there is no virtual server whose transfer can make a node light, the heaviest virtual server from this node would be transferred Distributed Hash Tables L3S Research Center 43 Virtual Server Each node is responsible for several intervals log (n) virtual servers Load balancing Different possibilities to change servers One-to-one One-to-many Many-to-many Copy of an interval is like removing and inserting a node in a DHT Chord Ring Distributed Hash Tables L3S Research Center 44 Scheme 1: One-to-One One-to-One Light node picks a random ID Contacts the node x responsible for it Accepts load if x is heavy H L L H L L L L H [Rao 2003] Distributed Hash Tables L3S Research Center 45 Scheme 2: One-to-Many One-to-Many Light nodes report their load information to directories Heavy node H gets this information by contacting a directory g p H contacts the light node which can accept the excess load L1 D1 H1 L2 H3 L3 L5 L4 D2 H2 Light nodes Directories Heavy nodes [Rao 2003] Distributed Hash Tables L3S Research Center 46 Scheme 3: Many-to-Many Many-to-Many Many heavy and light nodes rendezvous at each step Directories periodically compute the transfer schedule and report it b k t the nodes, which th d th actual t back to th d hi h then do the t l transferf L1 D1 H1 L2 H3 L3 L5 L4 4 D2 H2 Light nodes Directories Heavy nodes [Rao 2003] Distributed Hash Tables L3S Research Center 47 Virtual Server Advantages Easy shifting of load Whole Virtual Servers are shifted Disadvantages Increased administrative and messages overhead Maintenance of all Finger-Tables Much load is shifted [Rao 2003] Distributed Hash Tables L3S Research Center 48 Simulation Scenario 4,096 nodes (comparison with other measurements) 100,000 to 1,000,000 documents Chord m= 22 bits. Consequently, 222 = 4,194,304 nodes and documents Hash function sha-1 (mod 2m) d random Analysis Up to 25 runs per test Distributed Hash Tables L3S Research Center 49 Results Without load balancing Power of Two Choices + Simple + Simple + Original + Lower load – Bad load balancing – Nodes w/o load Distributed Hash Tables L3S Research Center 50 Results (cont'd) Virtual server + No nodes w/o load – Higher max. load than Power of Two Choices Distributed Hash Tables L3S Research Center 51 Overview 1. CHORD (again) 2. 2 New node joins: Stabilization 3. Reliability in Distributed Hash Tables 4. Storage Load Balancing in Distributed Hash Tables 5. The P-GRID Approach Distributed Hash Tables L3S Research Center 52 L3S Research Center, University of Hannover 1 • Toy example: 3 8 2 Distributing 5 4 7 skewed load 6 0 1 Key-space Load-distribution Peer-to-Peer Systems and Applications, Springer LNCS 3485 53 L3S Research Center, University of Hannover 1 2 3 8 • A globally coordi- 4 6 5 nated recursive 7 bisection approach • Key-space can be divided in two partitions 0 1 Assign peers propor- tional t the load i ti l to th l d in the two sub-partitions Load-distribution Peer-to-Peer Systems and Applications, Springer LNCS 3485 54 L3S Research Center, University of Hannover 2 6 • Recursively repeat the 1 8 process to 7 5 4 repartition the 3 sub-partitions 0 1 Load-distribution Peer-to-Peer Systems and Applications, Springer LNCS 3485 55 L3S Research Center, University of Hannover • Partitioning of the 2 key-space s.t. there 6 is equal load in 1 each partition 7 Uniform replication of the partitions 4 3 Important for fault- 5 8 tolerance • Note: A novel and 0 1 load- general load balancing problem Load-distribution Peer-to-Peer Systems and Applications, Springer LNCS 3485 56 L3S Research Center, University of Hannover Lessons from the globally coordinated algorithm Achieves an approximate l d b l • A hi i t load-balance • The intermediate partitions may be such that they can not be perfectly repartitioned There’s a fundamental limitation with any bisection based approach, as well as for any fixed key-space partitioned overlay network • Nonetheless practical For realistic load-skews and peer populations Peer-to-Peer Systems and Applications, Springer LNCS 3485 57 L3S Research Center, University of Hannover Distributed proportional partitioning for overlay construction 1 Random interaction 3 A mechanism to meet other random peers * * A parameter p for partitioning the space • Proportional partitioning: Peers 000,010,100 101,001 partition proportional to the load partitioning distribution 0 1 In a ratio p:1-p 1 3 Let’s call the sub-partitions as 0 and 1 • Referential integrity: Obtain reference 1: 3 0: 1 to the other partition 000,010,001 101,100 Needed to enable overlay routing • Sorting the load/keys: Peers exchange pid Legend locally stored keys in order to store only keys for its own partition Routing table Keys (only part of the prefix is shown) Peer-to-Peer Systems and Applications, Springer LNCS 3485 58 P-Grid Scalable Distributed Search Tries (prefix tree) 0 1 index 0 1 0 1 peer 1 peer 2 peer 3 peer 4 Peer 1 stores all data with prefixes 000 or 001 Distributed Hash Tables L3S Research Center 59 P-Grid A single peer should not hold the entire index Distribute index disjoint over peers Distributed Hash Tables L3S Research Center 60 P-Grid Scalable Distribution of Index Distributed Hash Tables L3S Research Center 61 P-Grid Routing information at each peer is only logarithmic (height of trie) peer 4 peer 3 Distributed Hash Tables L3S Research Center 62 P-Grid Prefix Routing Distributed Hash Tables L3S Research Center 63 P-Grid Two basic approaches for new nodes to join Splitting approach (P-Grid) peers meet (randomly) and decide whether to extend search tree by splitting the data space peers can perform load balancing considering their storage load networks with different origins can merge, like Gnutella, FreeNet (loose coupling) Node insertion approach (Plaxton, OceanStore, …) leaf position peers determine their "leaf position" based on their IP address nodes route from a gateway node to their node-id to populate the routing table network has to start from single origin (strong coupling) Distributed Hash Tables L3S Research Center 64 P-Grid Load balance effect Algorithm converges quickly Peers have similar load E.g. leaf load in case of 25 = 32 possible prefixes of length 5: Distributed Hash Tables L3S Research Center 65 P-Grid Replication of data items and routing table entries is used to increase failure resilience Distributed Hash Tables L3S Research Center 66

DOCUMENT INFO

Shared By:

Categories:

Tags:

Stats:

views: | 2 |

posted: | 10/3/2012 |

language: | English |

pages: | 33 |

OTHER DOCS BY praveenkumar14319

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.