"Announcements EECS Introduction to Computer Networks Multicast and Overlay"
Announcements EECS 122: ! No class Wednesday. Happy Thanksgiving! Introduction to Computer Networks Multicast and Overlay Networks ! Project 2a code posted… use at will for part 2b - but, better and probably easier to use your own code Ion Stoica (and Brighten Godfrey) TAs: Lucian Popa, David Zats and Ganesh Ananthanarayanan http://inst.eecs.berkeley.edu/~ee122/ Materials with thanks to Vern Paxson, Jennifer Rexford, and colleagues at Princeton and UC Berkeley EE122 F08 EE122 F08 2 Motivational Example: Streaming Media This approach does not scale… ! Live 8 concert - Send ~300 Kb/s video streams - Peak usage > 100,000 simultaneous users Broadcast Center Backbone ISP - Consumes > 30 Gb/s - If 1000 people are in Berkeley, and if the concert were broadcast from a single location, 1000 unicast streams are sent from that location to Berkeley EE122 F08 3 EE122 F08 4 Instead build trees Multicast Routing Approaches Copy data at routers ! Kinds of Trees At most one copy of a data packet per link - Source Specific Trees - Shared Tree Broadcast Center Backbone ! Tree Computation Methods ISP - Link state - Distance vector •Routers keep track of groups in real-time •LANs implement link •Routers compute trees and layer multicast by forward packets along them broadcasting EE122 F08 5 EE122 F08 6 Source Specific Trees Source Specific Trees •Each source is the root of •Each source is the root of its own tree its own tree 5 7 5 7 •One tree per source •One tree per source 4 8 •Tree can consists of 4 8 •Tree can consists of shortest paths to each shortest paths to each 6 receiver 6 receiver 11 11 2 10 2 10 3 3 1 13 12 1 13 12 Very good performance but expensive to construct/maintain; Members of the multicast tree Sender routers need to manage a tree per source EE122 F08 7 EE122 F08 8 Shared Tree Shared Tree ! Ideally, find a Steiner tree - One tree used by all the minimum-weighted members in a group tree connecting only the 5 7 2 2 multicast members 5 7 4 8 4 1 2 2 8 6 12 6 11 15 2 2 2 10 11 2 2 10 2 3 1 7 1 3 13 11 1 2 12 3 1 13 12 Less state to construct but 12 2 hard to pick “good” trees for everyone! EE122 F08 9 EE122 F08 10 Shared Tree Shared Tree ! Ideally, find a Steiner tree ! Example heuristic: find a – minimum-weighted tree minimum-spanning tree – connecting only the minimum-weighted tree 2 2 multicast members connecting all nodes in 5 7 ! Finding Steiner Tree is NP 2 the network 4 2 5 7 1 2 8 2 hard ! Finding a minimum 12 ! Heuristics are known 4 1 2 2 spanning tree is much 8 6 easier 12 15 6 2 2 11 2 15 2 10 2 3 2 11 2 7 1 2 2 10 11 1 2 2 3 1 1 3 13 7 12 11 1 2 12 2 1 3 13 12 12 2 EE122 F08 11 EE122 F08 12 Shared Tree Shared Tree ! Example heuristic: find a ! Example heuristic: find a minimum-spanning tree – minimum-spanning tree – minimum-weighted tree minimum-weighted tree connecting all nodes in the connecting all nodes in the 2 network 2 network 2 5 7 2 5 7 ! Finding a minimum spanning ! Finding a minimum tree is easier. How? 4 1 2 4 1 2 8 2 spanning tree is much 8 2! Prune back to get multicast 12 tree 12 easier. How? 6 6 15 15 2 2 2 2 11 11 2 2 10 2 10 2 3 2 3 1 2 7 1 7 2 11 1 2 11 1 3 3 1 13 1 13 12 12 12 12 2 2 EE122 F08 13 EE122 F08 14 Multicast Service Model Multicast Service Model (cont’d) ! Unicast: packets delivered to one destination ! Membership access control ! Broadcast: packets delivered to all end-hosts - Open group: anyone can join ! Multicast: packets delivered to multiple destinations (those that have joined the multicast group) - Closed group: restrictions on joining atas G R0 [G,1djoin ] R [G, data] [G, data] s G R1 ! Sender access control S Net R0 join . - Anyone can send to group [Rn G, -data 1j . - Anyone in group can send to group oin ] sG . - Restrictions on which host can send to group Rn-1 ! Receivers join a multicast group which is identified by a multicast address (e.g. G) ! Sender(s) send data to address G ! Network routes data to each of the receivers EE122 F08 15 EE122 F08 16 Multicast and Layering Multicast Implementation Issues ! Multicast can be implemented at different layers ! How are multicast packets addressed? - data link layer • e.g. Ethernet multicast ! How is join implemented? - network layer • e.g. IP multicast - application layer ! How is send implemented? • e.g. End system multicast ! Which layer is best? ! How much state is kept and who keeps it? EE122 F08 17 EE122 F08 18 Problems with Data Link Layer Multicast Data Link Layer Multicast ! Recall: end-hosts in the same local area network (LAN) can hear from ! Single data link technology each other at the data link layer (e.g., Ethernet) ! Single LAN ! Reserve some data link layer addresses for multicast - Limited to small number of hosts ! Join group at multicast address G - Network interface card (NIC) normally only listens for packets sent to - Limited to low diameter latency unicast address A and broadcast address B - Essentially all the limitations of LANs compared to - To join group G, NIC also listens for packets sent to multicast address G internetworks (NIC limits number of groups joined) - Implemented in hardware, thus efficient ! Send to group G - Packet is flooded on all LAN segments, like broadcast - Can waste bandwidth, but LANs should not be very large ! Only host NICs keep state about who has joined ! scalable to large number of receivers, groups EE122 F08 19 EE122 F08 20 Network Layer (IP) Multicast IP Multicast Routing ! Overcomes limitations of data link layer multicast ! Intra-domain ! Performs inter-network multicast routing - Distance-vector multicast - Relies on data link layer multicast for intra-network - Link-state multicast routing ! Inter-domain ! Portion of IP address space defined as multicast - Protocol Independent Multicast addresses - Single Source Multicast - 228 addresses for entire Internet ! Open group membership ! Anyone can send to group - Flexible, but leads to problems EE122 F08 21 EE122 F08 22 Distance Vector Multicast Routing Protocol (DVRMP) Reverse Path Flooding (RPF) ! An elegant extension to DV routing ! Extension to DV unicast routing ! Packet forwarding ! Use shortest path DV routes to determine if link is - If incoming link is shortest path to s:3 on the source-rooted spanning tree source - Send on all links except incoming ! Three steps in developing DVRMP - Packets always take shortest path - Reverse Path Flooding • assuming delay is symmetric s:2 s:3 - Reverse Path Broadcasting ! Issues - Some links (LANs) may receive - Truncated Reverse Path Broadcasting multiple copies s:1 s:2 - Every link receives each multicast packet, even if no interested hosts s r DV shortest paths RPF data flow EE122 F08 23 EE122 F08 24 Example Reverse Path Broadcasting (RPB) ! Flooding can cause a given packet to be sent multiple times over ! Chose parent of each link along the same link reverse shortest path to source S S ! Only parent forward to a link 5 6 (child link) ! Identify Child Links forward only x y x y to child link ! Routing updates identify parent ! Since distances are known, a a child link of x each router can easily figure for S duplicate packet out if it's the parent for a given z z link ! In case of tie, lower address b wins b ! Solution: Reverse Path Broadcasting EE122 F08 25 EE122 F08 26 Truncated Reverse Path Don’t Really Want to Flood! Broadcasting (TRPB) S ! This is still a broadcast algorithm – the traffic ! Extend DV/RPB to eliminate unneeded forwarding goes everywhere NL ! Identify leaves ! Need to “Prune” the tree when there are subtrees - Routers announce that a link is with no group members their next link to source S - Parent router can determine that it ! Solution: Truncated Reverse Path Broadcasting is not a leaf NL NL ! Explicit group joining on LAN - Members periodically (with random offset) multicast report locally L L - Hear and report, then suppress own r2 ! Packet forwarding - If not a leaf router or have r1 members - Out all links except incoming L – leaf node NL – Non-leaf node EE122 F08 27 EE122 F08 28 Pruning Details Pruning Details ! Prune (Source,Group) at leaf if no members ! How to pick prune timers? - Send Non-Membership Report (NMR) up tree - Too long " large join time ! If all children of router R send NRM, prune (S,G) - Too short " high control overhead - Propagate prune for (S,G) to parent R ! On timeout: ! What do you do when a member of a group - Prune dropped (re)joins? - Flow is reinstated - Issue prune-cancellation message (graft) - Down stream routers re-prune ! Note: a soft-state approach EE122 F08 29 EE122 F08 30 Distance Vector Multicast Scaling Core Based Trees (CBT) ! State requirements: ! Pick a “rendezvous point” for the group called the core. - O(Sources " Groups) active state - One tree per group (same tree for all senders in group) ! Unicast packet to core and bounce it back to multicast ! How to get better scaling? group - Hierarchical Multicast ! Tree construction is receiver-based - Core-based Trees - Joins can be tunneled if required - Only nodes on tree involved ! Reduce routing table state from O(S x G) to O(G) EE122 F08 31 EE122 F08 32 Example CBT vs. DV Multicast ! Group members: M1, M2, M3 ! DV Multicast: One tree for each source ! M1 sends data - More state in routers root - Better optimized trees M1 ! CBT: Trees shared among all sources in a group - Less state in routers - Shared tree may not be the best tree for any particular source - Need to pick a good core node • Sub-optimal setup delay M2 M3 • Optimal choice (computing topological center) is NP hard control (join) messages data EE122 F08 33 EE122 F08 34 Problems with Network Layer Multicast (NLM) NLM Reliability ! Scales poorly with number of groups ! Assume reliability through retransmission - A router must maintain state for every group that ! Sender can not keep state about each receiver traverses it - E.g., what receivers have received - Many groups traverse core routers - Number of receivers unknown and possibly very large ! Supporting higher level functionality is difficult ! Sender can not retransmit every lost packet - NLM: best-effort multi-point delivery service - Even if only one receiver misses packet, sender must - Reliability and congestion control for NLM complicated retransmit, lowering throughput ! Deployment is difficult and slow ! N(ACK) implosion - ISPs reluctant to turn on NLM - Described next EE122 F08 35 EE122 F08 36 (N)ACK Implosion NACK Implosion ! (Positive) acknowledgements ! When a packet is lost all receivers in the sub-tree - Ack every n received packets originated at the link where the packet is lost send - What happens for multicast? NACKs ! Negative acknowledgements 3 R1 - Only ack when data is lost - Assume packet 2 is lost 2? R1 S 3 R2 2? 3 2 1 S 2? R2 R3 3 R3 EE122 F08 37 EE122 F08 38 Barriers to Multicast ! Hard to change IP - Multicast means changes to IP - Details of multicast were very hard to get right Overlay Networks ! Not always consistent with ISP economic model - Charging done at edge, but single packet from edge can explode into millions of packets within network ! Troublesome security model - Anyone can send to a group - Denial-of-service attacks on known groups EE122 F08 39 EE122 F08 Overlay Networks: Motivations Motivations (cont’d) ! Protocol changes in the network happen very slowly ! One size does not fit all ! Why? - Internet network is a shared infrastructure; need to achieve consensus (IETF) ! Applications need different levels of - Many of proposals require to change a large number of routers (e.g., IP - Reliability Multicast, QoS); otherwise end-users won’t benefit - Performance (latency) ! Proposed changes that haven’t happened yet on large scale: - Security - More Addresses (IPv6 ’91) - Access control (e.g., who is allowed to join a multicast - Security (IPSEC ’93); Multicast (IP multicast ’90) group) - … EE122 F08 41 EE122 F08 42 Goals Solution ! Deploy processing in the network ! Make it easy to deploy new functionalities in the network " accelerate the pace of innovation ! Have packets processed as they traverse the network ! Allow users to customize their service AS-1 IP Overlay Network AS-1 (over IP) EE122 F08 43 EE122 F08 44 Overlay network overview Application Layer Multicast (ALM) ! Application Layer (Overlay) multicast ! Let the hosts do all the “special” work - Only require unicast from infrastructure ! Resilient Overlay Network (RON) ! Basic idea: - Hosts do the copying of packets ! Next lecture: Peer-to-peer systems - Set up tree between hosts ! Example: Narada [Yang-hua et al, 2000] - Small group sizes <= hundreds of nodes - Typical application: streaming video ! (What do you use that’s a lot like overlay multicast?) EE122 F08 45 EE122 F08 46 Narada: End System Multicast Algorithmic Challenge Gatech Stanford Stan1 Stan2 ! Choosing replication/forwarding points among CMU hosts - how do the hosts know about each other Berk1 - and know which hosts should forward to other hosts Berkeley Berk2 “Overlay” Tree Stan1 Gatech Stan2 CMU Berk1 Berk2 EE122 F08 47 EE122 F08 48 Advantages of ALM Performance Concerns ! No need for changes to IP or routers ! Stretch - ratio of latency in the overlay to latency in the underlying network ! No need for ISP cooperation ! Stress ! End hosts can prevent other hosts from sending - number of duplicate packets sent over the same physical link ! Easy to implement reliability - use hop-by-hop retransmissions ! Network Layer Multicast can get perfect or near- perfect stretch; ALM can’t in general - but it can come pretty close EE122 F08 49 EE122 F08 50 Performance Concerns Overlay network overview Delay from CMU to Stan1 Gatech Berk1 increases ! Overlay Multicast (Narada) Stan2 # Resilient Overlay Network (RON) CMU Berk2 Berk1 Duplicate Packets: Gatech Stanford ! Next lecture: Peer-to-peer systems Stan1 Bandwidth Wastage Stan2 CMU Berk1 Berkeley Berk2 EE122 F08 51 EE122 F08 52 Resilient Overlay Network (RON) Example ! Premise: by building application overlay network, can increase performance and reliability of routing MIT Berkeley Default IP path determined by BGP & OSPF ! Install N computers at different Internet locations ! Each computer acts as an overlay network router - Between each overlay router is an IP tunnel (logical link) - Logical overlay topology is all-to-all (N2 total links) ! Run a link-state routing algorithm in that overlay topology UCLA - Computers actively measure each logical link in real time for packet loss rate, latency, throughput, etc Reroute traffic using red alternative overlay network path, avoid congestion point - These define link costs Acts as overlay router - Route overlay network traffic based on measured characteristics EE122 F08 53 EE122 F08 54 Summary: What You Need to Know ! Multicast protocols - DVRMP - CBT - How they compare ! Overlay networks - Benefits and drawbacks - More to come: P2P EE122 F08 55