Docstoc

IC 2003

Document Sample
IC 2003 Powered By Docstoc
					Distributed Systems,
Network Protocols &
Applications
Srinivasan Seshan
Computer Science Department
Carnegie Mellon University
Three Major Projects

   Measurement analysis of networks
   Sensor networks
   Distributed virtual reality




                                       2
Measurement/Analysis of Networks
   Selfish TCP behavior
   Bottleneck discovery
   Scaling properties of the Internet
   Multihoming

   People:
       Aditya Akella, Jeff Pang, Bruce Maggs and
        Srinivasan Seshan
       Anees Shaikh (IBM)




                                                    3
Measuring the Internet from Everywhere

   What could you learn if you could…
       Have a machine in almost every ISP
       Collect routing information (E-BGP/I-BGP) from
        these ISPs
       Be part of a significant fraction of all Web
        transfers
       Be queried by almost every DNS server in the
        world
   We have access to such a testbed  Akamai

                                                         4
Bottleneck Discovery
   Where are bottlenecks in the
    Internet?
       Ignoring access links                             Stub          Stub
   What is the capacity of these
    bottlenecks?
   Initial results
       There is a lot of available
        bandwidth in the Internet today                       More ISPs
             > 45Mbps on 50% of paths!
       Quantified relative benefit of using
        larger Tier-1 ISPs over smaller
        ISPs
       Internal ISP links are bottlenecks         ISP1                   ISP2
        more often than expected
       Peering between ISPs not as
        significant a bottleneck as
        expected
                                               Bottlenecks?
                                                                 Stub      Stub



                                                                                  5
Scaling Properties of the Internet
   How will these bottlenecks
    change over time?
   Analyzing the combination of
    Internet topology and routing
                                          Uniformly scale
   Identifying changes that are           all capacities?
    needed to make the Internet           Scale some
    scale with hardware                    links faster?
    improvements from Moore’s
                                          Moore’s-law
    law
                                           like scaling
                                           sufficient?
   Initial results
        Congestion scales poorly in
         Internet-like graphs                   Congested
                                                hot-spots
        Policy-routing does not
         worsen the congestion
        Alleviation possible via
         simple, straight-forward
         mechanisms

                                                             6
    Multihoming
                                        The effective use of multiple ISPs
   How can stub networks like          (multihoming) by stub networks
    CMU route traffic around
    bottlenecks?
                                                    Destination
   Using multiple ISPs can…
        Improve performance and
         reliability of Internet
         connectivity                              Internet
        Make Internet routing robust
         to failures and attacks
   But need…                              ISP 1              ISP 2
        Techniques for stub domains
         to choose providers
        Monitoring tools to track
         changes in Internet
         performance
        Dynamic control over chosen                 CMU
         routes

                                                                             7
Multihoming
   In a given metro area…
        What maximum performance benefits can multihoming offer?
        How can multihomed networks realize these benefits in practice?

   Initial results
        Multihoming helps, but not much beyond 4 providers
        Careful choice necessary
             Cannot just pick top individual performers
             Performance can be 50% worse for a poor choice of providers

   Future work
        Reasons for observed performance benefit  can we relate
         route/ISP selection to bottleneck observations?
        Impact of ISP cost structure  what is the best choices for a given
         cost?
        How will Internet operation be affected by such “smart” routing?

                                                                               8
Sensor Networks

   IrisNet

   People:
       Suman Nath, Yan Ke, and Srinivasan Seshan
       Phil Gibbons, Babu Pillai, Rahul Sukthankar (Intel
        Research)




                                                             9
What if Sensors Were Everywhere?

  Network monitoring       Persistent queries/triggered actions
    Packet sniffers as
        sensors
                                               Show an image
                                               when you hear a
                                                    honk



   Person Locator System      Characterization of human activity
      Where’s Fred?              Is the cafeteria busy?




                                                               10
Sensor Services

   Need: infrastructure to simplify creation of sensor-
    enriched services

   Remove deployment overhead
       Provide a common shared infrastructure of sensors

   Automate common tasks
       Sensor reading collection and storage
       Efficient query processing over readings
       Address privacy concerns of users



                                                            11
IrisNet: Internet-scale Resource-
Intensive Sensor Network Services
             IrisNet                Sensor Networks
PCs/PDAs                       mote hardware
Linux, Java, XML, C++          TinyOS, TinyDB, etc.
Internet-scale                 campus-scale
intensive sensor processing    minimal sensor processing
powered nodes                  energy is a key concern
multimedia sensors             scalar sensors
wide variety of services       narrowly focused services
direct Internet connectivity   ad hoc wireless connectivity


                                                              12
Example: Parking Space Finder

   A distributed database maintains
     Spot availability data

     Address of parking spot

     Meter description

     Historical availability data




   Query: Where is the cheapest empty parking spot
    near school?
       Returns driving directions to the best spot




                                                      13
IrisNet Architecture
                  Parking Space Finder
                   Organizing Agents
              University Downtown Hill District




                         Internet



                Amy-John      Kim-Steve   Tom-Zoe

                       Person Finder
                     Organizing Agents
    Sensing                                         Sensing
    Agents                                          Agents



                                                              14
Design Decisions

   Sensor feeds processed in application specific way near
    source
       Reduces demand on network
       Requires relatively intensive processing on sensor device
   Distributed, hierarchical XML database stores readings
       Accommodates frequent updates to different readings
       XML supports hierarchical and heterogeneous/evolving
        description of data
       Hierarchical organization enables scalability and rich query
        styles
   Challenges in database processing, image processing
    & distributed systems


                                                                       15
Distributed Virtual Reality


   Distributed multiplayer games

   People:
       Ashwin Bharambe, Jeff Pang and
        Srinivasan Seshan




                                         16
What do Multiplayer Games Look Like?
   Large shared world
       Composed of map information, textures, etc
       Populated by active entities: user avatars, computer AI’s, etc
   Only parts of world relevant to particular user/player

             Game World




                      Player 1
                                           Player 2

                                                                         17
Individual Player’s View


   Interactive
    environment
    (e.g. door,
    rooms)
   Live ammo

   Monsters

   Players

   Game state


                           18
Current Game Architectures
   Distributed broadcast-based              Centralized client-server
    (e.g., DOOM )                             (e.g., Quake)
      Every update sent to all                Every update sent to server
        participants                              who maintains “true” state
   Advantages/disadvantages                 Advantages/disadvantages
     + No central server                       + Reduces overall
     - Waste of bandwidth                         bandwidth requirements
     - Synchronized game state                 + State management, cheat
        – difficult for players to join           proofing much easier
        at arbitrary times                     - Bottleneck for computation
                                                  and bandwidth  current
                                                  games limited to about
                                                  6000 players
                                               - Single point of failure
     Do not scale well
                                               - Response time limited by
                                                  client-server latency
                                                                           19
Large-Scale Distributed Games
   Need to distribute responsibility for maintaining
    world state and running computer AIs
   Avoid any single point of failure
   Efficient use of available bandwidth
     Every player only receives “relevant” updates 
      subscribes to updates

                                              Events     Virtual World
                       (50,250)
                                              x 100
Solution: model                   (100,200)   y 200
game with Publish-                Player
Subscribe                 x   ≥   50
                          x   ≤   150
                          y   ≥   150         Arena
                          y   ≤   250
                                                 (150,150)
                         Interests


                                                                         20
Publish-Subscribe Overview
                                                       Publishers produce
                                                       publications
                                       Publications    Subscribers register
         Subscription                                  their interests via
                                                       subscriptions


    Key feature  subscription language
        Rich database-like subscription languages (e.g. all publications with
         stock price > 100)
        Subject/channel-based subscriptions (e.g. all publications on the IBM
         stock channel)

    State-of-the-art
        Centralized designs with rich subscriptions
        Scalable distributed designs with channel-based subscriptions
        Unscalable designs with rich subscriptions



                                                                               21
Publish-Subscribe Critical Components

   Subscription language
       Subjects vs. attribute/values
       Exact matches vs. regular expressions?
   Routing mechanism
       Where are subscriptions stored in the system?
       How are publications routed so that they “meet”
        subscriptions?
       How are publications delivered from this
        rendezvous point to subscribers?

                                                          23
Related Systems

   Scribe, Herald
       Scalable, but –
       Restricted subscription language
   Siena, Gryphon
       Flexible subscription language, but –
       Poor scalability due to message flooding


          Delicate balance between expressiveness of
               language and scalability of routing

                                                       24
MERCURY: Subscription Language

   SQL-like but more limited  tradeoff to achieve scalability
       Example: int x ≤ 200  Enough to support range predicates SQL-
        like
       Need sortable attribute-values
       Sufficient for modeling games
            Game arenas
            Player statistics, etc.


   How to support this subscription language scalably?
       Use techniques derived from distributed hash tables (DHT)
       Existing DHT-based designs only support exact-match lookup
            Need range-based lookups
            Eliminate the use of cryptographic hashes  must explicitly handle
             load-balancing


                                                                                  25
MERCURY: Routing Protocol

   Each node responsible for
    range of attribute values                  [240, 320)



                                                            [0, 80)
   For each attribute, nodes     [160, 240)      Hx
    arranged into circle
                                               [80, 160)


   Each node compares value              Attribute Hub
    in message to his range;
    and routes along the circle
                                                              26
Routing Illustrated

   Send subscription to any one attribute hub
   Send publications to all attribute hubs

                                    Subscription
             [240, 320)
                                       50 ≤ x ≤ 150
                                                             [0, 105)
                                      150 ≤ y ≤ 250


[160, 240)                [0, 80)
              Hx                                                 Hy
                                          Publication                   [105, 210)
                                            x 100
                                           y 200        [210, 320)
 Rendezvous
            [80, 160)
    point



                                                                               27
Why Not Use DHTs (and Cryptographic
Hashing) ?
   Hashing is good for exact                int x  1
    matches  e.g., DHTs                    int x  10


   Want to support range queries
   Possible approach                       int x = 1

       Hash each value in the range
       Problems
          Can only be used for discrete-   int x = 9

           valued attributes                int x = 10
          Too many subscriptions



                                                         28
Future Work

   Performance
       Cached pointers  reduce number of overlay hops
       Network aware placement of nodes  delay competitive
        with centralized systems
   Robustness  need to survive node failures
   Workload  need system to self-tune to workload
   Cheating  detecting various forms of cheating
       Routing, subscriptions, state ownership




                                                               29
Future Work

   Distributed VR has similar challenges as
    many other distributed applications
   Other applications we plan to explore:
       Collaborative applications (whiteboard, shared
        applications, chat servers, etc)
       Distributed databases
       Distributed simulation (ns-2)
       …


                                                         30

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:5
posted:2/23/2012
language:
pages:29