Overview of Mesh Networking Research @ MSR

Document Sample
Overview of Mesh Networking Research @ MSR Powered By Docstoc
					Overview of Mesh Networking Research
@ MSR




Jitendra Padhye
Microsoft Research




January 23, 2006
What are mesh networks?

• Multi-hop wireless networks

• Mostly static nodes

• Unplanned node placement

• Applications: Disaster relief, Backhaul for city-wide
  wireless networks, Meeting mesh, Neighborhood Meshes,
  internet connection sharing

• Many startups ….
Three main problems in mesh networking


• Capacity

• Capacity

• Capacity
Why is capacity a problem?




   Source
                      Mesh Router           Destination



  With a single radio, a node can not transmit and receive
                       simultaneously.

  A two-hop path has half the capacity of a one-hop path.
         Other interference patterns also possible.

        Seminal Result by Gupta and Kumar (2000):
                 Capacity = O(1/sqrt(n))
MSR’s research on Mesh Network Capacity


• Capacity estimation

• Capacity improvement using multiple radios
  and other techniques

• Feasibility study using realistic traffic
Mesh Network Capacity Estimation

• New framework for estimating capacity of multi-hop wireless networks
   – Gupta-Kumar result is asymptotic
   – Our framework calculates optimal capacity of a given mesh network for
     given set of flows
        MobiCom 2003 (Jain, Padhye, Padmanabhan and Qiu).


• Our framework requires knowledge of which links interfere with one
  another
   – Problem of “conflict graph” estimation
   – N nodes  O(N^2) links  O(N^4) pairs!
   – We developed an approximation technique that takes O(N^2) time
       IMC 2005 (Padhye, Agarwal, Padmanabhan, Qiu, Rao and Zill)



Key Insight: Multiple radios necessary to improve capacity
Improving capacity using Multiple Radios

• Select best radio to send each packet using locally available
  information
    – Multi-radio unification protocol
        IEEE BroadNets 2004: Adya, Bahl, Padhye, Wolman and Zhou)
    – Problem: sub-optimal in many cases
• Optimize entire path for a given flow
    – Take into account interference and link capacity along entire path
    – Implemented in Mesh Connectivity Layer (MCL)
         MobiComm 2004: Padhye, Draves, Zill
• If second radio has very low bandwidth, can we use it to offload
  signaling?
    – Simulation-based study of separating control and data into different
      frequency bands
         IEEE BroadNets 2005 (Kyasanur, Padhye, Bahl)


   How do we know how much capacity is “enough”?
Feasibility study using realistic traffic

• Collect traffic traces from Microsoft’s wired network

• Replay on mesh testbed

• Study delay characteristics of replayed traffic

• Conclusions:
    – Factors such as specific card brands, placement of servers have
      significant impact, routing metrics have less impact.
    – 2-radio mesh network likely sufficient for supporting normal office traffic
    – Some large delay spikes.


• MobiSys 2006 (Eriksson, Agarwal, Bahl, Padhye)
Ongoing work related to capacity:

• Capacity improvement using network coding

• Use of directional antennas to reduce interference

• Use of spectrum etiquettes and cognitive radios to
  improve spectrum utilization
Other challanges:

• Self-management
   – Network without administrator – is it possible?
   – Engineering challenges such as automatic address assignment


• Security and Fairness
   – Freeloaders
   – Information leakage by observing traffic
   – Malicious nodes can disrupt routing
Backup slides
Mesh Connectivity Layer (MCL)
Design & Implementation

Design Choice
         Multi-hop networking at layer 2.5

Framework
     –    NDIS miniport – provides virtual adapter on virtual link
     –    NDIS protocol – binds to physical adapters that provide next-hop
          connectivity
     –    Inserts a new L2.5 header

Why Layer 2.5?
     –    Works over heterogeneous links (e.g. wireless, powerline)
     –    Transparent to higher layer protocols.
             • works equally well with IPv4 and IPv6
     –    ARP etc. continue to work without any changes

 Features
     –    DSR-like routing with optimizations at virtual link layer
             – Link Quality Source Routing (LQSR)
     –    Incorporates 5 different link selection metrics:
             – Hop count, RTT, Packet Pair, ETX, WCETT
Scope: Technical Problems we looked at
Range and Capacity
    – Off-the-shelf wireless hardware Is severely range limited
    – Throughput of 802.11 MAC degrades rapidly with the number of hops
    Our Solution: multi-radio meshbox, directional ant., NLDP, Interference management, Capacity-cal

Routing
    – Network connectivity is highly dynamic
    – Classical single path & shortest path routing perform poorly in a dense network
    Our Solution: LQSR & MR-LQSR, WCETT (ETX, PacketPair, RTT,..)

Security and Fairness
    – Mesh is susceptible to freeloaders and malicious users
    – Achieving “fairness” without topological and traffic information is difficult
   Our Solution: “Windows certificate", greedy behavior detection, watchdog mechanism, intrusion detection

Self Management
    – End users are non-technical
    – A no-network operator model is challenging
    Our Solution: M3, watchdog mechanism, data cleaning, liar detection, on-line network simulation, beacon
       stuffing, server placement

Spectrum Management
    – Tragedy of the commons
    – Exploit spectrum white space
    Our Solution: Control channel, dual-frequency meshes, 700-900 MHz, Spectrum etiquettes
Impact of path length on throughput
Experimental Setup

• 23 node testbed
                                                          10000

                                                          9000

•   One IEEE 802.11a radio per node                       8000

    (NetGear card)                                        7000




                                      Throughput (Kbps)
                                                          6000

                                                          5000

• Randomly selected 100 sender-                           4000

  receiver pairs (out of 23x22 =                          3000

                                                          2000
  506)                                                    1000

                                                             0
                                                                  0   1        2         3         4         5   6
• 3-minute TCP transfer, only one                                         Byte-Averaged Path Length (Hops)

  connection at a time
                                                    If a connection takes multiple paths over lifetime,
                                                               lengths are byte-averaged
                                                                     Total 506 points.
    Solution: Multi-Radio Meshes
Link Selection Metrics
Many metrics have been studied in literature
    –   Hop count
    –   Round trip time
    –   Packet pair
    –   Expected data transmission count incl. retransmission
    –   Weighted cumulative expected transmission time
    –   Signal strength stability
    –   Energy related
    –   Link error rate
    –   Location related
    –   …




The ones in red are implemented in MCL
Link Selection Metric for Single Radio: ETX
• Each node periodically              Advantages
  broadcasts a probe                     – Explicitly takes loss rate into
                                           account
• The probe carries information          – Implicitly takes interference
  about probes received from               between successive hops into
  neighbors                                account
                                         – Low overhead
• Each node can calculate loss
  rate on forward (Pf) and reverse    Disadvantages
  (Pr) link to each neighbor             – PHY-layer loss rate of broadcast
                                           probe packets is not the same as
• Selects the path with least total        PHY-layer loss rate of data packets
  ETX                                         Broadcast probe packets are
                                                smaller
                                              Broadcast packets are sent at
                    1                           lower data rate
   ETX                                  – Does not take data rate or link load
           (1  Pf) * (1  Pr)             into account


                                        Developed by De Couto et al @ MIT (2003)
Baseline comparison of Metrics
Single Radio Mesh

Experimental Setup                    Median path length:
                                                                         HOP: 2, ETX: 3.01, RTT: 3.43, PktPair: 3.46

• 23 node testbed
                                                                  1600


•   One IEEE 802.11a radio per node                               1400




                                       Median Throughput (Kbps)
    (NetGear card)                                                1200

                                                                  1000

• Randomly selected 100                                           800
  sender-receiver pairs (out of                                   600
  23x22 = 506)                                                    400

                                                                  200
• 3-minute TCP transfer, only
                                                                    0
  one connection at a time                                                HOP        ETX         RTT       PktPair


                                                                                ETX performs the best
Link Selection Metric for Multiple Radios: WCETT

State-of-art metrics (shortest path, Packet Pair, RTT, ETX)
  do not leverage channel, range, data rate diversity


Multi-Radio Link Quality Source Routing (MR-LQSR)
   – Link metric: Expected Transmission Time (ETT)
        Takes bandwidth and loss rate of the link into account

   – Path metric: Weighted Cumulative ETTs (WCETT)
       Combine link ETTs of links along the path
       Takes channel diversity into account

   – Incorporates into source routing



                                       Developed by Draves, Padhye et al @ MSR(2004)
Expected Transmission Time (ETT)
Given:
    –   Loss rate p
    –   Bandwidth B
    –   Mean packet size S
    –   Min backoff window CWmin

Takes bandwidth and loss rate of the link into account

               ETT  ETxmit  ETbackoff


               where,
                             S                          CWminf(p)
               ETxmit                    ETbackoff 
                          B(1  p)                       2(1  p)
                            i 7
               f(p)  1   2(i 1) p i
                            i 0
WCETT = Combines link ETTs

Need to avoid unnecessarily       Given a n hop path, where each hop
long paths                        can be on any one of k channels, and
                                  two tuning parameters, a and b:
   - bad for TCP performance
   - bad for global resources


                                   WCETT 
                                           a*  ETT  b* max    n


                                                                  i 1
                                                                         i          1 j k
                                                                                              Xj
All hops on a path on the same                                               a b
channel interfere                  where
 – Add ETTs of hops that are on    Xj             ETT             i
   the same channel                       hop i is on channel j




 – Path throughput is dominated
   by the maximum of these
   sums                           Select the path with min WCETT
Baseline Comparison of Metrics
Two Radio Mesh

Experimental Setup
                                                             Median path length:
                                                                            HOP: 2, ETX: 2.4, WCETT: 3
• 23 node testbed
                                                                       Median Throughput of 100 transfers


• Randomly selected 100
                                                      3500

                                                                       2989.5                     Single Radio
  sender-receiver pairs (out of                       3000
                                                                                                  Two Radios
  23x22 = 506)




                                  Throughput (Kbps)
                                                      2500


                                                      2000
                                                                1601
• 3-minute TCP transfer
                                                                                           1508
                                                      1500                          1379
                                                                                                           1155
                                                      1000                                                        844


• Two scenarios:                                      500


   – Baseline (Single radio):                           0
                                                                 WCETT                ETX                Shortest Path
       802.11a NetGear cards

    – Two radios                                         WCETT utilizes 2nd radio better
        802.11a NetGear cards                           than ETX or shortest path
        802.11g Proxim cards
Path Length and Throughput
Which metric is best?
                                                                           WCETT   ETX   HOP

Experimental Setup                                            3.5

                                                                3

•   23 node testbed                                           2.5




                                                 Hop Length
                                                                2
•   Randomly selected 100 sender-                             1.5
    receiver pairs (out of 23x22 = 506)                         1

                                                              0.5
•   3-minute TCP transfer (transmit as                          0
    many bytes as possible in 2                                      A     C       D      E      F
                                                                           WCETT   ETX   HOP
    minutes, followed by 1 minute of                                     Testbed Configuration
    silence)                                                  4000
                                                              3500


                                          Throughput (Kbps)
                                                              3000
                                                              2500
                                                              2000

    For 1 or 2 hop the choice of                              1500

    metric doesn’t matter                                     1000
                                                              500
                                                                0
                                                                     A     C       D      E      F
                         Comparison of Metrics
                         Wireless Office Scenario

                                           23 node indoor testbed. Two radios (both 802.11a) per node.
                                                           11 active clients, 4 servers.

                                         Light Office Traffic                                                        Heavy Office Traffic
                                1 hour, 415 sessions, 19.72 MB total                                        1 hour, 308 sessions, 587.5 MB total
                        10000                                                                       10000
Additional Delay (ms)




                                                                            Additional Delay (ms)
                        1000                                                                        1000                862     943
                                                              474                                               590

                                                                      179
                         100                 120
                                    89               82                                              100

                                                                                                                27      31      30

                          10        11
                                                              8                                       10
                                             6       5                6
                                    4        4       3        3                                                 4
                                                                      2                                                 3       3
                           1                                                                           1
                                 WCETT     ETX     HOP    PKTPAIR   RTT
                                                                                                             WCETT    ETX     HOP     PKTPAIR   RTT




                                         Relatively light traffic means performance is okay for all metrics.
                                         WCETT does better under heavy load (worst case delay)
Management:
Resiliency against Liars/Lossy Links

Problem                                            Simulation Results
•   Identify nodes that report incorrect
    information (liars)                                                                          Detect liars

•   Detect lossy links                                                        1




                                                   Fraction of lying nodes
                                                                             0.8




                                                          identified
Assume                                                                       0.6

•   Nodes monitor neighboring traffic, build                                 0.4

    traffic reports and periodically share info.                             0.2
•   Most nodes provide reliable information                                   0
                                                                                   NL=1   NL=2     NL=5     NL=8     NL=10 NL=15 NL=20

Challenge                                                                                         coverage      false positive

    Wireless links are error prone and unstable
                                                                                           Detect lossy links

                                                                              1
Approach

                                                   Fraction of lossy links
                                                                             0.8
•   Watchdogs

                                                          identified
                                                                             0.6
•   Find the smallest number of lying nodes to
                                                                             0.4
    explain inconsistency in traffic reports
•   Use the consistent information to estimate                               0.2

    link loss rates                                                           0
                                                                                   NL=1   NL=2     NL=5   NL=8 NL=10 NL=15 NL=20

                                                                                                 coverage       false positive