oral by xiangpeng

VIEWS: 13 PAGES: 54

									                 Oral Qualifying Examination
                          David V. Schuehler


•Papers reviewed:
   –Packet Classification on Multiple Fields
       •Gupta and McKeown

   –Scalable Packet Classification
       •Baboescu and Varghese

   –What Packets May Come: Automata for Network Monitoring
       •Bhargavan, Chandra, McCann and Gunter

   –Protocol Boosters
       •Feldmeier, McAuley, Smith, Bakin, Marcus and Raleigh

Schuehler                                                      1
Services Provided by Packet Classifiers

• Packet Filtering

• Policy Routing

• Accounting & Billing

• Traffic Rate Limiting

• Traffic Shaping


Schuehler                                 2
Network Monitoring

• Troubleshoot problems

• Analyze performance

• Validate correctness of operations

• Data gathering

• Network tuning


Schuehler                              3
Heterogeneous Internet
• Fiber Optic
• Copper
• Wireless
• Satellite




Schuehler                4
First Paper

• Packet Classification on Multiple Fields
    – Pankaj Gupta and Nick McKeown
    – Computer Systems Laboratory
    – Stanford University

• Published in SIGCOMM 1999
    – August, 1999
    – Cambridge, MA




Schuehler                                    5
Challenge


• Develop a high performance packet classifier

• Exploit structure and redundancy found in
  existing classifier rule sets




Schuehler                                        6
Analysis of 793 Classifiers from 101 ISPs
• 41,505 total rules
• Small rule sets
    – 99% contained < 1000 rules, mean of 50 rules
• Filter on maximum of 8 fields
    – src/dst addr, src/dst port, TOS, protocol, flags
• Small number of protocols filtered
• 10% contain ranges
• 14% contain non-contiguous mask
    – Ex. 137.98.217.0/8.22.160.80
• Duplication found in rule field specifications
• 8% or rules were redundant

Schuehler                                                7
Structure of Classifiers

• Small amount of rule intersection in existing
  classifiers
• For 1734 rules in 4 dimensions, found 4316
  overlapping regions – worst case is 1013




Schuehler                                         8
Recursive Flow Classification (RFC)

 • Perform mapping from packet header fields to
   classification ID in multiple phases
 • Each phase consists of multiple parallel lookups
 • Each lookup is a reduction in bit length




Schuehler                                             9
Packet Classification using RFC




Schuehler                         10
RFC Performance Tuning
• Number of phases
    – Time (# of lookups)
• Reduction tree selected
    – Space (memory utilization)
• Tuning operation
    – Select number of phases
    – Combine chunks with most
      correlation
    – Combine as many chunks as
      possible
• Tree A is optimal



Schuehler                          11
Memory – Time Tradeoff




        2 Phases: < 10GBytes              3 Phases: < 2.5MBytes




                          4 Phases: < 1.1MBytes
Schuehler                                                         12
Rule Preprocessing Time




Schuehler                 13
Software Performance
• 333Mhz Pentium-II (Windows NT)
• Worst case time double that of average
• Average time for 100,000 classifications




Schuehler                                    14
Adjacency Groups
• Combine rules which contain differences in
  one dimension, but are otherwise identical

• Loose knowledge of which rule packet
  matched

• Additional preprocessing work required

• Reduces the total number of rules

• Handles 15,000 rules in 3.85 MB

Schuehler                                      15
 Summary
• Exploit structure & redundancy in rules
• Recursive Flow Classification (RFC)
   – 1 million packets/sec in S/W
   – 30 million packets/sec in H/W
• Supports < 6000 rules, < 15,000 with Adj Grp
• Utilizing knowledge of rule set to reduce complexity
• Combine rules (adjacency groups) to reduce the number
  of chunk equivalence classes
• Hardware performance optimistic
• Problems with small number of phases and large rule sets



  Schuehler                                              16
Second Paper

• Scalable Packet Classifications
    – Florin Baboescu & George Varghese
    – Dept. of Computer Science & Engineering
    – University of California, San Diego

• Published in SIGCOMM 2001
    – August, 2001
    – San Diego, CA




Schuehler                                       17
Challenge

• Develop a high performance packet classifier
  that supports large rule sets (100,000 rules)

• Exploit structure and redundancy found in
  existing classifier rule sets

• Extend Bell Labs/Lucent Bit Vector search
  algorithm




Schuehler                                         18
Lucent Bit Vector

• Point location in multi-dimensional space

• Parallel lookups for each dimension

• Bit vector generated for each field (dimension)

• Take intersection of result vectors

• Search is linear with respect to number or rules

• Scales to 10,000 rules



 Schuehler                                           19
Lucent Bit Vector (continued)




  Max 2n+1
  intervals for n
  rules


Schuehler                       20
Aggregate Bit Vector
• Rule Aggregation
    – Bit vectors are large (scale with # of rules)
    – Bit vectors are sparsely populated
    – Packets match at most 4 rules
    – Large rule sets created by combining smaller
      disjoint rule sets

• Rule Rearrangement
    – Rearrange rules to improve aggregation
    – Reduce false matches
    – Must compute lowest cost for all matches


Schuehler                                             21
Aggregation Example




Schuehler             22
Rearrangement Example
 • Aggregation size = 2
 • Packet from source X to destination Y
            Rule   Field1   Field2    Rule   Field1   Field2
             F1      X       A1        F1      X       A1
             F2     A1        Y        F2      X       A2
             F3      X       A2        F3      X       A3
             F4     A2        Y        …       …        …
             F5      X       A3       F30      X       A30
             F6     A3        Y       F31      X        Y
             F7      X       A4       F32     A1        Y
             …       ...      …       F33     A2        Y
            F60     A30       Y        …       …        …
            F61      X        Y       F61     A30       Y

      Before Rearrangement           After Rearrangement
      30 false matches               No false matches


Schuehler                                                      23
 Results
Worst case memory access for 4 databases with 5 fields (A=32)
Improvement: 27% - 54% unsorted       40% - 75% sorted




  Schuehler                                                     24
Synthetic Database Results




Schuehler                    25
Multiple Levels of Aggregation
• Comparison of one & two levels of aggregation
• Zero length prefixes are injected
• 60% improvement for large rule set
• Number of memory accesses required




Schuehler                                         26
Summary
• Add aggregation & rearrangement to Lucent
  Bit Vector algorithm

• Order of magnitude faster than BV scheme

• Suitable for large rule sets (100,000 rules)

• Multiple levels of aggregation reduce memory
  operations for large databases

• Wide memory widths improve efficiency

Schuehler                                        27
Third Paper
• What Packets May Come: Automata for
  Network Monitoring
    – Karthikeyan Bhargavan & Carl A. Gunter
    – University of Pennsylvania
    – Satish Chandra & Peter J. McCann
    – Bell Laboratories

• Published in POPL 2001
    – Principles of Programming Languages
    – January, 2001
    – London, UK


Schuehler                                      28
Challenge

• Formulate an external network protocol
  monitor as a language recognition problem

• Given a language specification of input &
  output sequences, develop a second that
  corresponds to the sequences observed
  externally




Schuehler                                     29
Complications

• Observed traffic could differ from traffic
  observed by target

• Protocol specifications are often vague

• Implementations of protocols vary

• Observed language could be significantly
  different from language that target device
  processes

Schuehler                                      30
Basic Monitor

• Sequence at M
     iqa iqb iqc iqe oqd
• Sequence at S
            ida idb odd


            id             iq

            od             oq



Schuehler                       31
Admissibily
Given string at S: i1 i2 o1 i3 o2 i4 i5
Queue sizes: input = 3, output = 2

A: iq1 id1 iq2 id2 od1 oq1 iq3 id3 od2 oq2 iq4 id4 iq5 id5
B: iq1 iq2 id1 id2 od1 oq1 iq3 id3 od2 oq2 iq4 id4 iq5 id5

C: iq1 iq2 iq3 id1 id2 od1 oq1 id3 od2 oq2 iq4 id4 iq5 id5
D: iq1 iq2 iq3 id1 id2 od1 id3 od2 oq1 oq2 iq4 id4 iq5 id5

E: iq1 iq2 iq3 id1 id2 od1 iq4 iq5 id3 od2 oq1 oq2 id4 id5
F: iq1 iq2 iq3 iq4 id1 id2 od1 iq5 id3 od2 oq1 oq2 id4 id5

Schuehler                                                    32
Elimination of Output Buffer
• CU the maximum number of input symbols
  without an intervening output symbol
• M(S, m, n) => M(S, m+CU*n, 0)

• Example m = 2, n = 2, CU = 2
iq1 iq2 od1 id1 iq3 id2 iq4 od2 id3 iq5 id4 iq6 oq1 oq2

Move iq and oq tokens as far left as possible
iq1 iq2 iq3 iq4 iq5 iq6 od1 oq1 id1 id2 od2 oq2 id3 id4

Maximum input buffer size = 6 (2 + 2 * 2)

Schuehler                                                 33
Dealing with Packet Loss

• CL the maximum number of dropped tokens
  between two id tokens must be less than CL
• LM(S,m,n) => LM(S,m+CU*CL*n,0)


• Example        iq1 il1 iq2 iq3 id2 od1 il3 oq1

• Tokens at M iq1 iq2 iq3 oq1

• Tokens at S    id2 od1


Schuehler                                          34
Brute Force Search
• g is a function that checks S on a sequence of tokens
  and indicates whether it is in LS
• F(g,T) is a function that tells us whether trace T
  corresponds to proper execution with respect to S

• Construct all possible token sequences at S based on
  tokens observed at M
• Iterate through each sequence checking for an
  admissible string
• If found, observed string is in LS
• Otherwise, failure



Schuehler                                                 35
No Data Loss Optimizations (CL= 1)
• P1: Counting Properties
    – Every output must consume between cmin & cmax inputs
• P2: Independent Inputs and Outputs
    – Validate input and output sequences separately
• P3: Periodic Outputs
    – Output is produced every P inputs
• P4: Deterministic Placement of Outputs
    – One position for output after sequence of inputs
• P5: Contiguously Enabled Commutative Outputs
    – Output is valid for a contiguous range of inputs
• P6: Output-checkpointed Automata
    – For each output, there is at most one next state
• P7: Finite State Machines
    – If g is FSM, BFS has polynomial bound in # of states &
      size of buffers (|T| * B2)

Schuehler                                                      36
Optimizations with Data Loss
• P1*: Counting Properties
   – Buffer limit becomes m + cmax * CL * (n + 1)
• P2o: Independent Output Properties
   – Same as no loss case
• P8: Insert-closed Commutative Outputs
   – If string is accepted, so is string with arbitrary inputs added
• P7*: Finite State Machines
   – Still bounded, but must consider 2B lossy substrings
• P9: Deterministic Stateless Transducers
   – Stateless automata where all inputs are distinct
• P10: Output-checkpointed Stateful Transducers
   – Unique state after consuming odx
• P6*: Output-checkpointed Automata
   – Check maximum of 2(B+CU*CL) strings against g at output
 Schuehler                                                             37
 Complexities
P1: Counting (P6, P7)
P2: Independent In & Outputs (P5)
P2o: Independent Outputs (P2, P8)
P3: Periodic Outputs (P4)
P4: Deterministic Placement (P5)
P5: Commutative Outputs (ALL)
P6: Checkpointed Automata (ALL)
P7: Finite State Machines (ALL)
P8: Commutative Outputs (P5)
P9: Finite State Machines (P7, P10)
P10: Stateless Transducers (P4, P6)
(implies)

  Schuehler                           38
Monitoring TCP
• Property 1 describes counting property
    – Monitors ACKs generated for at least every
      other message

• Property 2 describes independent inputs &
  outputs
    – Monitors non-decreasing sequence numbers

• Property 3 describes periodic outputs (no loss)
    – Monitors ACKs generated for contiguously
      received set of segments


Schuehler                                           39
Summary

• External monitor developed as a language
  recognition problem
• Problem unbounded with respect to space & time
• Properties defined to limit complexity
• Impressive goal to attempt monitoring of complex
  protocols with finite automata
• Disappointed at TCP monitoring examples
• Does not account for loss of output events
• Monitor should be placed close to endpoint


Schuehler                                        40
Fourth Paper

• Protocol Boosters
    – D.C. Feldmeier, A.J. McAuley, J.M.Smith, D.S.
      Bakin, W.S. Marcus, T.M. Raleigh
    – Bellcore and University of Pennsylvania

• Published in IEEE JSAC
    – Journal on Selected Areas in Communications
    – April, 1998




Schuehler                                             41
Challenge

• Develop a new methodology for protocol
  design

• Support localized customization in
  heterogeneous networks

• Provide for rapid protocol evolution




Schuehler                                  42
Current Limitations with IP Internet
• Protocols evolve slowly with respect to
  advances in networking technology
    – IPV6
    – Multicast
    – Short duration connections (HTTP)

• Sacrifice efficiency in order to support a large
  heterogeneous network
    – Satellite communication
    – Wi-Fi wireless etherent
    – ATM


Schuehler                                            43
Protocol Booster

• Software or hardware module that transparently
  improves protocol performance




Schuehler                                          44
One-Element Protocol Boosters

• UDP checksum generation
    – Generate UDP checksum within network
• TCP ACK compression
    – Compress multiple ACKs on slow link
• TCP congestion control
    – Generate duplicate ACKs to reduce window size
• TCP ARQ booster
    – Caches packets and performs retransmission


    ARQ (automatic repeat request)


Schuehler                                             45
Two-Element Protocol Boosters

• Forward error correction coding
    – Add parity and correction bits
    – Regenerates missing data
• Jitter elimination for real-time communication
    – Match packet arrival rate at other end
    – Eliminates jitter by increasing latency
• TCP Selective ARQ
    – Cache packets add sequence numbers
    – Generate NACK for missing packet
    – Retransmit packet on receipt of NACK

Schuehler                                          46
Fast Evolution
• No standards body

• Developed by small team

• Contained insertion into network

• Free market supports competition and
  collaboration

• Proprietary boosters offer competitive
  advantage

Schuehler                                  47
Targeted Improvements

• Quick fix applied to individual network segments

• Rapid deployment

• Isolated boosters

• Targeted trouble spots

• Doesn’t affect other areas of the network


Schuehler                                            48
Comparisons to Other Approaches
• Link Layer Adaptation
    – Only operates at link layer

• Protocol Conversion
    – Conversion changes message syntax

• Protocol Termination
    – Loses end-to-end properties

• Special Purpose End-to-End Protocols
    – Cannot account for changes in network

Schuehler                                     49
Example Implementation

• Protocol boosters added to Linux & NetBSD systems
• Forward error correction booster implemented
• UDP data traffic
• Random and bursty error models used
• Booster successfully reduced effective packet loss




Schuehler                                              50
Summary

• Targeted improvements
• Help solve problems with heterogeneous
  Internet
• Boosters can be nested
• Booster should be invisible to end systems
• Placement important
• Rapid development & deployment
• Two element boosters need to be paired



Schuehler                                      51
Topics Covered
• Packet Classification
    – Examine data set for structure
    – Develop targeted solutions
    – Optimize for lookups

• Network Monitoring
    – Automata for validating aspects of a protocol’s behavior
      based on monitored traffic

• Protocol Boosters
    – Improve IP protocol performance through a
      heterogeneous network


Schuehler                                                        52
Final Thoughts
• An enhanced packet classifier can be considered a
  one-element protocol booster

• Both packet classification papers take a divide and
  conquer approach performing multiple lookups in
  parallel

• Classifiers could be combined with protocol booster to
  determine which packets to process

• Automata based monitor could validate properties of
  protocol booster

Schuehler                                                  53
Questions




Schuehler   54

								
To top