TCP

Document Sample
TCP Powered By Docstoc
					Computer Networks
          Lecture 4: IP Addressing-route lookup



              Younghee Lee




                                             1     1
                Prof. Younghee Lee
        The Internet Protocol
   Identifier: A sequence number to identify a datagram uniquely.
   Flag: More bit(indicates the last fragment in original datagram), Don‟t
    Fragment bit(can be discarded at some subnet->source routing advisable)
   Fragment offset:: indicate where in the original datagram this fragment belongs
   Time to live: somewhat similar to a hop count
   Protocol: the next higher-level protocol




                                                                               2      2
                                       Prof. Younghee Lee
      Type of Service
   TOS subfield: guidance to the IP entity indicating the
    type or quality of service
    – The way in which a router learns which routes support
      which TOS
        » Domain administrator preconfigure the TOS associated with the
          routes
        » A routing protocol monitor the TOS along the routes monitoring
          delays, throughputs, and dropped datagrams.(ex: OSPF)
          ignored now
 Typically
 Replaced by DiffServ




                                                                    3      3
                                 Prof. Younghee Lee
       IPv4 Options

   Security:
    – Security label to be attached to a datagram
   Source routing
    – A sequenced list of router addresses that specifies the routes to be
      followed. May be strict or loose
   Route recording
    – allocated to record the sequence of routers visited by the datagram
   Timestamping
    – The source IP entity and some intermediate routers add a time
      stamp (precision to milliseconds)




                                                                    4        4
                                Prof. Younghee Lee
     Naming and Addressing
   Naming versus addressing
    – naming is typically a high-level description
    – addresses refer to specific physical resources
    – distinction hard to define but often clear:
        » icu.ac.kr
        » 128.9.23.93
        » D74A049C2384
   Naming/addressing formats
    – structure: flat versus partitioned (hierarchical)
    – duration: dynamic versus static
    – scope: local versus global

   Domain Name System (DNS) names are names of hosts
   DNS binds host names to interfaces
   Routing binds interface names to paths
                                                          5   5
                                  Prof. Younghee Lee
Name/Address Structure
   Hierarchical address space
    – address space has structure: sequence of fields
         » fields identify autonomous organizations, geographical
           location, ..
    –   hierarchical can simplifies routing
    –   easily supports distributed assignment of addresses
    –   can result in inefficient use of the address space
    –   example: IP addresses, postal address, telephone
        numbers, ..
   Flat address space
    –   address has no structure: single field
    –   easier to use full address space
    –   lacks support for routing
    –   example: IEEE addresses (48 bits)

                                                                    6   6
                                  Prof. Younghee Lee
    IP Addressing: introduction
   IP address: 32-bit                      223.1.1.1

    identifier for host,                                             223.1.2.1
                                            223.1.1.2
    router interface
                                                   223.1.1.4    223.1.2.9
   interface: connection
    between host, router                   223.1.1.3    223.1.3.27
                                                                     223.1.2.2
    and physical link
     – router‟s typically have
       multiple interfaces
     – host may have multiple               223.1.3.1            223.1.3.2
       interfaces
     – IP addresses associated
       with interface, not host,
       router                    223.1.1.1 = 11011111 00000001 00000001 00000001

                                                 223        1          1         1

                                                                             7       7
                                 Prof. Younghee Lee
    IP addresses: how to get one?
Hosts (host portion):
 hard-coded by system admin in a file
 DHCP: Dynamic Host Configuration Protocol: dynamically get address:
    “plug-and-play”
     – host broadcasts “DHCP discover” msg
     – DHCP server responds with “DHCP offer” msg
     – host requests IP address: “DHCP request” msg
     – DHCP server sends address: “DHCP ack” msg
   Auto-configuration: no DHCP?
     – IPv6 stateless autoconfiguration
     – MANET AUTOCONF :
         » Standalone
         » With gateway: can be relatively simple but how to select gateway?
         » Stand-alone for most of the time but temporarily connected to the infrastructured network
               e.g. car network connected while parked and disconnected otherwise

         » Strong DAD, Prophet, AROD


                                                                                              8        8
                                           Prof. Younghee Lee
Hierarchical addressing: route
aggregation
  Hierarchical addressing allows efficient advertisement of routing
  information:


Organization 0
    200.23.16.0/23
Organization 1
                                             “Send me anything
    200.23.18.0/23                           with addresses
Organization 2                               beginning
    200.23.20.0/23    .   Fly-By-Night-ISP   200.23.16.0/20”
                      .
                 .    .                                           Internet
                 .
Organization 7   .
   200.23.30.0/23
                                              “Send me anything
                            ISPs-R-Us
                                              with addresses
                                              beginning
                                              199.31.0.0/16”

                                                                       9     9
                             Prof. Younghee Lee
     Addressing in IP v4
   Addresses are hierarchical.
     – address contains hint about location
   Original design: 4 classes of subnets.: classful
   Total IP address size: 4 billion
     –   Class A: 128 networks, 16M hosts
     –   Class B: 16K networks, 64K hosts
     –   Class C: 2M networks, 256 hosts
     –   Class D: for multicast
     –   Class E: 1111, for experiment
   127.0.0.1: local host (a.k.a. the loopback address)
   Host bits all set to 0: network address
   Host bits all set to 1: broadcast address

                           type      network                host

                   A         0           7                   24
                   B        10          14                   16
                   C       110          21                    8
                   D      1110           28
                                                                   10   10
                                       Prof. Younghee Lee
    Subnetting
   Hierarchy can be extended to
    more than two layers.
   Makes it possible to break up a                 Network        Host
    network in multiple subnets.
     – provides flexibility to manage                     1               0
       networks
     – packet forwarding between
       subnets is also done using
       routers, I.e. same as in Internet            Network   Sub     Host
                                                              Net
   Provides autonomy.
     – subnets inside network are not
       visible outside the network

                                                        Subnet 1
                                                                    Subnet 3

                                                         Subnet 2

                                                                              11   11
                                   Prof. Younghee Lee
IP Addressing: Issues
   Running out of IP address space: short term
    solutions.
    – Classless inter-domain routing
    – Dynamic address assignment
    – Network address translation
   Longer term solution for IP address shortage:
    IPv6.
    – Move to longer addresses: IPv6




                                               12   12
                        Prof. Younghee Lee
IP Address Utilization (‘98)




  http://www.caida.org/outreach/resources/learn/ipv4space/   13   13
                         Prof. Younghee Lee
Problems with Simple Address Structure
   Address space is not used very efficiently.
     – Address spaces for networks can only be 2**8, 2**16, 2**24 in size
         » Sizes differ by two orders of magnitude
     – Organizations that do not fit in smaller network (e.g. 257 hosts) need to
       use a size that is significantly larger

   Running out of addresses.
     – Especially true for mid-sized networks
     – Class B – greatest problem
         » Sparsely populated – but people refuse to give it back
     – Class C too small for most domains
     – Very few class A – IANA (Internet Assigned Numbers Authority) very
       careful about giving

   Routing tables are becoming too big.
     – 100 of thousands of entries

                                                                          14       14
                                        Prof. Younghee Lee
  Ideas Behind Classless Inter-Domain Routing

 Use  address space more efficiently by relaxing the
 strict address structure.
  – length of network address is variable
  – generalization of subnetting idea
  – makes network use more efficient
 HaveInternet service providers hand out blocks of
 addresses to their customers.
  – customers of ISPs appear like subnets of the ISP to other
    ISPs
  – reduces size of the routing tables


                                                      15   15
                         Prof. Younghee Lee
    CIDR Addressing
   Length of network address is
    variable and specified using
    a netmask.                                   Network 0   Hosts
    – Can make the address space
      just large enough                          Network 1   Hosts
   Can merge a group of
                                                    1          0
    adjacent class C addresses
    to form a larger network
    address.
                                                 Network     Hosts

                                                    1          0


                                                                     16   16
                            Prof. Younghee Lee
    CIDR Address Allocation: Example

                                                  Host

       Single route entry:         Host           ISP 2        Host
            128.5/16
                                       ISP 3                   ISP 5
                                                     ISP 4



       ISP: 128.5.X.X                                 ISP


Customer 1: 128.5.010xxxxx.X        Customer        Customer   Customer
Customer 2: 128.5.110xxxxx.X            1               2          3

Customer 3: 128.5.011xxxxx.X           Host           Host       Host
                                                                       17   17
                             Prof. Younghee Lee
    Route Lookup with CIDR
   Need to store a netmask with each
    entry to indicate the size of the
    network identifier.
     – can no longer rely on type field
   Problem: with CIDR there can be
    multiple matches when looking up          Ex-ISP     10110110       -> ISP 1
    an address.
     – Can for example happen when a             My
                                                         10110110 010   -> ISP 2
       customer switches ISPs but keeps         Entry
       addresses
   Solution: lookup is based on
                                                         10110110 010 0100011
    longest prefix match.
     – when there are multiple matches,
       the match with the most bits
       (longest netmask) wins
     – Complicates route lookup!

                                                                         18     18
                                    Prof. Younghee Lee
    NATs
   NAT maps (private source IP, source port) onto (public
    source IP, unique source port)
     – reverse mapping on the way back
     – destination host does not know that is process is happening
   Very simple working solution.
     – NAT functionality fits well with firewalls

                                       Priv A IP               B IP
    A
                                         B IP                Priv A IP
                                   A Port    B Port       B Port     A Port



                                      Publ A IP                B IP
                                         B IP               Publ A IP
          B                        A Port’ B Port         B Port A Port’
                                                                         19   19
                                    Prof. Younghee Lee
    NAT Considerations
   NAT has to be consistent during a session.
    – Set up mapping at the beginning of a session and maintain it
      during the session
    – Recycle the mapping that the end of the session
        » May be hard to detect
   NAT only work for certain applications.
    – Some applications (e.g. ftp) pass IP information in payload
    – Need application level gateways to do a matching translation
   NAT has to be consistent with other protocols.
    – ICMP, routing, …
   Many flavors of NAT exist.
    – Basic, network address port translation (NAPT), bi-directional,..



                                                                     20   20
                                  Prof. Younghee Lee
    NAT/firewall traversal of VoIP
 Types   of NAT functionality.
  – Full Cone If a host behind a NAT sends a packet from address:port
    {A:B}, the NAT process translates the address:port {A:B} to {X:Y} and
    causes a binding of {A:B} to {X:Y}. Any incoming packets (from any
    address) destined for {X:Y} are translated to {A:B}.
  – Partial/Restricted Cone full cone, However, once that first
    packet comes inward, the bindings are turned into complete four-
    component bindings. This enforces only packets from that source to be
    accepted and NATed from now onward.·
  – Symmetric Cone If a host behind a NAT sends a packet from
    address:port {A:B} to {C:D}, the NAT process translates the source
    address:port {A:B} to {X:Y} and causes a binding of {A:B} to {C:D} to
    {X:Y}. Only packets from {C:D} to {X:Y} are accepted in the reverse
    direction and these are NATed to {A:B}.
                                                                    21      21
                                Prof. Younghee Lee
NAT/firewall traversal of VoIP




                                   22   22
              Prof. Younghee Lee
 NAT/firewall traversal of VoIP
 NAT   problem
  – „Bindings‟ can only be initiated by outgoing traffic.
  – Unsolicited incoming calls cannot be supported.
     » Like incoming call of PABX can‟t be translated without attendant.




                                                                      23   23
                              Prof. Younghee Lee
   NAT/firewall traversal of VoIP
 Solutions      to NAT problem
  – Universal Plug and Play (UPnP)
       » limited to small installations.
  – Simple Traversal of UDP Through Network Address
    Translation devices (STUN)
       » STUN does not work with the type most commonly found in corporate
         networks - the symmetric NAT.
  –   TURN
  –   ICE
  –   Application Layer Gateway
  –   Manual Configuration
  –   Tunnel Techniques
                                                                     24      24
                                     Prof. Younghee Lee
    NAT/firewall traversal of VoIP
   STUN
    – The STUN protocol enables a SIP
      client to discover whether it is
      behind a NAT, and to determine
      the type of NAT.
        » STUN server: “This is what I see as
          the source address and port”

   TURN
    – Server that is inserted in the
      media and signalling path. This
      TURN server is located either in
      the customers DMZ or in the
      Service Provider network.
        » Increase latency and packet loss


                                                           25   25
                                      Prof. Younghee Lee
    Skype : From the KaZaA community
   A peer-to-peer VoIP client developed
    by KaZaa in 2003 : P2P – SIP
   It has better voice quality than the
    MSN and Yahoo IM applications
   It encrypts calls end-to-end, and
    stores user information in a
    decentralized fashion
   Auto-detect NAT/firewall settings
    – STUN and TURN
   Allows searching a user (e.g., kun*)
   Promote to super node
    – Based on availability, capacity
   Conferencing
                                                     26   26
                                Prof. Younghee Lee
      Kazaa
   FastTrack (aka Kazaa)
    – Modifies the Gnutella protocol into two-level hierarchy
        » Hybrid of Gnutella and Napster
    – Group leader                                                  Overlay peer
        » Nodes that have better connection to Internet
        » Act as temporary directory servers for other nodes in group
        » Maintains database, mapping names of content to
          IP address of its group member
        » Not a dedicated server; an ordinary server                    Group leader
                                                                        peer
    – Bootstrapping node
        » A peer wants to join the network contacts this node.
        » This node can designate this peer as new bootstrapping node.
    – Standard nodes                                                          Neighboring relationships
        » Connect to super nodes and report list of files                     In overlay network
        » Allows slower nodes to participate
    – Broadcast (Gnutella-style) search across Group leader peer; Query
      flooding
    – Drawbacks
        » Fairly complex protocol to construct and maintain the overlay network
        » Group leader have more responsibility. Not truly decentralized
        » Still not purely serverless(Bootstrapping node is on “always up server”)       27        27
                                         Prof. Younghee Lee
     IPv6

   Initial motivation: 32-bit address space
    completely allocated by 2008.
    – => 128 bit address
   Additional motivation:
    – header format helps speed processing/forwarding
    – header changes to facilitate QoS
    – new “anycast” address: route to “best” of several
      replicated servers
   IPv6 datagram format:
    – fixed-length 40 byte header
    – no fragmentation allowed
                                                      28   28
                           Prof. Younghee Lee
 IPv6 Header (Cont)
Priority: identify priority among datagrams in flow
Flow Label: identify datagrams in same “flow.”
           (concept of“flow” not well defined).
Next header: identify upper layer protocol for data




                                                      29   29
                        Prof. Younghee Lee
     IPv6 Header: Flow Label
   A flow:
    – A sequence of packets sent from a particular source to a
      particular (unicast or multicast) destination for which the source
      desires special handling by the intervening routers.
        » A flow may comprise multiple TCP connections: file transfer application
        » A single application may generate multiple flow: multimedia conferencing
              one flow for audio, one for graphic window, .. With different
               requirements
   Rules applied to the flow label
    – The source assigns a flow label to a flow. Chosen randomly in range 1
      to 224-1.
      * a table with 224 (16 million) entries: memory burden.
      * on entry in the table per active flow: search the entire table
      => hash table approach, CAM?




                                                                                     30   30
                                       Prof. Younghee Lee
    Other Changes from IPv4
   Checksum: removed entirely to reduce processing time at
    each hop
   Options: allowed, but outside of header, indicated by “Next
    Header” field
   ICMPv6: new version of ICMP
    – additional message types, e.g. “Packet Too Big”
    – multicast group management functions
   IPv6 eliminates fragmentation
   Easy configuration
    – Provides stateless auto-configuration using hardware MAC address
      to provide unique base
   Additional requirements
    – Support for security
    – Support for mobility
                                                               31   31
                              Prof. Younghee Lee
Migration from IPv4 to IPv6
   Interoperability with IPv4 is necessary for gradual
    deployment.
   Two mechanisms:
    – dual stack operation: IPv6 nodes support both address types
    – tunneling: tunnel IPv6 packets through IPv4 clouds
   Unfortunately there is little motivation for any one
    organization to move to IPv6.
    – the challenge is the existing hosts (using IPv4 addresses)
    – little benefit unless one can consistently use IPv6
        » can no longer talk to IPv4 nodes
    – stretching address space through address translation seems to
      work reasonably well


                                                                   32   32
                                 Prof. Younghee Lee
Dual Stack Approach




                                  33   33
             Prof. Younghee Lee
Tunneling


                  IPv6 inside IPv4 where needed




                                                  34   34
            Prof. Younghee Lee
      IPv6 Addresses
   A interface may have multiple unicast addresses.
    – Allow subscriber that uses multiple access providers across the same
      interface to have separate addresses aggregated under each
      provider‟s address space
   Longer Internet addresses allow for aggregating addresses by
    hierarchies of network, access provider, geography, corporation…
    – smaller routing tables, faster table lookups
   Address types
    – Unicast: an identifier for a single interface
    – Anycast: an identifier for a set of interface. Delivered to one of the
      interface(the “nearest” one for example)
    – Multicast: an identifier for a set of interfaces. Delivered to all interface.



                                                                            35    35
                                     Prof. Younghee Lee
     IPv6 Stateless Autoconfiguration
   Local communication with no intervention
    – Generate link-local address
        » corresponds to installed Ethernet network adapters. The last 64 bits of
          the IPv6 address is known as the interface identifier. It is derived from
          the 48-bit MAC address of the network adapter.
        » Perform Duplicate Address Detection (DAD)
    – This looks like this:
        »   FE80:0:0:0:XXXX:XXXX:XXXX:XXXX: prefix of FE80::/64
        »   The X‟s are the EUI-64 address.(extended unique identifier; 24 for company id)
        »   They could be a random 64 bit address also.
        »   The only requirement is that the address be unique.
    – Start sending data
   Global communication with no stateful server
   Adds devices with no user configuration
   Stateful configuration: DHCP
                                                                                    36       36
                                        Prof. Younghee Lee
     Routing : source routing
   Source routing
    – List entire path in packet
   Router processing
    – Examine first step in directions
    – Strip first step from packet
    – Forward to step just stripped off
   Advantages
    – Switches can be very simple and fast
   Disadvantages
    – Variable (unbounded) header size
    – Sources must know or discover topology (e.g., failures)
   Typical use
    – Ad-hoc networks (DSR)
    – Machine room networks (Myrinet)

                                                                37   37
                                   Prof. Younghee Lee
     Routing : Virtual Circuits/Tag Switching
   Connection setup phase
     – Each router allocates flow ID on local link
     – VC connection id
   Each packet carries connection ID
   Router processing
     – Lookup flow ID – simple table lookup
     – Replace flow ID with outgoing flow ID
     – Forward to output port
   Advantages
     –   More efficient lookup (simple table lookup)
     –   More flexible (different path for each flow)
     –   QoS: reserve bandwidth at connection setup
     –   Easier for hardware implementations
   Disadvantages
     – Complex signalling to route connection setup request : stateful
     – More complex failure recovery – must recreate connection state
   Typical uses
     – ATM – combined with fix sized cells
     – MPLS – tag switching for IP networks

                                                                         38   38
                                         Prof. Younghee Lee
     Routing : IP routing
   Each switch has forwarding table of destination  next hop
   Distributed routing algorithm for calculating forwarding tables
   Routing table size
     – One entry for every host on the Internet
         » 100M entries,doubling every year
     – One entry for every LAN
         » Every host on LAN shares prefix
         » Still too many, doubling every year
     – One entry for every organization
         » Every host in organization shares prefix
         » Requires careful address allocation
   Advantages
     – Stateless – simple error recovery
   Disadvantages
     – Every switch knows about every destination
         » Potentially large tables
     – All packets to destination take same route
                                                                      39   39
                                            Prof. Younghee Lee
     Lookup mechanism
   Exact match search:
    – MPLS, ATM..
        » Direct lookup
        » Associative lookup: Content Addressable
          Memory (CAM)
              Ternary CAM: 0, 1, x

        » Hashing: binary search
              Perfect hash function <= not easy,
               Update <= complex
                  – Multiple hash function, bloom filter
        » Binary search tree

   Longest Prefix match
    – IP
    – Radix trie
    – Binary search on prefix interval
                                                            40   40
                                       Prof. Younghee Lee
    Longest Prefix Match: is Harder than Exact Match
   The destination address of an arriving packet does not
    carry with it the information to determine the length of the
    longest matching prefix
   Hence, one needs to search among the space of all prefix
    lengths; as well as the space of all prefixes of a given
    length
   Metrics for Lookup Algorithms
    –   Speed (= number of memory accesses)
    –   Storage requirements (= amount of memory)
    –   Low update time (support ~5K updates/s)
    –   Scalability
         » With length of prefix: IPv4 unicast (32b), Ethernet (48b), IPv4 multicast (64b),
           IPv6 unicast (128b)
         » With size of routing table: (sweetspot for today’s designs = 1 million)
    – Flexibility in implementation
    – Low preprocessing time
                                                                                     41       41
                                        Prof. Younghee Lee
Longest Prefix Match
    LPM in IPv4
     Use 32 exact match algorithms for LPM!


                                        Exact match
                                      against prefixes
                                         of length 1

                                        Exact match
    Network Address                   against prefixes
                                                              Priority   Port
                                         of length 2
                                                             Encode
                                                             and pick




                                        Exact match
                                      against prefixes
                                        of length 32


                                                                                42   42
                                        Prof. Younghee Lee
        Tree, Tries
   Trie: Prefix tree
   Binary search trie: use binary tree paths to encode prefixes

         <               >
                                                           0                   1
    <        >           <
                             log2N



                                     001xx 2           0       1               0
>           <        >               0100x 3                                       1
                                     10xxx 1                   0           1
        <        <                                 1
                                     01100 5
                                                       2   0           0
             N entries                                     3       0
Binary search tree                   Binary search trie                                Prefix tree
                                                                   5
   Advantage: simple to implement
   Disadvantage: one lookup may take O(m), where m is number of bits
    (32 in the case of IPv4)
                                                                                               43    43
                                              Prof. Younghee Lee
     Skip Count vs. Path Compression
           0                                                                  (Skip count)
                         1                                                    Skip 2
     P1                                                     0                 or
                     0        1                                           1
                                                                              11 (path
                                                     P1
            P2                                                       0        compressed)
                                  1
                                                                P2
                              0        1                                  0    1
                         P3           P4                             P3       P4


   Removing one way branches ensures # of trie nodes is at most twice #
    of prefixes; (case: trie containing a small number of very long strings)
     – Patricia tries: Practical Algorithm To Retrieve Information Coded In Alphanumeric,
        Radix trie
   Using a skip count requires exact match at end and backtracking on
    failure  path compression simpler
                                                                                    44       44
                                           Prof. Younghee Lee
         Multibit Tries
   Binary trie
     – Depth: w, Degree: 2, Stride: 1bit
   Multi-bit trie                                         W/k
     – Depth: w/k, Degree: 2 , Stride: k bits
                              k

   Expanded trie
     – If stride = k bits, prefix lengths that are not a multiple of k need to be expanded
     – To speed up lookup, branch on multiple bits at each decision instead of just one.
     – The number of bits used is the “stride length”
     – Expansion uses up more space
     – Also, each entry requires two fields : because some entries require both a pointer and
       a prefix; i.e. P2, P5, and P6
     – Update speed versus memory size tradeoff




                                                                                      45        45
                                           Prof. Younghee Lee
       Binary Search on Prefix Intervals
       [Lampson98]


                                Prefix           Interval
                      P1        /0               0000…1111

                      P2        00/2             0000…0011

                      P3        1/1              1000…1111

                      P4        1101/4           1101…1101

                      P5        001/3            0010…0011

               P5                                                         P4
        P2                                               P3
                           P1
       I1      I2          I3                                 I4          I5     I6
0000        0010    0100        0110       1000 10011010           1100        1110 1111
                                                                                  46   46
                                         Prof. Younghee Lee
     Tree Bitmap
   Used in high-speed routers : Cisco
   Goal:
    – Wire-speed forwarding at OC-192(10Gbps)
    – Minimize memory accesses
   Going back to unibit tree to avoid the problems of expansion
    and leaf pushing




                                                          47   47
                             Prof. Younghee Lee
    Fast Longest Prefix Match
   Lulea‟s Routing Lookup Algorithm (Sigcomm‟97)
    – use a three-level data structure
   Multi-bit Tries
   Controlled Prefix Expansion [Sri98]
   Binary Search on Prefix Intervals [Lampson98]
   Binary search on prefixes : Waldvogel – Sigcomm 97
   Longest prefix matching using bloom filters
   Route caches
    – Temporal locality
    – Many packets to same destination



                                                         48   48
                                Prof. Younghee Lee
     Bloom Filter
   Method for representing a Set A={a1,
    a2,…an} of n elements (keys) to support
    membership queries.
   Probability of a false positive




   The right hand side is minimized
    for               , in which case it
    becomes
                                                   Figure: A Bloom Filter with 4
                                                   hash functions.



                                                                        49         49
                              Prof. Younghee Lee
  Fast Longest Prefix Match
 Content    addressable memory (CAM)
  – Hardware based route lookup
  – Input = tag, output = value associated with tag
  – Requires exact match with tag
     » Multiple cycles (1 per prefix searched) with single CAM
     » Multiple CAMs (1 per prefix) searched in parallel
  – Ternary CAM
     » 0,1,don‟t care values in tag match
     » Priority (I.e. longest prefix) by order of entries in CAM




                                                                   50   50
                                 Prof. Younghee Lee
Memory Technology (2006)

Technology   Max       $/chip               Access    Watts/chip
             single    ($/MByte)            speed
             chip
             density
Networking   64 MB     $30-$50              40-80ns   0.5-2W
DRAM                   ($0.50-$0.75)

SRAM         8 MB      $50-$60              3-4ns     2-3W
                       ($5-$8)

TCAM         2 MB      $200-$250            4-8ns     15-30W
                       ($100-$125)



                                                             51    51
                       Prof. Younghee Lee
    Performance Comparison:
    Complexity
Algorithm                      Lookup          Storage   Update
Binary trie                    W               NW        W

Patricia                       W2              N         W

Path-compressed trie           W               N         W

Multi-ary trie                 W/k             N*2k      -
LC trie                        W               N         -

Lulea                          -               -         -

Binary search on trie levels   logW            NlogW     -

Binary search on intervals     log(2N)         N         -
TCAM                           1               N         W

                                                              52   52
                               Prof. Younghee Lee
   Performance Comparison
Algorithm                                        Lookup (ns) Storage (KB)
Patricia (BSD)                                   2500        3262

Multi-way fixed-stride optimal trie (3-levels)   298         1930

Multi-way fixed-stride optimal trie (5-levels)   428         660
LC trie                                          -           700

Lulea                                            409         160

Binary search on trie levels                     650         1600

6-way search on intervals                        490         950

Lookups with direct access                       15-60       9-33 * 1000

TCAM                                             15-20       512

                                                                      53   53
                                     Prof. Younghee Lee
  Packet classification
 Packet   classification
  – The process of categorizing packets into “flows” in an
    Internet router
  – All packets belonging to the same flow obey a
    predefined rule and are processed in a similar manner
    by the router
 Flow-aware router: keeps track of flows and
 perform similar processing on packets in a flow
  – Non best effort services, firewalls, QoS
 Flow-unaware  router (packet-by-packet router):
 treats each incoming packet individually

                                                      54     54
                            Prof. Younghee Lee
Example of Classification Rules
   Access-control in firewalls
     – Deny all e-mail traffic from ISP-X to Y
   Policy-based routing
     – Route IP telephony traffic from X to Y via ATM
   Differentiate quality of service
     – Ensure that no more than 50 Mbps are injected from ISP-X
   Committed Access Rate (rate limiting)
     – Rate limit WWW traffic from sub-interface#739 to 10Mbps
   Traffic measurement: ftp?, p2p?...




                                                                  55   55
                                    Prof. Younghee Lee
Complexity: Hard Problem
   N rules and k header fields for k > 2
    – O(log Nk-1) time and O(N) space
    – O(log N) time and O(Nk) space
   How many rules?
    – Largest for firewalls & similar  1700
    – Diffserv/QoS  much larger  100k (?)




                                               56   56
                        Prof. Younghee Lee
     Multi-field Packet Classification

          Field 1        Field 2            …     Field k   Action
 Rule 1   5.3.90/21      2.13.8.11/32       …     UDP       A1

 Rule 2   5.168.3/24     152.133/16         …     TCP       A2

 …        …              …                  …     …         …

 Rule N   5.168/16       152/8              …     ANY       AN



Given a classifier with N rules, find the action associated with
the highest priority rule matching an incoming packet.
Example: packet (5.168.3.32, 152.133.171.71, …, TCP)
                                                                 57   57
                             Prof. Younghee Lee
Flow-aware Router: Basic
Architectural Components


     Routing, resource reservation,
     admission control, SLAs                     Control



Routing Packet      Special         Switching     Datapath:
lookup classificati processi        Scheduling
                                                  per-packet
        on          ng                            processing


                                                     58    58
                      Prof. Younghee Lee
    Packet Classification: Problem
    Definition
Given a classifier C with N rules, Rj, 1  j  N, where Rj
   consists of three entities:
1) A regular expression Rj[i], 1  i  d, on each of the d header
   fields,
2) A number, pri(Rj), indicating the priority of the rule in the
   classifier, and
3) An action, referred to as action(Rj).
For an incoming packet P with the header considered as a d-tuple
of points (P1, P2, …, Pd), the d-dimensional packet classification
problem is to find the rule Rm with the highest priority among all the
rules Rj matching the d-tuple; i.e., pri(Rm) > pri(Rj),  j  m, 1  j
 N, such that Pi matches Rj[i], 1  i  d. We call rule Rm the best
matching rule for packet P.

                                                                   59    59
                                Prof. Younghee Lee
     Example 4D classifier
Rule   L3-DA                  L3-SA                L4-DP         L4-PROT   Action

       152.163.190.69/255.2   152.163.80.11/255.   *             *         Deny
R1     55.255.255             255.255.255

       152.168.3/255.255.25   152.163.200.157/2    eq www        udp       Deny
R2     5                      55.255.255.255


       152.168.3/255.255.25   152.163.200.157/2    range 20-21   udp       Permit
R3     5                      55.255.255.255


       152.168.3/255.255.25   152.163.200.157/2    eq www        tcp       Deny
R4     5                      55.255.255.255


       *                      *                    *             *         Deny
R5

                                                                                  60   60
                                         Prof. Younghee Lee
      Example Classification Results




Pkt    L3-DA            L3-SA              L4-DP       L4-PROT   Rule,
Hdr                                                              Action


       152.163.190.69   152.163.80.11      www         tcp       R1, Deny
P1
       152.168.3.21     152.163.200.157    www         udp       R2, Deny
P2



                                                                     61     61
                                  Prof. Younghee Lee
  Classification is a Generalization
  of Lookup
 Classifier = routing table
 One-dimension (destination address)
 Rule = routing table entry
 Regular expression = prefix
 Action = (next-hop-address, port)
 Priority = prefix-length


 Longest   Prefix Matching for routing lookups is a
  special-case of one-dimensional packet
  classification
                                                  62   62
                       Prof. Younghee Lee
Example
   Two-dimension space, i.e., classification based
    on two fields
   Complexity depends on the layout, i.e., how
    many distinct regions are created




                                                      63   63
                          Prof. Younghee Lee
     Classification algorithm
   Linear search
    – The simplest data structure is a linked list of rules stored in order of
      decreasing priority
   O(N) storage, O(N) lookup time, O(1) update complexity




                                                                        64       64
                                  Prof. Younghee Lee
 Recursive Flow Classification
 [Gupta99]
Observations:
           to achieve both high classification
  Difficult
   rate and reasonable storage in the worst case
  Real classifiers exhibit structure and
   redundancy
  A practical scheme could exploit this structure
   and redundancy




                                                65   65
                      Prof. Younghee Lee
    RFC: Classifier Dataset

   793 classifiers from 101 ISP and enterprise networks
    with a total of 41505 rules.
     – Classifier (policy database)
   40 classifiers: more than 100 rules. Biggest classifier
    had 1733 rules.
   Maximum of 4 fields per rule: source IP address,
    destination IP address, protocol and destination port
    number.




                                                          66   66
                               Prof. Younghee Lee
     RFC:
   Problem formulation:
    – Map S bits (i.e., the bits of all the F fields) to T bits (i.e., the class identifier)
   Main idea:
    – Create a 2S size table with pre-computed values; each entry contains the class
      identifier
         » Only one memory access needed
    – …but this is impractical  require huge memory
    – Use recursion: trade speed (number of memory accesses) for memory footprint




                                                                                     67        67
                                        Prof. Younghee Lee
       The RFC Algorithm
 At each stage the algorithm maps one set of values
  to a smaller set
   – A set of memories return a value shorter than the index of
     the memory access


 Split   the F fields in chunks
   1. Use the value of each chunk to index into a table
       Indexing is done in parallel
   2. Combine results from previous phase, and repeat
   3. In the final phase we obtain only one value that is action

                                                            68     68
                                      Prof. Younghee Lee
The RFC Algorithm

                    Chunk #0
                                      Source L3 Address


                                      Destination
                                      L3 Address

                                      L4 protocol and
                                      flags

                                      Source L4 port

                                      Destination L4 port

                    Chunk #7          Type of Service

                          Packet Header
                                                    69      69
            Prof. Younghee Lee
Chunking of a Packet
                            Transport-layer Destination: chunk #6
                            • (a) {www=80} (b) {20,21} (c) {>1023}
                            (d) {all remaining numbers in the range
                            0-65535}; => can be encoded
                            using two bits 00 through 11
                            • two bit values => Equivalence Class
                            IDs (eqID)
                            Transport-layer Protocol: chunk #4
                            • (a) {tcp} (b) {udp} (c) {all remaining
                            numbers in the range 0-255}: can
                            be encoded using two-bit eqIDs

                            Second phase;
                            (a) {({80}, {udp})} (b) {({20-21}, {udp})}
                            (c) {({80}, {tcp})} (d) {({gt 1023}, {tcp})}
                            (e) {all remaining crossproducts}; =>
                            can be represented using three-bit
                            eqIDs.


                                                             70        70
             Prof. Younghee Lee
Complete Example                  indx=c10*5+c11




          indx=c02*6+c03*3+c05




                                          71       71
             Prof. Younghee Lee
                     72   72
Prof. Younghee Lee
    Choice of Reduction Tree
                               0
0
                               1
1
                               2
2
                               3
3                              4
4
                               5
 5
Number of phases = P = 3         Number of phases = P = 4
10 memory accesses               11 memory acceses 73 73
                       Prof. Younghee Lee
RFC: Classification Time
 Pipelined hardware: 30 Mpps (worst case
  OC192) using two 4Mb SRAMs and two 64Mb
  SDRAMs at 125MHz.
 Software: (3 phases) 1 Mpps in the worst case
  and 1.4-1.7 Mpps in the average case. (average
  case OC48) [performance measured using Intel Vtune simulator on
    a windows NT platform]




                                                           74       74
                             Prof. Younghee Lee
RFC: Pros and Cons


Advantages                        Disadvantages

Exploits structure of real-      Depends on structure of
life classifiers                  classifiers
Suitable for multiple fields     Large pre-processing time
Supports non-contiguous          Incremental updates slow
masks                             Large worst-case storage
Fast accesses                    requirements




                                                           75   75
                            Prof. Younghee Lee
Summary of classification schemes




                                  76   76
             Prof. Younghee Lee
Summary of classification schemes
   Lookup/Classification Chip Vendors
    –   Switch-on
    –   Fastchip
    –   Agere
    –   Solidum
    –   Siliconaccess
    –   TCAM vendors: Netlogic, Lara, Sibercore, Mosaid,
        Klsi etc.

   Packet classification still an area of active
    research

                                                      77   77
                         Prof. Younghee Lee

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:24
posted:7/15/2011
language:English
pages:77