ATM NETWORKS by yurtgc548

VIEWS: 14 PAGES: 139

									         10. IP QoS Service
The current Internet: IP Protocol
 Best-Effort Service – no Quality of
  Service (QoS) guarantees are provided.

   Connectionless Service – no connection is
    established prior to sending the packets.
    Each packet carries the full destination
    address. Routing is performed using a
    shortest path algorithm, independently
    for each packet.

         Why do we need a New Protocol?

   The emerging multimedia applications require QoS

   Real-time applications require connection oriented

   Other routing algorithms may be more appropriate than
    the shortest path algorithm in order to increase
    network efficiency and provide QoS.

      IETF* Proposed Solutions

   Integrated Services (IntServ)
   Resource Reservation Protocol (RSVP)
    – Disadvantage: Scalability (per-flow reservations)
   Differentiated Services (DiffServ)
    – Disadvantage: No per-flow QoS guarantee
   Multiprotocol Label Switching (MPLS)
*IETF – Internet Engineering Task Force

           REQUIREMENTS for IP QoS

   A network is characterized as having EDGE and

   Edge routers accept customer traffic, i.e.,
    packets from any source outside the network into
    the network.

   Core routers provide transit packet forwarding
    service between other Core routers and/or Edge

             REQUIREMENTS for IP QoS

   Edge routers characterize, police, and mark customer
    traffic being admitted to the network.

   Edge routers may decline requests signaled by outside
    sources (Admission Control).

   Core routers differentiate traffic insofar as necessary to
    cope with transient congestion within the network itself.

   Statistical multiplexing must be utilized wherever
    appropriate to maximize utilizaton of core resources.

Network Architecture

            Integrated Services (IntServ)

GOAL: Augment existing Best Effort Internet
        with a range of end-to-end services for
        real-time streaming in interactive applications.

IntServ developed an architecture requiring per-flow
traffic handling at every hop along an application’s end-
to-end path and explicit a priori signaling using RSVP
(Resource Reservation Protocol) of each flow’s

         Integrated Services (IntServ)

   IntServ model requires resources such as
    bandwidth and buffers to be explicitly reserved
    for a given data flow to ensure that the
    application receives its requested QoS.

   A flow is composed by a stream of packets with
    the same source and destination addresses and
    port numbers.

   A flow descriptor is used to describe the traffic
    and QoS requirements of a flow.

   Per-flow QoS guarantees are provided at the
    expense of installing and maintaining flow-specific
    state in each router along the flow’s path.

   Basic components of the IntServ architecture:
    Setup Protocol, Traffic Control (filterspec),
    flowspec and Traffic Classes.

            Architecture Basic Components

   Setup Protocol – enables a host or an application to
    request a specific amount of resources from the
    network  realized by
     (Resource Reservation Protocol (RSVP))

   Traffic Control (filterspec) – includes packet
    classifier, packet scheduler, and admission control.
   flowspec – objects such as token bucket parameters.
   Traffic Classes – best-effort, controlled load, and
    guaranteed services.

              Setup Protocol: RSVP
   Every application is presumed to use some form of
    signaling to negotiate service with an IntServ capable

   IntServ signaling has 2 functions:
        Negotiation: When the network decides whether it
         can support the applications requested service
         (Admission Control)
       Configuration: When the network configures the
        routers along the path to support the negotiated
       flow characteristics.

The applications use
RSVP: Resource Reservation Protocol.
         Goals for the Design of RSVP

   Must support both unicast and multicast
    traffic flows (i.e., RSVP sessions).

   Must allow parties of a multicast session to
    request different levels of QoS.

   Must be deployable on top of existing IP

              Basics of RSVP

   Performs resource reservations for unicast and multicast
   Requests resources in one direction from a sender to a receiver
    (simplex resource reservation)
   Requires the receiver to initiate and maintain the resource
   Maintains soft state at each intermediate router: A resource
    reservation at a router is maintained for a limited time only, and
    so the sender must periodically refresh its reservation.
   Does not require each router to be RSVP capable.
   Non-RSVP capable routers use Best Effort delivery technique.
   Provides different reservation styles so that requests may be
    merged in several ways according to the applications.
   Supports both IPv4 and IPv6.

            RSVP: Receiver Initiated Reservation

   Similar to “Leaf Join Case” in ATM Multicasting.
   Motivation: RSVP is primarily designed to support multiparty
    conferencing with heterogeneous receivers.
   In this environment the receiver actually knows how much
    bandwidth it needs.
   If the sender were to make the reservation request, then the
    sender must obtain the bandwidth requirement from each receiver.
   This may cause an implosion problem for large multicast groups.

Problem: Receiver does not directly know the path taken by data

Solution:    Use Path messages.

The application source transmits a “Path” message
along the routed path to the unicast or multicast
 – The Path message has two purposes:
      * to mark the routed path in each router (store
        the “path state”) between sender/receiver and
     * to collect information about the QoS viability
       of each router along that path.
 – Upon receiving the Path message, the destination
   host(s) can determine what services the network
   can support (e.g., guaranteed service or controlled
   load) and then generate an RSVP reservation (Resv)

   Resv messages are sent back towards the sender along
    the reverse path.

   The Resv message carries reservation requests to the
    routers along the path.

   The Resv message contains traffic and QoS objects that
    are processed by the traffic control component of each
    router as it follows the reverse path upstream toward the

   If the router has sufficient capacity, then resources
    along the path back towards the receiver are reserved for
    that flow. If resources are not available, RSVP error
    messages are generated and returned to the receiver.
            SOFT STATE in RSVP

   RSVP Path and Resv messages are periodically
    sent by senders and receivers, respectively, to
    refresh the reservations performed.
   When a state is not refreshed within a certain
    time out, the state is deleted.
   The type of state that is maintained by a timer
    is called “Soft State” as opposed to hard state
    where the establishment and teardown of a
    state are explicitly controlled by signaling

   Wildcard Filter Reservation
    A single reservation shared by all senders. Kind of shared
    pipe whose resource is the largest of the resource
    requests from all receivers, independent of the number of
    senders. (e.g., Audioconferencing).

   Fixed Filter Reservation
    A distinct reservation is created for each sender. S_i is
    the selected sender and Q_i is the resource request for
    sender i. The total reservation on a link for a given
    session is the sum of all Q_i’s.

   Shared Explicit Reservation
    A single reservation shared by a set of explicit senders
    where S_i is the selected sender and Q is the flowspec.
               flowspec and filterspec

   flowspec is used to set parameters in the
    router’s packet scheduler.
   flowspec (Flow Specification) consists of traffic
    specification (Tspec) (T for traffic) and a
    service request specification (Rspec) (R for
   Tspec describes the sender’s traffic
    characteristics, i.e., it specifies the traffic
    behavior of the flow in terms of a token bucket.

          Flow Specification (flowspec)

   Rspec reserves a service class which defines
    the requested QoS,
    i.e., it specifies the requested QoS in terms
    of bandwidth, packet delay or packet loss.

   flowspec is carried by RSVP messages into
    the network and defines the application’s QoS
    requirements as a series of objects, such as
    token bucket parameters.

         Traffic Control Components (filterspec)

filterspec (Filter Specification) provides the information
required by the packet classifier to identify the packets
                 that belong to the flow.

    Classifier - examines the source and destination
     addresses, and port number fields in each packet to
     determine what class the packet belongs to.
    Scheduler - determines which packet will be served
    Admission Control - determines whether a new flow
     can be granted the requested QoS without affecting
     other flows existing in the network.

           Traffic Classes Components

   Best-Effort - same as in the traditional
    IP networks.

   Controlled Load - approximates a best-
    effort over an uncongested network.

   Guaranteed Service - supports real-time
    traffic flows that require a delay bound.

            Controlled Load Service

   Under CL service, the packets of a given flow will
    experience loss and delays comparable to a network
    with a light traffic load, assuming the flow complies
    with the traffic contract.
   No guarantees are provided but both loss probability
    and delay are expected to be very low.
   The application provides the network with an estimate
    of the traffic it will generate.
   This estimate is done by specifying the data flow’s
    desired traffic parameters (Tspec) to the network

              Controlled Load Service

Tspec (Traffic Specification) Model:

   It is a refinement of the Token Bucket model.
   A source characterizes itself with the following
    SENDER-Tspec (traffic characteristics) parameters:
            * Token bucket rate r (bytes/sec) and size
              b (bytes)
            * Peak data rate p
            * Minimum policed unit m
            * Maximum packet size M

           Controlled Load Service
   Admission Control is performed in order to deliver the expected
   Traffic flows are policed.
   Non-conformant packets are either dropped or delivered when
    possible using the best-effort service.
   Packets larger than the agreed maximum packet size will also be
    considered as non-conformant.
   Adaptive real-time applications are supposed to use the controlled
    load service.
   These applications perform well when the network is not heavily
    loaded, but suffer rapid degradation in performance as the
    network load increases.

             Guaranteed Service

   GS guarantees the packets will arrive within a
    certain delivery time, and that they will not be
    discarded due to queue overflow, provided that
    the flow’s traffic complies with the traffic
   GS also uses the Tspec model.
   The service is requested by a sender
    specifying Tspec and the receiver subsequently
    requesting a desired service level (Rspec).

               Guaranteed Service
Rspec (Reservation Specification) Model:
 Works together with the Tspec model to guarantee a desired
  service level.
 The desired service level is described using the following
  parameters (R data rate and S slack term)
  in addition to r,b,p,m and M used for CL service:
   – Data rate R is measured in the same units as r and must
      be equal to or more than r (token rate). R reflects the
      theoretical service rate that, at each router, will result in
      a desirable delay bound.
   – Slack term S is measured in microsec and reflects how far
      each router is allowed to deviate from the ideal delay
      bound, i.e., the difference between the desired delay and
     the delay obtained by using a reservation level R.

   REMARK: Larger values for R and smaller values for S represent
          stricter delay bounds.
             Guaranteed Service

   Making use of TSpec and RSpec, a certain
    amount of bandwidth and buffer space is
    allocated at each node for each flow.

   Resources are allocated using worst-case

   Upper bounds for the end-to-end delay and
    the packet loss probability can be evaluated


Sources emit regular PATH messages downstream
toward the receiver(s) for reservation
   Two message objects relevant to IntServ are
    carried in PATH messages: SENDER_Tspec
    (describing the traffic) and ADspec (modified at
    each hop to reflect the network characteristics
    between source and receiver).

   ADspec informs the receiver which service classes
    (CL, GS or both) are appropriate for the traffic.

   Along the way, IntServ capable routers may modify
    the ADspec relevant to reflect restrictions or
    modifications required by the network.

Receiver(s) respond with Resv messages upstream
toward the sender

   Receiver uses the SENDER_Tspec and (possibly
    modified) ADspec to determine which parameters to
    send back upstream in a flowspec element.

   flowspec selects either CL or GS and carries
    parameters required by the routers along the upstream
    path to determine whether the request can be honored
    or not.

   One message object relevant to IntServ is carried in
    Resv messages: flowspec (describing the receiver’s
    desired QoS service to be applied to the sources’
                IntServ Drawbacks

   Scalability – per flow resources reservation.
   Flexibility – IntServ provides a small number
    of pre-specified traffic classes: Guaranteed
    and Controlled Load Services.
   Efficiency – The Guaranteed Service of the
    IntServ model is based on the worst case
    analysis and thus, is very conservative.
    Moreover, bandwidth and delay requirements
    are coupled, causing network inefficiency.

        Resource Reservation Protocol Drawbacks

   Complicated RSVP signaling (unidirectional,
    frequent refresh messages).

   The current version of RSVP lacks both
    adequate security mechanisms to prevent
    unauthorized parties from instigating theft-
    of-service attacks, and policy control.

   Looking for a New Solution…

Because of the difficulty in
implementing and deploying IntServ and
RSVP, the IETF proposed the
Differentiated Services (DiffServ)

              Differentiated Services (DiffServ)

   Solves scalability and flexibility problems
   Forces as much complexity as possible to the
    edge nodes which process lower volumes of
    traffic and lesser number of flows.
   Offers service per aggregate traffic, rather than
    per flow.
   Reservations are made for a set of related flows.
   It does not require new applications or extensive
    router upgrades.
   It does not define specific services or service
    classes, as IntServ does.
           Differentiated Services

The objective of the DiffServ is
to propose a small, well-defined set
of building blocks from which
a variety of services may
be constructed.

Complexity is moved from
the core of the network to
the edge of the network.

Packet forwarding in the
core network is simple and
per-aggregate rather than

               Differentiated Services

   A DiffServ Domain is a set of contiguous DS nodes defining the same per
    hop behaviors (PHBs) and under the same policy strategy.
   A DS domain consists of DS interior, edge, and boundary nodes.
   A boundary node interconnects the DS domain to other DS or non-DS-
    complaint nodes.
   Edge and interior nodes only connect to other interior, edge, or boundary
    nodes within the same DS domain.

                    Differentiated Services

The DSCP (DiffServ Code Point)
    byte is used to                        The DS byte coincides
         The DS byte is used to specify the
 specify the forwarding                    with the TOS octet in
 treatment (or per-hop                     IPv4 and the Traffic
          forwarding treatment (or per-hop
   behavior) to be used for                Class octet in IPv6.
          behavior) to be used for a packet.

                 Edge and Core Nodes

   Edge nodes handle a relatively small number of traffic
   Therefore, they can execute per-flow traffic
   Edge nodes are responsible for policing and shaping.
   They are also responsible for admission control, if any.
   Core nodes handle a large amount of traffic flows.
   They perform per-aggregate rather than per-flow
    traffic management.

            Basic Approach

• Traffic is divided into a small number of
  groups called forwarding classes

• Forwarding class that a packet belongs to
  is encoded into a field in the IP packet

• Each forwarding class represents a
  predefined forwarding treatment in terms
  of drop priority and bandwidth allocation.

            Basic Approach (cont.)

Achieves scalability by implementing traffic classification
and conditioning functions at network boundary nodes

Classification involves mapping packets to different
forwarding classes.

Conditioning: checking whether traffic flows meet the
service agreement and dropping/remarking non-conformant

Interior nodes forward packets based solely on the
forwarding class.

           Basic Approach (cont.)

Resource allocation for aggregated traffic rather than
individual flows

   Performance assurance to individual flows in a
   forwarding class provided through prioritization and
   provisioning rather than per-flow reservation

Traffic policing on the edge and class-based forwarding
in the core

Define forwarding behaviors not services

            Basic Approach (cont.)

Guarantee by provisioning rather than reservation

   Allocate resources to forwarding class and control the
   amount of traffic for these classes

   Provides only service assurance; no BW or delay
Based on SLAs, not dynamic signaling

Focus on a single domain, not end-to-end

   Forwarding classes can be defined for a single domain
   and between domains service providers can extend or
   map their definitions through bilateral agreement
         Services and Forwarding Treatment

Two important concepts in DiffServ architecture

Forwarding treatment refers to the externally observable
behavior of a specific algorithm or mechanism that is
implemented in a node e.g. Express forwarding (using
priority queue)

Service is defined by the overall performance that a
customer’s traffic receives e.g. a no-loss service
provided by Express Forwarding

           Per Hop Behavior (PHB)

Forwarding treatments at a node

Each PHB is represented by a 6-bit value called DSCP

All packets with the same code points are referred to as
a behavior aggregate (BA) and they receive the same
forwarding treatment.

                     PHB (cont.)

Describe forwarding behavior in either relative or absolute

      * Minimal BW for BA: absolute term
      * Allocate BW proportionally: relative

Typically implemented by means of buffer management and
packet scheduling.

                 Per-Hop Behavior

   The PHB defines the service a packet receives at each
    hop as it is forwarded through the network.
   It is realized through internal queue management and
    scheduling techniques.
   5 bits of the DS byte can be used to specify the PHB.
   Therefore, (2^5) = 32 PHBs can be defined.
   The IETF intends to standardize only a few of them.
   Packets marked with different DS byte values should
    receive different PHB and, accordingly, should
    experience different services in the core network.
   Services can be differentiated using appropriate
     – Scheduling
     – Queue Management
                  Services (cont.)

SLAs may be static or dynamic

Services can be defined in either quantitative or qualitative

Services may have different scopes:
      * All traffic from ingress node A and any egress
      * All traffic between ingress node A and egress node

           IETF Per-Hop Behaviors

   The IETF DiffServ Working Group is
    finishing work on two PHBs:
     – Expedited Forwarding (EF)
     – Assured Forwarding (AF)

               Expedited Forwarding PHB

   The EF PHB was designed to support low loss, low delay, and low
    jitter connections.
   It appears as a point-to-point virtual leased line (VLL) service
    between endpoints with a peak bandwidth.
   To minimize jitter and delay, packets must spend little or no
    time in router queues.
   Therefore, the EF PHB requires that the traffic be conditioned
    to conform to the peak rate at the boundary, and the network
    of routers be provisioned such that this peak rate is less than
    the minimum packet departure rate at each router in the
   The EF PHB uses a single DSCP bit to indicate that the packet
    should be placed in a high-priority queue on the outbound link of
    each router hop.

            Assured Forwarding PHB

   The AF PHB defines four relative classes of service
    with each service supporting three levels of drop
   Twelve distinct DSCP bit combinations define the AF
    classes and the drop precedence within each class.
   When congestion is encountered at a router, packets
    with a higher drop precedence will be discarded ahead
    of those with a lower drop precedence.
   The four AF classes define no specific bandwidth or
    delay constrains other than that AF class 1 is distinct
    from AF class 2, and so on.


Describes the overall treatment of a customer’s traffic
within a DS domain or end-to-end.

This is what is visible to the customers; PHBs are hidden
inside the network node.

Realizing a service involves many components to work
      * Mapping of traffic to specific PHBs,
      * Traffic conditioning at the boundary,
      * Network provisioning,
      * PHB-based forwarding in the core

            Services (cont.)

In Diffserv, services are defined in the form of a Service
Level Agreement (SLA) between a customer and its service

One important element of SLA in Diffserv is the Traffic
Conditioning Agreement (TCA).

TCA details the service parameters for traffic profiles and
policing actions.

             Services (cont.)

This may include
            Traffic profiles, such as token bucket
            parameters for each of the classes

   Performance metrics: throughput, delay

   Actions for non-conformant packets
In addition to TCA, an SLA may also contain other
characteristics and business-related agreements such as
availability, security, monitoring, auditing, billing.

          Packet Classifier and Traffic Conditioner


             CLASSIFIER         MARKER

                 Traffic Conditioning Components


 Packets      Classifier     Marker        Shaper&Dropper

– Meter: A meter measures the temporal properties of the stream of
  packets selected by the classifier against a traffic profile.
– Marker: A packet is marked by setting its DS field to a particular
  codepoint. The packet now belongs to a certain behavior aggregate.
– Shaper: A shaper holds (delays) some or all the packets in a traffic
  stream to make the stream to become compliant to the traffic
– Dropper: A dropper discards some or all the packets in a traffic
  stream to bring the stream into compliance with the traffic profile.

Divides an incoming packet stream into multiple groups
based on predefined rules

Two basic types of classifiers:
         * Behavior Aggregate (BA)
         * Multifield (MF)

BA classifier selects packets based solely on DSCP
DiffServ Code Point) value in the packet header

BA classifier is used when DSCP has been set (marked)
before the packet reaches the classifier

             Classifier (Cont.)

MF classifier uses a combination of one or more fields of
the five-tuple

(src addr, src port, dest addr, dest port, proto ID)

in the packet header for classification

Classification policies may specify a set of rules and
corresponding DSCP values for marking the matched

         Traffic Conditioner

Performs traffic policing function to enforce the TCA
(Traffic Conditioning Agreement) between customer
and service providers

Four basic elements:
   •Shaper and


For each forwarding class meter measures the traffic flow
from a customer against its traffic profile

In-profile packets are allowed to enter the network

Out-profile packets are further conditioned based on TCA


Sets the DS field of a packet to a particular DSCP,
adding marked packet to forwarding class.

May act on unmarked packets or remark previously
marked packets.

Can occur at different locations:
         * Can be marked by the application
         * Marked by the first-hop routers

                 Marker (cont.)

Marking is done on non-conforming packets:

     * Packets may be marked with a special DSCP to
       indicate non-conformance

     * These packets would be dropped first in the event
       of network congestion

Since packets travel through different domains, packets
that have been marked may be remarked (to a different

            Marker (cont.)

When packet REmarked with new DSCP receives
worse forwarding treatment than from previous
 PHB demotion

With better forwarding treatment:
 PHB promotion


Shapers delay non-conformance packets in order to bring
the stream into compliance.

A stronger form of policing than marking

Shaping may also be needed at a boundary node to a
different domain (to make sure that the traffic is
conformant before entering the next domain)

Usually has finite buffer, so may also drop packets when
buffer is full


Discards packets in a traffic stream in order to bring
the stream into compliance with a traffic profile.

Strongest policing entity

Can be implemented as a special case of a shaper by
setting the shaper buffer size to zero.

           Differentiated Services Field

Uses 6 bits in the IP header to encode forwarding treatment

These 6 bits are those out of the IP TOS field (8 bits long)

DiffServ redefines existing IP TOS field to indicate
forwarding behavior

Replacement field, called DS field supersedes existing
definition of TOS

First 6 bits used as DSCP to encode the PHB, remaining 2
bits are currently unused (CU).

           Differentiated Services Field (cont.)

xxxxx0 – standard action
xxxx11 – experimental and local use
xxxx01 – experimental and local use but may be subject
         to standard action (in case pool 1 is exhausted)

          Assured Forwarding (AF)

The basic idea came from RIO scheme

In RIO scheme packets are marked as In or Out

During congestion, out packets are dropped first:
in/out bit indicates drop priorities

AF standard extended the basic in or out marking in RIO
into four forwarding classes and within each forwarding
class, three drop precedences

       Assured Forwarding (AF) (cont.)

Customers can subscribe to the service built with AF
forwarding class and their packets will be marked with
appropriate AF DSCPs.

Drop priorities within each forwarding class are used to
select which packets to drop during congestion

When backlogged packets from an AF forwarding class
exceed a specified threshold, packets with highest drop
priority is dropped first, then packets with lower drop

        AF Implementation

Can be implemented as BW partition between
classes and drop priorities within a class

BW partition is specified in terms of minimum BW

Can be achieved by WFQ scheduling and assigning
weights according to min BW requirement

                AF Implementation (cont.)

AF standard specifies certain properties

   Attempt to minimize short-term fluctuation in congestion:
      Some smoothing function should be applied.

   Dropping mechanism should be insensitive to the short term traffic
   characteristics and discard packets from flows of the same long
   term characteristics with equal probability:
      Use random function for dropping

   Discard rate of a flow within a drop priority should be proportional
   to the flow’s percentage of the total amount of traffic passing
   through that drop priority level
       Can use RED or RIO for dropping

                    Buffer Management

When a router runs out of buffer
space packets must be dropped.

In DiffServ, dropping decisions
take the DS byte value into

For example if Weighted Random
Early Detection (WRED) is used:

          Random Early Detection (RED)


MAX-thr    MIN-thr

                                       MIN-thr   MAX-thr

           RED Algorithm (Cont.)

for each packet arrival
            calculate the average queue size “avg”
            if min-thr <= avg < max-thr
                   calculate probability pa
                   with probability pa mark the
                   arriving packet
            else if max-thr <= avg
                   mark the arriving packet

           RED Algorithm (Cont.)

pb = maxp (avg – min_thr) / (max_thr – min_thr)

pa = pb / (1 – count * pb)

count is the number of packets unmarked since the last
      packet marking.

pa ensures that the EDGE ROUTER/INGRESS ROUTER
 does not wait too long before marking a packet.

       RED Algorithm (Cont.)

Avoids global synchronization problem by
virtue of its randomness

No bias against bursty traffic

             RED-In/Out (RIO)

Uses same mechanism as RED, but is configured with
two sets of parameters,
(in-profile packets and out-profile packets)

Out-packets are dropped more aggressively than in-

                RED-In/Out (RIO)

Pout = Pmaxout (avgout+in – minout) / (maxout – minout)

Pin = Pmaxin (avgin – minin) / (maxin – minin)

If avgout+in < minout, no packet dropped,

If avgout+in > maxout, all “Out” packets are dropped

If avgin < minin, no packet dropped,

If avgin > maxin, all “In” packets are dropped

                        RIO (Cont.)

      P_in (drop)                             P_out (drop)

1.0                                     1.0


               MIN-in    MAX-in   Avg_in        MIN-out      MAX-out   Avg_tot

                RIO (cont.)

Discrimination against out packet is created by carefully
choosing the parameters

   (min_in, max_in, Pmax_in) and (min_out, max_out,

Drops “out packets” earlier than “in packets”:
done by choosing
      min_out < min_in

Drops “out packets” with a higher probability:
     Pmax_out > Pmax_in (Congestion Avoidance Phase)

                      RIO (cont.)
Goes into congestion control phase for “out packets” much
earlier than for “in packets” by choosing
       max_out <<max_in.

So, RIO drops “out packets” first when it detects some
congestion and drops all “out packets” if congestion persists

Only as a last resort, it may drop “in packets” to control

   If a router is consistently dropping in packets then the
   router may be under-provisioned

         Expedited Forwarding (EF)

Proposed to characterize a forwarding treatment similar
to that of a simple priority queueing.

Forwarding treatment of traffic aggregate must equal or
exceed a configurable rate

Should receive this rate independent of load of other
traffic passing through the node

Provides low delay and low loss service

Code point <101110> used for EF PHB

               EF Implementation

Several queueing mechanisms can be used to implement EF PHB

   Priority queueing with token bucket

      1. Priority of EF traffic should be highest in the

      2. Token bucket is used to limit the total amount of
         EF traffic so that other traffic will not starve

   WFQ can be used such that weight assigned to EF
   traffic has relative priority than other traffic

         Interoperability with Non-DS-Compliant Node

Non-DS-compliant node is a node that does not implement
some or all of the standardized PHBs.

A special case of a non-DS-compliant node is a legacy
node which implements IPv4 Precedence classification as
defined in RFC1812 and RFC791

Nodes that are non-DS-compliant and not legacy nodes
may exhibit unpredictable forwarding behavior for packets
with non-zero DSCP.

        Non-DS-Compliant Node within a Domain

When links connected to a non-DS-compliant node are
lightly loaded, the performance degradation may be

However, in general, lack of PHB forwarding a node will
make it impossible to offer low-delay, low-loss service

Use of legacy node may be acceptable if DS domain
restricts itself if the precedence implementation in the
legacy node is compatible with services offered along the

         Transit Non-DS-Compliant Domain

DS domain and non-DS domain may negotiate how egress
traffic from DS domain be marked before entry into the
non-DS domain

When there is no traffic management service available or
no agreement in place, DS domain egress node may remark
the DSCP to zero, under the assumption that non-DS
domain will treat the traffic uniformly as best-effort

              Differentiated Services (DiffServ)

Scalable: Only simple functions in the core, and relatively
complex functions at edge routers (or hosts)
Flexible: Does not define service classes, instead provides
functional components with which service classes can be
Simple: Users only specify a qualitative notion of service

   End host                                       End host

             DiffServ Drawbacks

   The QoS enjoyed by a flow is dependent
    on the behavior of the other flows
    belonging to the same aggregate.

   There is no per-flow guarantees.

             IntServ over DiffServ

Since IntServ has scaling issues in the core of the
network, DiffServ was proposed.

IntServ provides guaranteed service per flow whereas
DiffServ only provides assurance for aggregated traffic

Thus, application would still like to use IntServ until the
edge of the DiffServ core in the ingress side and from
edge of the DiffServ core to the end host/router on the
egress side

Hence  the need for IntServ over DiffServ

            IntServ over DiffServ (cont.)

Request for Intserv services needs to be mapped onto
underlying capabilities of Diffserv network:

    * Selecting appropriate PHB for the requested service

    * Performing appropriate policing at the edge of the
      Diffserv network

    * Performing admission control on the Intserv

          IntServ over DiffServ (cont.)

When PHB has been selected for a particular Intserv
flow, it may be necessary to communicate the choice to
other network elements, e.g. when marking is not done
at the edge

Two schemes may be used to achieve this:

          * Network Driven Mapping (Default)
          * Microflow Separation

           IntServ over DiffServ (cont.)

1. Network Driven Mapping

   •   RSVP capable routers in Diffserv network (perhaps at
       the edge) may do the well-known mapping

           IntServ over DiffServ (cont.)

2. Microflow Separation

•   Boundary nodes at the edge of Diffserv network police
    traffic from outside Diffserv network

•   But this policing is applied to aggregate traffic

                MicroFlow Separation

So it is possible for a misbehaving microflow to claim
more than its fair share of resources within the
aggregate and degrade service provided to other

This problem can be addressed in three ways:
    * Provide per microflow policing at border routers:
      but this approach puts management burden on the
      Diffserv region
    * Rely on upstream elements to do shaping and

    IntServ over DiffServ (cont.)

Two scenarios in this framework:

* Differv Network is RSVP-unaware
* Diffserv Network is RSVP-aware

        Differv Network is RSVP-Unaware

1. Diffserv network and the customer of this network
   have negotiated SLAs, e.g., amount of BW Diffserv
   will provide for each SLA

2. RSVP messages just pass through the Diffserv
   network as tunnels, without any action being taken.

3. The edge router in Intserv network will identify the
   service level (DSCP) of the flow and will run
   admission control to make sure that resources are
   available in the Diffserv network at the
   corresponding service level.

          Differv Network is RSVP-Aware

1. Border routers and possibly some/all core routers in Diffserv
   network are RSVP-aware

2. These routers participate in RSVP signaling, but schedule
   traffic in aggregate, (like the control plane is RSVP while their
   data plane is Diffserv)

3. Admission control agent is part of Diffserv network.

          Multiprotocol Label Switching (MPLS)

   MPLS is a forwarding paradigm.
   Choosing the next hop can be thought as the
    composition of two functions:
     – Partitioning the entire set of possible
       packets into a set of Forwarding
       Equivalence Classes (FECs).
     – Mapping each FEC to a next hop.
   In the Multiprotocol Label Switching (MPLS),
    the assignment of a packet to a particular
    FEC is done just once: when the packet
    enters the network.
Operation of MPLS

          Remove Layer 2 header
            New Layer 2 header

layer 2
header         Network (3)
layer 3
                 Link (2)      Small tag lookup

                Physical (1)

              IP Network
             MPLS Network
            “Label Substitution” What is it?

 One of the many ways of getting from A to B:

 Go everywhere, stop when you get to B, never ask for directions.
 Continually ask who’s closer to B go there, repeat … stop when you
 get to B. “Going to B? You’d better go to X, its on the way”.
 Ask for a list (that you carry with you) of places to go that
 eventually lead you to B. “Going to B? Go straight 5 blocks, take
 the next left, 6 more blocks and take a right at the lights”.

                   Label Substitution

Have a friend go to B ahead of you using one of the previous two
At every road they reserve a lane just for you.
At ever intersection they post a big sign that says for a given lane
which way to turn and what new lane to take.

                LANE#1 TURN RIGHT USE LANE#2


          A Label by Any Other Name ...

 There are many examples of label substitution
 protocols already in existence.

• ATM - label is called VPI/VCI and travels with cell.
• Frame Relay - label is called a DLCI and travels with
• TDM - label is called a timeslot its implied, like a lane.
• X25 - a label is an LCN
• Proprietary TAGs etc..
• One day perhaps Frequency Substitution where label is a
 light frequency?
         SO WHAT IS MPLS ?

• Hop-by-hop or source routing
  to establish labels
• Uses label native to the media
• Multi level label substitution transport


IP             IP    #L1       IP    #L2     IP   #L3      IP

     IP Forwarding         LABEL SWITCHING        IP Forwarding





                 Label request

                Label mapping
               WHY MPLS ?

 Leverage existing ATM hardware
 Ultra fast forwarding
 IP Traffic Engineering
    – Constraint-based Routing
   Virtual Private Networks
    – Controllable tunneling mechanism
   Voice/Video on IP
    – Delay variation + QoS constraints
             Need for MPLS

IP Routing
  • Slow

  • No path choice towards destination

  • No QoS guarantees

  • IP/ATM/SONET/DWDM architecture is not scalable
    for very large traffic, and very cost-ineffective

             BEST OF BOTH WORLDS

PACKET                                               CIRCUIT
ROUTING                      HYBRID                  SWITCHING

    IP                    MPLS                        ATM
• MPLS + IP form a middle ground that combines the best of IP and
 the best of circuit switching technologies.
• ATM and Frame Relay cannot easily come to the middle so IP has!!
           MPLS Terminology

   LDP: Label Distribution Protocol
   LSP: Label Switched Path
   FEC: Forwarding Equivalence Class
   LSR: Label Switching Router
   LER: Label Edge Router (Useful term
    not in standards)

                Forwarding Equivalence Classes
                             LSR               LSR
          LER                                                    LER


IP1                                                                      IP1
                IP1   #L1          IP1   #L2         IP1   #L3
                IP2   #L1          IP2   #L2         IP2   #L3
IP2                                                                      IP2

          Packets are destined for different address prefixes, but can be
          mapped to common path

• FEC = “A subset of packets that are all treated the same way by a router”
• The concept of FECs provides for a great deal of flexibility and scalability
• In conventional routing, a packet is assigned to a FEC at each hop (i.e., L3
  look-up), in MPLS it is only done once at the network ingress
             LABEL SWITCHED PATH (vanilla)
                                               #14    #311
                #216                           #99    #311
                                               #963   #311

              #612                                           #462

                               #99            #311

- A Vanilla LSP is actually part of a tree from every
  source to that destination (unidirectional).
- Vanilla LDP builds that tree using existing IP forwarding
  tables to route the control messages.

                                                                     Dest   Out
                                                                     47.1   1
                                    Dest     Out                     47.2   2
                                    47.1     1                       47.3   3
                                    47.2     2
                                    47.3     3
                                                                                  1 47.1
                                                   1                              2   IP
              Dest   Out        3
              47.1   1                             2
              47.2   2
              47.3   3         IP
     47.3 3                                                                                     47.2


              MPLS Label Distribution
                           Intf Label Dest Intf Label          Intf       Label Dest Intf
                           In In           Out Out             In         In         Out
                           3    0.50 47.1 1     0.40           3          0.40 47.1 1

                                                                                     1   47.1
                                                    Request: 47.1
Intf Dest Intf Label
In        Out Out                                                                2
                                   3            1
3    47.1 1    0.50                                    Mapping: 0.40
    47.3 3                                                                                47.2

              Label Switched Path (LSP)

                           Intf Label Dest Intf Label   Intf       Label Dest Intf
                           In In           Out Out      In         In         Out
                           3    0.50 47.1 1     0.40    3          0.40 47.1 1

                                                                            1 47.1
Intf Dest Intf Label       3                                   3
In        Out Out
3    47.1 1    0.50                                                       2
     47.3 3                                                                      47.2


                                          #14       #972
                      A                 #972


ER-LSP follows route that source chooses.
In other words, the control message to establish the LSP
(label request) is source routed.
                       EXPLICITLY ROUTED LSP ER-LSP

                                   Intf Label Dest Intf Label   Intf       Label Dest Intf
                                   In In           Out Out      In         In         Out
                                   3    0.50 47.1 1     0.40    3          0.40 47.1 1
Intf   Dest     Intf   Label
In              Out    Out                                                     IP
                                                                                     1 47.1
3      47.1.1   2      1.33        3                                   3
3      47.1     1      0.50
       47.3 3                                                                            47.2

                ER LSP - Advantages

• Operator has routing flexibility (policy-based, QoS-based)
• Can use routes other than shortest path
• Can compute routes based on constraints in exactly the
  same manner as ATM based on distributed topology
  database. (traffic engineering)

                MPLS Link Layers

•MPLS is intended to run over multiple link layers

•Specifications for the following link layers currently exist:

   • ATM: label contained in VCI/VPI field of ATM header

   • Frame Relay: label contained in DLCI field in FR header

   • PPP/LAN: uses ‘shim’ header inserted between L2 and L3

Translation between link layers types must be supported

  MPLS intended to be “multi-protocol” below as well as above

                        MPLS Encapsulation - ATM
      ATM LSR constrained by the cell format imposed by existing ATM standards
                                                     5 Octets
        ATM Header
          Format                VPI                 VCI         PT   CLP     HEC

            Option 1         Label              Label
            Option 2               Combined Label
            Option 3        ATM VPI (Tunnel)    Label

                                                AAL 5 PDU Frame (nx48 bytes)
                    n        •••          1
                   Generic Label Encap.                   Network Layer Header     AAL5 Trailer
      ATM                                                  and Packet (eg. IP)
                    (PPP/LAN format)
                 48 Bytes

                               48 Bytes
  ATM Header
   ATM Payload                                •••

• Top 1 or 2 labels are contained in the VPI/VCI fields of ATM header
        - one in each or single label in combined field, negotiated by LDP
• Further fields in stack are encoded with ‘shim’ header in PPP/LAN format
        - must be at least one, with bottom label distinguished with ‘explicit NULL’
• TTL is carried in top label in stack, as a proxy for ATM header (that lacks TTL)
           MPLS Encapsulation - PPP & LAN Data Links
                                   MPLS ‘Shim’ Headers (1-n)
                                      n      •••     1
                Layer 2 Header                                        Network Layer Header
                (eg. PPP, 802.3)                                       and Packet (eg. IP)

                                                           4 Octets
            Label Stack                                                                TTL
                                            Label                     Exp.    S
            Entry Format
                                   Label: Label Value, 20 bits (0-16 reserved)
                                   Exp.:    Experimental, 3 bits (was Class of Service)
                                   S:       Bottom of Stack, 1 bit (1 = last entry in label stack)
                                   TTL:     Time to Live, 8 bits

• Network layer must be inferable from value of bottom label of the stack
• TTL must be set to the value of the IP TTL field when packet is first labelled
• When last label is popped off stack, MPLS TTL to be copied to IP TTL field
• Pushing multiple labels may cause length of frame to exceed layer-2 MTU
  - LSR must support “Max. IP Datagram Size for Labelling” parameter
  - any unlabelled datagram greater in size than this parameter is to be fragmented

              MPLS on PPP links and LANs uses ‘Shim’ Header Inserted
                       Between Layer 2 and Layer 3 Headers
          MPLS & ATM
Several Models for running MPLS on ATM:
1. Label-Controlled ATM:
   • Use ATM hardware for label switching
   • Replace ATM Forum SW by IP/MPLS

                  IP Routing

                  ATM HW

                               Label-Controlled ATM
   • Label switching is used to forward network-layer packets
   • It combines the fast, simple forwarding technique of ATM with
     network layer routing and control of the TCP/IP protocol suite
                                 Network Layer
                                                             Label Switching Router
                               (eg. OSPF, BGP4)
    Switched path topology
     formed using network
                                 Forwarding       Forwarding
           layer routing           Table             Table
     (I.e. TCP/IP technique)                      B 17       C 05
                                                         •                            Label
                                                                          IP Packet       05
                                           A                         C
                                                                                   Packets forwarded
                   IP Packet      17       B                         D            by swapping short,
                                                                                   fixed length labels
                                                                                  (I.e. ATM technique)

ATM Label Switching is the combination of L3 routing and L2 ATM switching
            MPLS Over ATM

     MPLS                              MPLS
                ATM Network        L
            S                      S
            R                      R

                Two Models

VP                        VC

                 Internet Draft:
                     VCID notification over ATM Link

            Ships in the Night

               L      MPLS            L
               S                      S
               R               ATM    R
               ATM                   ATM
               SW                    SW

   ATM and MPLS control planes both run on the
    same hardware but are isolated from each
    other, i.e. they do not interact.
   This allows a single device to simultaneously
    operate as both an MPLS LSR and an ATM
   Important for migrating MPLS into an ATM
    network           ECE6609
   Ships in the Night Requirements

Resource   Management
 –VPI.VCI Space Partitioning
 –Traffic management
   •Bandwidth Reservation
   •Admission Control
   •Queuing & Scheduling
 –Processing Capacity
                                  Bandwidth Management

                A. Full Sharing        B. Protocol Partition   C. Service Partition
                          MPLS         Pool 1 MPLS                       MPLS
                                       •50%                      Pool 1
                 Pool 1
                                       •ATM                      •50%
                                                                 •rt-VBR ATM
Port Capacity

                         ATM                                              Available

                                       Pool 2 ATM               Pool 2 MPLS
                                       •50%                     •50%
                         Available     •rt-VBR                  •nrt-VBR ATM

                                     • Bandwidth Guarantees
                                     • Flexibility
                        ATM Merge
   Multipoint-to-point capability
 Motivation
    – Stream Merge to achieve scalability in MPLS:
        • O(n) VCs with Merge as opposed to O(n2) for
          full mesh
        • less labels required
    – Reduce number of receive VCs on terminals
 Alternatives
    – Frame-based VC Merge
    – Cell-based VP Merge

                  Stream Merge
Input cell streams     in   out
        1 1 1           1   7
        2 2 2           2   6         6 7 9 6 7 9 6 7
         3 3            3   9
                Non-VC merging (Nin--Nout)
Input cell streams     in out
                                      7 7 7 7 7 7 7 7
         1 1 1          1 7
                                     AAL5 Cell Interleaving Problem
        2 2 2           2 7
                                      7 7 7 7 7 7 7 7
          3 3           3 7
                                     No Cell Interleaving
                  VC merging (Nin-1out)
   VC-Merge: Output Module

Reassembly buffers

                       Output buffer


   VCI=1               Option 1: Dynamic VCI Mapping

              VPI=1                No Cell Interleaving Problem
                                   Since VCI is unique

           VPI=2                                           VCI=3
                                                     Option 2: Root
                                                     Assigned VCI

VCI=3        –merge multiple VPs into one VP
             –use separate VCIs within VPs to distinguish frames
             –less efficient use of VPI/VCI space, needs support of SVP

         Summary of Motivations for MPLS

• Simplified forwarding based on exact match of fixed length label
        - initial drive for MPLS was based on existance of cheap, fast ATM
• Separation of routing and forwarding in IP networks
       - facilitates evolution of routing techniques by fixing the forwarding
       - new routing functionality can be deployed without changing the
        forwarding techniques of every router in the Internet
• Facilitates the integration of ATM and IP
       - allows carriers to leverage their large investment of ATM
       - eliminates the adjacency problem of VC-mesh over ATM
•Enables the use of explicit routing/source routing in IP networks
       - can be easily used for such things as traffic management, QoS

           Summary of Motivations for MPLS

• Promotes the partitioning of functionality within the network
       - move granular processing of packets to edge; restrict core to
         packet forwarding
       - assists in maintaining scalability of IP protocols in large networks
• Improved routing scalability through stacking of labels
       - removes the need for full routing tables from interior routers in
        transit domain; only routes to border routers are required
• Applicability to both cell and packet link-layers
       - can be deployed on both cell (eg. ATM) and packet (eg. FR,
        Ethernet) media
       - common management and techniques simplifies engineering

Many drivers exist for MPLS above and beyond high speed forwarding

                          IP and ATM Integration
            IP over ATM VCs                                   IP over MPLS

• ATM cloud invisible to Layer 3 Routing         • ATM network visible to Layer 3 Routing

• Full mesh of VCs within ATM cloud              • Singe adjacency possible with edge router

• Many adjacencies between edge routers          • Hierachical network design possible

• Topology change generates many route updates   • Reduces route update traffic and power
                                                   needed to process them
• Routing algorithm made more complex

            MPLS eliminates the “n-squared” problem of IP over ATM VCs

                      Traffic Engineering
                  B                                           C
           A                                                           D

     Traffic engineering is the process of mapping traffic demand onto a network


  Purpose of traffic engineering:
  • Maximize utilization of links and nodes throughout the network
  • Engineer links to achieve required delay, grade-of-service
  • Spread the network traffic across network links, minimize impact of single failure
  • Ensure available spare link capacity for re-routing traffic on failure
  • Meet policy requirements imposed by the network operator

      Traffic engineering key to optimizing cost/performance
                   Traffic Engineering Alternatives
                        Current Methods of Traffic Engineering:
       Manipulating routing metrics                      Difficult to manage
     Use PVCs over an ATM backbone                           Not scalable
         Over-provision bandwidth                          Not economical

MPLS provides a new method to do traffic engineering (traffic steering)
          Example Network:

            Ingress node
           explicitly routes                                           Chosen by Traffic Eng.
             traffic over                                              (least congestion)
          uncongested path
                                                     Chosen by routing protocol
                                  Congested Node
                                                     (least cost)

   Potential benefits of MPLS for traffic engineering:
             - allows explicitly routed paths            operator control
             - no “n-squared” problem                  scalable
             - per FEC traffic monitoring                granularity of feedback
             - backup paths may be configured              redundancy/restoration

         MPLS combines benefits of ATM and IP-layer traffic engineering
             MPLS Traffic Engineering Methods

• MPLS can use the source routing capability to steer traffic on desired path

• Operator may manually configure these in each LSR along the desired path
         - analogous to setting up PVCs in ATM switches

• Ingress LSR may be configured with the path, RSVP used to set up LSP
          - some vendors have extended RSVP for MPLS path set-up

• Ingress LSR may be configured with the path, LDP used to set up LSP
          - many vendors believe RSVP not suited

• Ingress LSR may be configured with one or more LSRs along the desired path,
  hop-by-hop routing may be used to set up the rest of the path
          - a.k.a loose source routing, less configuration required

• If desired for control, route discovered by hop-by-hop routing can be frozen
          - a.k.a “route pinning”

• In the future, constraint-based routing will offload traffic engineering tasks from
  the operator to the network itself

                    MPLS: Scalability Through Routing Hierarchy
                                                  BR2               AS1
       AS2                                                                              AS3
                                        TR1              TR2

                                  TR4                   TR3
                                                                                      Egress border
   Ingress router                                                                      router pops
                      Packet labelled                   Forwarding in the interior
  receives packet                                                                    label and fwds.
                         based on                 BR4     based on IGP route
                       egress router

• Border routers BR1-4 run an EGP, providing inter-domain routing
• Interior transit routers TR1-4 run an IGP, providing intra-domain routing
• Normal layer 3 forwarding requires interior routers to carry full routing tables
      - transit router must be able to identify the correct destination ASBR (BR1-4)
• Carrying full routing tables in all routers limits scalability of interior routing
      - slower convergence, larger routing tables, poorer fault isolation
• MPLS enables ingress node to identify egress router, label packet based on interior
• Interior LSRs would only require enough information to forward packet to egress
  MPLS increases scalability by partitioning exterior routing from interior routing

           MPLS: Partitioning Routing and Forwarding
       Routing                                                  Based on:
                                                                Classful Addr. Prefix?
                 OSPF, IS-IS, BGP, RIP                          Classless Addr. Prefix?
                                                                Multicast Addr.?
                                         Forwarding Table       Port No.?
                                                                ToS Field?

                                                            Based on:
                       MPLS                                 Exact Match on Fixed Length Label

• Current network has multiple forwarding paradigms
      - class-ful longest prefix match (Class A,B,C boundaries)
      - classless longest prefix match (variable boundaries)
      - multicast (exact match on source and destination)
      - type-of-service (longest prefix. match on addr. + exact match on ToS)
• As new routing methods change, new route look-up algorithms are required
      - introduction of CIDR
• Next generation routers will be based on hardware for route look-up
      - changes will require new hardware with new algorithm
• MPLS has a consistent algorithm for all types of forwarding; partitions routing/fwding
      - minimizes impact of the introduction of new forwarding methods

      MPLS introduces flexibility through consistent forwarding paradigm
       Upper Layer Consistency Across Link Layers

                            PPP             ATM                Frame
        Ethernet       (SONET, DS-3 etc.)                      Relay

• MPLS is “multiprotocol” below (link layer) as well as above (network layer)

• Provides for consistent operations, engineering across multiple technologies

• Allows operators to leverage existing infrastructure

• Co-existence with other protocols is provided for
      - eg. “Ships in the Night” operation with ATM, muxing over PPP

           MPLS positioned as end-to-end forwarding paradigm

          Common Misconceptions

IP QoS is not ready for real, production
QoS is not useful unless it is deployed end-
Only ATM networks can support true, end-to-
end QoS.


To top