4 Measuring Network Delay

Document Sample
4 Measuring Network Delay Powered By Docstoc
					4 Measuring Network Delay
4.1 Measurements
Measurements are the key aspect of Internet Tomography and play an important role for
the estimation of internal characteristics of the network. They are the basis for the inference
strategy to build its estimates. The observations represent the only data the Network
Tomography can adopt. That is why the choice of measurements becomes really important
to obtain as precise estimation as possible. The inference techniques and measurements
must jointly cooperate. If the inference strategy is able to provide detailed estimates of its
analysis but adopts wrong measurements, the obtained results will be a failure. The
measurements must follow the dynamics of the network parameters which should be
computed. It is important to implement the measurements in a right way and to find a
trade-off between the accuracy and complexity. The accuracy stands for the degree of
correctness of measurements and the capacity to probe the network without large
measurement errors. The complexity represents not only the computational burden of the
algorithm of measurements, but also the quantity and the way to manage the machines
cooperating to obtain the measurements. For example, the results of a chirurgic operation
will be certainly better if a surgeon is supported by the medical personnel and operates
with state-of-the-art equipment than if he works alone and uses simple instruments.

An active external measurement is the measurement of a generated traffic flow, which
crosses the network from a source to a destination external hosts.

The focus of the present work is the link-level Network Tomography, that is the estimation
of link-level parameters, such as delay distributions, having path-level measurements. The
only available measurement is the one way delay to travel a path from its source to its
destination. Let us define the measurement of one way delay along a path.

4.2 Synchronization of Host
The measurements are not easy to obtain. A synchronization of time between the source
and destination hosts is required. There are many possible techniques to synchronize two
or more machines. One of them is the Global Positioning System (GPS). GPS considers
a satellite environment as reference time for the            synchronization of the source
and the destination. The GPS system provides the highest synchronization accuracy [15].
Emanuele Orlando

The advantage of this method is that the error caused by the propagation and the network
jitter is significantly reduced. The synchronization accuracy depends on the precision with
which the source and destination hosts are able to synchronize their internal clock to the
GPS signal. This method provides the accuracy of a few microseconds [15]. However,
GPS has also several drawbacks. In particular, the synchronization of several devices, each
of which is equipped with a GPS receiver, can be expensive. Furthermore, the GPS antenna
has to be located, depending on the card and antenna, within a specific distance from the
receiver, limiting the positioning of monitoring devices.

Another synchronization system is the Network Time Protocol (NTP) servers[15,16]. In
this system the synchronization is obtained through the time reference offered by public
NTP servers located across the Internet. This is the cheapest synchronization technique, but
it does not provide as optimal results as GPS does. In the NTP the accuracy depends both
on the characteristics of the paths followed by the NTP synchronization messages and on
the distance between the NTP server and the source and destination that must be
synchronized. The synchronization accuracy is of the about ten millisecond order [15].

                                GPS signal        NTP server

                Source              One-way delay              Destination
             synchronized                                     synchronized

    Figure 4.1: The GPS signal or NTP server synchronize the source and the
    destination. It is possible then to calculate the one way delay along a specific
    path of the network.

The techniques described above explain how the accuracy depends on the complexity of a
measurement made. An accurate measurement obtained with a GPS synchronization
provides optimal results, but its implementation it is not so simple.

                                                                      4 Measuring Network Delay

Network Tomography is devised to analyze large scale network. The present work
describes how Network Tomography can be applied to a small scale network, such as a
Local Area Network (LAN). In this contest it is not reasonable to use a GPS receiver or any
other complex and expensive synchronization systems. Whenever they are available, the
complexity of the synchronization to obtain the one way delay is higher than the
complexity of the inference algorithm. It is necessary to find new measurement tools and
understand how the loss of the accuracy can influence the estimate.

In the next section some of these methods, such as Ping, Traceroute and Patchar, will be
described. It is important to balance the complexity and accuracy of a measurement to test
its quality.

4.3 Ping
Ping is an abbreviation of the ‘Packet Internet Groper’[17]. The ping utility is essentially a
system administrator’s tool able to see if a computer is operating and to realize if network
connections are intact. Ping operates using the required Internet Control Message Protocol
(ICMP) Echo function, documented in RFC 792 that all hosts should implement. ICMP are
special IP control messages, that are used to send network information between two hosts,
and will be described in details in the next paragraph. When a machine receives an Echo
Request, it answers with an Echo Reply, placing the original Echo Request packet by the
data field of the Echo Reply. In particular, a host sends an ICMP Echo Request to an IP
address and then waits (or ‘listens’) for the return packet Echo Reply. If there is a right
connection and the target computer is active, a good return packet will be received.

To ping a host, it is only necessary to enter a ping hostname, whereby hostname is either a
machine name or just an IP address. Figure 4.2 shows a traces ping application in the
Router Lab of the Department of Distributed System of the University of Wuerzburg
(Germany) . In this example the host Latona is pinged by the host Ull. Ull sends one ICMP
Echo Request packet every second to Latona. When the ping program gets an Echo Reply
back from Latona, it prints out the response, providing the following information. First, the
IP address of where it came from. Usually, the host name is pinged and its IP address is
given in the response. Second, the number, and, third, the Time To Live (TTL) field.
Finally, it returns the Round Trip Time (RTT), which is the time expressed in milliseconds.

The sequence number starts at 0. The ICMP Echo Request number 0 represents the first
packet sent to the pinged host. Ping places a unique sequence number on each packet it
transmits, and reports, which sequence numbers it receives back.

If a skipped sequence number is received, it means, that some packet is lost or dropped,
because it did not receive an Echo Reply back with the missing number of the sequence.
The reason for a missing number of the sequence is a loss of either Echo Request or Echo
Reply. In fact, it is possible that the Echo Request arrives to the destination, replaced by the

Emanuele Orlando

              PING Latona( byte
              64 bytes from to to : icmp_seq=0 ttl=255 time=0.298 ms
              64 bytes from to to : icmp_seq=1 ttl=255 time=0.352 ms
              64 bytes from to to : icmp_seq=3 ttl=255 time=0.354 ms
              64 bytes from to to : icmp_seq=4 ttl=255 time=0.353 ms
              64 bytes from to to : icmp_seq=5 ttl=255 time=0.351 ms
              Statistics Ping for
              Packets : Transmitted = 5, Received= 5, Lost = 0 <0% lost>

                   Figure 4.2. The result of ping for the host Latona.

Echo Reply, which cannot arrive due to some reasons to the pinger host. If a loss rate of
Echo Request/Reply is one per cent of the total number of the sent Echo Request/Reply, the
right state of the link network is obtained. Usually, the random event of a loss is very low
inside a LAN and increases when the number of nodes in the network increases.

The Time To Live (TTL) allows to obtain some other information. When an IP packet is
sent, the TTL field is set to an arbitrary number. The packet traverse the network and the
TTL field is decreased by one by each router it goes through. When the TTL value
becomes 0, the packet is discarded by the router. Ping program uses the TTL to determine
approximately how many router hops the packet has gone through.

Let us imagine the TTL is set to 255. In this situation it is 255 minus N hops, where N is
the TTL of the returning Echo Replies. If the TTL field varies in successive pings, it could
indicate that the successive reply packets are going via different routes. In a LAN, the TTL
value can be set to 64 and usually the returned TTL does not vary in consecutive pings.

The Ping places a timestamp in each packet, which is echoed back and can easily be used
to compute the round-trip time to get a packet to the remote host. The necessary time of a
packet to reach its destination is called latency. An important index in a multi Echo
Request comes from the analysis of the variance, called jitter, of the RTT. A high jitter
means, that the measurements are not reliable. In a LAN the jitter does not vary
considerably. The measurements change, but always in a small band around the average
RTT. The latency can be considered negligible, because the network is of small size. The
motif of a varying jitter can be represented specially from the queuing delay at routers. It
depends on the traffic state of the LAN. When the ping terminates, it prints out a summary
containing the minimum, the average and the maximum RTT, so the jitter can be easily
computed. Besides, the number of packets transmitted, the number received, and the
percentage of packet lost are printed out. Usually, a reliable jitter is in the order of ten per
cent of the average TTL.

                                                                    4 Measuring Network Delay

Ping is a tool simple to use and can be implemented by all the hosts. This is a significant
advantage and is sometimes more important than precise measurement.

In the next section some other tools to obtain the RTT, such as Traceroute and Patchar, will
be discussed in detail. They provide more accurate results, but require a more complex
analysis in terms of acquisition and computation of the data.

Next Section gives brief explanation of how the ping can be applied and describes the
structure of an ICMP packet.

4.4 ICMP
The Internet Control Message Protocol (ICMP) is a network-layer Internet protocol that
provides message packets to report errors and other information regarding IP packet
processing back to the source[18]. ICMP is documented in RFC 792.

ICMP messages are contained within standard IP datagrams. They provide a way for IP
stacks to send simple messages containing information or errors. ICMP is important for IP
networks to work correctly. In fact, the Internet Protocol is not supposed to be absolutely
reliable. The goal of these control messages is to provide feedback about problems in the
communication environment. There are still no guarantees that a datagram will be delivered
or a control message will be returned. The ICMP messages typically report errors in the
processing of datagrams as specified in RFC1122 ( In
this section the most important aspects will be explained to provide a better understanding
of how the ping operates.

ICMPs generate several kinds of important messages, including Destination Unreachable,
Echo Request and Reply, Time Exceeded, Timestamp request and reply [20]. If an ICMP
message cannot be delivered, no other one will be generated in order to prevent an endless
flood of ICMP messages.

ICMP destination-unreachable message is sent by a router, it means that the router is
unable to send the package to its final destination. The router then discards the original
packet. There are different reasons why a destination might be not be reached. Some of
these are, the source host could specifiy a nonexistent address or the router did not have a
route to reach the destination. Destination-unreachable messages include network
unreachable, where a failure has occurred in the routing or forwarding of a packet. Another
kind of Destination-unreachable is Host-unreachable message, which indicates a delivery
failure, such as a wrong subnet mask. Port-unreachable messages are those when the port is
not available.

The Echo Request is an ICMP message which sends a packet of data to the host and
expects the data to be sent in return by an Echo Reply. The host must respond to an Echo

Emanuele Orlando

Requests with an Echo Reply containing the exact data received in the request

The ICMP Time-exceeded message is sent by the router if the Time-to-Live field of the IP
packet (expressed in hops or seconds) reaches zero. The basic format of ICMP messages is
shown in Figure 4.3. The 8-bit type code identifies the types of message. The type values
for the mentioned ICMP messages are the following:

      -   Echo Reply : Type 0
      -   Destination unreachable : Type 3
      -   Echo Request : Type 8
      -   Time exceeded : Type 11
      -   Timestamp request : Type 13
      -   Timestamp reply : Type 14

The code field focuses the kind of error message for the specific ICMP message. For
example, in case of ICMP Destination-unreachable, the code for already mentioned causes
of this message are:

      - Net unreachable : code 0
      - Host unreachable : code 1
      - Port unreachable : code 2

The checksum field is the one’s complement checksum of the ICMP message.

                   0            78            15 16               31
                   8-bit type        8-bit code    16-bit checksum

                       (Contents depends on type and code)

                        Figure 4.3: Format of ICMP packet.

Figure 4.4 depicts the way the ping works and the specific ICMP messages if there is a
right connection between the source and the destination node. It is the responsibility of the
network layer (IP) protocol to ensure that the ICMP message is sent to the correct
destination. This is achieved by setting the destination address of the IP packet. The source
address is set to the address of the computer generating the ICMP Echo Request, and the IP
protocol type is set to "ICMP" to indicate the packet as a control message. When the
destination receives the ICMP Echo Request, it copies all the data in the new ICMP

                                                                     4 Measuring Network Delay

Reply and sets the source address of the IP packet. When ICMP reply is received, it is
possible to analyze the sequence number and the round trip time. The sender records the
most recent time before transmission. When the destination receives the timestamp in the
message, it is returned in the reply together with the additional timestamp before the
transmission of the replier. The source with these two values can calculate the RTT
expressed in milliseconds.

                     Intermediate system
              Source       (router)      Destination specified in IP
                                            protocol destination
                                                                            Host copies
                                                  Host destination
                                                                           payload data
                                                 receives the Echo
         RTT                                                               and returns a
                                                                          reply with the
                                                 Host destination           source and
                                                 sends the Echo           destination IP
                                                      Reply                  addresses

   Figure 4.4: A source sends an ICMP Echo Request and listens to an ICMP

4.5 Traceroute
Traceroute is a network debugging utility that attempts to trace the path a packet takes
through the network [21,22]. The Internet path between two locations has many routers,
computers and other devices along it which help move the information of IP packets. The
goal is to obtain the route of the packet sent. A possibility use some IP packet options like
record route and its variants. Usually these are poorly specified and they are rarely
implemented in a useful way. In addition, these options can be disabled for security
reasons. These are some of the reasons to introduce Traceroute as a tool which does not
depend on any of these facilities.

Traceroute is an IP utility which allows the user to determine the route packets are taking to
a particular host. It can observe the path the packets travel, as they leave the sender, and
head for their destination. Traceroute can notify how many routers the packets travel
through and how long it takes them to travel between routers. It can also give the names of
the routers and their network affiliation and geographic location, if the routers have DNS
(Domain System Service) entries.

Emanuele Orlando

Traceroute takes also advantage of a feature of the Internet Protocol called TTL, or Time
To Live. Originally, this data field in an IP packet header was supposed to contain a value
representing the actual amount of time a packet could be flying around the Internet before a
router would simply discard it. This is used to reduce the possibility of birth of endlessly
looping packets around the network. The TTL field is now interpreted to indicate the
maximum number of routers a packet can transit. Every router that handles a packet
subtracts one from the packet's TTL. If the TTL reaches zero, the packet will be discarded
and an error message will be transmitted to the originator of the packet, ICMP (Internet
Control Message Protocol). Traceroute depends on the common router practice of sending
an ICMP Time Exceed Message (TEM) back to the sender when this occurs. Traceroute
causes routers along a packet's normal delivery path to generate these ICMP messages
which identify the router.

Traceroute works as following. It sets the TTL field for the first packet to 1. The first router
in the path receives the packet, decrements the TTL value by one, and if the resulting TTL
value is zero, this router responds with an ICMP TEM indicating that the datagram has
expired. Traceoute records the IP address and, if available, the DNS name of that router,
then sends out another packet with a TTL value of two. This packet makes it through the
first router, then expires at the next router in the path. This second router also sends an
error message back to the originating host. This process continues, incrementing TTL by
one, recording the IP address and name of each router until a packet finally reach the final
destination, or until it decides that the host is unreachable. Since this datagram is trying to
access an invalid port at the destination host, ICMP Port Unreachable Messages (PUM) are
returned, indicating an unreachable port; this event signals the Traceroute program that it is
finished. The purpose behind these sequences of packet sent with different TTL is to record
the source of each ICMP Time Exceeded Message to provide a trace of the path the
packet took to reach the destination host. Figure 4.5 shows how traceroute works to find
the route of a path between source and a receiver.

                                     Receiver                           Receiver


             UDP TTL=2      Router                                  Router
    Source                                      Source
             UDP TTL=1

                       a)                                      b)

    Figure 4.5: a) A sequence of UDP with different TTL is sent, until the receiver
    is reached. Each TEM, give back the RTT, the IP address and ,if available, the
    DNS name of that router. b) Knowing all the IP addresses, it is possible to
    define the path between the source and the receiver.

                                                                          4 Measuring Network Delay

A practical application is depicted in the Figure 4.6. The numbered output lines give the
TTL value, the name and IP address of the host or the router and the round-trip time for the
three datagrams that are sent. Figure 4.7 shows that the path obtained exists.

  Traceroute to (, 64 hops max, 40 byte packets
   1 ( 0.248ms 0.254ms 0.254ms
   2 (        0.562ms 0.599ms 0.598ms
   3      (       0.96ms 0.98ms 1ms
   4 ( 1.2ms 1.3ms 1.3ms

   Figure 4.6: Reading from left to right, it is possible to recognize the TTL, the
   host along the path and the RTT of each router.

                  Switch wi3k0

                                          Switch         Firewall             Switch

    Hestia132.187.106.46                                         Zeus



   Figure 4.7: Traceroute applied from Ull to Hestia and its relative network path.

4.6 Pathchar
Pathchar is a tool devised to infer the characteristics of links along an Internet path
[23,24,25,26,27]. From the global vision of a path, the link-level parameters, such as the
bandwidth, the latency or the queuing delays, can be estimated by means of Pathchar.
When a path is chosen, the characteristics of a link along it can be infered by measuring the
round trip time (RTT) of packets sent from a single host.

The substantial difference between Pathchar and Traceroute is that Pathchar conducts a so-
called packet train analysis. A packet train is a sequence of at least two packets sent along

Emanuele Orlando

the path. Varying the characteristics of the packets of different trains, such as, for example,
their size, it is possible to obtain the estimates of the link-level parameters. In this case the
measurements of the RTT of the packet trains conduct the analysis along the path.

As well as Ping and Traceroute, Pathchar uses the TTL field in an IP packet. TTL is a field
in the IP header which indicates how many hops should be allowed for this packet to make
more before being discarded or returned. If a router receives a packet with null TTL, it
means that this packet has expired. In this case the router drops this packet and sends an
ICMP unreachable message back to the sender. The source address of the ICMP packet
indicates the router where the packet expired. RTT is computed by measuring, for each
probe, the time until the error ICMP message is received. Pathchar operates by sending a
series of probes with varying values of TTL and varying packet size. By conducting a
statistical analysis of these measurements it is possible to infer the link-level parameters for
each link along a path, as, for example, the bandwidth, the latency and the queuing delay
distribution for the packet sent. Figure 4.8 shows how these measurements are conducted.

             Sender Router 1                       Router n

                UDP                      Link
                ICMP       TTL=1

                      ICMP                  TTL=n
    Figure 4.8: A source sends UDP packets along a path, waiting for the ICMP
    response. By varying IP TTL it is possible to control how far the packets can
    travel into the network and obtain the number of links traversed. Bandwidth and
    latency can be estimated by changing the size of packets. Repeating this
    multiple times gives queuing and loss information of packets.

In order to calculate RTT, it is important to observe how a packet travels along a path.
Figure 4.9 shows a typical example of calculating RTT between two different routers.
Setting RTT to n, let us focus the attention on the (n-1)-th and n-th nodes. When the
packet reaches the (n-1)-th node, it waits for a queuing time q1 to get in the outgoing link.
The transmit time is the time a packet spends on a link. It is a linear function, expressed in
the Equation 4.1, which depends on the latency and the packet size .

                                                                           4 Measuring Network Delay

                                trasmit  latency                                            (4.1)

At the node n the packet waits for another queuing time, q2 . The router reads the null TTL
value of the packet and generates the ICMP message. Even the ICMP packet spends a
queuing time to be forwarded, q3 , then returns to the node (n-1) with a transmit time (the
Equation 4.1). Finally, it waits in queue in the node (n-1), q4 .

                                      Transmit time               Forward message
                                 q1                        q2

                                 q4                        q3

                         Node n-1                           Node n

   Figure 4.9. The components of RTT are the queuing delays qi , the transmit
   times and the forwarding time.

The RTT from the (n-1)-th to the n-th node and back is:

                       packet _ size                                message _ size
   rtt  q1  (lat                  )  q2  forward  q3  (lat                 )  q4     (4.2)
                           bw                                            bw

To simplify the Equation 4.2, three hypotheses are to be made:
                                                                   message _ size
 The size of the ICMP message is small enough to consider negligible
 Usually the forwarding time of router is small enough to consider the forward time
forward negligible

 For a high number of measurements given a path, the probability of the event with
negligible queuing delay qi is not null.

Using these assumptions, the Equation 4.2 can be replaced by the following Equation 4.3.

Emanuele Orlando

                                             packet _ size
                              rtt  (lat                  )  lat                                               (4.3)

Pathchar measures RTT using the Equation 4.3. This computation is repeated until the
shortest possible value is found. In fact, the shortest value represents the RTT with the
highest likelihood obtained by using Equation 4.3 and its hypotheses.
Pathchar infers the latency and the bandwidth by measuring, for a range of packet size, the
RTT in the Equation 4.3. The RTT is plotted against the packet size and the shortest one is
chosen. Figure 4.10 shows the scatter plot and the relative choices of RTT. Pathchar is used
to estimate link parameters by subtracting the vertical intercepts of the adjacent link. The
vertical intercept is the total latency, hence, subtracting n-th value from the (n-1)-th gives
twice latency along link n and, dividing by two, the latency of link n is estimated.

Pathchar infers the bandwidth by subtracting the slopes of the adjacent links. Therefore, the
difference between the slopes of nodes n and (n-1) gives the inverse of the bandwidth of
that link. Reversing this value, the bandwidth of link n is estimated.

Pathchar is used to estimate the minimum RTT for a given packet size. An increment of the
packet size causes an inevitable increment of the RTT. Given a packet size, only the
minimum RTT observed is considered, assuming that any excess time a packet takes is
caused by queuing delay. The distribution of queuing delay on the node n can be estimated
by using Pathchar, because it is obtained by convolution of the queuing delays of previous

                          Link n                            Link n               Link n-1
                                                                               psize            psize1
                                                                                     (link_n) -        (link_n - 1)
      rtt                  shortest rtt                         shortest rtt    bw               bw

                                                                Sh_rrt_ n

                      Packet size                 Packet size                             Packet size
 Shortest rtt      Increment of packet size

                    a)                            b)                                    c)

    Figure 4.10: a) A scatter plot of all measured rtt. b) Pathchar chooses shortest rtt
    from the global scatter plot c) Pathchar finds latency and bandwidth by
    subtracting the intercepts and the slopes of adjacent link .

                                                                      4 Measuring Network Delay

Pathchar provides the latency estimation. The inferring of the bandwidth depends on the
precision of the interpolation in obtaining the line in Figure 4.10b. It is impossible to obtain
a perfect line, and there are inevitable errors in its slope. Usually, the difference of RTT
between the largest packet and the smallest one is insignificant. The higher the bandwidth,
the more difficult the estimation process. The quality of the estimation of bandwidth
depends on the number of packets sent. When the number of probes is high, the packets, in
fact, can travel along the path without incurring in any queuing delays.

Pathchar provides also the queuing delay probability distribution over a link. The key
assumption is to consider a steady state of the network traffic.

4.7 Advantages
The biggest advantage of the three tools described above is the absence of
synchronization between the source and the destination node. The Ping, for example
replaces the timestamp of the Echo Request in the Echo Reply without using any other
mechanism to connect the source and the destination hosts. They work autonomously
providing the RTTs without giving a lot of effort.

A measurement was defined as a one way delay along a path in a link-level delay
estimation. Ping, Traceroute and Pathchar compute the RTT, which is the time required to
go and come back. The one way delay is obtained by dividing the RTT by two. This can be
possible only under the hypothesis that the forward route of a packet is the same as the
return route. This is an empirical approach, but is the simplest in use. It is also quite
reliable under the hypothesis that the network in a steady state and local traffic during the
measurement is the same. In this case, the average queuing delays at the routers are the
same in both the directions. These assumptions can be easily applied in a LAN where a
closed network can be completely managed. In this contest the jitter for the RTT could be
small. An upper bound of this jitter is the ten percent of the total RTT computed.

In the next chapter there will be explained, why the one way measurement is not always
necessary to apply a link-level delay algorithm described in the Section 3. It is possible to
apply a link-level delay estimation using RTT and not to require any synchronization
between the source and the destination hosts.

Emanuele Orlando



Shared By: