Docstoc

On the Resilience of SACK and NewReno TCP

Document Sample
On the Resilience of SACK and NewReno TCP Powered By Docstoc
					    On the Resilience of SACK and NewReno TCP
                                              Qiang Ye, Mike H. MacGregor
                    Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada
                                          Email: {qye,macg}@cs.ualberta.ca

  Abstract-The de facto requirement in traditional telephone        Our other concern is to be able to size client buffers.
networks is to restore failures in 50 milliseconds or less. The   The "bandwidth-delay" product is usually used to set
same standard has been assumed in data networks. In this          receive buffer size, to fully utilize transport network
study we consider the reaction of TCP to a failure in a           bandwidth. When network failures are considered, we
continental-scale network. Our goal is to determine whether
                                                                  would like to know what buffer size leads to the best
there are particular values for outage duration at which file
transfer times increase markedly. Such values would               resilience of TCP.
indicate significant objectives for the restoration of              The rest of the paper is organized as follows. Section 2
networks carrying TCP traffic. For SACK and NewReno               gives the background of TCP resilience mechanisms, and
TCP, we find that a restoration objective of 600 ms to 1 s is     Section 3 presents the details of our simulations. In
appropriate. In addition, we also find that receive buffers       Section 4 we discuss the behavior of TCP in the case of
                       τ
can be sized at 2 rτ to maximize link utilization and             network failures. The paper closes with our conclusions
resilience.
                                                                  and recommendations in Section 5.
 Index Terms-TCP, resilience, data network.

                                                                          II. RESILIENCE MECHANISMS IN TCP
                    I. INTRODUCTION                                 TCP does not have any resilience mechanisms that are
  The de facto requirement in traditional telephone               specially designed to deal with network failures. From
networks is for restoration to occur in 50 milliseconds or        the viewpoint of TCP, there is no difference between
less [1] [2].       This was adopted as the result of             network failure and network congestion. As a result,
considering the impact of outage duration on voice calls.         when part of the network fails and some segments are
Outages of greater than 50 ms will likely result in many          dropped, TCP will assume that there is congestion
calls being dropped, due to various voice switch design           somewhere in the network, and the TCP congestion
parameters. Once these calls have been dropped, there is          control mechanisms will start dealing with the segment
the potential for an inrush of reattempts which has the           loss.
potential to overload and crash the network. However, the           TCP congestion control mechanisms have improved
same considerations do not necessarily apply to data              over time. The main versions of TCP are Tahoe TCP,
traffic. Despite this, the same 50 ms objective has been          Reno TCP, NewReno TCP and SACK TCP. Tahoe TCP
assumed in the development of data networks. Recent               is the oldest version and only a few old systems use it.
studies contend that for network and application layers,          Reno TCP, NewReno TCP and SACK TCP are widely
50 ms restoration is not necessary [3][4]. For Internet           implemented [5]. This paper focuses on SACK and
packet transport, the question that needs to be answered is:      NewReno TCP because they are the newer versions and
how does TCP behave in case of network failures? Does             are more widely deployed. Details about TCP congestion
TCP really need 50 millisecond restoration?                       control can be found in [6][7][8][9][10]. In our
  This paper focuses on the TCP-layer view of failures.           experiments, the TCP implementation conforms to the
That is, the goal of this study is to find out, in the absence    one illustrated in [9].
of any other compensating mechanisms (e.g. Automatic                Both SACK and NewReno TCP congestion control are
Protection Switching), how TCP control mechanisms                 composed of three parts: slow start, congestion avoidance
react to outages. This data is fundamental to designing           and fast retransmit/fast recovery. Three state variables,
any restoration mechanisms for networks carrying TCP              cwnd (congestion window), rwnd (receiver’s advertised
traffic.                                                          window) and ssthresh (slow start threshold), are
  In this paper we consider the reaction of a single TCP          maintained at the sender to deal with network congestion.
session to network link failures. Interactions between            In addition, SACK TCP has an extra variable called pipe
multiple TCP flows are not taken into account. We                 at the sender that represents the estimated number of
believe that the TCP protocol itself is complex enough            outstanding segments. SACK TCP also has a data
that it is necessary to first understand how TCP behaves          structure called scoreboard at the sender side that keeps
in this baseline scenario, before setting standards,              track of the contiguous data blocks that have arrived at
proposing mechanisms, or exploring the impact of                  the receiver. Retransmission timeout (RTO) is an
additional variables.                                             important parameter in TCP congestion control. It has a
minimum of one second and RFC 2988 [11] suggests that             The receive buffer at the client plays an important role
a maximum value may be placed on RTO. In our                    in TCP performance. During a TCP session, the sending
simulation, this maximum value is 64 seconds.                   TCP      continuously       compares    the   outstanding
                                                                unacknowledged traffic with cwnd and rwnd. Whenever
                                                                the outstanding traffic is less than the smaller of these
                     III. SIMULATION
                                                                two variables by at least one SMSS (sender maximum
  Our simulations were carried out using OPNET [12]             segment size), the sender will send out some segments if
because it has more up-to-date versions of protocols of         there are any waiting to be sent. Generally the receive
interest than other simulators such as ns [13]. For this        buffer size (rbuff) is set as:
study, a client and server are connected across a
continental-scale network. Each node is connected to a
                                                                     rbuff = bandwidth * round-trip-time = r * τ       (1)
local router via a high-speed LAN link. The local routers       where r stands for bandwidth and τ is round-trip-time
are connected to the core network via access links. Three       (RTT). This is commonly called the bandwidth-delay
access link rates are commonly used in real-life systems:       product [7].
DS0 (64 Kbps), DS1 (1.544 Mbps) and OC-3c (155                    The TCP session in our simulation must be long
Mbps). Based on the fact that servers are usually               enough to test scenarios with varying failure durations.
connected to the Internet via high-speed links while            We chose FTP as the application-layer protocol and made
client-side access link rates vary a lot, our simulation        the transmitted file large enough to fulfill this
fixes the server-side access at OC-3c and varies the            requirement. For DS0, DS1 and OC-3c client-side access
client-side access link from DS0 to DS1 and OC-3c. The          links, we used 5 MB, 10 MB and 20 MB files,
core network in our simulation has an NSFNET-like               respectively. In reality, the duration of TCP flows covers
topology shown in Figure 1. Core routers (Cisco 12008)          a very large range. Although there are many short flows
are connected via OC-192, which is common in backbone           in terms of number, long-running flows account for up to
networks nowadays. The client resides in Palo Alto and          50% of all traffic [17]. This paper focuses on long-
the server is located at Princeton.                             running TCP flows.
  As shown in Figure 1, a packet discarder model, used to
simulate outages, is on the link connecting Salt Lake City
to Palo Alto. We can specify either the number of packets                    IV. EXPERIMENTAL RESULTS
to be dropped or a certain time period during which all           We first describe the general behavior of SACK and
packets are dropped. Our experiments simulate a                 NewReno TCP in the case of network failures, and then
unidirectional failure of packets going from Salt Lake          present some additional details of TCP resilience for
City to Palo Alto (i.e. in the server-to-client path).          different scenarios. We use Transmission Time Increase
Packets traveling the other way get to their destination        (TTI) to quantify the impact of a network failure:
safely. A unidirectional failure would be unusual in a                                           -
                                                                                     TTI = ATT NTT                     (2)
transport network. However, this assumption was made            where ATT stands for Actual Transmission Time in the
by many network researchers to reflect the reality of           case of network failure, and NTT means Normal
today’s Internet: routes for IP packets are often               Transmission Time in the case of no network failure.
asymmetric [14]. Thus a failure in the underlying
network will often only affect a session in one direction.      A. General Behavior of TCP
  There is only one routing domain in our simulations,            We use a scenario with DS1 client access and 32 KB
and the NSFNET-like topology is relatively old. However,        receive buffer as a typical example to illustrate the
this paper focuses on the TCP-layer view of failures. That      general behavior of TCP in the case of network failures.
is, this paper tries to find out, in the absence of any         1. SACK TCP General Behavior
compensating mechanisms, how TCP congestion control               In our example, a long SACK TCP session starts at
mechanisms will react to outages. This first-step               time 0 and the Packet Discarder begins to drop packets at
experiment generated many valuable results, and some of         30 seconds, as shown in Figure 2. The four curves in
which are presented in detail in Section 4. Actually, it        Figure 2 illustrate the changes in the sender’s congestion
usually takes routing protocols (both IGP and EGP) tens         window over time in four different cases. Before the
of seconds to detect and react to lower-layer failures          failure starting point (marked by “X”), the four curves
[15][16]. If the failures can be restored within the time       overlap each other because during that period they
horizons recommended in this paper, the routing protocol        describe essentially the same conditions. After point “X”,
will not detect the failure, and any failure will be restored   they split into four different curves.
long before the routing protocol could converge. For              When the TCP session starts, the congestion window is
these reasons, we do not consider the potential reaction of     initially one SMSS (1460 bytes in our simulation) and
routing protocols to the failures under study.                  TCP is in slow start. cwnd increases exponentially as the
sender receives acknowledgements, until cwnd equals               In the normal state the receiver only sends out an ACK
ssthresh (initially ssthresh is 64 KB). The sparse points at    for every second full-sized segment, or within 200ms of
the beginning of the curve correspond to this slow start        the arrival of the first unacknowledged segment. Also,
period. Then TCP transitions into congestion avoidance          out-of-order data segments should be acknowledged
during which cwnd increases by 1460 bytes every RTT.            immediately. Thus when the first out-of-order segment in
This increase is comparatively slow and as a result the         the window arrives, if there is no unacknowledged
points for the congestion avoidance period are very close       segment at the receiver, this segment will trigger the
together.                                                       receiver sending out a duplicate ACK (we call this case
  Without a network failure, TCP stays in congestion            Type I Failure). Eq. (7) above applies to this scenario.
avoidance until the file is completely transmitted and the      But if the first out-of-order segment arrives within 200
TCP session is terminated. This is shown by the curve           ms of an unacknowledged segment, the receiver will not
labeled “0 Drop”.                                               send out a duplicate ACK. Instead the receiver transmits
                                                                an acknowledgement of the unacknowledged segment. Of
  Now we consider the case in which some segments are
                                                                course each segment following the first out-of-order
lost during a failure. In this case, after the lost segments,
                                                                segment results in a duplicate ACK. We call this case
the client receives subsequent segments over the restored
                                                                Type II Failure. In this scenario, Eq. (7) should be
link. As the result, the sender gets three duplicate
                                                                modified slightly: nC,S should be decreased by one. Hence,
acknowledgements, and TCP transitions into fast
                                                                we should calculate nC,S according to the following
retransmit/fast recovery. It retransmits the earliest
                                                                formula:
unacknowledged segment and sets pipe to the number of
outstanding segments, 23 in this case. It also sets ssthresh                         pipe / 2 + 2 in Type I Failure
to half the current flight size (16060 B in this case) and                 n C,S =                                     (8)
sets cwnd to (ssthresh + 3 * SMSS) = 20440 bytes [8].                                pipe / 2 + 1 in Type II Failure
Theoretically ssthresh should be set to exactly half the
                                                                Note that the SACK DS1-32K example illustrates a Type
current flight size, but OPNET uses a slightly different
calculation: after halving the flight size (33580 B), the       II failure, thus nC,S is 23/2 +1 =12.
result of 16790 B is rounded down to a multiple of SMSS           If less than nC,S segments within one window are lost,
(16060 B). Given the method used to calculate ssthresh,         there will be enough segments that arrive at the receiver
we have the following relation between cwnd and pipe:           and trigger enough duplicate selective ACKs. These
                                                                ACKs will then make (pipe * SMSS) at least one SMSS
            cwnd = ( pipe / 2 + 3) *SMSS                 (3)    less than cwnd at the sender. In this case, the sender can
  During fast recovery, pipe is increased by one when the       retransmit other lost segments after retransmitting the
sender either sends a new segment or retransmits an old         earliest unacknowledged segment. When a non-duplicate
segment, and it is decreased by one for each additional         ACK arrives acknowledging all data that was outstanding
duplicate ACK. For each partial ACK (ACKs received              when fast retransmit/fast recovery was entered, TCP exits
during fast recovery that acknowledges new data, but do         fast retransmit/fast recovery and switches into congestion
not take TCP out of fast recovery), pipe is decreased by        avoidance. In short, SACK TCP can usually recover
two rather than one. When (pipe * SMSS) becomes at              quickly from the loss of less than nC,S segments. The
least one SMSS less than cwnd, the sender will check the        overall transmission time does not increase much in this
scoreboard and either retransmit the earliest                   case. In the DS1-32K case, nC,S is 12. Thus from 1 lost
unacknowledged segment or transmit a new segment                segment to 12 lost segments, all the curves are similar.
when there are no unacknowledged segments. We use nD            For clarity, we only include the 12-drop curve in Figure 2.
to denote the number of duplicate ACKs received by the            On the other hand, if more than nC,S segments are
sender when (pipe * SMSS) becomes exactly one SMSS              dropped and there are still three or more non-
less than cwnd. Then we have:                                   retransmitted segments following the lost segments that
         pipe *SMSS − n D *SMSS = cwnd − SMSS        (4)        arrive at the receiver, enough duplicate ACKs will reach
                                                                the sender to trigger fast retransmit/fast recovery. But in
So:                                                             this case the duplicate selective ACKs will never make
                n D = pipe − (cwnd / SMSS) + 1           (5)    (pipe * SMSS) at least one SMSS less than cwnd, so the
                                                                sender will not retransmit other lost segments after
From Eq. (3) and (5), we have:
                                                                retransmitting the earliest unacknowledged segment. It is
 n D = pipe − ( pipe / 2 + 3) + 1 = pipe − pipe / 2 − 2 (6)     doomed to timeout, which will force TCP into slow start.
If we use nC,S to denote the critical number of lost            TCP will then have to retransmit the earliest segment at
segments in this case, we arrive at:                            that moment, which in this case is the second lost
                                                                segment.
            n C,S = pipe − n D =    pipe / 2 + 2        (7)
   If less than three segments follow the lost segments,          following formula:
fast retransmit/fast recovery will not occur because there                         rbuff / SM SS-3 in Type I Failure
will not be enough duplicate ACKs. This also leads to                   n C,NR =                                            (9)
timeout. When the retransmission timer expires TCP will                            rbuff / SM SS-4 in Type II Failure
transition into slow start, but in this case it retransmits the
first lost segment instead of the second. Although TCP            Note that the NewReno DS1-32K example also illustrates
experiences different transitions in the above two cases,         a Type II failure, thus nC,NR is (33580/1460) - 4 =19.
total transmission time does not change dramatically                 If less than nC,NR segments in a window of data are lost,
because timeout is the main factor. Hence, from 13 lost           enough surviving segments can arrive at the receiver and
segments to 23 lost segments, all the curves are similar to       trigger enough duplicate ACKs to make TCP transition
the 13-drop curve. We only include the 13-drop curve in           into fast retransmit/fast recovery. In this case, the earliest
Figure 2 for clarity.                                             unacknowledged segment is retransmitted and the
   If the network failure lasts long enough so that the           retransmission leads to a partial ACK. The partial ACK
segment retransmitted due to timeout is also dropped,             will then make the sender retransmit the earliest
things change again. This is because when the                     unacknowledged segment at that moment. This
retransmitted segment is sent out, the retransmission             retransmitted segment will result in another partial ACK,
timer has been doubled. If the retransmission fails, the          and thus leads to another retransmission. This process
sender will wait for twice the previous RTO before                goes on until a non-duplicate ACK arrives
timing      out    and      retransmitting      the    earliest   acknowledging all data that was outstanding when TCP
unacknowledged segment again. This corresponds to the             transitioned into fast retransmit/fast recovery, then TCP
24-drop curve in Figure 2. If the repeated retransmission         switches into congestion avoidance by setting cwnd back
does not succeed, the sender has to wait for four times the       to ssthresh. We should note that each time the sender gets
original RTO to retransmit a third time. This process goes        a partial ACK, it does one retransmission and thus
on until TCP gives up this connection. For clarity, we did        recovers one lost segment. Namely, it takes NewReno
not include the curves of 25, 26, etc. dropped segments,          TCP a whole RTT to recover one lost segment. Thus in a
but it is not difficult to imagine what they should look          sense, RTT determines the final TTI value. If RTT is
like in Figure 2. Figure 3 presents TTI changes vs. the           comparatively long, TTI increases dramatically with the
number of dropped segments, and illustrates this trend.           number of lost segments; otherwise, TTI almost remains
                                                                  unchanged. In the NewReno DS1_32K case, RTT is
2. NewReno TCP General Behavior                                   relatively small, thus TTI does not increase much. In this
   For NewReno TCP, a similar experimental setup is               case nC,NR is 19, so from 1 lost segment to 19 lost
used, but different experimental results are obtained. As         segments, all the curves are similar. For clarity, we only
shown in Figure 4, a long NewReno TCP session also                include the 19-drop curve in Figure 4.
starts at time 0 and the Packet Discarder begins to drop            SACK TCP has a different mechanism to deal with
packets at 30 seconds. The four curves in Figure 4                partial ACKs. In Section 4.A.1, we have mentioned that
illustrate the changes in the sender’s congestion window          pipe is decremented by one for each additional duplicate
over time in four different cases. Again, before the              ACK, but it is decreased by two rather than one for each
failure starting point (marked by “X”), the four curves           partial ACK. This additional decrease in pipe results in a
overlap each other; after point “X”, they transition into         faster recovery process: one partial ACK leads to two
four different curves.                                            retransmissions. The two retransmissions will trigger
   The curve labeled “0 Drop” is the same one illustrated         another two partial ACKs and eventually will lead to
in Section 4.A.1. Without network failures SACK and               another four retransmissions. This process goes on until a
NewReno TCP behave in the same way.                               non-duplicate ACK arrives acknowledging all data that
   Now we consider the case in which some segments are            was outstanding when TCP transitioned into fast
lost during a failure. In this case, similar to the SACK          retransmit/fast recovery. Hence, within one RTT, usually
scenario, if the sender can get three duplicate                   many more lost segments can be recovered with SACK
acknowledgements, TCP will transition into fast                   TCP than with NewReno TCP. This is why with SACK
retransmit/fast recovery and set ssthresh and cwnd in the         TCP, TTI does not increase much when less than nC,S
same way. NewReno TCP does not set pipe because pipe              segments within one window are lost, regardless of the
only appears in SACK TCP. If we use nC,NR to denote the           length of RTT. In contrast, the TTI of NewReno is
critical number of lost segments when there are just              influenced by RTT in this situation.
enough subsequent surviving segments in the window of                 On the other hand, if more than nC,NR segments in a
data to trigger three duplicate acknowledgements, it is           window of data are dropped, fast retransmit/fast recovery
easy to know usually the critical number is equal to              will not occur because there will not be enough duplicate
(rbuff/SMSS-3). Due to the TCP acknowledging                      ACKs. This leads to timeout. When the retransmission
mechanism illustrated in Section 4.A.1, we can get the            timer expires TCP will transition into slow start and
retransmit the first lost segment. In this scenario, timeout                          τ1,S = (( rbuff /(2 *SMSS) ) *SMSS) / r      (11)
plays the major role in terms of TTI, and thus the overall
transmission time does not increase much with the                          then τ1,S approximately doubles as rbuff doubles. This is
number of lost segments. In the NewReno DS1_32K case,                      illustrated in Figure 6, 7 and 8. τ2,S is not as
from 20 lost segments to 23 lost segments, all the curves                  straightforward because it is mainly related to RTO, and
are similar. We only include the 20-drop curve in Figure                   RTO is influenced by many factors [11]. RTO usually
4 for clarity.                                                             increases with rbuff, and has a minimum value of 1
  If the network failure lasts long enough so that the                     second. Thus, when RTO is greater than 1 second, τ2,S
segment retransmitted due to timeout is also dropped,                      increases with rbuff. This is illustrated in Figure 6 and 7.
NewReno TCP will experience the same doubling                              When RTO is at its minimum of 1 second, τ2,S does not
calculation that is illustrated in Section 4.A.1. A similar                change much and is independent of rbuff. This can be
24-drop curve is included in Figure 4. For clarity, we did                 observed in Figure 8. In any case, τ2,S is always greater
not include the curves of 25, 26, etc. dropped segments.                   than 1 second.
Figure 5 illustrates the overall trend by presenting TTI                     Either τ1,S or τ2,S can be chosen as a restoration
changes vs. the number of dropped segments.                                objective. If the restoration can be finished within τ1,S, the
B. Several Details of TCP Behavior                                         overall transmission time will not increase much in the
   The bandwidth-delay product, rτ, is commonly used to                    case of network failures. If the restoration time is in the
size the receive buffer. In our simulations, the RTT for                   range of τ1,S to τ2,S, the overall transmission time is
DS0, DS1 and OC-3c access is 210ms, 41ms and 26ms                          increased but it is guaranteed that the TTI is a fixed value.
respectively, so rτ has values of 1680, 7913 and 505440                    Other thresholds can be defined on the basis of a third
bytes respectively. For each access link rate, we                          timeout and so on. However, we know that τ2,S is
experimented with 8 different receive buffer sizes, from 8                 certainly greater than 1 second. This is already much
KB to 1024 KB. By 8 KB, we mean a multiple of SMSS                         larger than the de facto target of 50 ms.
that is just above 8 KB. For example, in our simulation,                     An interesting observation is that TTI is not always
SMSS is 1460 bytes, so by 8 KB, we mean 1460 * 6=                          greater than outage duration. When receive buffer is
8760 bytes.                                                                greater than rτ, some segments will be buffered in the
1. SACK TCP Details                                                        network. These buffered segments help keep traffic
                                                                           flowing between the sender and the receiver in the case of
   We have demonstrated that losing less than nC,S
                                                                           network failures. Thus TTI could be shorter than outage
segments typically does not increase SACK TCP
                                                                           duration in some scenarios. We can observe this trend in
transmission time significantly. Losing (nC,S + 1)
                                                                           Figure 6 and 7.
segments makes a difference, and subsequent losses have
little impact until it comes to the loss of the retransmitted                There are some exceptions to these typical cases. First,
copy. Knowing that pipe * SMSS = rbuff, we can easily                      the 512 KB and 1024 KB curve in Figure 6 illustrate
translate nC,S into ( rbuff /(2*SMSS) + 2) or ( rbuff                      situations in which very large receive buffers lead to a
/(2*SMSS) + 1). To link the number of lost segments to                     calculated value of RTO that is greater than the TCP-
outage duration, we define SACK Level-1 Fault                              defined maximum of 64 seconds. This puts TCP into
                                                                           slow start many times unnecessarily and dramatically
Tolerance Time (τ1,S ) as the period from the moment
                                                                           changes the normal recovery process. Thus the 512 KB
network failure occurs to the moment just before the
                                                                           and 1024 KB curve are very irregular. Secondly, in
segment following the dropped nC,S segments arrives. We
                                                                           Figure 8, we note that a 1024 KB buffer mostly leads to a
define SACK Level-2 Fault Tolerance Time (τ2,S) as the
                                                                           shorter TTI than does a 512 KB buffer. This is
period from the moment network failure occurs to the
                                                                           exceptional because generally TTI increases with rbuff.
moment just before the copy retransmitted due to timeout
                                                                           However, the bandwidth-delay product for OC-3c access
arrives. Thus τ1,S is the time during which ( rbuff                        is 505440 B, and after the failure ssthresh is set to half
/(2*SMSS) +2) or ( rbuff /(2*SMSS) + 1) segments                           the current flight size, which is around 256 KB in the
pass the failure point. In our simulations, this is mostly                 case of a 512 KB buffer. Setting ssthresh to a value less
influenced by the client access rate, which is essentially                 than rτ hurts link utilization and leads to a longer TTI.
the bandwidth of “r” used to calculate the bandwidth-                      Thirdly, in Figure 8, we observe that when outage
delay product, so that:                                                    duration is between τ1,S and τ2,S, TTI decreases
         (( rbuff /(2 * SMSS) + 2) *SMSS) / r in Type I Failure            dramatically with outage duration. This is again the result
τ1,S =                                                              (10)   of the large value of rτ. When outage duration is in this
         (( rbuff /(2 * SMSS) + 1) * SMSS) / r in Type II Failure          range, the sender times out and finally gets into
Obviously, τ1,S increases with rbuff. If rbuff is large                    congestion avoidance. In the case of OC-3c access, cwnd
enough so that we can approximate τ1,S as follows:                         increases with the number of lost segments
                                                                           (corresponding to longer outage duration) when TCP
transitions into congestion avoidance. In this scenario, the             We define NewReno Level-2 Fault Tolerance Time
network connection is not fully utilized after the failure             (τ2,NR) as the period from the moment network failure
because cwnd is always less than rτ, so a larger cwnd due              occurs to the moment just before the copy retransmitted
to longer failure time leads to shorter TTI. Fourthly, in              due to timeout arrives. τ2,NR is essentially the same as τ2,S.
Figure 8, after τ1,S, TTI increases as the receiver buffer             So all conclusions about τ2,S also apply to τ2,NR.
increases from 8 KB to 256 KB and it decreases as the                    Either τ1,NR or τ2,NR can be chosen as a restoration
receiver buffer increases from 256 KB to 512 KB. We                    objective. But with NewReno TCP, if the restoration can
know that in the case of OC-3c access rτ is 505440 bytes.              be finished within τ1,NR, the overall transmission time is
Hence, the curves for 8 KB to 256 KB are for receive                   influenced by RTT. If RTT is relatively small, the overall
buffer sizes of less than rτ and those for 512 KB to 1024              transmission time does not change much as restoration
KB are for sizes greater than rτ. A value for rbuff less               time increases; otherwise, TTI increases with restoration
than rτ leads to poorer link utilization and so to larger              time. In Figure 9, 10 and 11, we observe that, when
NTT [7]. NTT is the baseline value used to calculate TTI               restoration time is less than τ1,NR, TTI does not change
in Eq. (2). Thus, we have two different classes in terms of            much in the DS0 scenario, but RTT plays a role in the
TTI, above and below rτ. They are essentially not                      DS1 scenario and TTI increases dramatically with outage
comparable.                                                            duration in the OC-3c scenario. It is interesting that in the
   The receive buffer size plays a significant role in SACK            OC-3c scenario, for large receive buffers, restoration
TCP resilience. First, from the viewpoint of network link              times longer than τ1,NR lead to better resilience (i.e.
utilization, receive buffers should be set to at least 2rτ to          decreased TTI). In addition, TTI could be less than
anticipate the case that one network failure takes place               outage duration in some cases due to large receiver buffer
during the transmission. Anything lower results in                     and this can be observed in Figure 9 and 10.
impaired resilience. This is because even if a timeout                    The exceptions due to large receive buffers, ssthresh
occurs, ssthresh is still equal to at least rτ when it is              halving and insufficient rτ presented in Section 4.B.1 also
initially set to 2rτ. Secondly, the receive buffer should be           apply to NewReno TCP and they can be observed in
set as large as possible in order to increase τ1,S. Finally,           Figure 9, 10 and 11. The exception with SACK TCP that
when RTO is at its minimum of 1 second, receive buffer                 TTI decreases with outage duration when restoration is
size does not affect τ2,S; when RTO is greater than 1                  finished between τ1,S and τ2,S in the OC-3c access case
second, larger receive buffers lead to longer τ2,S. We can             does not occur in NewReno TCP because with NewReno
observe these trends in Figure 6, 7 and 8.                             TCP, after τ1,NR there are only 3 or 4 segments left in the
2. NewReno TCP Details                                                 window of data, these segments do not make a significant
                                                                       change to TTI.
   We have illustrated that when less than nC,NR segments
are lost, TTI is affected by RTT. Losing (nC,NR +1)                      For receive buffer sizing, all rules illustrated for SACK
segments makes a difference, and subsequent losses have                TCP previously also apply to NewReno TCP.
little impact until it comes to the loss of the retransmitted
copy. We define NewReno Level-1 Fault Tolerance Time                                        V. CONCLUSIONS
(τ1,NR) as the period from the moment of network failure
                                                                         The results presented here demonstrate that the
to the moment just before the segment following the
                                                                       traditional 50 ms restoration time is not suitable for TCP
dropped nC,NR segments arrives. τ1,NR is the time during
                                                                       transport on the Internet. With SACK TCP, we found two
which (rbuff/SMSS-3) or (rbuff/SMSS-4) segments pass
                                                                       restoration objectives, τ1,S and τ2,S. τ1,S is given by Eq.
the failure point. In our simulation, it is mostly affected
by the client access rate, which is essentially r, so:                 (10), and τ2,S is closely related to RTO. With NewReno
                                                                       TCP, we also found two restoration objectives, τ1,NR and
          (( rbuff / SMSS-3) * SMSS) / r in Type I Failure
                                                                       τ2,NR. τ1,NR is given by Eq. (12), and τ2,NR is essentially the
τ1,NR =                                                         (12)
                                                                       same as τ2,S. τ1,NR is approximately twice as large as τ1,S
          (( rbuff / SM SS-4) * SM SS) / r in Type II Failure
                                                                       when rbuff is large. For SACK TCP, if restoration can be
Apparently, τ1,NR increases with rbuff. If rbuff is large              finished within τ1,S, TTI does not increase much with
enough so that we can approximate τ1,NR as follows:                    restoration time. For NewReno TCP, if network failures
                      τ1,NR = rbuff / r                         (13)   can be restored within τ1,NR, TTI is influenced by RTT. If
                                                                       RTT is relatively small, TTI does not change much;
then τ1,NR approximately doubles as rbuff doubles. This is             otherwise, TTI increases with restoration time. In
illustrated in Figure 9, 10 and 11. From Eq. (11) and (13),            addition, we find that TTI could be less than outage
we conclude that τ1,NR is approximately twice as large as              duration in some cases due to large receive buffer.
τ1,S when rbuff is large.                                                For low-rate access, we recommend τ1,S or τ1,NR to be
                                                                       the restoration objective. This is because in this situation
τ1,S or τ1,NR is the threshold after which TTI increases                                                              [4] J. Schallenburg. Is 50 ms Restoration Necessary? .
markedly. In the DS0 access scenario, both values are                                                                      IEEE      Bandwidth     Management       Workshop,
above 600 ms in all experimental cases. For high-rate                                                                      Montebello, Quebec, Canada, Jun. 2001.
access, τ2,S or τ2,NR is recommended because τ1,S or τ1,NR                                                            [5] J. Pahdye and S. Floyd. On Inferring TCP Behavior.
is probably too small to be realistically attainable, and                                                                  Proceedings of the 2001 conference on applications,
any additional outage up to τ2,S or τ2,NR does not increase                                                                technologies, architectures and protocols for
TTI significantly. In the OC-3c access scenario, both τ1,S                                                                 computer communications.
and τ1,NR are below 15 ms in almost all experimental                                                                  [6] M. Allman and V. Paxson. RFC 2581: TCP
cases. Hence, a restoration objective of τ2,S or τ2,NR is                                                                  Congestion Control. Apr. 1999.
                                                                                                                      [7] W.R. Stevens. TCP/IP Illustrated, Volume 1.
appropriate; in the OC-3c access scenario, τ2,S and τ2,NR
                                                                                                                           Addison Wesley Press, Apr. 2000.
are approximately 1 s.
                                                                                                                      [8] M. Mathis, J. Mahdavi et al. RFC 2018: TCP
  We also find that receive buffers can be sized to meet                                                                   Selective Acknowledgement Options. Oct. 1996.
various utilization and resilience goals. First, from the                                                             [9] K. Fall and S. Floyd. Simulation-based Comparisons
viewpoint of network link utilization, receive buffers                                                                     of Tahoe, Reno and SACK TCP. Computer
should be set to at least 2rτ in the case that one network                                                                 Communication Review, V. 26 N. 3, Jul.1996, pp. 5-
failure takes place during the transmission. Anything                                                                      21.
lower results in impaired resilience. This is because even                                                            [10] S. Floyd and T. Henderson. RFC 2582: The
if a timeout occurs, ssthresh is still equal to at least rτ                                                                NewReno Modification to TCP’s Fast Recovery
when it is initially set to 2rτ. Secondly, the receive buffer                                                              Algorithm. Apr. 1999.
should be set as large as possible in order to increase τ1,S                                                          [11] V. Paxson and M. Allman. RFC 2988: Computing
and τ1,NR. Finally, when RTO stays at its minimum of 1                                                                     TCP's Retransmission Timer. Nov. 2000.
second, receive buffer size does not affect τ2,S and τ2,NR;                                                           [12] OPNET Modeler. Version: 8.1.A PL3. May 2002.
when RTO is greater than 1 second, larger receive buffers                                                             [13] LBNL        Network     Simulator.     http://www-
lead to longer τ2,S and τ2,NR.                                                                                             nrg.ee.lbl.gov/ns/.
                                                                                                                      [14] K. Ramakrishnan, S. Floyd et al. RFC 3168: The
                                                                                                                           Addition of Explicit Congestion Notification (ECN)
                    REFERENCES                                                                                             to IP. Sep. 2001.
[1] Transport Systems Generic Requirements (TSGR):                                                                    [15] J. Moy. RFC 2328: OSPF Version 2. Apr. 1998.
    Common Requirements. Generic Requirements: GR-                                                                    [16] K. Lougheed and Y. Rekhter. RFC 1267: Border
    499-CORE. Dec. 1998.                                                                                                   Gateway Protocol 3. Oct. 1991.
[2] Types and Characteristics of SDH Network                                                                          [17] N. Brownlee and K.C. Claffy. Understanding
    Protection Architectures. ITU-T G.841. Oct. 1998.                                                                      Internet traffic streams: dragonflies and
[3] S. Mokbel. Canada's Optical Research and Education                                                                     tortoises. IEEE Communications Magazine,
    Network: CA*net3. Proceedings DRCN 2000,                                                                               40(10), Oct. 2002, pp. 110-117.
    Munich, April2000, pp. 10-32.




                                                                                               Figure - 1 Core Network Topology.

                                 140

                                                                          0 Drop

                                 120

                                                        X

                                 100                 Failure Starting Point
        Congestion Window (KB)




                                                                                          13 Drops
                                 80

                                                                          12 Drops

                                 60



                                 40
                                                                          24 Drops

                                 20



                                  0
                                       0   10   20      30               40          50          60
                                                     Time (s)
                                                                                                                                   Figure - 3 TTI vs. No. of Lost Segments
       Figure - 2 Congestion Window vs. Transmission Time                                                                                  (SACK DS1-32K Case).
                     (SACK DS1-32K Case).
                                                                                                                                                                                                                         6

                                                       140
                                                                                                                                                                                                                                     8k Buffer
                                                                                                                              0 Drop                                                                                                16k Buffer
                                                                                                                                                                                                                                    32k Buffer
                                                       120                                                                                                                                                               5          64k Buffer
                                                                                                                                                                                                                                   128k Buffer
                                                                                                           X                                                                                                                       256k Buffer
                                                                                                                                                                                                                                   512k Buffer




                                                                                                                                                                   Transmission Time Increase (s)
                                                       100                                               Failure Starting Point                                                                                                   1024k Buffer
                             Congestion Window (KB)




                                                                                                                                                                                                                         4
                                                                                                                                                       20 Drops
                                                        80

                                                                                                                              19 Drops

                                                        60                                                                                                                                                               3


                                                        40
                                                                                                                              24 Drops
                                                                                                                                                                                                                         2
                                                        20



                                                            0
                                                                0                  10         20           30                40             50                60                                                         1
                                                                                                                                                                                                                         0.0001                        0.001              0.01               0.1                     1
                                                                                                        Time (s)
                                                                                                                                                                                                                                                                      Outage Duration (s)
                                                        Figure - 4 Congestion Window vs. Transmission Time                                                                                                      Figure - 8 TTI vs. Outage Duration (SACK OC-3c Access).
                                                                     (NewReno DS1-32K Case).


                                                                                                                                                                                                                         100




                                                                                                                                                                      Transmission Time Increase (s)
                                                                                                                                                                                                                             10




                                                                                                                                                                                                                              1                                                                       8k Buffer
                                                                                                                                                                                                                                                                                                     16k Buffer
                                                                                                                                                                                                                                                                                                     32k Buffer
                                                                                                                                                                                                                                                                                                     64k Buffer
                                                                                                                                                                                                                                                                                                    128k Buffer
                                                                                                                                                                                                                                                                                                    256k Buffer
                                                                                                                                                                                                                                                                                                    512k Buffer
                                                                          Figure - 5 TTI vs. No. of Lost Segments                                                                                                                                                                                  1024k Buffer
                                                                                                                                                                                                                          0.1
                                                                                (NewReno DS1-32K case).                                                                                                                           0.1                           1                   10                    100
                                                                                                                                                                                                                                                                       Outage Duration (s)

                                                                                                                                                                                  Figure - 9 TTI vs. Outage Duration (NewReno DS0 Access).
                                                      100


                                                                                                                                                                                                                                           8k Buffer
                                                                                                                                                                                                                                          16k Buffer
Transmission Time Increase (s)




                                                                                                                                                                                                                                          32k Buffer
                                                                                                                                                                                                                                          64k Buffer
                                                                                                                                                                                                                                         128k Buffer
                                                                                                                                                                                                                                         256k Buffer
                                                      10                                                                                                                                                                                 512k Buffer
                                                                                                                                                                                                                             10         1024k Buffer
                                                                                                                                                                               Transmission Time Increase (s)




                                                       1                                                                                    8k Buffer
                                                                                                                                           16k Buffer
                                                                                                                                           32k Buffer
                                                                                                                                           64k Buffer                                                                         1
                                                                                                                                          128k Buffer
                                                                                                                                          256k Buffer
                                                                                                                                          512k Buffer
                                                                                                                                         1024k Buffer

                                                      0.1
                                                            0.1                         1                         10                             100
                                                                                                   Outage Duration (s)
                                                      Figure - 6 TTI vs. Outage Duration (SACK DS0 Access).                                                                                                                  0.1
                                                                                                                                                                                                                                0.01                            0.1                    1                     10
                                                                                                                                                                                                                                                                       Outage Duration (s)

                                                                                                                                                                   Figure - 10 TTI vs. Outage Duration (NewReno DS1 Access).
                                                                       8k Buffer
                                                                      16k Buffer
                                                                      32k Buffer
                                                                      64k Buffer
                                                                     128k Buffer                                                                                                                                                                                                                         8k Buffer
                                                                     256k Buffer                                                                                                                                                                                                                        16k Buffer
                                                                     512k Buffer                                                                                                                                                                                                                        32k Buffer
                                                       10           1024k Buffer
              Transmission Time Increase (s)




                                                                                                                                                                                                                                                                                                        64k Buffer
                                                                                                                                                                                                                                                                                                       128k Buffer
                                                                                                                                                                                                                                                                                                       256k Buffer
                                                                                                                                                                                                                                                                                                       512k Buffer
                                                                                                                                                                                                                                                                                                      1024k Buffer
                                                                                                                                                                                        Transmission Time Increase (s)




                                                                                                                                                                                                                             10




                                                        1




                                                       0.1
                                                          0.01                          0.1                        1                             10
                                                                                                   Outage Duration (s)

                                                      Figure - 7 TTI vs. Outage Duration (SACK DS1 Access).                                                                                                                  1
                                                                                                                                                                                                                             0.0001                     0.001             0.01               0.1                     1
                                                                                                                                                                                                                                                                      Outage Duration (s)

                                                                                                                                                                                                                                                 Figure - 11 TTI vs. Outage Duration
                                                                                                                                                                                                                                                    (NewReno OC-3c Access).

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:12
posted:5/27/2011
language:English
pages:8