# TCP Congestion Control Algorithms and Analysis

Document Sample

```					               TCP Congestion Control:
Algorithms and Analysis
Simon S. Lam
Department of Computer Sciences
The University of Texas at Austin

TCP Congestion Control (SSL)   1

Little’s Law

Average population
= (average delay) x                                       N
(throughput)              average delay = 1             ∑ delayi
N            i=1
where N is number of departures
throughput = N/T
where T is duration of observation
average population (to be defined)

Try homework problem at
http://www.cs.utexas.edu/users/lam/cs356/homework/hw2.html
TCP Congestion Control (SSL)   2

1
Number in system n(t)

Time t

average population = 1 ∫T n(t)dt
T 0

TCP Congestion Control (SSL)   3

TCP sliding window protocol
Pre-1988
Go-back-N ARQ
Sender detects loss from timeout
Retransmits from lost packet onward (because
Window flow control
Prevents overflow at receive buffer of
destination host
self-clocking—sender uses ACKs to trigger new
bandwidth)

TCP Congestion Control (SSL)   4

2
Window Flow Control
RTT

Source       1 2     W                1 2       W
time

data             ACKs

Destination           1 2         W              1 2           W
time

~ W packets per RTT when no loss
Lost packet detected by missing ACK
(note: timeout value TO > RTT)

TCP Congestion Control (SSL)   5

Throughput (sending rate)
Limit the number of unacked transmitted
packets in the network to window size W

W
Throughput            ≅             packets/sec
RTT
W × MSS
=         RTT
bytes/sec

Where did we apply Little’s Law?

TCP Congestion Control (SSL)   6

3
Clarifications
Average number in the send buffer is typically
less than W unless packet arrival rate to send
buffer is infinite -> previous formula provides a
throughput upper bound

If each packet may be lost with rate p, then the
average delay is
(1 − p) × RTT + p × TO
Since TO > RTT, actual throughput is smaller.
With loss, goodput is
(1 − p ) × throughput

Note: in some papers and other context (e.g., random
access protocols), the term throughput is called send
rate and goodput is called throughput.

TCP Congestion Control (SSL)   7

Effect of Congestion
W too big for each of many flows -> congestion
Packet loss -> transmissions on links prior to packet
loss are wasted
Congestion collapse due too many retransmissions
and too much waste
October 1986, Internet had its first congestion
collapse
goodput

TCP Congestion Control (SSL)   8

4
TCP Window Control

Network congestion control
It infers available network capacity from “loss
indications”
cwnd:     congestion window

Sender sets W = min (cwnd, rwnd)

TCP Congestion Control (SSL)   9

sends
Size of rwnd indicates available space in
decreased when data is received from IP layer and
ack’d
increased when data is consumed by application
process

TCP Congestion Control (SSL)   10

5
Network Congestion Control
Sender calculates cwnd from indications of
network congestion
Congestion indications
timeout (loss)
dupACK (loss likely)
queueing delay
mark (needs ECN)
TCP algorithms to calculate cwnd
Tahoe, Reno, Vegas, …
RED, REM …

TCP Congestion Control (SSL)   11

TCP & AQM

pl(t)

xi(t)

Example congestion measure pl(t)
Loss (DropTail)
Queue length (RED)
Queueing delay (Vegas)
Price (REM)

TCP Congestion Control (SSL)   12

6
Equilibrium – duality theory
Congestion control problem
max
xs ≥ 0
∑U (x )
s
s       s

subject to         x ≤c,
l
l
∀l ∈ L
Primal-dual algorithm
x(t + 1) = F ( p(t ), x(t ))                   Reno, Vegas

p(t + 1) = G ( p(t ), x(t ))                   DropTail, RED, REM

TCP, AQM protocols (F, G)
Maximize aggregate source utility
With different utility functions Us(xs)
TCP Congestion Control (SSL)   13

TCP Congestion Control
Tahoe (Jacobson 1988)
Slow Start
Congestion Avoidance
Fast Retransmit
Reno (Jacobson 1990)
Fast Recovery
Its variants: NewReno, SACK
Vegas (Brakmo & Peterson 1994)
New Congestion Avoidance
AQM
RED (Floyd & Jacobson 1993)
• Probabilistic marking or dropping
REM (Athuraliya & Low 2000)
• Clear buffer, match rate
Others…

TCP Congestion Control (SSL)   14

7
Slow Start
On each successful ACK, increment cwnd
cwnd ← cwnd + 1
Exponential growth of cwnd
each RTT: cwnd ← 2 x cwnd

Enter CA when cwnd >= ssthresh

For initial slow start, ssthresh is set to a very large
value (e.g., 65 Kbytes)

Note: for clarity, cwnd, rwnd, and ssthresh are
counted in packets (segments) rather than in bytes

TCP Congestion Control (SSL)   15

Slow Start
cwnd
1
data packet
1 RTT
ACK
2

3
4

5
6
7
8

cwnd ← cwnd + 1 (for each ACK)          TCP Congestion Control (SSL)   16

8
Congestion Avoidance
cwnd
1                                         cwnd ≥ ssthresh
data packet

ACK
2                           1 RTT
On each successful
ACK:
cwnd ← cwnd + 1/cwnd
3

Linear growth of cwnd
4
each RTT:
cwnd ← cwnd + 1

TCP Congestion Control (SSL)   17

Packet Loss
Assumption: loss indicates congestion
Packet loss detected by
Retransmission timeout (RTO timer)
Duplicate ACKs (at least 3)

Packets
1          2        3           4     5          6           7

Acknowledgements
1         2         3                 3           3           3
TCP Congestion Control (SSL)   18

9
Fast Retransmit
A timeout is quite long (> RTT)
Upon receiving 3 dupACKs, immediately
retransmit without waiting for timeout

ssthresh ← max(flightsize/2, 2)
where flightsize is number of outstanding packets,
which may be less than W = min(rwnd, cwnd)

Enter Slow Start (cwnd = 1)
TCP Congestion Control (SSL)   19

TCP Tahoe (Jacobson 1988)
cwnd

time
SS     CA
(in RTTs)

SS: Slow Start
CA: Congestion Avoidance

TCP Congestion Control (SSL)   20

10
Successive Timeouts
When there is another timeout, double the
timeout value
Keep doing so for each additional loss-
retransmission
Exponential backoff up to
max timeout value equal
to 64 times initial timeout
value

Note: red line in figure denotes a loss indication

TCP Congestion Control (SSL)   21

Summary: Tahoe
Basic ideas
Probe network for spare capacity during SS and
CA and increase send rate
Drastically reduce rate on congestion indication
Self-clocking
Error recovery by retransmission
Round trip time estimation (to get TO value)
for every ACK {
if (W < ssthresh) then W++                   (SS)
else W += 1/W                                (CA)
}
for every loss indication {
ssthresh = W/2
W = 1
}                                             TCP Congestion Control (SSL)   22

11
TCP Tahoe (Jacobson 1988)
cwnd

time
SS     CA

SS: Slow Start
CA: Congestion Avoidance

TCP Congestion Control (SSL)    23

TCP Reno            (Jacobson 1990)

cwnd

time
SS     CA

SS: Slow Start
CA: Congestion Avoidance       Fast retransmission/fast recovery
TCP Congestion Control (SSL)    24

12
TCP Reno (another scenario)
TO

cwnd              3 dupACKs

halved
Slow start until cwnd
Initial slow start                      reaches ssthresh
t

TCP Congestion Control (SSL) McKeown
Nick
25

Fast recovery

Idea: each dupACK represents a packet
successfully received. Therefore, no need for
very drastic action
Enter FR/FR after 3 dupACKs
Set ssthresh ← max(flightsize/2, 2)
Retransmit lost packet
Set cwnd ← ssthresh + #dupACKs (window inflation)
Wait till W=min(rwnd, cwnd) is large enough; transmit
new packet(s)
On non-dup ACK (1 RTT later), set cwnd ← ssthresh
(window deflation)
Enter CA
TCP Congestion Control (SSL)   26

13
Example: FR/FR

S 1 2 3 4 5 6 7 8          1     9 10 11
time
Exit FR/FR
R            0 0 0 0 0 0 0          8                              time

cwnd 8                          7    9       11   4
ssthresh                        4    4        4   4
Above scenario: Packet 1 is lost, packets 2, 3, and
4 are received; dupACKs with seq. no. 0 returned
Fast retransmit
Retransmit on 3 dupACKs
Fast recovery
Inflate window such that new packets 9, 10, and 11 can be
sent while repairing loss
TCP Congestion Control (SSL)   27

Summary: Reno
Basic ideas
Fast recovery avoids slow start
dupACKs: fast retransmit + fast recovery
Timeout: fast retransmit + slow start

dupACKs
congestion
avoidance                           FR/FR

timeout

slow start                     retransmit
TCP Congestion Control (SSL)   28

14
increase cwnd by 1                     cut cwnd in half after
MSS every RTT in the                   3 dupACKs
absence of any loss
event
congestion
window

24 Kbytes

16 Kbytes

8 Kbytes

time

Long-lived TCP connection
TCP Congestion Control (SSL)   29

TCP throughput (send rate)

We derived the approximate formula

throughput =    W      packets/sec
RTT
W changes with the arrival of each
congestion indication
To calculate (average) send rate, we need
the average value of W
Q: W is a function of what parameter?

TCP Congestion Control (SSL)   30

15
First approximation
M. Mathis, et al., “The Macroscopic Behavior of the TCP Congestion
Avoidance Algorithm,”ACM Computer Communicatons Review, 27(3), 1997.
No slow-start, no timeout, long-lived TCP
connection
Independent identically distributed “periods”
Each packet may be lost with probability p

TCP Congestion Control (SSL)   31

Geometric Distribution
Ave. no. of transmissions to get first loss
∞              ∞
n =     ∑ ib = ∑ i (1 − p)
i =1
i
i =1
i −1
p
∞
=   p ∑ i (1 − p )i −1
i =1

d ∞                d            ∞
= −p     ∑ (1 − p)i = − p dp
dp i =1
∑ (1 − p)
i =0
i

d      1         1
= −p               =p 2
dp 1 − 1 + p     p
= 1/ p

Similarly, ave. no. of transmissions to get first success is
1/(1-p)
TCP Congestion Control (SSL)   32

16
First approximation (cont.)
Average number of                           8
packets delivered in                  W=
3p
one period (area under
one saw-tooth)
2             2
sending rate (in packets/sec)
⎛W ⎞ 1 ⎛W ⎞ 3 2
⎜ ⎟ + ⎜ ⎟ = W                                                           3 2
W
⎝ 2 ⎠ 2⎝ 2 ⎠ 8                          no. of packets/period           8
=                       =
Average number of                       time per period                ⎛W ⎞
RTT ⎜ ⎟
packets sent per period                                                   ⎝ 2⎠
(incl. loss at the end) is                   1/ p      1             3
=              =
1/p                                           ⎛ 2 ⎞ RTT             2p
RTT ⎜    ⎟
Equating the two and                       ⎝ 3p ⎠
solving for W, we get
TCP Congestion Control (SSL)   33

TCP ACK generation                         [RFC 1122, RFC 2581]

Arrival of in-order segment with     Delayed ACK. Wait up to 500ms
expected seq #. All data up to       for next segment. If no next segment,
expected seq # already ACKed         send ACK

Arrival of in-order segment with     Immediately send single cumulative
expected seq #. One other            ACK, ACKing both in-order segments
segment has ACK pending

Arrival of out-of-order segment      Immediately send duplicate ACK,
higher-than-expect seq. # .          indicating seq. # of next expected byte
Gap detected

Arrival of segment that              Immediate send ACK, provided that
partially or completely fills gap    segment starts at lower end of gap

TCP Congestion Control (SSL)   34

17

Receiver sends one ACK for every two packets
received -> each saw-tooth is WxRTT wide
-> area under a saw-tooth is     3W 2
4
Throughput is        1      3
RTT    4p
One ACK for every b packets received ->
throughput is

1       3
RTT     2bp
TCP Congestion Control (SSL)   35

Modeling TCP Throughput: A Simple
Model and its Empirical Validation,
Proc. ACM SIGCOMM, 1998
Don Towsley, and Jim Kurose

TCP Congestion Control (SSL)   36

18
Motivation
Previous formulas not so accurate when
loss rates are high
TCP traces show that there are more loss
indications due to timeouts (TO) than due
to triple dupACKs (TD)

TCP Congestion Control (SSL)   37

Objectives
formula as a function of loss rate and RTT
by also accounting for TO behavior of a
TCP connection
Formula applicable over a wider range of
loss rates
Explicit statements of assumptions and
approximations used in derivation of
throughput formula
Formula to include the impact of a small
rwnd
TCP Congestion Control (SSL)   38

19
Many assumptions and
approximations
A1. TCP sender is saturated, i.e., source
application process always has a packet to
send when send window has space available
i.e., bulk transfer application
A2. Slow Start not modeled
A3. Time to send all packets in a window is
smaller than RTT
i.e.,transmission rate is not too low

TCP Congestion Control (SSL)   39

A3. Time to send W packets is
less than RTT        •ACK reception
marks the end of
current round and
Start of round
beginning of next
time                                  round.
•Approximation: For
b > 1, ACK is not
immediately after
one RTT, but it is so
End of round
assumed in the
analysis

space
TCP Congestion Control (SSL)   40

20
AIMD evolution of Window Size over time

Each TD period is ended by a TD loss indication.
TDPi period has duration Ai rounds
A4. Duration of a round (RTT) is independent of
window size
approximation (poor for a slow line)
A5. No window inflation in Fast Recovery
approximation
TCP Congestion Control (SSL)   41

Markov regenerative assumption
For the i-th TD period, Wi is window size at
the end of the period, Yi is the number of
packets sent in the period
A6. Assume {Wi} to be a Markov
regenerative process with rewards {Yi}
Given A6, the steady-state TCP throughput
is
N t E[Yi ]             E[Y ]
B = lim Bt = lim            =
t →∞        t →∞   t    E[ Ai ]           E[ A]

TCP Congestion Control (SSL)   42

21
Consider i-th TD period

when ACK of
last packet is

One ACK after receiving b packets (b = 2 in above
figure) -> linear increase has a slope of 1/b packet per
RTT
Number of rounds is Xi +1
αi is the first packet lost in i-th TD period
TCP Congestion Control (SSL)   43

Loss assumptions
A7. Losses in different rounds are
independent
approximation
A8. Losses within the same round are
correlated as follows: If a packet is lost,
all remaining packets transmitted until the
end of that round are also lost
approximation – bursty loss behavior but only
within the same round
all lost packets in the same round are counted
as a single loss indication when estimating p

TCP Congestion Control (SSL)   44

22
AIMD throughput derivation (1)
E[α ] = 1/ p                                   α     seq. no. of first loss
r round trip time
E[r] = RTT
Y no. of packets sent
1
E[Y ] = E[α ] + E[W ] −1 =       −1+ E[W ]         W window size
p                   X no. of rounds
From Wi =
Wi−1 Xi
+ , we have                A time duration of a period
2   b
b
E[ X ] = E[W ]
2
b
E[ A] = (E[ X ] +1)E[r] = ( E[W ] +1)RTT           <- from A4 that round trip
2
times are independent of Wi
1
−1+ E[W ]
E[Y ]      p
send rate B =        =
E[ A] ( b E[W ] +1)RTT
2                               TCP Congestion Control (SSL)   45

AIMD throughput derivation (2)
Another way to
X i / b −1
Wi −1                                       compute E[Y]
Yi =    ∑
k =0    2
+ k )b + βi
(

XW          X X
= i i −1 + i ( i − 1) + βi
2        2 b
X           W
= i (Wi + i −1 − 1) + βi
2           2
W
Let E[ β ] be E[ ] and we have
2                                 <- A9. Assume that
E[ X ]            E[W ]                        {Xi} and {Wi} are
E[Y ] =          ( E[W ] +         − 1) + E[ β ]
2                  2                         mutually
bE[W ]               E[W ]            W                independent i.i.d.
=          ( E[W ] +          − 1) + E[ ]
4                   2               2               sequences of
random variables
TCP Congestion Control (SSL)   46

23
AIMD throughput (3)
<- Equate the two previous
formulas for E[Y]. Solve the
as the only unknown
8
E[W ] =         + o(1/ p )
3bp
b        2b
E[ X ] = E[W ] =    + o(1/ p )
2        3p
1/ p + o(1/ p)        1  3
send rate B( p) =                           ≈       + o(1/ p )
⎛ 2b              ⎞ RTT 2bp
RTT ⎜      + o(1/ p ) ⎟
⎝  3p             ⎠
same as before!

TCP Congestion Control (SSL)   47

AIMD with TO

Let ni denote the number of TD periods within a
cycle ending in i-th TO period, Ri denote no. of
retransmissions in i-th TO period
A10. {ni } form an i.i.d. sequence, independent of
{Yij} and {Aij}

TCP Congestion Control (SSL)   48

24
Throughput of AIMD with TO (1)
E[ M ] = E[n]E[Y ] + E[ R ]                    Assumption of
Markov
E[ S ] = E[n]E[ A] + E[ Z TO ]
regenerative
E[ M ]      E[n]E[Y ] + E[ R]  process again.
send rate B =           =
E[ S ] E[n]E[ A] + E[ Z TO ]
E[Y ] + Q × E[ R]
B=
E[ A] + Q × E[ Z TO ]
1
where Q
E[n]                            <- Probability that a
1                                     given loss
E[ R] =                                          indication is a TO
1− p
with Q and E[ Z TO ] to be determined
TCP Congestion Control (SSL)   49

Approximate solution for Q

A given loss indication is a TO is the union of two
events Two or less acked packets in penultimate
round or two or less acked packets in final round
TCP Congestion Control (SSL)   50

25
Approximate solution for Q (cont.)
<- penultimate round of w
(1 − p ) k p                            packets, first k packets
A( w, k ) =
1 − (1 − p ) w                           ack’d given there is a loss

C (k , m) = (1 − p )m p,      m ≤ k −1            <- for last round, k packets
sent, m packets ack’d in
C (k , m) = (1 − p )k ,       m=k                    sequence

Q( w) = 1                 if w ≤ 3      <- at most 2 dupACKs
2            w           2          <- probability of fewer than 3
= ∑ A( w, k ) + ∑ A( w, k ) ∑ C ( k , m) packets sent successfully
k =0          k =3        m =0
in penultimate round or
if w ≥ 4              less than 3 acks in last
round
TCP Congestion Control (SSL)       51

Approximate solution for Q (cont.)

Q is E[Q ( w)]
But we don’t know the probability distribution of Wi
Approximation                      3            3bp
Q       Q ( E[W ])   min(1,           )   min(1,3            )
E[W ]                   8
TCP Congestion Control (SSL)       52

26
Throughput of AIMD with TO (2)
P[ R = k ] = p k −1 (1 − p )       for k = 1, 2,...
<- duration of k
Lk = (2 − 1)TO
k
for k ≤ 6                               TOs in a row
= (63 + 64(k − 6))TO           for k ≥ 7
1 + p + 2 p + 4 p 3 + 8 p 4 + 16 p 5 + 32 p 6
2
E[ Z TO ] = TO
1− p

TO
f ( p)
T0 (1 + 32 p 2 )                 <- approximation
1− p
E[Y ] + Q × E[ R ]
send rate B ( p ) =
E[ A] + Q × E[ Z TO ]
1− p                             1
+ E[W ] + Q( E[W ])
p                            1− p
B( p)
f ( p)
RTT ( E[ X ] + 1) + Q( E[W ])TO
1− p      TCP Congestion Control (SSL)   53

Throughput of AIMD with TO (3)
1− p                           1
+ E[W ] + Q( E[W ])                 <- Eq. (27)
p                         1− p
B( p)
f ( p)      more accurate
RTT ( E[ X ] + 1) + Q( E[W ])TO                version of
1− p          throughput
1/ p                            formula
⎛ 2b ⎞          ⎛      3bp ⎞
RTT ⎜      ⎟ + min ⎜1,3        ⎟ (1 + 32 p )T0
2

⎝ 3p ⎠          ⎝       8 ⎠
1
=                                                  <-Eq. (29)
⎛ 2bp ⎞          ⎛      3bp ⎞                  most well-
RTT ⎜       ⎟ + min ⎜1,3        ⎟ p(1 + 32 p )T0
2

⎝   3 ⎠          ⎝       8 ⎠                    known version
of throughput
formula
TCP Congestion Control (SSL)   54

27

Compute E[W ]. If E[W ] < Wmax , use Eq. (27):           Full model Eq. (31)
1− p                          1
+ E[W ] + Q( E[W ])
p                         1− p
B( p)
f ( p)          if E[W] <Wmax,
RTT ( E[ X ] + 1) + Q( E[W ])TO
1− p
1− p                        1
+ Wmax + Q(Wmax )                otherwise, use Wmax for
p                       1− p             E[W] and recompute
B( p)
b           1− p                    f ( p)    E[X]
RTT ( Wmax +            + 2) + Q(Wmax )TO
8          pWmax                    1− p      (derivation omitted)
TCP Congestion Control (SSL)   55

limitation—approximate model
Use the well-known Eq. (29) from before,

Wmax                               1
B( p)     min(        ,                                                                 )
RTT          ⎛ 2bp ⎞       ⎛    3bp ⎞
RTT ⎜     ⎟ + min ⎜1,3     ⎟ p (1 + 32 p )T0
2

⎝ 3 ⎠         ⎝     8 ⎠

which is referred to as Eq. (32)

TCP Congestion Control (SSL)   56

28
Summary data from traces (1 hour)
Saturated TCP
sender
p computed
from dividing
total no. of loss
indications by
total number of
packets sent
RTT and TO
values are
averaged over
entire 1-hour
trace

TCP Congestion Control (SSL)   57

Summary data from 100s traces

Each row represents results of 100 traces each of
100 seconds in duration for same S-D pair
Totals are cumulative over 100 traces
RTT and TO are average values over 100 traces for
same S-D pair                    TCP Congestion Control (SSL)          58

29
Experimental comparison (1)

Each point represents number of packets in 100s interval of trace
T0 ~ single TO, T1 ~ at least 1 double TO in trace, etc.
“TD Only” is analytic model by Mathis et al.
Note: Wmax is only 6 in Figure 7         TCP Congestion Control (SSL) 59

Experimental comparison (2)

Wmax = 33                                       Wmax=44
TCP Congestion Control (SSL)   60

30
Experimental comparison (3)

Wmax=8                                        Wmax=48
TCP Congestion Control (SSL)   61

Accuracy of approximate model

Figure 18: manic to spiff, with predictions by both full and
approximate models       (Wmax=32)
TCP Congestion Control (SSL)   62

31
Average errors

N predicted − N observed
∑            N observed
ave. error =   observations

no. of observations                 TCP Congestion Control (SSL)   63

Conclusions
A much more rigorous analysis than the one by
Mathis et al.
Numerous assumptions and approximations used
but (almost) all of them are explicitly stated
Large amount of experimental measurements on
the Internet to validate accuracy of the full model
(less for the approximate model)
Throughput formula accounts for loss indications
due to TO as well as rwnd restriction
Using the formula requires accurate measurements of
loss rate and RTT values (which could be tricky)
For TCP Reno and drop-tail router
Accuracy (like beauty) is in the eye of the
beholder. What do you think?
TCP Congestion Control (SSL)   64

32
TCP Throughput limited by loss rate
TCP average throughput (approximate) in
terms of loss rate, L:
1.22 ⋅ MSS
RTT p
Example: 1500-byte segments, 100ms RTT,
to get 10 Gbps throughput, loss rate needs
to be very low
p = 2x10-10
New version of TCP needed for connections
with high-delay bandwidth product
addressed in paper by Katabi’s et al
TCP Congestion Control (SSL)   65

The End

TCP Congestion Control (SSL)   66

33

```
DOCUMENT INFO
Shared By:
Categories:
Stats:
 views: 23 posted: 6/25/2010 language: English pages: 33
How are you planning on using Docstoc?