Transport Layer Computer Networking
Description
Transport Layer Computer Networking
Document Sample


Section 3: Transport Layer
Our goals goals: Overview:
r understand principles r transport layer services
behind transport layer r multiplexing/demultiplexing
services: r connectionless transport: UDP
m multiplexing/demultiplex
r principles of reliable data
ing
transfer
m reliable data transfer
r connection-oriented transport:
m flow control
TCP
m congestion control
m reliable transfer
r instantiation and
m flow control
implementation in the
m connection management
Internet
r principles of congestion control
r TCP congestion control
3: Transport Layer 3-1
Transport services and protocols
r provide logical communication application
between app’ processes
transport
network
running on different hosts
data link network
physical data link
network physical
r transport protocols run in data link
physical
end systems network
data link
r transport vs network layer physical network
data link
services: physical
network layer: data transfer
network
r data link
physical
between end systems
r transport layer: data application
transport
transfer between processes network
data link
physical
m relies on, enhances, network
layer services
3: Transport Layer 3-2
Transport-layer protocols
Internet transport services: application
transport
r reliable, in-order unicast network
data link network
delivery (TCP) physical data link
network physical
m congestion data link
physical
m flow control network
data link
m connection setup physical network
data link
r unreliable (“best-effort”),
physical
network
unordered unicast or data link
physical
multicast delivery: UDP
r services not available: application
transport
network
m real-time data link
physical
m bandwidth guarantees
m reliable multicast
3: Transport Layer 3-3
Multiplexing/demultiplexing
Recall: segment - unit of data
Demultiplexing: delivering
exchanged between
received segments to
transport layer entities correct app layer processes
m aka TPDU: transport
protocol data unit
receiver
P3 P4
application-layer M M
data
application
segment P1 transport P2
header M
M network
application application
segment Ht M transport transport
network
Hn segment network
3: Transport Layer 3-4
Multiplexing/demultiplexing
Multiplexing:
gathering data from multiple 32 bits
app processes, enveloping
data with header (later used source port # dest port #
for demultiplexing)
other header fields
multiplexing/demultiplexing:
r based on sender, receiver
port numbers, IP addresses
application
m source, dest port #s in
data
each segment (message)
m recall: well-known port
numbers for specific
applications TCP/UDP segment format
3: Transport Layer 3-5
Multiplexing/demultiplexing: examples
source port: x Web client
host A dest. port: 23 server B host C
source port:23
dest. port: x
Source IP: C Source IP: C
Dest IP: B Dest IP: B
source port: y source port: x
port use: simple telnet app dest. port: 80 dest. port: 80
Source IP: A
Dest IP: B Web
Web client source port: x server B
host A dest. port: 80
port use: Web server
3: Transport Layer 3-6
UDP: User Datagram Protocol [RFC 768]
r “no frills,” “bare bones”
Internet transport Why is there a UDP?
protocol
r no connection
r “best effort” service, UDP establishment (which can
segments may be: add delay)
m lost r simple: no connection state
m delivered out of order at sender, receiver
to app r small segment header
r connectionless: r no congestion control: UDP
m no handshaking between can blast away as fast as
UDP sender, receiver desired
m each UDP segment
handled independently
of others
3: Transport Layer 3-7
UDP: more
r often used for streaming
multimedia apps 32 bits
m loss tolerant Length, in source port # dest port #
m rate sensitive bytes of UDP length checksum
segment,
r other UDP uses including
(why?): header
m DNS
m SNMP Application
data
r reliable transfer over UDP:
(message)
add reliability at
application layer
m application-specific
UDP segment format
error recovery!
3: Transport Layer 3-8
UDP checksum
Goal: detect “errors” (e.g., flipped bits) in transmitted
segment
Sender: Receiver:
r treat segment contents r compute checksum of
as sequence of 16-bit received segment
integers r check if computed checksum
r checksum: addition (1’s equals checksum field value:
complement sum) of m NO - error detected
segment contents m YES - no error detected.
r sender puts checksum But maybe errors
value into UDP checksum nonetheless? More later
field ….
3: Transport Layer 3-9
Principles of Reliable data transfer
r important in app., transport, link layers
r top-10 list of important networking topics!
r characteristics of unreliable channel will determine
complexity of reliable data transfer protocol (rdt)
3: Transport Layer 3-10
Reliable data transfer: getting started
rdt_send(): called from above, deliver_data(): called by
(e.g., by app.). Passed data to rdt to deliver data to upper
deliver to receiver upper layer
send receive
side side
udt_send(): called by rdt, rdt_rcv(): called when packet
to transfer packet over arrives on rcv-side of channel
unreliable channel to receiver
3: Transport Layer 3-11
Reliable data transfer: getting started
We’ll:
r incrementally develop sender, receiver sides of
reliable data transfer protocol (rdt)
r consider only unidirectional data transfer
m but control info will flow on both directions!
r use finite state machines (FSM) to specify
sender, receiver
event causing state transition
actions taken on state transition
state: when in this
“state” next state state state
1 event
uniquely determined 2
by next event actions
3: Transport Layer 3-12
Rdt1.0: reliable transfer over a reliable channel
r underlying channel perfectly reliable
m no bit errors
m no loss of packets
r separate FSMs for sender, receiver:
m sender sends data into underlying channel
m receiver read data from underlying channel
Wait for rdt_send(data) Wait for rdt_rcv(packet)
call from call from extract (packet,data)
above packet = make_pkt(data) below deliver_data(data)
udt_send(packet)
sender receiver
3: Transport Layer 3-13
Rdt2.0: channel with bit errors
r underlying channel may flip bits in packet
m recall: UDP checksum to detect bit errors
r the question: how to recover from errors:
m acknowledgements (ACKs): receiver explicitly tells sender
that pkt received OK
m negative acknowledgements (NAKs): receiver explicitly
tells sender that pkt had errors
m sender retransmits pkt on receipt of NAK
m human scenarios using ACKs, NAKs?
r new mechanisms in rdt2.0 (beyond rdt1.0):
m error detection
m receiver feedback: control msgs (ACK,NAK) rcvr->sender
3: Transport Layer 3-14
rdt2.0: FSM specification
rdt_send(data)
snkpkt = make_pkt(data, checksum) receiver
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for Wait for rdt_rcv(rcvpkt) &&
call from ACK or udt_send(sndpkt) corrupt(rcvpkt)
above NAK
udt_send(NAK)
rdt_rcv(rcvpkt) && isACK(rcvpkt)
Wait for
L
call from
sender below
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
3: Transport Layer 3-15
rdt2.0: operation with no errors
rdt_send(data)
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for Wait for rdt_rcv(rcvpkt) &&
call from ACK or udt_send(sndpkt) corrupt(rcvpkt)
above NAK
udt_send(NAK)
rdt_rcv(rcvpkt) && isACK(rcvpkt)
Wait for
L call from
below
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
3: Transport Layer 3-16
rdt2.0: error scenario
rdt_send(data)
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for Wait for rdt_rcv(rcvpkt) &&
call from ACK or udt_send(sndpkt) corrupt(rcvpkt)
above NAK
udt_send(NAK)
rdt_rcv(rcvpkt) && isACK(rcvpkt)
Wait for
L call from
below
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
3: Transport Layer 3-17
rdt2.0 has a fatal flaw!
What happens if Handling duplicates:
ACK/NAK corrupted? r sender adds sequence
r sender doesn’t know what number to each pkt
happened at receiver! r sender retransmits current
r can’t just retransmit: pkt if ACK/NAK garbled
possible duplicate r receiver discards (doesn’t
deliver up) duplicate pkt
What to do?
r sender ACKs/NAKs
receiver’s ACK/NAK? What stop and wait
if sender ACK/NAK lost? Sender sends one packet,
then waits for receiver
r retransmit, but this might
response
cause retransmission of
correctly received pkt!
3: Transport Layer 3-18
rdt2.1: sender, handles garbled ACK/NAKs
rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt) rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
Wait for Wait for
ACK or
isNAK(rcvpkt) )
call 0 from
NAK 0 udt_send(sndpkt)
above
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt) rdt_rcv(rcvpkt)
&& isACK(rcvpkt) && notcorrupt(rcvpkt)
&& isACK(rcvpkt)
L
L
Wait for Wait for
ACK or call 1 from
rdt_rcv(rcvpkt) && NAK 1 above
( corrupt(rcvpkt) ||
isNAK(rcvpkt) ) rdt_send(data)
udt_send(sndpkt) sndpkt = make_pkt(1, data, checksum)
udt_send(sndpkt)
3: Transport Layer 3-19
rdt2.1: receiver, handles garbled ACK/NAKs
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq0(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
rdt_rcv(rcvpkt) && (corrupt(rcvpkt) ||
(corrupt(rcvpkt) || Wait for Wait for has_seq0(rcvpkt)))
has_seq1(rcvpkt))) 0 from 1 from
below sndpkt = make_pkt(NAK, chksum)
below
sndpkt = make_pkt(NAK, chksum) udt_send(sndpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq1(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
3: Transport Layer 3-20
rdt2.1: discussion
Sender: Receiver:
r seq # added to pkt r must check if received
r two seq. #’s (0,1) will packet is duplicate
suffice. Why? m state indicates whether
0 or 1 is expected pkt
r must check if received seq #
ACK/NAK corrupted
r note: receiver can not
r twice as many states know if its last
m state must “remember” ACK/NAK received OK
whether “current” pkt
at sender
has 0 or 1 seq. #
3: Transport Layer 3-21
rdt2.2: a NAK-free protocol
r same functionality as rdt2.1, using NAKs only
r instead of NAK, receiver sends ACK for last pkt
received OK
m receiver must explicitly include seq # of pkt being ACKed
r duplicate ACK at sender results in same action as
NAK: retransmit current pkt
3: Transport Layer 3-22
rdt2.2: sender, receiver fragments
rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
Wait for Wait for
ACK isACK(rcvpkt,1) )
call 0 from
above 0 udt_send(sndpkt)
sender FSM
fragment rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) && && isACK(rcvpkt,0)
(corrupt(rcvpkt) || L
has_seq1(rcvpkt)) Wait for receiver FSM
0 from
udt_send(sndpkt) below fragment
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq1(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK1, chksum)
udt_send(sndpkt) 3: Transport Layer 3-23
rdt3.0: channels with errors and loss
New assumption: Approach: sender waits
underlying channel can “reasonable” amount of
also lose packets (data time for ACK
or ACKs) r retransmits if no ACK
m checksum, seq. #, ACKs, received in this time
retransmissions will be r if pkt (or ACK) just delayed
of help, but not enough (not lost):
Q: how to deal with loss? m retransmission will be
duplicate, but use of seq.
m sender waits until
#’s already handles this
certain data or ACK
lost, then retransmits m receiver must specify seq
# of pkt being ACKed
m yuck: drawbacks?
r requires countdown timer
3: Transport Layer 3-24
rdt3.0 sender
rdt_send(data)
rdt_rcv(rcvpkt) &&
sndpkt = make_pkt(0, data, checksum) ( corrupt(rcvpkt) ||
udt_send(sndpkt) isACK(rcvpkt,1) )
rdt_rcv(rcvpkt) start_timer L
L Wait for Wait
for timeout
call 0from
ACK0 udt_send(sndpkt)
above
start_timer
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt) rdt_rcv(rcvpkt)
&& isACK(rcvpkt,1) && notcorrupt(rcvpkt)
stop_timer && isACK(rcvpkt,0)
stop_timer
Wait Wait for
timeout for call 1 from
udt_send(sndpkt) ACK1 above
start_timer rdt_rcv(rcvpkt)
rdt_send(data) L
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) || sndpkt = make_pkt(1, data, checksum)
isACK(rcvpkt,0) ) udt_send(sndpkt)
start_timer
L
3: Transport Layer 3-25
rdt3.0 in action
3: Transport Layer 3-26
rdt3.0 in action
3: Transport Layer 3-27
Performance of rdt3.0
r rdt3.0 works, but performance stinks
r example: 1 Gbps link, 15 ms e-e prop. delay, 1KB packet:
Ttransmit = L (packet length in bits) 8kb/pkt
= = 8 microsec
R (transmission rate, bps) 10**9 b/sec
U L/R .008
= = = 0.00027
sender 30.008
RTT + L / R microsec
onds
m U sender: utilization – fraction of time sender busy sending
m 1KB pkt every 30 msec -> 33kB/sec thruput over 1 Gbps link
m network protocol limits use of physical resources!
3: Transport Layer 3-28
rdt3.0: stop-and-wait operation
sender receiver
first packet bit transmitted, t = 0
last packet bit transmitted, t = L / R
first packet bit arrives
RTT last packet bit arrives, send ACK
ACK arrives, send next
packet, t = RTT + L / R
U L/R .008
= = = 0.00027
sender 30.008
RTT + L / R microsec
onds
3: Transport Layer 3-29
Pipelined protocols
Pipelining: sender allows multiple, “in-flight”, yet-to-
be-acknowledged pkts
m range of sequence numbers must be increased
m buffering at sender and/or receiver
r Two generic forms of pipelined protocols: go-Back-N,
selective repeat
3: Transport Layer 3-30
Pipelining: increased utilization
sender receiver
first packet bit transmitted, t = 0
last bit transmitted, t = L / R
first packet bit arrives
RTT last packet bit arrives, send ACK
last bit of 2nd packet arrives, send ACK
ACK arrives, send next last bit of 3rd packet arrives, send ACK
packet, t = RTT + L / R
Increase utilization
by a factor of 3!
U 3*L/R .024
= = = 0.0008
sender 30.008
RTT + L / R microsecon
ds
3: Transport Layer 3-31
Go-Back-N
Sender:
r k-bit seq # in pkt header
r “window” of up to N, consecutive unack’ed pkts allowed
r ACK(n): ACKs all pkts up to, including seq # n - “cumulative ACK”
m may deceive duplicate ACKs (see receiver)
r timer for each in-flight pkt
r timeout(n): retransmit pkt n and all higher seq # pkts in window
3: Transport Layer 3-32
GBN: sender extended FSM
rdt_send(data)
if (nextseqnum < base+N) {
sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum)
udt_send(sndpkt[nextseqnum])
if (base == nextseqnum)
start_timer
nextseqnum++
}
L else
refuse_data(data)
base=0
nextseqnum=0
timeout
start_timer
Wait
udt_send(sndpkt[base])
rdt_rcv(rcvpkt) udt_send(sndpkt[base+1])
&& corrupt(rcvpkt) …
udt_send(sndpkt[nextseqnum-1])
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
base = getacknum(rcvpkt)+1
If (base == nextseqnum)
stop_timer
else
start_timer 3: Transport Layer 3-33
GBN: receiver extended FSM
default
udt_send(sndpkt) rdt_rcv(rcvpkt)
&& notcurrupt(rcvpkt)
&& hasseqnum(rcvpkt,expectedseqnum)
L Wait extract(rcvpkt,data)
expectedseqnum=0 deliver_data(data)
sndpkt = make_pkt(expectedseqnum,ACK,chksum)
udt_send(sndpkt)
expectedseqnum++
ACK-only: always send ACK for correctly-received pkt
with highest in-order seq #
m may generate duplicate ACKs
m need only remember expectedseqnum
r out-of-order pkt:
m discard (don’t buffer) -> no receiver buffering!
m Re-ACK pkt with highest in-order seq #
3: Transport Layer 3-34
GBN in
action
3: Transport Layer 3-35
Selective Repeat
r receiver individually acknowledges all correctly
received pkts
m buffers pkts, as needed, for eventual in-order delivery
to upper layer
r sender only resends pkts for which ACK not
received
m sender timer for each unACKed pkt
r sender window
m N consecutive seq #’s
m again limits seq #s of sent, unACKed pkts
3: Transport Layer 3-36
Selective repeat: sender, receiver windows
3: Transport Layer 3-37
Selective repeat
sender receiver
data from above : pkt n in [rcvbase, rcvbase+N-1]
r if next available seq # in r send ACK(n)
window, send pkt r out-of-order: buffer
timeout(n): r in-order: deliver (also
r resend pkt n, restart timer deliver buffered, in-order
pkts), advance window to
ACK(n) in [sendbase,sendbase+N]: next not-yet-received pkt
r mark pkt n as received
pkt n in [rcvbase-N,rcvbase-1]
r if n smallest unACKed pkt,
r ACK(n)
advance window base to
next unACKed seq # otherwise:
r ignore
3: Transport Layer 3-38
Selective repeat in action
3: Transport Layer 3-39
Selective repeat:
dilemma
Example:
r seq #’s: 0, 1, 2, 3
r window size=3
r receiver sees no
difference in two
scenarios!
r incorrectly passes
duplicate data as new
in (a)
Q: what relationship
between seq # size
and window size?
3: Transport Layer 3-40
TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581
r point-to-point: r full duplex data:
m one sender, one receiver m bi-directional data flow
r reliable, in-order byte in same connection
m MSS: maximum segment
steam:
size
m no “message boundaries”
r connection-oriented:
r pipelined:
m handshaking (exchange
m TCP congestion and flow
of control msgs) init’s
control set window size sender, receiver state
r send & receive buffers before data exchange
r flow controlled:
m sender will not
application application
writes data reads data
socket socket
overwhelm receiver
door door
TCP TCP
send buffer receive buffer
segment
3: Transport Layer 3-41
TCP segment structure
32 bits
URG: urgent data counting
(generally not used) source port # dest port #
by bytes
sequence number of data
ACK: ACK #
valid acknowledgement number (not segments!)
head not
PSH: push data now len used
UA P R S F rcvr window size
(generally not used) # bytes
checksum ptr urgent data
rcvr willing
RST, SYN, FIN: to accept
Options (variable length)
connection estab
(setup, teardown
commands)
application
Internet data
checksum (variable length)
(as in UDP)
3: Transport Layer 3-42
TCP seq. #’s and ACKs
Seq. #’s:
Host A Host B
m byte stream
“number” of first User
types
byte in segment’s ‘C’
data host ACKs
receipt of
ACKs: ‘C’, echoes
m seq # of next byte back ‘C’
expected from
other side host ACKs
m cumulative ACK receipt
of echoed
Q: how receiver handles ‘C’
out-of-order segments
m A: TCP spec doesn’t
time
say, - up to
simple telnet scenario
implementor
3: Transport Layer 3-43
TCP: reliable data transfer
simplified sender, assuming
event: data received
from application above
create, send segment •one way data transfer
•no flow, congestion control
event: timer timeout for
wait
wait
for segment with seq # y
for
event
event retransmit segment
event: ACK received,
with ACK # y
ACK processing
3: Transport Layer 3-44
TCP: 00 sendbase = initial_sequence number
01 nextseqnum = initial_sequence number
reliable
02
03 loop (forever) {
04 switch(event)
data
05 event: data received from application above
06 create TCP segment with sequence number nextseqnum
07 start timer for segment nextseqnum
transfer
08 pass segment to IP
09 nextseqnum = nextseqnum + length(data)
10 event: timer timeout for segment with sequence number y
11 retransmit segment with sequence number y
12 compue new timeout interval for segment y
Simplified 13 restart timer for sequence number y
14 event: ACK received, with ACK field value of y
TCP 15 if (y > sendbase) { /* cumulative ACK of all data up to y */
sender 16 cancel all timers for segments with sequence numbers < y
17 sendbase = y
18 }
19 else { /* a duplicate ACK for already ACKed segment */
20 increment number of duplicate ACKs received for y
21 if (number of duplicate ACKS received for y == 3) {
22 /* TCP fast retransmit */
23 resend segment with sequence number y
24 restart timer for segment y
25 }
26 } /* end of loop forever */
3: Transport Layer 3-45
TCP ACK generation [RFC 1122, RFC 2581]
Event TCP Receiver action
in-order segment arrival, delayed ACK. Wait up to 500ms
no gaps, for next segment. If no next segment,
everything else already ACKed send ACK
in-order segment arrival, immediately send single
no gaps, cumulative ACK
one delayed ACK pending
out-of-order segment arrival send duplicate ACK, indicating seq. #
higher-than-expect seq. # of next expected byte
gap detected
arrival of segment that immediate ACK if segment starts
partially or completely fills gap at lower end of gap
3: Transport Layer 3-46
TCP: retransmission scenarios
Host A Host B Host A Host B
Seq=92 timeout
Seq=100 timeout
timeout
X
loss
time time premature timeout,
lost ACK scenario
cumulative ACKs
3: Transport Layer 3-47
TCP Flow Control
flow control receiver: explicitly
sender won’t overrun informs sender of
receiver’s buffers by (dynamically changing)
transmitting too much, amount of free buffer
too fast space
m RcvWindow field in
RcvBuffer = size or TCP Receive Buffer TCP segment
RcvWindow = amount of spare room in Buffer sender: keeps the amount
of transmitted,
unACKed data less than
most recently received
RcvWindow
receiver buffering
3: Transport Layer 3-48
TCP Round Trip Time and Timeout
Q: how to set TCP Q: how to estimate RTT?
timeout value? r SampleRTT: measured time from
r longer than RTT segment transmission until ACK
receipt
m note: RTT will vary
m ignore retransmissions,
r too short: premature
cumulatively ACKed segments
timeout
r SampleRTT will vary, want
m unnecessary
estimated RTT “smoother”
retransmissions
m use several recent
r too long: slow reaction
measurements, not just
to segment loss
current SampleRTT
3: Transport Layer 3-49
TCP Round Trip Time and Timeout
EstimatedRTT = (1-x)*EstimatedRTT + x*SampleRTT
r Exponential weighted moving average
r influence of given sample decreases exponentially fast
r typical value of x: 0.1
Setting the timeout
r EstimtedRTT plus “safety margin”
r large variation in EstimatedRTT -> larger safety margin
Timeout = EstimatedRTT + 4*Deviation
Deviation = (1-x)*Deviation +
x*|SampleRTT-EstimatedRTT|
3: Transport Layer 3-50
Example RTT estimation:
RTT: gaia.cs.umass.edu to fantasia.eurecom.fr
350
300
250
RTT (milliseconds)
200
150
100
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
SampleRTT Estimated RTT
3: Transport Layer 3-51
TCP Connection Management
Recall: TCP sender, receiver Three way handshake:
establish “connection”
before exchanging data Step 1: client end system
segments sends TCP SYN control
r initialize TCP variables: segment to server
m seq. #s m specifies initial seq #
m buffers, flow control
Step 2: server end system
info (e.g. RcvWindow)
receives SYN, replies with
r client: connection initiator SYNACK control segment
Socket clientSocket = new
Socket("hostname","port m ACKs received SYN
number"); m allocates buffers
r server: contacted by client m specifies server->
Socket connectionSocket = receiver initial seq. #
welcomeSocket.accept();
3: Transport Layer 3-52
TCP Connection Management (cont.)
Closing a connection: client server
close
client closes socket:
clientSocket.close();
Step 1: client end system close
sends TCP FIN control
segment to server
Step 2: server receives
FIN, replies with ACK. timed wait
Closes connection, sends
FIN. closed
3: Transport Layer 3-53
TCP Connection Management (cont.)
Step 3: client receives FIN, client server
replies with ACK. closing
m Enters “timed wait” -
will respond with ACK
to received FINs
closing
Step 4: server, receives
ACK. Connection closed.
timed wait
Note: with small
closed
modification, can handly
simultaneous FINs.
closed
3: Transport Layer 3-54
TCP Connection Management (cont)
TCP server
lifecycle
TCP client
lifecycle
3: Transport Layer 3-55
Principles of Congestion Control
Congestion:
r informally: “too many sources sending too much
data too fast for network to handle”
r different from flow control!
r manifestations:
m lost packets (buffer overflow at routers)
m long delays (queueing in router buffers)
r a top-10 problem!
3: Transport Layer 3-56
Causes/costs of congestion: scenario 1
Host A lout
r two senders, two
lin : original data
receivers
r one router,
Host B unlimited shared
output link buffers
infinite buffers
r no retransmission
r large delays
when congested
r maximum
achievable
throughput
3: Transport Layer 3-57
Causes/costs of congestion: scenario 2
r one router, finite buffers
r sender retransmission of lost packet
Host A lin : original lout
data
l'in : original data, plus
retransmitted data
Host B finite shared output
link buffers
3: Transport Layer 3-58
Causes/costs of congestion: scenario 2
r always: l= l (goodput)
in out
r “perfect” retransmission only when loss: l > lout
in
r retransmission of delayed (not lost) packet makes l larger
in
(than perfect case) for same lout
“costs” of congestion:
r more work (retrans) for given “goodput”
r unneeded retransmissions: link carries multiple copies of pkt
3: Transport Layer 3-59
Causes/costs of congestion: scenario 3
r four senders
Q: what happens as l
r multihop paths in
and l increase ?
r timeout/retransmit in
Host A lout
lin : original data
l'in : original data, plus
retransmitted data
finite shared output
link buffers
Host B
3: Transport Layer 3-60
Causes/costs of congestion: scenario 3
H l
o
o
s
u
t
A t
H
o
s
t
B
Another “cost” of congestion:
r when packet dropped, any “upstream transmission
capacity used for that packet was wasted!
3: Transport Layer 3-61
Approaches towards congestion control
Two broad approaches towards congestion control:
End-end congestion Network-assisted
control: congestion control:
r no explicit feedback from r routers provide feedback
network to end systems
r congestion inferred from m single bit indicating
end-system observed loss, congestion (SNA,
delay DECbit, TCP/IP ECN,
r approach taken by TCP ATM)
m explicit rate sender
should send at
3: Transport Layer 3-62
Case study: ATM ABR congestion control
ABR: available bit rate: RM (resource management)
r “elastic service” cells:
r if sender’s path r sent by sender, interspersed
“underloaded”: with data cells
m sender should use r bits in RM cell set by switches
available bandwidth (“network-assisted”)
r if sender’s path m NI bit: no increase in rate
congested: (mild congestion)
m sender throttled to m CI bit: congestion
minimum guaranteed indication
rate r RM cells returned to sender by
receiver, with bits intact
3: Transport Layer 3-63
Case study: ATM ABR congestion control
r two-byte ER (explicit rate) field in RM cell
m congested switch may lower ER value in cell
m sender’ send rate thus minimum supportable rate on path
r EFCI bit in data cells: set to 1 in congested switch
m if data cell preceding RM cell has EFCI set, sender sets CI
bit in returned RM cell
3: Transport Layer 3-64
TCP Congestion Control
r end-end control (no network assistance)
r transmission rate limited by congestion window
size, Congwin, over segments:
Congwin
r w segments, each with MSS bytes sent in one RTT:
w * MSS
throughput = Bytes/sec
RTT
3: Transport Layer 3-65
TCP congestion control:
r “probing” for usable r two “phases”
bandwidth: m slow start
m ideally: transmit as fast m congestion avoidance
as possible (Congwin as
r important variables:
large as possible)
without loss m Congwin
m increase Congwin until m threshold: defines
loss (congestion) threshold between two
slow start phase,
m loss: decrease Congwin,
congestion control
then begin probing
phase
(increasing) again
3: Transport Layer 3-66
TCP Slowstart
Host A Host B
Slowstart algorithm
RTT
initialize: Congwin = 1
for (each segment ACKed)
Congwin++
until (loss event OR
CongWin > threshold)
r exponential increase (per
RTT) in window size (not so
slow!) time
r loss event: timeout (Tahoe
TCP) and/or or three
duplicate ACKs (Reno TCP)
3: Transport Layer 3-67
TCP Congestion Avoidance: Tahoe
TCP Tahoe Congestion avoidance
/* slowstart is over */
/* Congwin > threshold */
Until (loss event) {
every w segments ACKed:
Congwin++
}
threshold = Congwin/2
Congwin = 1
perform slowstart
3: Transport Layer 3-68
TCP Congestion Avoidance: Reno
TCP Reno Congestion avoidance
r three duplicate ACKs /* slowstart is over */
(Reno TCP): /* Congwin > threshold */
Until (loss event) {
r some segments are every w segments ACKed:
getting through Congwin++
correctly! }
r don’t “overreact” by
threshold = Congwin/2
If (loss detected by timeout) {
decreasing window to 1 Congwin = 1
as in Tahoe perform slowstart }
m decrease window size If (loss detected by triple
by half duplicate ACK)
Congwin = Congwin/2
3: Transport Layer 3-69
Congestion Avoidance: Reno
r increase window by one per RTT if no loss: Congwin++
receiver
W
sender
r decrease window by half on detection of loss by triple
duplicate ACK: CongWin = Congwin/2 W <- W/2
receiver
W
sender
3: Transport Layer 3-70
TCP Reno versus TCP Tahoe:
congestion window size 14
12
10
(segments)
8
6
4 threshold
2
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Transmission round
TCP
Series1 TCP
Series2
Tahoe Reno
Figure 3.49 (revised): Evolution of TCP’s Congestion
window (Tahoe and Reno)
3: Transport Layer 3-71
AIMD
TCP Fairness
TCP congestion
avoidance: Fairness goal: if N TCP
r AIMD: additive sessions share same
increase, bottleneck link, each
multiplicative should get 1/N of link
decrease capacity
m increase window by 1 TCP connection 1
per RTT
m decrease window by
factor of 2 on loss
event
bottleneck
TCP
router
connection 2
capacity R
3: Transport Layer 3-72
Why is TCP fair?
Two competing sessions:
r Additive increase gives slope of 1, as throughout increases
r multiplicative decrease decreases throughput proportionally
R equal bandwidth share
loss: decrease window by factor of 2
congestion avoidance: additive increase
loss: decrease window by factor of 2
congestion avoidance: additive increase
Connection 1 throughput R
3: Transport Layer 3-73
TCP latency modeling
Q: How long does it take to Notation, assumptions:
receive an object from a r Assume one link between
Web server after sending client and server of rate R
a request? r Assume: fixed congestion
r TCP connection establishment window, W segments
r data transfer delay r S: MSS (bits)
r O: object size (bits)
r no retransmissions (no loss,
no corruption)
Two cases to consider:
r WS/R > RTT + S/R: ACK for first segment in
window returns before window’s worth of data
sent
r WS/R < RTT + S/R: wait for ACK after sending
window’s worth of data sent 3: Transport Layer 3-74
TCP latency Modeling K:= O/WS
Case 1: latency = 2RTT + O/R Case 2: latency = 2RTT + O/R
+ (K-1)[S/R + RTT - WS/R]
3: Transport Layer 3-75
TCP Latency Modeling: Slow Start
r Now suppose window grows according to slow start.
r Will show that the latency of one object of size O is:
O S S
Latency 2 RTT P RTT ( 2 P 1)
R R R
where P is the number of times TCP stalls at server:
P min{Q, K 1}
- where Q is the number of times the server would stall
if the object were of infinite size.
- and K is the number of windows that cover the object.
3: Transport Layer 3-76
TCP Latency Modeling: Slow Start (cont.)
initiate TCP
Example:
connection
request
O/S = 15 segments object
first window
= S/R
K = 4 windows RTT
second window
= 2S/R
Q=2
third window
= 4S/R
P = min{K-1,Q} = 2
Server stalls P=2 times.
fourth window
= 8S/R
complete
object transmission
delivered
time at
time at server
client
3: Transport Layer 3-77
TCP Latency Modeling: Slow Start (cont.)
S
RTT time from when server starts to send segment
R
until server receives acknowledgement
initiate TCP
connection
S
2k 1 time to transmit the kth window request
R object
first window
= S/R
S S RTT
second window
R RTT 2k 1 stall time after the kth window = 2S/R
R
third window
= 4S/R
P
O
latency 2 RTT stallTimep
fourth window
= 8S/R
R p 1
P
O S S
2 RTT [ RTT 2 k 1 ]
R k 1 R R object
complete
transmission
delivered
O S S
2 RTT P[ RTT ] ( 2 P 1) time at
time at
server
R R R client
3: Transport Layer 3-78
Chapter 3: Summary
r principles behind
transport layer services:
m multiplexing/demultipl
exing
m reliable data transfer
m flow control
m congestion control Next:
r instantiation and r leaving the network
“edge” (application,
implementation in the transport layers)
Internet r into the network
m UDP “core”
m TCP
3: Transport Layer 3-79
Get documents about "