Microscopic Behavior of TCP Congestion Control
Document Sample


Microscopic Behavior of Internet Control Xiaoliang (David) Wei NetLab, CS&EE California Institute of Technology Internet Control Problem -> solution -> understanding -> 1986: First Internet Congestion Collapse 1986 1989 1995 1999 2003 … Internet Control Problem -> solution -> understanding -> First Internet Congestion Collapse 1988~1990: TCP-Tahoe DEC-bit 1986 1989 1995 1999 2003 … Internet Control Problem -> solution -> understanding -> First Internet Congestion Collapse TCP Tahoe; DEC-bit 1993~1995: Tri-S, DUAL, TCP-Vegas 1986 1989 1995 1999 2003 … Outline Motivation Overview of Microscopic behavior Stability of Delay-based Congestion Control Algorithms Fairness of Loss-based Congestion control algorithms Future works Summary Outline Motivation Overview of Microscopic behavior Stability of Delay-based Congestion Control Algorithms Fairness of Loss-based Congestion control algorithms Future works Macroscopic View of TCP Control TCP/AQM: A feedback control system C TCP Receiver 1 TCP Sender 1 TCP Sender 2 TCP Receiver 2 xi(t) TCP: Reno Vegas FAST τF τB q(t) AQM: DropTail / RED Delay ECN x i t F xi t , qt B qt G qt , xi t F c i Fluid Models x i t F xi t , qt B qt G qt , xi t F c i Assumptions: TCP algorithms directly control the transmission rates; The transmission rates are differentiable (smooth); Each TCP packet observes the same congestion price (loss, delay or ECN) Methodology based on Fluid Models x i t F xi t , qt B qt G qt , xi t F c i Equilibrium: Efficiency? Fairness? Dynamics: Stability? Responsiveness? Gap 1: Stability of TCP Vegas Analysis: “TCP Vegas is stable if (and only if) the number of flows is large, and capacity is small, and delay is small.” Experiment: a single TCP Vegas flow is stable with arbitrary delay and capacity. Gap 2: Fairness of Scalable TCP Analysis: “Scalable TCP is fair in homogeneous network” [Kelly’03] Analysis: [Chiu&Jain’90] → Scalable TCP is unfair. Experiment: in most cases, Scalable TCP is unfair in homogeneous network. Gap 3: TCP vs TFRC Analysis: “We designed TCP Friendly Rate Control (TFRC) algorithm to have the same equilibrium as TCP when they co-exist.” Experiment: TCP flows do not fairly coexist with TFRC flows. Gaps Stability: TCP-Vegas Fairness: Scalable TCP Friendliness: TCP vs TFRC Current analytical models ignore microscopic behavior in TCP congestion control Outline Motivation Overview of Microscopic behavior Stability of Delay-based Congestion Control Algorithms Fairness of Loss-based Congestion control algorithms Future works Microscopic View (Packet level) Two level timescales On each RTT -- TCP congestion control algorithm; On each packet arrival -- Ack-clocking: p--; while (p < w(t) ) do Send a packet p++; (p: number of packets in flight) W: 0 -> 5 1 2 Sender 3 4 5 C Receiver x(t) c 0 t (time) Packets queued in bottleneck C Sender 1 2 3 4 5 Receiver x(t) c 0 t (time) Packets leaves bottleneck at rate c C Sender 3 4 5 2 1 Receiver x(t) c 0 t (time) Acknowledgment returns at rate c A1 A2 A3 C Sender 5 4 Receiver x(t) c 0 t (time) New Packets sent at rate c A4 A5 C Sender 3 2 1 Receiver x(t) c 0 RTT t (time) No queue in nd 2 C Round Trip Receiver Sender 5 4 3 2 1 x(t) c 0 RTT No need to control rate x(t) ! RTT t (time) Two Flows 4 TCP1 4 3 2 3 2 1 C Rcv1 TCP2 1 Rcv2 x(t) c 0 t (time) Two Flows TCP1 C 3 4 1 2 3 4 2 1 Rcv1 TCP2 Rcv2 x(t) c 0 t (time) A1 A2 A3 TCP1 C 2 3 4 5 1 4 Rcv1 TCP2 Rcv2 x(t) c 0 t (time) A3 A4 A1 TCP1 2 1 C 4 Rcv1 3 TCP2 2 Rcv2 x(t) c 0 RTT t (time) A1 A2 A3 TCP1 4 3 C 2 1 Rcv1 TCP2 4 Rcv2 x(t) c 0 RTT t (time) A3 A4 A1 TCP1 C 4 1 2 3 2 Rcv1 TCP2 Rcv2 x(t) c 0 RTT t (time) A1 A2 A3 TCP1 C 2 4 Rcv1 3 1 TCP2 4 Rcv2 x(t) c 0 RTT On-off pattern for each flow RTT t (time) Sub-RTT Burstiness: NS-2 Measurement Two levels of Burstiness x(t) c 0 RTT RTT t (time) Micro Burst Pulse function Input rate>>c Extra queue & loss Transient Sub-RTT burstiness On-off function Input rate <=c No extra queue & loss Persistent Microscopic Effects: known Loss-based TCP Delay-based TCP Micro Burst Low throughput with small buffer – pacing improves throughput (Clearly understood) Noise to delay signal, should be eliminated (Partially…) Sub-RTT Observed in Internet Traffic Burstiness (“Why do we care?”) Microscopic Effects: new Loss-based TCP Delay-based TCP Micro Burst Low throughput with small buffer – pacing improves throughput (Clearly Understood) Fast convergence in queuing delay and better stability Sub-RTT Low loss No effect Burstiness synchronization rate with DropTail routers New Understandings Micro Burst with Delay-based TCP: fast queue convergence 1. A single TCP-Vegas flow is always stable, regardless of delay and capacity. Sub-RTT Burstiness and Loss-based TCP: low loss sync rate 1. 2. Scalable TCP is (usually) unfair; TCP is unfriendly to TFRC; Outline Motivation Overview of Microscopic behavior Stability of Delay-based Congestion Control Algorithms Fairness of Loss-based Congestion control algorithms Future works New Understandings Micro Burst with Delay-based TCP: fast queue convergence 1. A single TCP-Vegas flow is always stable, regardless of delay and capacity. Sub-RTT Burstiness and Loss-based TCP: low loss sync rate 1. 2. Scalable TCP is (usually) unfair; TCP is unfriendly to TFRC; A packet level model: basis Ack-clocking: on each ack arrival p--; while (p < w(t) ) do Send a packet p++; (p: number of packets in flight) Packets can only be sent upon arrival of an acknowledgment; A micro burst of packets can be sent at a moment; Window size w(t) can be an arbitrary given process. A packet level model: variables Ack-clocking: on each ack arrival p--; while (p < w(t) ) do Send a packet p++; (p: number of packets in flight) pj : Number of packets in flight when j is sent; sj : sending time of packet j bj : backlog experienced by packet j aj : ack arrival time of packet j A packet level model: variables A4 A5 3 C 1 Sender 2 Receiver pj : Number of packets in flight when j is sent; sj : sending time of packet j A packet level model: variables A4 A5 C Sender 2 3 1 Receiver pj : Number of packets in flight when j is sent; sj : sending time of packet j bj : backlog experienced by packet j A packet level model: variables A4 A3 C 2 1 6 5 Sender Receiver pj : Number of packets in flight when j is sent; sj : sending time of packet j bj : backlog experienced by packet j aj : ack arrival time of packet j A packet level model: variables Ack-clocking: on each ack arrival p--; while (p < w(t) ) do Send a packet p++; (p: number of packets in flight) p j max p j 1 k 1 : p j 1 k 1 w a j 1 p j1 k 0 k p j 1 s j a j p j k : number of acks between sj and sj-1 ; pj : number of packets in flight when i is sent sj : sending time of packet j aj-p(j) : ack arrival time of the packet one RTT ago A packet level model: variables Ack-clocking: on each ack arrival p--; while (p < w(t) ) do Send a packet p++; (p: number of packets in flight) k : number of acks between sj and sj-1 ; For example: k =0 pj max p 0 k p j 1 j 1 k 1 p j 1 k 1 w... p j 1 1 s j a j p j a j p j1 1 a j 1 p j1 s j 1 A packet level model: variables C j j-1 p3 p2 p1 c s j s j 1 b j max b j 1 1 c s j s j 1 ,0 aj sj d bj c bj : experienced backlog c : bottleneck capacity aj :ack arrival time d : propagation delay A packet level model p j max p j 1 k 1 : p j 1 k 1 w a j 1 p j1 k 0 k p j 1 b j max b j 1 1 c s j s j 1 s j a j p j aj sj d bj c pj : Number of packets in flight when j is sent; sj : sending time of packet j bj : backlog experienced by packet j aj : ack arrival time of packet j Ack-clocking: quick sending process Theorem: For anytime that a packet j is sent (sj ), there is always a packet j*:=j*(j) s.t. sj = sj* pj* = w (sj ) The number of packets in flight at any packet sending time is sync-up with the congestion window. w(t) p(t) time (t) s Ack-clocking: fast queue convergence Theorem: If Then: pk cd for k : j p j k j p j cd b j w(t) q(t) The queue converges instantly if window size is larger than BDP in the entire previous RTT. time (t) s Window Control and Ack-clocking Per RTT Window Control: makes decision once every RTT with the measurement from the latest acknowledgement (a subsequence of sequence number k1, k2, k3, …) w(t) p(t) a k1 ak 2 time (t) s k1 sk 2 sk3 Stability of TCP Vegas Theorem: Given the packet level model, if αd>1, a single TCP Vegas flow converges to equilibrium with arbitrary capacity c, propagation delay d. That is: there exists a sequence number J such that j J : cd d 1 ws j cd d 1 d 1 b j d 1 Stability of Vegas : 100-flow simulation Stability of Vegas : Avg Window Size Window Oscillation: 1 packet Stability of Vegas : Queue Size Queue Oscillation: 100 packets ( because 100 flows synchronized ) Gap 1: Stability of TCP Vegas Analysis: “TCP Vegas is stable if (and only if) the number of flows is large, and capacity is small, and delay is small.” Reason: micro burst leads to fast queue convergence Experiment: a single TCP Vegas flow is stable with arbitrary delay and capacity. FAST : stable and responsive Designed based on the intuition that queue is directly a function of congestion window size. A FAST flow does the following every other RTT: p j 1 wt d wt bj 2 d c FAST : stability Theorem: Given the packet level model, homogeneous FAST flows converge to equilibrium regardless of capacity c and propagation delay d and number of flows N. [Tang, Jacobsson, Andrew, Low’07]: FAST is stable with single bottleneck link regardless of capacity c and propagation delay d and number of flows N. (With an extended fluid model capturing microburst effects) Micro-burst: Summary x(t) c 0 RTT RTT t (time) Effects: Fast queue convergence Stability of homogeneous Vegas for arbitrary delay Possibility of very responsive & stable TCP control Stability of FAST for arbitrary delay Outline Motivation Overview of Microscopic behavior Stability of Delay-based Congestion Control Algorithms Fairness of Loss-based Congestion control algorithms Future works New Understandings Micro Burst with Delay-based TCP: fast queue convergence 1. A single (homogeneous) TCPVegas flow is always stable, regardless of delay and capacity. Sub-RTT Burstiness and Loss-based TCP: low loss sync rate 1. 2. Scalable TCP is (usually) unfair; TCP is unfriendly to TFRC; Loss Synchronization Rate: Definition Loss Synchronization Rate [Baccelli,Hong’02]: The probability that a flow observes a packet loss during a congestion event. Congestion event (loss event): A round-trip time interval in which at least one packet is dropped by the bottleneck router due to congestion (buffer overflow at router) Loss Synchronization Rate: Effects Intuitions: Individual flow: the smaller the better (selfishness) System design: the higher the better (for fairness and convergence) Theoretic Results: Aggregate throughput [Baccelli,Hong’02] Instantaneous fairness [Baccelli,Hong’02] Fairness convergence [Shorten, Wirth, Leith’06] Loss Sync. Rate: Existing Model [Shorten, Wirth, Leith’06] No Model. Measure from NS-2 and feed into a model for computational results [Baccelli,Hong’02] Assume each packet has the same probability of being dropped in the loss event. Packet loss is bursty: Internet ~50% losses happen in bursts Loss process is bursty: on-off incoming packets during the RTT of loss event from all flows Legend: a packet (from any flow) a dropped packet burst period of loss signal L incoming packets dropped In each loss event (one RTT), packet loss process is an on-off process. Data packet process is bursty: on-off incoming packets during the RTT of loss event from all flows burst period of one flow: w packets i i i i i i i i i Legend: a packet (from any flow) i a packet from flow i x(t) c 0 In each loss event (one RTT), TCP data packet process is an on-off process. RTT RTT t (time) Loss Sync. Rate: A Sampling Perspective incoming packets during the RTT of loss event from all flows burst period of one flow: w packets i i i i i i i i i Legend: a packet (from any flow) i a packet from flow i a dropped packet burst period of loss signal L incoming packets dropped Loss Sync. Rate: The efficiency of a (bursty) TCP data process to sample the loss signal in a (bursty) loss process Assumption 1: Within the RTT of loss event, the position of an individual flow’s burst is uniformly distributed. Assumption 2: Loss process does not depend on data packet process of individual flows. Loss Sync. Rate Case 1: TCP+DropTail incoming packets during the RTT of loss event from all flows burst period of one flow: w packets i i i i i i i i i Legend: a packet (from any flow) i a packet from flow i a dropped packet burst period of loss signal L incoming packets dropped L wi 1 i cd B L wi : window of a TCP flow L : number of dropped packets cd+B+L : number of packets going through the bottleneck in the loss event ( c : capacity, d : propagation delay; B : buffer size) Loss Sync. Rate: TCP+DropTail Loss Sync. Rate Case 2: Pacing+DropTail incoming packets during the RTT of loss event from all flows w packets distributed in the entire RTT of loss event i i i i a packet from flow i a dropped packet i i i i i i Legend: a packet (from any flow) burst period of loss signal L incoming packets L i 1 1 cd B L wi wi : window of a TCP flow L : number of dropped packets cd+B+L : number of packets going through the bottleneck in the loss event Loss Sync. Rate: Pacing + DropTail Loss Sync. Rate Case 3: TCP+RED incoming packets during the RTT of loss event from all flows burst period of one flow: w packets i i i i i i i i i packet loss distributed over the entire RTT of loss event wi i 1 1 cd B L L wi : window of a TCP flow L : number of dropped packets cd+B+L : number of packets going through the bottleneck in the loss event Model for Loss Sync. Rate: General form cd+B incoming packets during the RTT of loss event burst period of Flow i spanning over K incoming packets i i i i i i i i i i i Legend: a packet (from any flow) i a packet from flow i a dropped packet burst period of loss signal randomly drop from M incoming packets cd+B : number of packets going through the bottleneck in the loss event ( c : capacity, d : propagation delay; B : buffer size) wi : window of a TCP flow in the loss event L : number of dropped packets in the loss event i ? Ki : length of burst period of flow i (in pkt) M : length of burst period of loss process (in pkt) Loss Sync. Rate: MatLab Computation cd+B = 1080; wi = 60; L = 16; K , M vary Measurement: TCP + DropTail Averaged sync. Rate cd+B = 3340 M =L = N/2 K = w = (cd+B)/N Measurement: Pacing + DropTail Averaged sync. Rate cd+B = 3340 M =L = N/2 K = w = (cd+B)/N Measurement: TCP + RED Averaged sync. Rate cd+B = 3340 M =L = N/2 K = w = (cd+B)/N Loss Sync. Rate: Qualitative Results With DropTail and bursty TCP (most widely deployed combination), loss synchronization rate is very low; TCP Pacing increases loss synchronization rate; RED increases loss synchronization rate. Loss Sync. Rate: Asymptotic Result If number of flows N is large: L >> wi TCP: L wi 1 L i cd B L cd B L Very weak dependency of Loss Sync Rate to window size: All flows see the same loss w TCP Pacing: wL L i 1 1 i Loss Sync Rate is proportional to window size: Rich guys see more loss. i cd B L cd B L Asymptotic Result: MatLab Computation cd+B = 1080; L = N/2; N varies Fair share window size: cd+B/N Implications 1. 2. 3. Scalable TCP is (usually) unfair with bursty TCP TCP is unfriendly to TFRC; … Fairness of Scalable TCP For each RTT without a loss: wi (t+1) = αwi (t); α=1.01 For each RTT with a loss (loss event): wi (t+1) = βwi (t); β= 0.875 [Chiu,Jain’90]: MIMD algorithms cannot converges to fairness with synchronization model [Kelly’03]: Scalable TCP (MIMD) converges to fairness in theory with fluid model [Wei, Jin, Low’06][Li,Leith,Shorten’07]: Scalable TCP is unfair in experiments Fairness of Scalable TCP: Chiu vs Kelly [Chiu,Jain’90]: MIMD is not fair Assumption: loss event rate is independent of window size (simplified synchronization model) [Kelly’03]: Scalable TCP (MIMD) is fair Assumption: loss event rate is proportional to window size (fluid model) Fairness of Scalable TCP: Chiu vs Kelly [Chiu,Jain’90]: MIMD is not fair loss event rate is independent of window size (simplified synchronization model) Sync. Rate Model: many bursty TCP flows Assumption: [Kelly’03]: Scalable TCP is fair Assumption: loss event rate is proportional to window size (fluid model) Sync. Rate Model: true with very few bursty TCP flows or with paced TCP flows Scalable TCP: simulations Capacity=100Mbps; delay=200ms; buffer size: BDP; MTU=1500; N varies; averaged rate over 600 second runtime Gap 2: Fairness of Scalable TCP Analysis: “Scalable TCP is fair in homogeneous network” [Kelly’03] Analysis: “MIMD in general is unfair.” [Chiu&Jain’90]. → Scalable TCP is unfair. Reason: sub-RTT burstiness leads to similar loss sync. rate for different flows Experiment: in most cases, Scalable TCP is unfair in homogeneous network. TFRC vs TCP incoming packets during the RTT of loss event from all flows burst period of TCP: w packets 1 1 1 2 a packet from TCP 1 2 1 2 2 2 2 1 2 2 2 2 1 1 a packet from TFRC a dropped packet 1 1 Legend: a packet (from any flow) burst period of loss signal L incoming packets TCP: L wi 1 i cd B L TFRC (same as Pacing): wi L i 1 1 cd B L TFRC vs TCP: simulation Gap 3: TCP vs TFRC Analysis: “We designed TCP Friendly Rate Control (TFRC) algorithm to have the same equilibrium as TCP when they co-exist.” Reason: sub-RTT burstiness leads to different loss sync. rate for TFRC and TCP Experiment: TCP flows do not fairly coexist with TFRC flows. Sub-RTT Burstiness: Summary x(t) c 0 RTT RTT t (time) Effects: Low Loss Sync. Rate with DropTail router Poor convergence MIMD unfairness TFRC unfriendly Possible solutions Eliminate sub-RTT burstiness: Pacing Randomize loss signal: RED Persistent loss signal: ECN Outline Motivation Overview of Microscopic behavior Stability of Delay-based Congestion Control Algorithms Fairness of Loss-based Congestion control algorithms Future works Future: a research framework on microscopic Internet behavior Experiment tools: help to observe, analyze and validate microscopic behavior in Internet: WAN-in-Lab, NS-2 TCP-Linux, … Theoretic model: more accurate models to capture the dynamic of Internet in microscopic timescale. New algorithms: new algorithms that utilize and control the microscopic Internet behavior NS-2 TCP-Linux The first tool that can run a congestion algorithm directly from Linux source code with the same simulation speed (sometimes even faster) 700+ local downloads (2400+ tutorial visits worldwide) 5+ Linux kernel fixes NS-2 Simulator 2+ papers Outreach: BIC/Cubic-TCP (NCSU), Linux Implementation H-TCP (Hamilton), TCP Westwood (UCLA/Politecnico di Bari), A-Reno (NEC), … Thank you! Q&A
Get documents about "