IP Network Traffic Measurement and Modelling
Presented to the COST 282 MCM meeting on 24-25 September 2003, Istanbul
Dr. Zhili Sun and Mr. Lei Liang Centre for Communication System Research University of Surrey Guildford Surrey GU2 7XH Z.Sun@surrey.ac.uk
To study IP network traffic by measurement . To find mathematical formula to fit the measurement results So that the formula will be used for traffic modelling to capture the relevant network traffic features, attributes, and characteristics
Traffic Measurement Parameters
QoS parameters for traffic engineering include:
delay, jitter and packet loss
IETF IPPM working group tries to define metrics of these parameters Traffic parameters at packet level includes:
Throughput, packet length, packet interarrival time, packet burstness and so on
Packet interarrival time is measured in this paper.
ArrivalInterval ArrivalTime(i) ArrivalTime(i 1)
Parameter Measurement Algorithm
In each measurement, packets are classified in terms of flow direction.
Uplink stream: packets from local machine to remote servers Downlink stream: packets from remote servers to local machine
Two direction flows are expected to have different performances and characteristics. The TCP traffic of the measurement node generated by FTP applications was measured
Packet capture method
Customer application Text file
Capture filter Capture driver Network adpater
Packet Interarrival Time Analysis
Downloading files always produces very small interarrival time Either for downloading small file or big file, the RTT has significant effect on the packet interarrival time The file size affects the FTP packet interarrival time
Fitting Using Pareto+Pareto Distribution
Fitting Using Pareto+Rayleigh Distribution
FTP Packet Interarrival Time Formula (1/3)
It has been found that there is no standard distribution can fit well to the measured distributions of the interarrival time for both small and big file downloading.
Pareto distribution fits the measurement curve very well around 0 second Sharp rise cuts off the distribution around the RTT point
Two different standard distributions were combined to model this kind of cut-off distributions.
It should guarantee the final distribution
FX ( x) f X ( x)dx 1
f X (x) has
FTP Packet Interarrival Time Formula (2/3)
For the small file download, the rise is very sharp. To model this distribution, we chose Pareto+Pareto distribution as the ideal model.
c1Tmin c1 c1 1 f T (t ) t c c 2TRTT 2 t c2 1
Tmin t TRTT TRTT t Tmax
c1Tmin 1 1 c1 1 dx Tm in t
where TRTT is the cut-off point. Tmin and Tmax is the minimum and maximum
value of the FTP packet interarrival time respectively.
FTP Packet Interarrival Time Formula (3/3)
It was found that Pareto+Rayleigh distribution could model the packet interarrival time very well for big file case.
f T (t ) cTmin t c 1 t2 t exp 2 2 2b b
Tmin t TRTT TRTT t Tmax
c1Tmin 1 1 c1 1 dx Tm in t
where TRTT , Tmin and Tmax are the same as previous page.
WIDE Backbone Traces
To verify the method described in above paragraphs, more analysis was executed to 6 TCP traces provided by the MAWI (Measurement and Analysis on the WIDE Internet) Working Group The 6 traces we used in our analysis were collected at an IPv6 line connected to WIDE-6Bone in this January and February
Totally contain around 6 million TCP packets All of the traces were captured using a software named TCPDUMP.EXE and saved in dump file format. Arrival time stamp of each TCP packet in the 6 traces was extracted to calculate the packet interarrival time
WIDE Backbone Traces Information
Traces Measurement Interval (Seconds) No. of TCP Packets TCP Packet Volume (Bytes)
Tue., 14 Jan. 2003
Mon., 20 Jan. 2003
Sun., 26 Jan. 2003
Sun., 02 Feb. 2003
Tue., 11 Feb. 2003
Fri., 28 Feb. 2003 Total
All of the traces have a common characteristic. All of their packet interarrival time CDFs have sharp cut-off around 0.11 second
The cut-off appears more outstanding when the TCP traffic is less loaded Might be a pair of hosts constantly communicate through the measurement point that contributes a significant fixed RTT during all of the capture intervals
This cut-off phenomenon implies that a combination of more than one well-known distribution should be used to model the measured results
TCP Traces Modelling
TCP Traces Modelling formula
Two Inverse Gaussian CDFs connected at the cut-off point could fit the measurement curve reasonably well
Inverse Gaussian Plus model
We can mathematically represent the TCP packet interarrival time using the following PDF formula
1 w1 (t u1 ) 2 w1 2 exp 2 2t 3 2u1 t f T (t ) 1 w2 2 w2 (t u 2 ) 2 3 exp 2 2u 2 t 2t
Tmin t TCUT TCUT t Tmax
dt , TCUT is the cut-off
w1 (t u1 ) 2 w1 exp Where 1 2 2t 3 2u1 t Tm in
point, Tmin and Tmax are the minimum and maximum interarrival time respectively
The packet interarrival time distribution of the IP traffic is sensitive and affected by RTT that causes a cut-off point on the curve. Need to use two distribution functions to fit the data Regarding the difference caused by the size of transported file, two models were established for FTP packet interarrival time distribution.
For transmitting small files: Pareto+ Pareto model For transmitting big files: Pareto+Rayleigh
The modelling algorithms is also use to fit 6 backbone traces from public domains