VIEWS: 9 PAGES: 4 POSTED ON: 4/27/2010 Public Domain
14th Telecommunications forum TELFOR 2006 Serbia, Belgrade, November 21-23, 2006 Clock Skew Compensation by Speech Interpolation Dejan Miljkovi , Tõnu Trump adjustment and signal processing group. Abstract — A common synchronization approach used in In clock adjustment group the most popular approach telecommunication networks is clock adjustment via a is the use of global positioning system (GPS) to make all synchronous digital hierarchy. In packet based networks this relevant clocks precise enough [4]. Some researchers (e.g. is however not always feasible and differences in local clock frequencies, clock skew, needs to be handled by proper signal [5]) propose more efficient algorithms than traditional processing. This paper proposes a clock skew compensation Phase Lock Loop for adjusting receiver’s clock. The algorithm that effectively changes the number of samples in underlying assumption made in this approach is that the receiver’s play-out buffer, dependent on estimated clock adjusting of the clock is possible. Even though this is the skew, thereby avoiding systematic over/under-runs of the truth in many cases, in some others it is not feasible. buffer. For example, it may be difficult in the case of software IP telephony clients running on PC-s, as the equipment Keywords — synchronization, clock skew, jitter, may use an independent clock on audio card to control the compensation. sampling rate [3]. Another complication occurs if one of the parties is a gateway interconnecting IP network and I. INTRODUCTION Public Switched Telephone Network (PSTN). Playout rate in such a gateway is typically synchronized to the PSTN I N the Internet telephony the packets containing voice samples are forwarded from router to router in their way from transmitter to receiver. The packets are temporarily clock and cannot be arbitrarily changed for individual calls. In these cases a signal processing algorithm must be used for clock skew compensation. stored in the buffers of these routers. The exact time of Paper [6] proposes one signal processing algorithm for storage of each voice packet in the router’s buffers is clock skew compensation. It uses temporal redundancy dependent on the network load. The accumulated storage inherent in audio signals to identify segments of audio that times contribute to variable component of transport delay may be repeated or cut from the stream (depending on the through the network. This phenomenon is known as delay sign of the skew). The weakness of this algorithm is jitter and it can be modeled as a random process. dependency on existence of these suitable segments and Even when initially set accurately, non-synchronized may have problems with the rich content. real clocks will differ after some amount of time as the More general approach of using resampling techniques clocks count time at slightly different rates. This is proposed in [3]. The paper argues that classic difference accumulates over time and causes resampling technique becomes impractical for rational inconsistencies in operation in a network consisted of factors close to one due to its computational complexity many end-points each having its own clock. We shall call and proposes a lower complexity alternative: spline the difference in frequencies of two free-running clocks a interpolation. This paper continues the work on clock clock skew. Because of its origin clock skew can be skew compensator started in [3]. Instead of applying considered as a deterministic phenomenon. spline interpolation on a complete test vector at once, we Reference [1] evaluates several hardware and software propose a clock skew compensation scheme that can be implementations of IP phones. The results reported show applied on packet by packet bases and is therefore possible that in most cases the clock skew is within 60 ppm, but it to implement in practical systems. can be as high as 300 ppm. For a comparison we note that The paper is organized as follows. In Section II we ITU-T Rec. G.823 [2] requires that the maximum introduce the problem and give mathematical permissible long-term mean controlled slip rate is no more formalization of it. Section III describes the proposed than one slip in 70 days in a node based on the 2048 kbit/s clock skew compensator. Some important properties of the hierarchy which connects to an independently- proposed algorithm are further discussed in Section IV. synchronized network. Section V provides our simulation results. And finally As suggested in [3], clock skew compensation Section VI presents the conclusions. algorithms can be divided into two groups: clock Authors are with Ericsson AB, Armborstvägen 14, Box 1505, Stockholm, Sweden. They can be reached by email Dejan.Miljkovic@ericsson.com, Tonu.Trump@ericsson.com. 142 Let us assume for a moment that there is no jitter II. PROBLEM DESCRIPTION present. Then, in the presence of clock skew, packet slips Fig. 1 shows two terminals that communicate over a happen when accumulated error of expected packet arrival packet switched network. Since these two terminals are time reaches packet period. It can be shown that the slip not synchronized, signal that is sampled in the transmitter period is given by: with sampling frequency Tx is replayed with sampling M 1 M frequency Rx at the receiver. t PER . (5) Tx Rx A Tx Rx ATx presents the clock’s accuracy of a transmitter when Packet the receiver’s clock is considered as a reference. The Terminal switched Terminal following table shows some values of Tx, and tPER for Tx network Rx different accuracies of transmitter’s clock ATx in the system where Rx = 8000 Hz and M = 40. Fig. 1. Communication between two terminals TABLE 1: SOME CLOCK SKEW EXAMPLES ATx [ppm] Tx [Hz] [s] tPER [s] 6 Let’s assume that transmitter sends packets containing 60 8 000.48 -0.30 10 83.333 M samples at time instances: 6 300 8 002.40 -1.50 10 16.667 6 M 142 857 9 142.86 -625 10 0.035 t Tx t, t 1, 2, ... (1) Tx The receiver expects them at time instances: III. CLOCK SKEW COMPENSATOR M Block diagram of the proposed algorithm is shown in t Rx t d Rx , t 1,2,... (2) Fig. 2. Rx Clock skew compensator where dRx is some transmission delay expected by the receiver. The actual arrival times are given by: yi ˆi Clock skew Converter M estimator to N tR t dR v t , t 1, 2,... (3) Tx tRx Nj b Limiter where dR is the minimum transmission delay from the tR Nj transmitter to the receiver over the network, and v(t) is a Packet Packet j random variable characterizing the extra delay added by i in FIFO Interpolator out [M] [N] [M] the network (delay jitter) at time t. The jitter can be modeled as independent and identically distributed random process with exponential probability density Fig. 2. Clock skew compensator function. Now the difference between the actual and expected Clock skew compensator is located in the receiver. At arrival times of the packets can be computed as: moments tR packets of M samples of data are received and written in a FIFO buffer. Then the difference between the yt tR t Rx actual and expected arrival time (yi) is calculated 1 1 according to (4). M t d R d Rx vt (4) Clock skew estimator algorithm is then used to update Tx Rx t a vt. the estimate of the clock skew value from each obtained yi. Converter to N calculates Nj – the number of samples This difference is a linear function of time. The to be read from the FIFO buffer at moments tRx. This parameter a can be interpreted as a correction to the initial number is then checked by Limiter which in the closed guess about the minimum transportation delay. The loop monitors number of samples in the FIFO buffer (b). parameter = (M/ Tx) – (M/ Rx) is the difference of packet This is the main difference compared to the solution with periods in transmitter and receiver. will be also referred de-jittering buffer. Instead of reading the same number as clock skew parameter. (M) of data samples from the buffer each time, adjustable Let us assume that the receiver has an input buffer that number of samples ( N j ) is read to compensate for the can accommodate just one packet and data is read from the clock skew. buffer when it is required for play-out. Then if a new Since the receiver needs to provide M samples of data packet arrives before the previous one is read out from it, the data is written over and a slip occurs. Likewise if the each packet period M/ Rx, Interpolator is needed to system attempts to read from the input buffer before a resample the signal from N to M samples. packet has arrived a slip occurs as there is no data to read. In the remainder of this section each building block in Fig. 2 is explained in more details (except FIFO queue 143 which is standard component). D. Limiter A. Interpolator The role of the Limiter is to keep the number of samples in the FIFO buffer (b) between 0 and 2M. The reason why The Interpolator’s task is to provide a packet of M is it needed will be explained in the next section. samples on the output by using N samples from the input If more samples are requested than there are in the by means of interpolation. This is equivalent to the change buffer (Nj > b) the Limiter will read all remaining samples of sampling rate by the factor M/N. Classical digital signal processing method for resampling with ratio M/N is ( N j b ) thus preventing buffer to underflow. impractical. This is because the clock skew is small in If b > 2M the Limiter will override value Nj practice hence the ratio M/N is close to one causing the recommended by “converter to N” and read all complexity to become significant. remaining samples above M ( N j b M ). As suggested in [3] spline interpolation has lower complexity. Splines are piecewise polynomials with pieces IV. PROPERTIES OF PROPOSED ALGORITHM that smoothly connect together. A zero order B-spline, 0 (t); is a rectangular pulse and n-th order B-spline is A. Processing delay n + 1 times convolution of 0(t) with itself. Following the Clock skew compensator introduces minimal constant results of [3], we use cubic spline interpolator in this delay of one packet period M/ Rx. If the processing delay paper. is smaller than M/ Rx slips will occur (e.g. in the situation B. Clock skew estimator when Tx< Rx because two frames of M samples have to The recursive clock skew estimator, proposed in [7], be sent one after another). was used. The algorithm is able to provide rather accurate This means that average number of samples in the clock skew estimates in most network scenarios. The buffer is M, and the minimal size of the FIFO buffer is algorithm has a reasonably low computational complexity 2M. and memory consumption. B. Resilience to clock skew estimation error C. Converter to N A statistical analysis of the clock skew estimation This block implements the control logic of algorithm can be found in [6], where it is shown that the compensation algorithm that needs to decide the number estimation error variance of the clock skew parameter is of samples to be read from the buffer at moments tRx so no inversely proportional to the fourth power of the number buffer over/under-runs shall happen. of received packets. At the beginning of operation the To make this possible the ratio of number of samples to estimation error ( error) can be significant because too little be read to packet size, needs to be equal to frequency ratio data is available to get a reliable estimate. Tx/ Rx. Hence, we need to read: It can be shown that in an open loop configuration when there is no Limiter (i.e. N j N j ), that the estimation error Tx n M 1 A Tx M (6) causes packet slips with period: Rx samples from the FIFO buffer. Tx is not known. t CSC _ PER t PER . (10) Estimated value of clock skew parameter is available as error input and has to be used instead. ATx can be expressed as: Thus, the slips would be still occurring but with far longer period since error is much smaller than . A Tx Rx . (7) The Limiter closes the loop by monitoring the number M Rx of samples in the FIFO buffer. It does not only eliminate Substitution of (7) to (6) gives: any remaining packet slips, but also keeps the mean processing delay to minimal M/ Rx. M2 n 1 Rx M . (8) C. Resilience to jitter M Rx M Rx De-jittering buffer is traditionally used to remove Since n is in general not an integer it needs to be variations in packet delay. All packets are intentionally rounded to obtain N. The rounding error, e, must be taken delayed for time TJitter. Any packet with delay greater than into account next time when calculating N. One iteration TJitter will be discarded as it comes too late to be processed. of the algorithm is given by the following equations: The value of TJitter (either fixed or adaptive) is selected as a M2 compromise between probability of packet loss and nj introduced delay. M Rx i (9) As clock skew compensator already uses a FIFO buffer, Nj round n j ej 1 it can be used for de-jittering as well. The delay of TJitter ej nj ej 1 Nj. is achieved by keeping B0 samples in the buffer: B0 Rx TJitter . (11) 144 Fig. 3. Number of samples in the FIFO buffer The Limiter is initialized with value B0 to set the upper accumulates samples in the buffer so there are B0+2M limit to B0+2M. The mean introduced delay in this (320) samples available when next slip would have configuration is (B0+M)/ Rx. Sample numbers 0 to B0 in happened. Two packets of M samples can be sent at that the buffer are used to protect against jitter. Samples B0 to moment without causing buffer under-run. B0 + 2M are used to compensate for clock skew. The first period on Fig. 3 shows that the Limiter Clock skew compensator can be configured for any successfully kept the number of samples in predefined value of TJitter and will not make any distortions as long as range even though estimation algorithm provided packets come in order. The compensator can be improved inaccurate estimate due to too little data. with packet sorting functionality if there is a need for it, Listening tests showed no difference when compared to but will become application dependent. experiments done under the same conditions but with no jitter. V. SIMULATION RESULTS In our simulation study we used speech sampled at VI. CONCLUSION 8 kHz, spoken by different speakers in different ambient This paper proposes an algorithm that successfully conditions and different type of music sampled at 44.1 and repairs any packet slips caused by clock skew. This is 8 kHz. The packets always contained 5 ms of data done by changing the number of samples in the receiver’s independently of the sampling frequency. play-out buffer using digital interpolation techniques. The In the first set of simulations we investigated a scenario proposed algorithm is able to work on packet-by-packet where clock skew was present but no jitter. Different basis which makes it suitable for implementation in real- values of clock skew were applied in range 625 to time systems. 625 s. The performance of the proposed algorithm was We conclude that the proposed algorithm works well in investigated in the simulation study. The algorithm was this situation as no distortions caused by clock skew could found to have good performance with both speech and be heard in speech test vectors. The result was the same music signals over a range of skew and jitter conditions. for music vectors sampled at 44.1 kHz. However, distortions were noticed for music vectors sampled at REFERENCES 8 kHz for > 25 s. These distortions were however [1] W. Jiang, K. Koguchi, H. Schulzrinne, “QoS Evaluation of VoIP End-points” Proc. IEEE International Conference on similar to the aliasing distortions reported in [3]. Communications, Anchorage, Alaska, May 2003, pp. 1917–1921. In the second set of simulations jitter was added on top [2] ITU-T Recommendation G.823 “The control of jitter and wander of two values of clock skew: = 625 s. within digital networks which are based on the 2048 kbit/s hierarchy”, March 2000. Two different sources of jitter were used. The first was [3] Tõnu Trump, “Compensator for Clock Skew in Voice Over Packet from the real network measurements. The second source Networks by Speech Interpolation”, Proc. IEE International was artificially created so that jitter values have Symposium on Circuits and Systems, vol. 5 pp. V-608-V-611, exponential distribution. In both cases lost packets were Vancouver, Canada, May 2004. [4] A.Pasztor, D. Veitch “A Precision Infrastructure for Active excluded from the experiment (influence of packet loss on Probing”, PAM2001 - A workshop on Passive and Active quality is not in the scope of this paper). Measurement, April 2001. Fig. 3 shows the number of samples in the buffer over [5] Raffaele Noro, “Synchronization over Packet-Switching Networks: Theory and Applications”, These No 2178 (2000), Lausanne, EPFL. the time for one of the experiments. (The number of [6] O. Hodson, C. Perkins and V. Hardman “Skew detection and processed frames, i, is on x axis.) In this simulation the compensation for internet audio applications” Proc. IEEE second jitter source is used and system parameters had the International Conf. on Multimedia and Expo, New York, July 2000, Vol. 3, pp. 1687–1690. following values: = 8000 Hz, M = 40, = 25.1 s and [7] Tõnu Trump, “Maximum Likelihood Trend Estimation in TJitter = 30 ms. This figure illustrates in the best way the Exponential Noise”, IEEE Transactions on Signal Processing, VOL. operation of clock skew compensation algorithm. B0 49, NO 9, September 2001, pages: 2087-2095. samples (240) are used for de-jittering. The algorithm 145