Clock Skew Compensation by Speech Interpolation
Document Sample


14th Telecommunications forum TELFOR 2006 Serbia, Belgrade, November 21-23, 2006
Clock Skew Compensation
by Speech Interpolation
Dejan Miljkovi , Tõnu Trump
adjustment and signal processing group.
Abstract — A common synchronization approach used in In clock adjustment group the most popular approach
telecommunication networks is clock adjustment via a is the use of global positioning system (GPS) to make all
synchronous digital hierarchy. In packet based networks this relevant clocks precise enough [4]. Some researchers (e.g.
is however not always feasible and differences in local clock
frequencies, clock skew, needs to be handled by proper signal
[5]) propose more efficient algorithms than traditional
processing. This paper proposes a clock skew compensation Phase Lock Loop for adjusting receiver’s clock. The
algorithm that effectively changes the number of samples in underlying assumption made in this approach is that
the receiver’s play-out buffer, dependent on estimated clock adjusting of the clock is possible. Even though this is the
skew, thereby avoiding systematic over/under-runs of the truth in many cases, in some others it is not feasible.
buffer. For example, it may be difficult in the case of software
IP telephony clients running on PC-s, as the equipment
Keywords — synchronization, clock skew, jitter,
may use an independent clock on audio card to control the
compensation.
sampling rate [3]. Another complication occurs if one of
the parties is a gateway interconnecting IP network and
I. INTRODUCTION Public Switched Telephone Network (PSTN). Playout rate
in such a gateway is typically synchronized to the PSTN
I N the Internet telephony the packets containing voice
samples are forwarded from router to router in their way
from transmitter to receiver. The packets are temporarily
clock and cannot be arbitrarily changed for individual
calls. In these cases a signal processing algorithm must
be used for clock skew compensation.
stored in the buffers of these routers. The exact time of
Paper [6] proposes one signal processing algorithm for
storage of each voice packet in the router’s buffers is
clock skew compensation. It uses temporal redundancy
dependent on the network load. The accumulated storage
inherent in audio signals to identify segments of audio that
times contribute to variable component of transport delay
may be repeated or cut from the stream (depending on the
through the network. This phenomenon is known as delay
sign of the skew). The weakness of this algorithm is
jitter and it can be modeled as a random process.
dependency on existence of these suitable segments and
Even when initially set accurately, non-synchronized
may have problems with the rich content.
real clocks will differ after some amount of time as the
More general approach of using resampling techniques
clocks count time at slightly different rates. This
is proposed in [3]. The paper argues that classic
difference accumulates over time and causes
resampling technique becomes impractical for rational
inconsistencies in operation in a network consisted of
factors close to one due to its computational complexity
many end-points each having its own clock. We shall call
and proposes a lower complexity alternative: spline
the difference in frequencies of two free-running clocks a
interpolation. This paper continues the work on clock
clock skew. Because of its origin clock skew can be
skew compensator started in [3]. Instead of applying
considered as a deterministic phenomenon.
spline interpolation on a complete test vector at once, we
Reference [1] evaluates several hardware and software
propose a clock skew compensation scheme that can be
implementations of IP phones. The results reported show
applied on packet by packet bases and is therefore possible
that in most cases the clock skew is within 60 ppm, but it
to implement in practical systems.
can be as high as 300 ppm. For a comparison we note that
The paper is organized as follows. In Section II we
ITU-T Rec. G.823 [2] requires that the maximum
introduce the problem and give mathematical
permissible long-term mean controlled slip rate is no more
formalization of it. Section III describes the proposed
than one slip in 70 days in a node based on the 2048 kbit/s
clock skew compensator. Some important properties of the
hierarchy which connects to an independently-
proposed algorithm are further discussed in Section IV.
synchronized network.
Section V provides our simulation results. And finally
As suggested in [3], clock skew compensation
Section VI presents the conclusions.
algorithms can be divided into two groups: clock
Authors are with Ericsson AB, Armborstvägen 14, Box 1505,
Stockholm, Sweden. They can be reached by email
Dejan.Miljkovic@ericsson.com, Tonu.Trump@ericsson.com.
142
Let us assume for a moment that there is no jitter
II. PROBLEM DESCRIPTION present. Then, in the presence of clock skew, packet slips
Fig. 1 shows two terminals that communicate over a happen when accumulated error of expected packet arrival
packet switched network. Since these two terminals are time reaches packet period. It can be shown that the slip
not synchronized, signal that is sampled in the transmitter period is given by:
with sampling frequency Tx is replayed with sampling M 1 M
frequency Rx at the receiver. t PER . (5)
Tx Rx
A Tx Rx
ATx presents the clock’s accuracy of a transmitter when
Packet the receiver’s clock is considered as a reference. The
Terminal switched Terminal following table shows some values of Tx, and tPER for
Tx network Rx different accuracies of transmitter’s clock ATx in the
system where Rx = 8000 Hz and M = 40.
Fig. 1. Communication between two terminals TABLE 1: SOME CLOCK SKEW EXAMPLES
ATx [ppm] Tx [Hz] [s] tPER [s]
6
Let’s assume that transmitter sends packets containing 60 8 000.48 -0.30 10 83.333
M samples at time instances: 6
300 8 002.40 -1.50 10 16.667
6
M 142 857 9 142.86 -625 10 0.035
t Tx t, t 1, 2, ... (1)
Tx
The receiver expects them at time instances: III. CLOCK SKEW COMPENSATOR
M Block diagram of the proposed algorithm is shown in
t Rx t d Rx , t 1,2,... (2) Fig. 2.
Rx
Clock skew compensator
where dRx is some transmission delay expected by the
receiver. The actual arrival times are given by: yi ˆi
Clock skew Converter
M estimator to N
tR t dR v t , t 1, 2,... (3)
Tx tRx Nj
b
Limiter
where dR is the minimum transmission delay from the tR Nj
transmitter to the receiver over the network, and v(t) is a Packet Packet j
random variable characterizing the extra delay added by i in
FIFO Interpolator out
[M] [N] [M]
the network (delay jitter) at time t. The jitter can be
modeled as independent and identically distributed
random process with exponential probability density Fig. 2. Clock skew compensator
function.
Now the difference between the actual and expected Clock skew compensator is located in the receiver. At
arrival times of the packets can be computed as: moments tR packets of M samples of data are received and
written in a FIFO buffer. Then the difference between the
yt tR t Rx actual and expected arrival time (yi) is calculated
1 1 according to (4).
M t d R d Rx vt (4) Clock skew estimator algorithm is then used to update
Tx Rx
t a vt. the estimate of the clock skew value from each obtained yi.
Converter to N calculates Nj – the number of samples
This difference is a linear function of time. The to be read from the FIFO buffer at moments tRx. This
parameter a can be interpreted as a correction to the initial number is then checked by Limiter which in the closed
guess about the minimum transportation delay. The loop monitors number of samples in the FIFO buffer (b).
parameter = (M/ Tx) – (M/ Rx) is the difference of packet This is the main difference compared to the solution with
periods in transmitter and receiver. will be also referred de-jittering buffer. Instead of reading the same number
as clock skew parameter. (M) of data samples from the buffer each time, adjustable
Let us assume that the receiver has an input buffer that number of samples ( N j ) is read to compensate for the
can accommodate just one packet and data is read from the
clock skew.
buffer when it is required for play-out. Then if a new
Since the receiver needs to provide M samples of data
packet arrives before the previous one is read out from it,
the data is written over and a slip occurs. Likewise if the each packet period M/ Rx, Interpolator is needed to
system attempts to read from the input buffer before a resample the signal from N to M samples.
packet has arrived a slip occurs as there is no data to read. In the remainder of this section each building block in
Fig. 2 is explained in more details (except FIFO queue
143
which is standard component). D. Limiter
A. Interpolator The role of the Limiter is to keep the number of samples
in the FIFO buffer (b) between 0 and 2M. The reason why
The Interpolator’s task is to provide a packet of M
is it needed will be explained in the next section.
samples on the output by using N samples from the input
If more samples are requested than there are in the
by means of interpolation. This is equivalent to the change
buffer (Nj > b) the Limiter will read all remaining samples
of sampling rate by the factor M/N. Classical digital signal
processing method for resampling with ratio M/N is ( N j b ) thus preventing buffer to underflow.
impractical. This is because the clock skew is small in If b > 2M the Limiter will override value Nj
practice hence the ratio M/N is close to one causing the recommended by “converter to N” and read all
complexity to become significant. remaining samples above M ( N j b M ).
As suggested in [3] spline interpolation has lower
complexity. Splines are piecewise polynomials with pieces
IV. PROPERTIES OF PROPOSED ALGORITHM
that smoothly connect together. A zero order B-spline,
0
(t); is a rectangular pulse and n-th order B-spline is A. Processing delay
n + 1 times convolution of 0(t) with itself. Following the Clock skew compensator introduces minimal constant
results of [3], we use cubic spline interpolator in this delay of one packet period M/ Rx. If the processing delay
paper. is smaller than M/ Rx slips will occur (e.g. in the situation
B. Clock skew estimator when Tx< Rx because two frames of M samples have to
The recursive clock skew estimator, proposed in [7], be sent one after another).
was used. The algorithm is able to provide rather accurate This means that average number of samples in the
clock skew estimates in most network scenarios. The buffer is M, and the minimal size of the FIFO buffer is
algorithm has a reasonably low computational complexity 2M.
and memory consumption. B. Resilience to clock skew estimation error
C. Converter to N A statistical analysis of the clock skew estimation
This block implements the control logic of algorithm can be found in [6], where it is shown that the
compensation algorithm that needs to decide the number estimation error variance of the clock skew parameter is
of samples to be read from the buffer at moments tRx so no inversely proportional to the fourth power of the number
buffer over/under-runs shall happen. of received packets. At the beginning of operation the
To make this possible the ratio of number of samples to estimation error ( error) can be significant because too little
be read to packet size, needs to be equal to frequency ratio data is available to get a reliable estimate.
Tx/ Rx. Hence, we need to read:
It can be shown that in an open loop configuration when
there is no Limiter (i.e. N j N j ), that the estimation error
Tx
n M 1 A Tx M (6) causes packet slips with period:
Rx
samples from the FIFO buffer. Tx is not known. t CSC _ PER t PER . (10)
Estimated value of clock skew parameter is available as error
input and has to be used instead. ATx can be expressed as:
Thus, the slips would be still occurring but with far longer
period since error is much smaller than .
A Tx Rx
. (7) The Limiter closes the loop by monitoring the number
M Rx
of samples in the FIFO buffer. It does not only eliminate
Substitution of (7) to (6) gives: any remaining packet slips, but also keeps the mean
processing delay to minimal M/ Rx.
M2
n 1 Rx
M . (8) C. Resilience to jitter
M Rx
M Rx
De-jittering buffer is traditionally used to remove
Since n is in general not an integer it needs to be variations in packet delay. All packets are intentionally
rounded to obtain N. The rounding error, e, must be taken delayed for time TJitter. Any packet with delay greater than
into account next time when calculating N. One iteration TJitter will be discarded as it comes too late to be processed.
of the algorithm is given by the following equations: The value of TJitter (either fixed or adaptive) is selected as a
M2 compromise between probability of packet loss and
nj introduced delay.
M Rx i
(9) As clock skew compensator already uses a FIFO buffer,
Nj round n j ej 1 it can be used for de-jittering as well. The delay of TJitter
ej nj ej 1 Nj. is achieved by keeping B0 samples in the buffer:
B0 Rx TJitter . (11)
144
Fig. 3. Number of samples in the FIFO buffer
The Limiter is initialized with value B0 to set the upper accumulates samples in the buffer so there are B0+2M
limit to B0+2M. The mean introduced delay in this (320) samples available when next slip would have
configuration is (B0+M)/ Rx. Sample numbers 0 to B0 in happened. Two packets of M samples can be sent at that
the buffer are used to protect against jitter. Samples B0 to moment without causing buffer under-run.
B0 + 2M are used to compensate for clock skew. The first period on Fig. 3 shows that the Limiter
Clock skew compensator can be configured for any successfully kept the number of samples in predefined
value of TJitter and will not make any distortions as long as range even though estimation algorithm provided
packets come in order. The compensator can be improved inaccurate estimate due to too little data.
with packet sorting functionality if there is a need for it, Listening tests showed no difference when compared to
but will become application dependent. experiments done under the same conditions but with no
jitter.
V. SIMULATION RESULTS
In our simulation study we used speech sampled at VI. CONCLUSION
8 kHz, spoken by different speakers in different ambient This paper proposes an algorithm that successfully
conditions and different type of music sampled at 44.1 and repairs any packet slips caused by clock skew. This is
8 kHz. The packets always contained 5 ms of data done by changing the number of samples in the receiver’s
independently of the sampling frequency. play-out buffer using digital interpolation techniques. The
In the first set of simulations we investigated a scenario proposed algorithm is able to work on packet-by-packet
where clock skew was present but no jitter. Different basis which makes it suitable for implementation in real-
values of clock skew were applied in range 625 to time systems.
625 s. The performance of the proposed algorithm was
We conclude that the proposed algorithm works well in investigated in the simulation study. The algorithm was
this situation as no distortions caused by clock skew could found to have good performance with both speech and
be heard in speech test vectors. The result was the same music signals over a range of skew and jitter conditions.
for music vectors sampled at 44.1 kHz. However,
distortions were noticed for music vectors sampled at REFERENCES
8 kHz for > 25 s. These distortions were however [1] W. Jiang, K. Koguchi, H. Schulzrinne, “QoS Evaluation of VoIP
End-points” Proc. IEEE International Conference on
similar to the aliasing distortions reported in [3]. Communications, Anchorage, Alaska, May 2003, pp. 1917–1921.
In the second set of simulations jitter was added on top [2] ITU-T Recommendation G.823 “The control of jitter and wander
of two values of clock skew: = 625 s. within digital networks which are based on the 2048 kbit/s
hierarchy”, March 2000.
Two different sources of jitter were used. The first was [3] Tõnu Trump, “Compensator for Clock Skew in Voice Over Packet
from the real network measurements. The second source Networks by Speech Interpolation”, Proc. IEE International
was artificially created so that jitter values have Symposium on Circuits and Systems, vol. 5 pp. V-608-V-611,
exponential distribution. In both cases lost packets were Vancouver, Canada, May 2004.
[4] A.Pasztor, D. Veitch “A Precision Infrastructure for Active
excluded from the experiment (influence of packet loss on Probing”, PAM2001 - A workshop on Passive and Active
quality is not in the scope of this paper). Measurement, April 2001.
Fig. 3 shows the number of samples in the buffer over [5] Raffaele Noro, “Synchronization over Packet-Switching Networks:
Theory and Applications”, These No 2178 (2000), Lausanne, EPFL.
the time for one of the experiments. (The number of [6] O. Hodson, C. Perkins and V. Hardman “Skew detection and
processed frames, i, is on x axis.) In this simulation the compensation for internet audio applications” Proc. IEEE
second jitter source is used and system parameters had the International Conf. on Multimedia and Expo, New York, July 2000,
Vol. 3, pp. 1687–1690.
following values: = 8000 Hz, M = 40, = 25.1 s and [7] Tõnu Trump, “Maximum Likelihood Trend Estimation in
TJitter = 30 ms. This figure illustrates in the best way the Exponential Noise”, IEEE Transactions on Signal Processing, VOL.
operation of clock skew compensation algorithm. B0 49, NO 9, September 2001, pages: 2087-2095.
samples (240) are used for de-jittering. The algorithm
145
Get documents about "