Clock Skew Compensation by Speech Interpolation by maclaren1


									14th Telecommunications forum TELFOR 2006                                        Serbia, Belgrade, November 21-23, 2006

                              Clock Skew Compensation
                               by Speech Interpolation
                                                Dejan Miljkovi , Tõnu Trump

                                                                   adjustment and signal processing group.
   Abstract — A common synchronization approach used in               In clock adjustment group the most popular approach
telecommunication networks is clock adjustment via a               is the use of global positioning system (GPS) to make all
synchronous digital hierarchy. In packet based networks this       relevant clocks precise enough [4]. Some researchers (e.g.
is however not always feasible and differences in local clock
frequencies, clock skew, needs to be handled by proper signal
                                                                   [5]) propose more efficient algorithms than traditional
processing. This paper proposes a clock skew compensation          Phase Lock Loop for adjusting receiver’s clock. The
algorithm that effectively changes the number of samples in        underlying assumption made in this approach is that
the receiver’s play-out buffer, dependent on estimated clock       adjusting of the clock is possible. Even though this is the
skew, thereby avoiding systematic over/under-runs of the           truth in many cases, in some others it is not feasible.
buffer.                                                               For example, it may be difficult in the case of software
                                                                   IP telephony clients running on PC-s, as the equipment
  Keywords —        synchronization,    clock    skew,   jitter,
                                                                   may use an independent clock on audio card to control the
                                                                   sampling rate [3]. Another complication occurs if one of
                                                                   the parties is a gateway interconnecting IP network and
                     I. INTRODUCTION                               Public Switched Telephone Network (PSTN). Playout rate
                                                                   in such a gateway is typically synchronized to the PSTN
I  N the Internet telephony the packets containing voice
   samples are forwarded from router to router in their way
from transmitter to receiver. The packets are temporarily
                                                                   clock and cannot be arbitrarily changed for individual
                                                                   calls. In these cases a signal processing algorithm must
                                                                   be used for clock skew compensation.
stored in the buffers of these routers. The exact time of
                                                                      Paper [6] proposes one signal processing algorithm for
storage of each voice packet in the router’s buffers is
                                                                   clock skew compensation. It uses temporal redundancy
dependent on the network load. The accumulated storage
                                                                   inherent in audio signals to identify segments of audio that
times contribute to variable component of transport delay
                                                                   may be repeated or cut from the stream (depending on the
through the network. This phenomenon is known as delay
                                                                   sign of the skew). The weakness of this algorithm is
jitter and it can be modeled as a random process.
                                                                   dependency on existence of these suitable segments and
   Even when initially set accurately, non-synchronized
                                                                   may have problems with the rich content.
real clocks will differ after some amount of time as the
                                                                      More general approach of using resampling techniques
clocks count time at slightly different rates. This
                                                                   is proposed in [3]. The paper argues that classic
difference accumulates over time and causes
                                                                   resampling technique becomes impractical for rational
inconsistencies in operation in a network consisted of
                                                                   factors close to one due to its computational complexity
many end-points each having its own clock. We shall call
                                                                   and proposes a lower complexity alternative: spline
the difference in frequencies of two free-running clocks a
                                                                   interpolation. This paper continues the work on clock
clock skew. Because of its origin clock skew can be
                                                                   skew compensator started in [3]. Instead of applying
considered as a deterministic phenomenon.
                                                                   spline interpolation on a complete test vector at once, we
   Reference [1] evaluates several hardware and software
                                                                   propose a clock skew compensation scheme that can be
implementations of IP phones. The results reported show
                                                                   applied on packet by packet bases and is therefore possible
that in most cases the clock skew is within 60 ppm, but it
                                                                   to implement in practical systems.
can be as high as 300 ppm. For a comparison we note that
                                                                      The paper is organized as follows. In Section II we
ITU-T Rec. G.823 [2] requires that the maximum
                                                                   introduce the problem and give mathematical
permissible long-term mean controlled slip rate is no more
                                                                   formalization of it. Section III describes the proposed
than one slip in 70 days in a node based on the 2048 kbit/s
                                                                   clock skew compensator. Some important properties of the
hierarchy which connects to an independently-
                                                                   proposed algorithm are further discussed in Section IV.
synchronized network.
                                                                   Section V provides our simulation results. And finally
   As suggested in [3], clock skew compensation
                                                                   Section VI presents the conclusions.
algorithms can be divided into two groups: clock

   Authors are with Ericsson AB, Armborstvägen 14, Box 1505,
Stockholm,    Sweden.    They    can   be   reached    by email,

                                                                                      Let us assume for a moment that there is no jitter
                      II. PROBLEM DESCRIPTION                                      present. Then, in the presence of clock skew, packet slips
   Fig. 1 shows two terminals that communicate over a                              happen when accumulated error of expected packet arrival
packet switched network. Since these two terminals are                             time reaches packet period. It can be shown that the slip
not synchronized, signal that is sampled in the transmitter                        period is given by:
with sampling frequency Tx is replayed with sampling                                                                   M              1       M
frequency Rx at the receiver.                                                                        t PER                                           .                    (5)
                                                                                                                  Tx       Rx
                                                                                                                                     A Tx      Rx

                                                                                      ATx presents the clock’s accuracy of a transmitter when
                                       Packet                                      the receiver’s clock is considered as a reference. The
 Terminal                             switched                  Terminal           following table shows some values of Tx, and tPER for
     Tx                               network                       Rx             different accuracies of transmitter’s clock ATx in the
                                                                                   system where Rx = 8000 Hz and M = 40.

      Fig. 1. Communication between two terminals                                                         TABLE 1: SOME CLOCK SKEW EXAMPLES
                                                                                          ATx [ppm]                 Tx [Hz]               [s]                  tPER [s]
 Let’s assume that transmitter sends packets containing                                            60              8 000.48           -0.30 10                     83.333
M samples at time instances:                                                                                                                         6
                                                                                                  300              8 002.40           -1.50 10                     16.667
                       M                                                                      142 857              9 142.86            -625 10                       0.035
          t Tx                   t,      t 1, 2, ...                       (1)

  The receiver expects them at time instances:                                                            III. CLOCK SKEW COMPENSATOR
                            M                                                        Block diagram of the proposed algorithm is shown in
               t Rx                  t d Rx ,       t 1,2,...              (2)     Fig. 2.

                                                                                                Clock skew compensator
where dRx is some transmission delay expected by the
receiver. The actual arrival times are given by:                                                yi                              ˆi
                                                                                                          Clock skew                    Converter
                      M                                                                                    estimator                       to N
          tR                t dR v t ,                t 1, 2,...           (3)
                       Tx                                                                                 tRx                                       Nj
where dR is the minimum transmission delay from the                                                  tR                                           Nj
transmitter to the receiver over the network, and v(t) is a                            Packet                                                                  Packet j
random variable characterizing the extra delay added by                                 i in
                                                                                                                FIFO                   Interpolator              out
                                                                                                [M]                         [N]                          [M]
the network (delay jitter) at time t. The jitter can be
modeled as independent and identically distributed
random process with exponential probability density                                                       Fig. 2. Clock skew compensator
   Now the difference between the actual and expected                                 Clock skew compensator is located in the receiver. At
arrival times of the packets can be computed as:                                   moments tR packets of M samples of data are received and
                                                                                   written in a FIFO buffer. Then the difference between the
     yt               tR        t Rx                                               actual and expected arrival time (yi) is calculated
                                 1        1                                        according to (4).
                      M                         t    d R d Rx      vt      (4)        Clock skew estimator algorithm is then used to update
                                 Tx       Rx

                       t a vt.                                                     the estimate of the clock skew value from each obtained yi.
                                                                                      Converter to N calculates Nj – the number of samples
   This difference is a linear function of time. The                               to be read from the FIFO buffer at moments tRx. This
parameter a can be interpreted as a correction to the initial                      number is then checked by Limiter which in the closed
guess about the minimum transportation delay. The                                  loop monitors number of samples in the FIFO buffer (b).
parameter = (M/ Tx) – (M/ Rx) is the difference of packet                          This is the main difference compared to the solution with
periods in transmitter and receiver. will be also referred                         de-jittering buffer. Instead of reading the same number
as clock skew parameter.                                                           (M) of data samples from the buffer each time, adjustable
   Let us assume that the receiver has an input buffer that                        number of samples ( N j ) is read to compensate for the
can accommodate just one packet and data is read from the
                                                                                   clock skew.
buffer when it is required for play-out. Then if a new
                                                                                      Since the receiver needs to provide M samples of data
packet arrives before the previous one is read out from it,
the data is written over and a slip occurs. Likewise if the                        each packet period M/ Rx, Interpolator is needed to
system attempts to read from the input buffer before a                             resample the signal from N to M samples.
packet has arrived a slip occurs as there is no data to read.                         In the remainder of this section each building block in
                                                                                   Fig. 2 is explained in more details (except FIFO queue

which is standard component).                                               D. Limiter
   A. Interpolator                                                           The role of the Limiter is to keep the number of samples
                                                                          in the FIFO buffer (b) between 0 and 2M. The reason why
    The Interpolator’s task is to provide a packet of M
                                                                          is it needed will be explained in the next section.
samples on the output by using N samples from the input
                                                                             If more samples are requested than there are in the
by means of interpolation. This is equivalent to the change
                                                                          buffer (Nj > b) the Limiter will read all remaining samples
of sampling rate by the factor M/N. Classical digital signal
processing method for resampling with ratio M/N is                        ( N j b ) thus preventing buffer to underflow.
impractical. This is because the clock skew is small in                      If b > 2M the Limiter will override value Nj
practice hence the ratio M/N is close to one causing the                  recommended by “converter       to N” and read all
complexity to become significant.                                         remaining samples above M ( N j b M ).
    As suggested in [3] spline interpolation has lower
complexity. Splines are piecewise polynomials with pieces
                                                                                  IV. PROPERTIES OF PROPOSED ALGORITHM
that smoothly connect together. A zero order B-spline,
   (t); is a rectangular pulse and n-th order B-spline is                   A. Processing delay
n + 1 times convolution of 0(t) with itself. Following the                   Clock skew compensator introduces minimal constant
results of [3], we use cubic spline interpolator in this                  delay of one packet period M/ Rx. If the processing delay
paper.                                                                    is smaller than M/ Rx slips will occur (e.g. in the situation
  B. Clock skew estimator                                                 when Tx< Rx because two frames of M samples have to
   The recursive clock skew estimator, proposed in [7],                   be sent one after another).
was used. The algorithm is able to provide rather accurate                   This means that average number of samples in the
clock skew estimates in most network scenarios. The                       buffer is M, and the minimal size of the FIFO buffer is
algorithm has a reasonably low computational complexity                   2M.
and memory consumption.                                                     B. Resilience to clock skew estimation error
  C. Converter to N                                                          A statistical analysis of the clock skew estimation
   This block implements the control logic of                             algorithm can be found in [6], where it is shown that the
compensation algorithm that needs to decide the number                    estimation error variance of the clock skew parameter is
of samples to be read from the buffer at moments tRx so no                inversely proportional to the fourth power of the number
buffer over/under-runs shall happen.                                      of received packets. At the beginning of operation the
   To make this possible the ratio of number of samples to                estimation error ( error) can be significant because too little
be read to packet size, needs to be equal to frequency ratio              data is available to get a reliable estimate.
 Tx/ Rx. Hence, we need to read:
                                                                             It can be shown that in an open loop configuration when
                                                                          there is no Limiter (i.e. N j N j ), that the estimation error
           n                 M     1 A Tx           M             (6)     causes packet slips with period:

samples from the FIFO buffer. Tx is not known.                                           t CSC _ PER                t PER .          (10)
Estimated value of clock skew parameter is available as                                                     error

input and has to be used instead. ATx can be expressed as:
                                                                          Thus, the slips would be still occurring but with far longer
                                                                          period since error is much smaller than .
                A Tx                     Rx
                                                    .             (7)        The Limiter closes the loop by monitoring the number
                                  M           Rx
                                                                          of samples in the FIFO buffer. It does not only eliminate
  Substitution of (7) to (6) gives:                                       any remaining packet slips, but also keeps the mean
                                                                          processing delay to minimal M/ Rx.
     n    1             Rx
                                    M                         .   (8)       C. Resilience to jitter
               M             Rx
                                              M          Rx
                                                                             De-jittering buffer is traditionally used to remove
   Since n is in general not an integer it needs to be                    variations in packet delay. All packets are intentionally
rounded to obtain N. The rounding error, e, must be taken                 delayed for time TJitter. Any packet with delay greater than
into account next time when calculating N. One iteration                  TJitter will be discarded as it comes too late to be processed.
of the algorithm is given by the following equations:                     The value of TJitter (either fixed or adaptive) is selected as a
                                  M2                                      compromise between probability of packet loss and
               nj                                                         introduced delay.
                             M      Rx    i
                                                                  (9)        As clock skew compensator already uses a FIFO buffer,
               Nj            round n j         ej 1                       it can be used for de-jittering as well. The delay of TJitter
               ej        nj       ej 1        Nj.                         is achieved by keeping B0 samples in the buffer:
                                                                                             B0        Rx     TJitter .              (11)

                                     Fig. 3. Number of samples in the FIFO buffer

   The Limiter is initialized with value B0 to set the upper    accumulates samples in the buffer so there are B0+2M
limit to B0+2M. The mean introduced delay in this               (320) samples available when next slip would have
configuration is (B0+M)/ Rx. Sample numbers 0 to B0 in          happened. Two packets of M samples can be sent at that
the buffer are used to protect against jitter. Samples B0 to    moment without causing buffer under-run.
B0 + 2M are used to compensate for clock skew.                      The first period on Fig. 3 shows that the Limiter
   Clock skew compensator can be configured for any             successfully kept the number of samples in predefined
value of TJitter and will not make any distortions as long as   range even though estimation algorithm provided
packets come in order. The compensator can be improved          inaccurate estimate due to too little data.
with packet sorting functionality if there is a need for it,        Listening tests showed no difference when compared to
but will become application dependent.                          experiments done under the same conditions but with no
                V. SIMULATION RESULTS
   In our simulation study we used speech sampled at                                    VI. CONCLUSION
8 kHz, spoken by different speakers in different ambient           This paper proposes an algorithm that successfully
conditions and different type of music sampled at 44.1 and      repairs any packet slips caused by clock skew. This is
8 kHz. The packets always contained 5 ms of data                done by changing the number of samples in the receiver’s
independently of the sampling frequency.                        play-out buffer using digital interpolation techniques. The
   In the first set of simulations we investigated a scenario   proposed algorithm is able to work on packet-by-packet
where clock skew was present but no jitter. Different           basis which makes it suitable for implementation in real-
values of clock skew were applied in range 625 to               time systems.
625 s.                                                             The performance of the proposed algorithm was
   We conclude that the proposed algorithm works well in        investigated in the simulation study. The algorithm was
this situation as no distortions caused by clock skew could     found to have good performance with both speech and
be heard in speech test vectors. The result was the same        music signals over a range of skew and jitter conditions.
for music vectors sampled at 44.1 kHz. However,
distortions were noticed for music vectors sampled at                                      REFERENCES
8 kHz for         > 25 s. These distortions were however        [1]   W. Jiang, K. Koguchi, H. Schulzrinne, “QoS Evaluation of VoIP
                                                                      End-points”     Proc.   IEEE     International  Conference    on
similar to the aliasing distortions reported in [3].                  Communications, Anchorage, Alaska, May 2003, pp. 1917–1921.
   In the second set of simulations jitter was added on top     [2]   ITU-T Recommendation G.823 “The control of jitter and wander
of two values of clock skew: = 625 s.                                 within digital networks which are based on the 2048 kbit/s
                                                                      hierarchy”, March 2000.
   Two different sources of jitter were used. The first was     [3]   Tõnu Trump, “Compensator for Clock Skew in Voice Over Packet
from the real network measurements. The second source                 Networks by Speech Interpolation”, Proc. IEE International
was artificially created so that jitter values have                   Symposium on Circuits and Systems, vol. 5 pp. V-608-V-611,
exponential distribution. In both cases lost packets were             Vancouver, Canada, May 2004.
                                                                [4]   A.Pasztor, D. Veitch “A Precision Infrastructure for Active
excluded from the experiment (influence of packet loss on             Probing”, PAM2001 - A workshop on Passive and Active
quality is not in the scope of this paper).                           Measurement, April 2001.
   Fig. 3 shows the number of samples in the buffer over        [5]   Raffaele Noro, “Synchronization over Packet-Switching Networks:
                                                                      Theory and Applications”, These No 2178 (2000), Lausanne, EPFL.
the time for one of the experiments. (The number of             [6]   O. Hodson, C. Perkins and V. Hardman “Skew detection and
processed frames, i, is on x axis.) In this simulation the            compensation for internet audio applications” Proc. IEEE
second jitter source is used and system parameters had the            International Conf. on Multimedia and Expo, New York, July 2000,
                                                                      Vol. 3, pp. 1687–1690.
following values: = 8000 Hz, M = 40, = 25.1 s and               [7]   Tõnu Trump, “Maximum Likelihood Trend Estimation in
TJitter = 30 ms. This figure illustrates in the best way the          Exponential Noise”, IEEE Transactions on Signal Processing, VOL.
operation of clock skew compensation algorithm. B0                    49, NO 9, September 2001, pages: 2087-2095.
samples (240) are used for de-jittering. The algorithm


To top