Document Sample

```					                                                                                                                ADAPTIVE FILTERS   259

Now consider the expectation of Eq. (37):                             For stable convergence each term in Eq. (45) must be less
than one, so we must have
E[w (n + 1)] = E[w (n)] + 2µE[d(n)x (n)]
w              w                x
(38)                                         1
− 2µE[x (n)x (n)T ]E[w (n)]
x x            w                                              0<µ<                                (46)
λmax
We have assumed that the ﬁlter weights are uncorrelated
with the input signal. This is not strictly satisﬁed, because         where max is the largest eigenvalue of the correlation matrix
the weights depend on x(n); but we can assume that        has         R, though this is not a sufﬁcient condition for stability under
small values because it is associated with a slow trajectory.         all signal conditions. The ﬁnal convergence rate of the algo-
So, subtracting the optimum solution from both sides of Eq.           rithm is determined by the value of the smallest eigenvalue.
(38), and substituting the autocorrelation matrix R and cross-        An important characteristic of the input signal is therefore
correlation vector p, we get                                          the eigenvalue spread or disparity, deﬁned as

E[w (n + 1)] − R−1 p = E[w (n)] − R−1 p + 2µR{R−1 p − E[w (n)]}
w                      w                              w                                        λmax/λmin                         (47)
(39)
So, from the point of view of convergence speed, the ideal
Next, deﬁning                                                     value of the eigenvalue spread is unity; the larger the value,
the slower will be the ﬁnal convergence. It can be shown (3)
ξ (n + 1) = E[w (n + 1)] − R−1 p
w                                (40)   that the eigenvalues of the autocorrelation matrix are
bounded by the maximum and minimum values of the power
from Eq. (39) we obtain                                               spectral density of the input.
It is therefore concluded that the optimum signal for fast-
ξ
ξ (n + 1) = (I − 2µR)ξ (n)                 (41)   est convergence of the LMS algorithm is white noise, and that
any form of coloring in the signal will increase the conver-
This process is equivalent to translation of coordinates. Next,       gence time. This dependence of convergence on the spectral
we deﬁne R in terms of an orthogonal transformation (7):              characteristics of the input signal is a major problem with the
LMS algorithm, as discussed in Ref. 6.
R = K T QK                          (42)
LMS-Based Algorithms
where Q is a diagonal matrix consisting of the eigenvalues
( 0, 1, . . ., N) of the correlation matrix R, and K is the uni-         The Normalized LMS Algorithm. The normalized LMS
tary matrix consisting of the eigenvectors associated with            (NLMS) algorithm is a variation of the ordinary LMS algo-
these eigenvalues.                                                    rithm. Its objective is to overcome the gradient noise ampliﬁ-
Substituting Eq. (42) in Eq. (41), we have                         cation problem. This problem is due to the fact that in the
standard LMS, the correction e(n)x(n) is directly propor-
ξ
ξ (n + 1) = (I − 2µK T QK)ξ (n)                       tional to the input vector x(n). Therefore, when x(n) is large,
(43)   the LMS algorithm ampliﬁes the noise.
ξ
= K T (I − 2µQ)Kξ (n)
Consider the LMS algorithm deﬁned by
Multiplying both sides of the Eq. (43) by K and deﬁning
w (n + 1) = w (n) + 2µe(n)x (n)
x                   (48)
ξ
v (n + 1) = Kξ (n + 1)
(44)   Now consider the difference between the optimum vector w*
= (I − 2µQ)v (n)
v
and the current weight vector w(n):
we may rewrite Eq. (44) in matrix form as
v (n) = w ∗ − w (n)                   (49)
         
v0 (n)
 v (n)                                                              Assume that the reference signal and the error signal are
 1       
         
     .   
     .
.                                                                                      d(n) = w ∗Tx (n)                     (50)
vN−1 (n)                                                                              e(n) = d(n) − w (n)Tx (n)                  (51)
                                               
(1 − 2µλ1 )n
                (1 − 2µλ2 )n                              Substituting Eq. (50) in Eq. (51), we obtain
                                               
=                               .               

                               .
.                                           e(n) = w ∗Tx (n) − w (n)Tx (n)
(1 − 2µλN )n                                   = [w ∗T − w (n)T ]x (n)
w              x                   (52)
         
v0 (0)                                   = v (n)x (n)
T
x
 v (0) 
 1       
            (45)
     .
.             We decompose v(n) into its rectangular components
     .   
vN−1 (0)                               v (n) = v o (n) + v p (n)                (53)

vp(n–1)                               Therefore, the NLMS algorithm given by Eq. (64) is equiva-
vp(n)                  lent to the LMS algorithm if

α
2µ =                                (66)
x T (n)x (n)
x
vp(n)         vp(n)           x(n)

Figure 13. Geometric interpretation of the NLMS algorithm.           NLMS Algorithm
Parameters :         M = ﬁlter order
α = step size
where vo(n) and vp(n) are the orthogonal component and the                     Initialization :     Set w (0) = 0
parallel component of v(n) with respect to the input vector.                   Computation :        For n = 0, 1, 2, . . ., compute
This implies
y(n) = w (n)Tx (n)
v p (n) = Cx (n)
x                             (54)                               e(n) = d(n) − y(n)
α
where C is a constant. Then substituting Eq. (53) and Eq. (54)                                      β= T
x (n)x (n)
x
in Eq. (52), we get
w (n + 1) = w (n) + βe(n)x (n)
x
e(n) = [v o (n) + v p (n)]Tx (n)
v                                       (55)
e(n) = [v o (n) + Cx (n)]Tx (n)
v          x                            (56)      Time-Variant LMS Algorithms. In the classical LMS algo-
rithm there is a tradeoff between validity of the ﬁnal solution
Because vo(n) is orthogonal to x(n), the scalar multiplication          and convergence speed. Therefore its use is limited for several
is                                                                      practical applications, because a small error in the coefﬁcient
vector requires a small convergence factor, whereas a high
v Tx (n) = 0
o                                    (57)   convergence rate requires a large convergence factor.
The search for an optimal solution to the problem of ob-
Then solving for C from Eqs. (56) and (57) yields                       taining high convergence rate and small error in the ﬁnal
solution has been an arduous in recent years. Various algo-
e(n)                                 rithms have been reported in which time-variable conver-
C=                                       (58)
x T (n)x (n)
x                                 gence coefﬁcients are used. These coefﬁcients are chosen so
as to meet both requirements: high convergence rate and low
and                                                                     MSE. Interested readers may refer to Refs. 9–14.

e(n)x (n)
x
v p (n) =                                  (59)   Recursive Least-Squares Algorithm
x T (n)x (n)
x
The recursive least-squares (RLS) algorithm is required for
The target now is to make v(n) as orthogonal as possible to             rapidly tracking adaptive ﬁlters when neither the reference-
x(n) in each iteration, as shown in Fig. 13. The above men-             signal nor the input-signal characteristics can be controlled.
tioned can be done by setting                                           An important feature of the RLS algorithm is that it utilizes
information contained in the input data, extending back to
v (n + 1) = v (n) − αv p (n)
v                         (60)   the instant of time when the algorithm is initiated. The re-
sulting convergence is therefore typically an order of magni-
Finally, substituting Eq. (49) and Eq. (59), we get                     tude faster than for the ordinary LMS algorithm.
In this algorithm the mean squared value of the error sig-
e(n)x (n)
x
w ∗ − w (n + 1) = w ∗ − w (n) − α                      (61)   nal is directly minimized by a matrix inversion. Consider the
x T (n)x (n)
x              FIR ﬁlter output
e(n)x (n)
x
w (n + 1) = w (n) + α                             (62)                             y(n) = w Tx (n)                     (67)
x T (n)x (n)
x

where, in order to reach the target,         must satisfy (9)           where x(n) is the input vector given by x(n) [x(n), x(n 1,
. . ., x(n M 1)]T and w is the weight vector. The optimum
0<α<2                                (63)   weight vector is computed in such a way that the mean
squared error, E[e2(n)] is minimized, where
In this way
e(n) = d(n) − y(n) = d(n) − w Tx (n)             (68)
w (n + 1) = w (n) + βe(n)x (n)
x                       (64)
E[e (n)] = E[{d(n) − w x (n)} ]
2                            T   2
(69)
where
To minimize E[e2(n)], we can use the orthogonality principle
α                                         in the estimation of the minimum. That is, we select the
β= T                                     (65)
x (n)x (n)
x                                        weight vector in such a way that the output error is orthogo-

nal to the input vector. Then from Eqs. (67) and (68), we ob-                                Next, for convenience of computation, let
tain
Q(n) = R−1 (n)                   (82)
E[x (n){d(n) − x (n)w}] = 0
x                 w    T
(70)
and
Then
R−1 (n − 1)x (n)
x
E[x (n)x T (n)w] = E[d(n)x (n)]
x x         w          x                                       (71)                       K(n) =                                     (83)
λ + x T (n)R−1 (n − 1)x (n)
x

Assuming that the weight vector is not correlated with the
Then from Eq. (81) we have
input vector, we obtain
1
E[x (n)x T (n)]w = E[d(n)x (n)]
x x          w         x                                       (72)      w (n) =     [Q(n − 1) − K(n)x T (n)Q(n − 1)]
x
λ                                                 (84)
[λp (n − 1) + d(n)x (n)]
p               x
which can be rewritten as
1
w (n) = Q(n − 1)p (n − 1) +
p              d(n)Q(n − 1)x (n)
x
Rw = p
w                                             (73)                                    λ
− K(n)x T (n)Q(n − 1)p (n − 1)
x              p                            (85)
where R and p are the autocorrelation matrix of the input                                                  1
signal and the correlation vector between the reference sig-                                            − d(n)K(n)xT (n)Q(n − 1)x (n)
x              x
λ
nal d(n) and input signal x(n), respectively. Next, assuming
ergodicity, p can be estimated in real time as                                                                        1
w (n) = w (n − 1) + d(n)Q(n − 1)x (n) x
λ
n
Q(n − 1)x (n)x T (n)w (n − 1)
x x        w
p (n) =             λn−k d(k)x (k)
x                               (74)               −                                                  (86)
k=0
λ + x T (n)Q(n − 1)x (n)
x
n−1                                                                                1 d(n)Q(n − 1)x (n)x T (n)Q(n − 1)x (n)
x x              x
−
p (n) =         λn−k d(k)x (k) + d(n)x (n)
x           x                                                       λ         λ + x T (n)Q(n − 1)x (n)
x
k=0
(75)                           1       Q(n − 1)x (n)
x
n−1                                                                  w (n) = w (n − 1) +
=λ         λ   n−k−1
d(k)x (k) + d(n)x (n)
x           x                                                      λ λ + x T (n)Q(n − 1)x (n)
x
k=0                                                                           × [λd(n) + d(n)x (n)Q(n − 1)x (n)
x T
x                  (87)

p (n) = λp (n − 1) + d(n)x (n)
p               x                                             (76)                − λx T (n)w (n − 1) − d(n)x T (n)Q(n − 1)x (n)]
x      w               x              x
1      Q(n − 1)x (n)
x
where     is the forgetting factor. In a similar way, we can ob-                               w (n) = w (n − 1) +
tain                                                                                                                 λ λ+x   T (n)Q(n − 1)x (n)
x                (88)
× λ[d(n) − x T (n)w (n − 1)]
w
R(n) = λR(n − 1) + x (n)x T (n)
x                                        (77)
Finally, we have
1
Then, multiplying Eq. (73) by R                        and substituting Eq. (76)
and Eq. (77), we get                                                                                              w (n) = w (n − 1) + K(n) (n)             (89)

w = [λR(n − 1) + x (n)x T (n)]−1 [λp (n − 1) + d(n)x (n)]
x            p               x                               (78)    where

Next, according to the matrix inversion lemma                                                                                  Q(n − 1)x (n)
x
K(n) =                                     (90)
λ + x T (n)Q(n − 1)x (n)
x
−1        −1          −1              −1          −1 −1        −1
(A + BCD)          =A         −A         B(DA            B +C     )     DA        (79)
and (n) is the a priori estimation error, based on the old
with A        R(n       1), B              x(n), C           1, and D            xT(n), we   least-square estimate of the weights vector that was made at
obtain                                                                                       time n 1, and deﬁned by

1 −1                       1 −1
w (n) =      R (n − 1) −                R (n − 1)x (n)
x                                                                 (n) = d(n) − w T (n − 1)x (n)
x               (91)
λ                          λ
−1
1 T                                                 1 T                        Then Eq. (89) can be written as
×     x (n)R−1 (n − 1)x (n) + 1
x                                   x (n)R−1 (n − 1)
λ                                                   λ
w (n) = w (n − 1) + Q(n) (n)x (n)
x              (92)
× [λp (n − 1) + d(n)x (n)]
p               x
(80)
where Q(n) is given by
−1                                 −1
1                 R (n − 1)x (n)x (n)R (n − 1)
x x                   xT
w (n) =     R−1 (n − 1) −
λ                    [λ + x T (n)R−1 (n − 1)x (n)]
x                                                       1                 Q(n − 1)x T (n)Q(n − 1)
x
Q(n) =        Q(n − 1) −                                (93)
× [λp (n − 1) + d(n)x (n)]
p               x                              (81)                                             λ                 λ + x T (n)Q(n − 1)x (n)
x

The applicability of the RLS algorithm requires that it ini-          3. S. Haykin, Adaptive Filter Theory, 3rd ed., Upper Saddle River,
tialize the recursion of Q(n) by choosing a starting value                  NJ: Prentice-Hall, 1996.
Q(0) to ensure the nonsingularity of the correlation matrix              4. B. Friedlander, Lattice ﬁlters for adaptive processing, Proc.
R(n) (3).                                                                   IEEE, 70: 829–867, 1982.
5. J. J. Shynk, Adaptive IIR ﬁltering, IEEE ASSP Mag., 6 (2): 4–
RLS Algorithm                                                               21, 1989.
Initialization :    Set Q(0)                                      6. P. Hughes, S. F. A. Ip, and J. Cook, Adaptive ﬁlters—a review of
techniques, BT Technol. J., 10 (1): 28–48, 1992.
w (0) = 0
7. B. Widrow and S. Stern, Adaptive Signal Processing, Englewood
Computation :       For n = 1, 2, . . ., compute                     Cliffs, NJ: Prentice-Hall, 1985.
Q(n − 1)x (n)
x                      8. B. Widrow and M. E. Hoff, Jr., Adaptive switching circuits, IRE
K(n) =
λ + x T (n)Q(n − 1)x (n)
x                    WESCON Conv. Rec., part 4, 1960, pp. 96–104.
9. J. Nagumo and A. Noda, A learning method for system identiﬁ-
(n) = d(n) − w T (n − 1)x (n)
x                       cation, IEEE Trans. Autom. Control, AC-12: 282–287, 1967.
w (n) = w (n − 1) + K(n) (n)                 10. R. H. Kwong and E. W. Johnston, A variable step size LMS algo-
w (n + 1) = w (n) + βe(n)x (n)
x                       rithm, IEEE Trans. Signal Process., 40: 1633–1642, 1992.
Q(n) = R−1 (n)                               11. I. Nakanishi and Y. Fukui, A new adaptive convergence factor
with constant damping parameter, IEICE Trans. Fundam. Elec-
tron. Commun. Comput. Sci., E78-A (6): 649–655, 1995.
IMPLEMENTATIONS OF ADAPTIVE FILTERS                                     12. T. Aboulnasr and K. Mayas, A robust variable step size LMS-
type algorithm: Analysis and simulations, IEEE Trans. Signal
In the last few years many adaptive ﬁlter architectures have                Process., 45: 631–639, 1997.
been proposed, for reducing the convergence rate without in-            13. F. Casco et al., A variable step size (VSS-CC) NLMS algorithm,
creasing the computational cost signiﬁcantly. The digital im-               IEICE Trans. Fundam., E78-A (8): 1004–1009, 1995.
plementations of adaptive ﬁlters are the most widely used.              14. M. Nakano et al., A time varying step size normalized LMS algo-
They yield good performance in terms of adaptivity, but con-                rithm for adaptive echo canceler structures, IEICE Trans. Fun-
sume considerable area and power. Several implementations                   dam., E78-A (2): 254–258, 1995.
achieve power reduction by dynamically minimizing the order             15. J. T. Ludwig, S. H. Nawab, and A. P. Chandrakasan, Low-power
of the digital ﬁlter (15) or employing parallelism and pipelin-             digital ﬁltering using approximate processing, IEEE J. Solid
ing (16). On the other hand, high-speed and low-power appli-                State Circuits, 31: 395–400, 1996.
cations require both parallelism and reduced complexity (17).           16. C. S. H. Wong et al., A 50 MHz eight-tap adaptive equalizer for
Is well known that analog ﬁlters offer advantages of small               partial-response channels, IEEE J. Solid State Circuits, 30: 228–
area, low power, and higher-frequency operation over their                  234, 1995.
digital counterparts, because analog signal-processing opera-           17. R. A. Hawley et al., Design techniques for silicon compiler imple-
tions are normally much more efﬁcient than digital ones.                    mentations of high-speed FIR digital ﬁlters, IEEE J. Solid State
Moreover, since continuous-time adaptive ﬁlters do not need                 Circuits, 31: 656–667, 1996.
analog-to-digital conversion, it is possible to prevent quanti-         18. M. H. White et al., Charge-coupled device (CCD) adaptive dis-
zation-related problems.                                                    crete analog signal processing, IEEE J. Solid State Circuits, 14:
monly used for analog adaptive learning circuits because of             19. T. Enomoto et al., Monolithic analog adaptive equalizer inte-
their simplicity of implementation. The LMS algorithm is of-                grated circuit for wide-band digital communications networks,
ten used to implement adaptive circuits. The basic elements                 IEEE J. Solid State Circuits, 17: 1045–1054, 1982.
used for implementing the LMS algorithm are delay elements              20. F. J. Kub and E. W. Justh, Analog CMOS implementation of high
(which are implemented with all-pass ﬁrst-order sections),                  frequency least-mean square error learning circuit, IEEE J. Solid
multipliers (based on a square law), and integrators. The                   State Circuits, 30: 1391–1398, 1995.
techniques utilized to implement these circuits are discrete-           21. Y. L. Cheung and A. Buchwald, A sampled-data switched-current
time approaches, as discussed in Refs. 18 to 21, and continu-               analog 16-tap FIR ﬁlter with digitally programmable coefﬁcients
ous-time implementations (22,23,24).                                        in 0.8 m CMOS, Int. Solid-State Circuits Conf., February 1997.
Several proposed techniques involve the implementation of            22. J. Ramirez-Angulo and A. Dıaz-Sanchez, Low voltage program-
´
the RLS algorithm, which is known to have very low sensitiv-                mable FIR ﬁlters using voltage follower and analog multipliers,
Proc. IEEE Int. Symp. Circuits Syst., Chicago, May 1993.
ity to additive noise. However, a direct analog implementa-
tion of the RLS algorithm would require a considerable effort.          23. G. Espinosa F.-V. et al., Ecualizador adaptivo BiCMOS de tiempo
To overcome this problem, several techniques have been pro-                 continuo, utilizando una red neuronal de Hopﬁeld, CONIELEC-
´
OMP’97, UDLA, Puebla, Mexico, 1997.
posed, such as structures based on Hopﬁeld neural networks
(23,25,26,27).                                                          24. L. Ortız-Balbuena et al., A continuous time adaptive ﬁlter struc-
´
ture, IEEE Int. Conf. Acoust., Speech Signal Process., Detroit,
1995, pp. 1061–1064.
BIBLIOGRAPHY                                                            25. M. Nakano et al., A continuous time equalizer structure using
Hopﬁeld neural networks, Proc. IASTED Int. Conf. Signal Image
1. S. U. H. Qureshi, Adaptive equalization, Proc. IEEE, 73: 1349–          Process., Orlando, FL, November 1996, pp. 168–172.
1387, 1985.                                                                                              ´
26. G. Espinosa F.-V., A. Dıaz-Mendez, and F. Maloberti, A 3.3 V
´
2. J. Makhoul, Linear prediction: A tutorial review, Proc. IEEE, 63:       CMOS equalizer using Hopﬁeld neural network, 4th IEEE Int.
561–580, 1975.                                                          Conf. Electron., Circuits, Syst., ICECS97, Cairo, 1997.

27. M. Nakano-Miyatake and H. Perez-Meana, Analog adaptive ﬁl-
tering based on a modiﬁed Hopﬁeld network, IEICE Trans. Fun-
dam., E80-A: 2245–2252, 1997.

M. L. Honig and D. G. Messerschmitt, Adaptive Filters: Structures,
Algorithms, and Applications, Norwell, MA: Kluwer, 1988.
B. Mulgrew and C. F. N. Cowan, Adaptive Filters and Equalisers,
Norwell, MA: Kluwer, 1988.
S. Proakis et al., Advanced Signal Processing, Singapore: Macmillan.

GUILLERMO ESPINOSA FLORES
JOSE ALEJANDRO DIAZ MENDEZ
´            ´    ´
National Institute for Research in
Astrophysics, Optics and
Electronics

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 4 posted: 9/25/2012 language: pages: 5