# 8. Linear least-squares problems by qok10781

VIEWS: 41 PAGES: 12

• pg 1
```									                                                                      EE103 (Spring 2004-05)

8. Linear least-squares problems

• overdetermined sets of linear equations

• least-squares solution

• examples and applications

8–1

Overdetermined sets of linear equations

m equations in n variables

a11x1 + a12x2 + · · · + a1nxn = b1
a21x1 + a22x2 + · · · + a2nxn = b2
.
.
am1x1 + am2x2 + · · · + amnxn = bm

in matrix form: Ax = b with A ∈ Rm×n, b ∈ Rm

• A is skinny (m > n); more equations than unkowns

• for most b, cannot solve for x

Linear least-squares problems                                                           8–2
Least-squares solution

one approach to approximately solve Ax = b:

minimize       Ax − b

• r = Ax − b is called the residual or error
• x with smallest residual (smallest value of r = Ax − b ) is called the
least-squares solution
• in Matlab: x = A\b

equivalent formulation:

2       m     T
minimize    Ax − b        =   i=1 (ai x   − b i )2

where aT is the ith row of A
i

Linear least-squares problems                                                    8–3

example: three equations in two variables x1, x2

2x1 = 1,      −x1 + x2 = 0,          2x2 = −1

least-squares solution:

minimize (2x1 − 1)2 + (−x1 + x2)2 + (2x2 + 1)2

to ﬁnd optimal x1, x2, set derivatives w.r.t. x1 and x2 equal to zero:

10x1 − 2x2 − 4 = 0
−2x1 + 10x2 + 4 = 0

solution x1 = 1/3, x2 = −1/3

(much more on solving LS problems later)

Linear least-squares problems                                                    8–4
r1 = (2x1 − 1)2
2                                                2
r2 = (−x1 + x2)2

30                                                 20

15
20
10
10
5
PSfrag replacements                                PSfrag replacements
0                                                  0
2                                                  2
2                                              2
0                                                0
0                                                    0
x2      −2 −2
x1                              x2     −2 −2
x1

2
r3 = (2x2 + 1)2                                        2     2     2
r1 + r 2 + r 3

30                                                 60

20                                                 40

10                                                 20

PSfrag replacements                                PSfrag replacements
0                                                  0
2                                                  2
2                                              2
0                                                0
0                                                    0
x2      −2 −2
x1                              x2     −2 −2
x1

Linear least-squares problems                                                                         8–5

Least-squares data ﬁtting

ﬁt a function

g(t) = x1g1(t) + x2g2(t) + · · · + xngn(t)

to data (t1, y1), . . . , (tm, ym), i.e., we would like to have

g(t1) = y1,                g(t2) = y2,         ...,           g(tm) = ym

• gi(t) : R → R are given functions (basis functions)
• problem variables: the coeﬃcients x1, x2, . . . , xn
• usually m                n, hence no exact solution
• applications:
– extrapolation, smoothing of data
– developing simple, approximate model of observed data

Linear least-squares problems                                                                         8–6
least-squares ﬁt: minimize the function

m                             m
2                                                          2
(g(ti) − yi) =                (x1g1(ti) + x2g2(ti) + · · · + xngn(ti) − yi)
i=1                           i=1

2
in matrix notation: minimize Ax − b                          where
                                                                    
g1(t1) g2(t1) g3(t1) · · · gn(t1)                                        y1
 g (t ) g2(t2) g3(t2) · · · gn(t2)                                      y 
A= 1.2       .      .            .   ,                               b =  .2 
    .      .      .            .                                        . 
g1(tm) g2(tm) g3(tm) · · · gn(tm)                                        ym

Linear least-squares problems                                                                  8–7

Example: data ﬁtting with polynomials

ﬁt a polynomial

g(t) = x1 + x2t + x3t2 + · · · + xntn−1

to data (t1, y1), . . . , (tm, ym) (m ≥ n), i.e., we would like to have

g(t1) = y1,              g(t2) = y2,         ...,   g(tm) = ym

a set of m equations in n variables

n−1
1 t 1 t2 · · · t 1
                                                 
1                x1                       y1
 1 t2 t2 · · · tn−1   x2                      y2 
2         2
 . .
 . .    .
.           .  .  = 
.  .                       . 
. 
n−1
1 t m t2 · · · t m
m                xn                       ym

Linear least-squares problems                                                                  8–8
polynomial interpolation: m = n
if m = n, we can satisfy the equations g(ti) = yi exactly by solving a set
of n linear equations in n variables (see page 3–3)
example. ﬁt a polynomial to f (t) = 1/(1 + 25t2) on [−1, 1]

n=5                                              n = 15
8
1.5

6
1

4

0.5
2

0
0
Sfrag replacements                              PSfrag replacements
−0.5                                                 −2
−1        −0.5           0        0.5        1      −1         −0.5       0        0.5       1

(dashed line: f ; solid line: polynomial g ; circles: the points (t i, yi))

increasing n does not improve the overall quality of the ﬁt

Linear least-squares problems                                                                           8–9

polynomial approximation: m > n

if m > n, we have more equations than variables, and (in general) it is not
possible to satisfy the conditions g(ti) = yi exactly

least-squares solution:
m                              m                                          n−1
minimize                i=1 (g(ti )   − y i )2 =       i=1 (x1   + x 2 ti + x 3 t2 + · · · + x n ti − y i ) 2
i

2
in matrix notation: minimize Ax − b                            where

n−1
1 t 1 t2 · · · t 1
                                                                               
1                                                     x1                     y1
 1 t2 t2 · · · tn−1                                         x                    y 
A= . .
 . .
2
.
2
. ,                                x =  .2  ,
 .               b =  .2 
 . 
.           . 
1 t m t2 · · · t m
m
n−1                                         xn                     ym

Linear least-squares problems                                                                        8–10
example: ﬁt a polynomial to f (t) = 1/(1 + 25t2) on [−1, 1]

m = 50; ti: m equally spaced points in [−1, 1]

n=5                                      n = 15

1                                                1

0.8                                              0.8

0.6                                              0.6

0.4                                              0.4

0.2                                              0.2
Sfrag replacements                              PSfrag replacements
0                                                0

−0.2                                             −0.2
−1        −0.5           0      0.5     1        −1    −0.5       0    0.5   1

(dashed line: f ; solid line: polynomial g ; circles: the points (t i, yi))

much better ﬁt overall

Linear least-squares problems                                                       8–11

Least-squares estimation

y = Ax + w

• x is what we want to estimate or reconstruct
• y is our measurement(s)
• w is an unknown noise or measurement error (assumed small)
• ith row of A characterizes ith sensor or ith measurement

ˆ
least-squares estimation: choose as estimate the vector x that minimizes

x
Aˆ − y

i.e., minimize the deviation between what we actually observed (y), and
ˆ
what we would observe if x = x, and there were no noise (w = 0)

Linear least-squares problems                                                       8–12

PSfrag replacements
determine position (u, v) in a plane by measuring distances to four
beacons at known positions (pi, qi)
beacons

(p1, q1)

(p4, q4)
−a1

−a4           −a2
(p2, q2)

−a3
unknown position (u, v)

(p3, q3)

we assume that the beacons are far from the unknown position (u, v), so
linearization around (u0, v0) = 0 (say) is nearly exact

Linear least-squares problems                                                           8–13

linearized equations (page 3–20):

ai1u + ai2v + wi = ρi −                   2
p2 + q i ,
i               i = 1, 2, 3, 4

• −(ai1, ai2): unit vector from 0 to beacon i
• ρi: measured distance to beacon i
• wi: measurement error in ρi plus small error due to linearization

problem: estimate u, v, given ρ, (p1, q1), (p2, q2), (p3, q3), (p4, q4)

example
• beacon positions (p1, q1) = (10, 0), (p2, q2) = (−10, 2),
(p3, q3) = (3, 9), (p4, q4) = (10, 10)
• actual position is (2, 2)
• measured distances ρ = (8.22, 11.9, 7.08, 11.33)

Linear least-squares problems                                                           8–14
approximate solutions:

• ‘just enough measurements method’: two measurements suﬃce to ﬁnd
(u, v) (when error w = 0)
ˆ ˆ
e.g., can compute u, v from

ρ1 −              2
p2 + q 1
1                a11 a12             ˆ
u
=
ρ2 −              2
p2 + q 2
2
a21 a22             ˆ
v

example (of previous page):

−1.78               −1.00  0.00            ˆ
u
=
1.72                0.98 −0.20            ˆ
v

u ˆ
solution (via Matlab): (ˆ, v ) = (1.78, 0.11) (norm of error: 1.90)

Linear least-squares problems                                                                 8–15

ˆ ˆ
• least-squares method: compute u, v by minimizing

2
ai1u + ai2v − ρi +
ˆ      ˆ              p2
i   +    2
qi
i=1,2,3,4

example (of page 8–14):
                     
ρ1 −          2
p2 + q 1
                                                                     
−1.00  0.00                                            1             −1.77
 0.98 −0.20                                    ρ2 −          2
p2 + q 2     1.72 
                      
A=                                                        2
 0.32 −0.95  ,                                                   =
                                                             
ρ3 −          2
p2 + q 3       −2.41 

           3         
−0.71 −0.71                                    ρ4 −   p2 + q 4
4
2       −2.81

u ˆ
solution: (ˆ, v ) = (1.97, 1.90) (norm of error: 0.10)

Linear least-squares problems                                                                 8–16
Least-squares system identiﬁcation
PSfrag replacements and output y(t) for t = 0, . . . , N of an unknown system
measure input u(t)

unknown
u(t)                     system                    y(t)

example (N = 70):
4                                             5

2
u(t)

y(t)
0                                             0

PSfrag replacements                             PSfrag replacements
−2

−4                                            −5
0       20       40    60                     0    20       40   60
t                                          t
system identiﬁcation problem: ﬁnd reasonable model for system based
on measured I/O data u, y

Linear least-squares problems                                                             8–17

a simple and widely used model:

ˆ
y (t) = h0u(t) + h1u(t − 1) + h2u(t − 2) + · · · + hnu(t − n)

ˆ
where y (t) is the predicted output (or model output)

• called a moving average (MA) model with n delays
• predicted output is a linear combination of current and n previous inputs
• h0, . . . , hn are parameters of the model

least-squares identiﬁcation: choose the model (i.e., h0, . . . , hn) that
minimizes the prediction error

N                        1/2

E=           (ˆ(t) − y(t))2
y
t=n

Linear least-squares problems                                                             8–18
formulation as a linear least-squares problem:

N                                                                    1/2

E      =                   (h0u(t) + h1u(t − 1) + · · · + hnu(t − n) − y(t))2
t=n
=         Ax − b

                                                                    
u(n)   u(n − 1) u(n − 2)                      ···     u(0)

              u(n + 1)   u(n)       u(n − 1)                  ···     u(1)     

A =               u(n + 2) u(n + 1)       u(n)                    ···     u(2)     

                  .
.        .
.            .
.                               .
.


u(N )  u(N − 1) u(N − 2)                      · · · u(N − n)
                             
h0                y(n)

             h1 
 y(n + 1) 
              
x =              h2  ,   b =  y(n + 2) 

              . 
. 

      .
.


hn                y(N )

Linear least-squares problems                                                                      8–19

example (I/O data of page 8–17) with n = 7: least-squares solution is

h0 = 0.0240,                    h1 = 0.2819,            h2 = 0.4176,         h3 = 0.3536,
h4 = 0.2425,                    h5 = 0.4873,            h6 = 0.2084,         h7 = 0.4412

5
solid: y(t): actual output
4
ˆ
dashed: y (t), predicted from model
3
2
1
0
−1
PSfrag replacements               −2
−3
−4
0    10     20     30       40   50         60   70
t

Linear least-squares problems                                                                      8–20
model order selection: how large should n be?

obviously the larger n, the smaller the prediction error on the data used to
form the model
relative prediction error E/ y
1

0.8

0.6
PSfrag replacements
0.4

0.2

¯ ¯
test data set y , u         0
data set y , u          0             20                     40
n
• suggests using largest possible n for smallest prediction error
• a much more important question is: how good is the model at
predicting new data (i.e., not used to calculate the model)?

Linear least-squares problems                                                                                            8–21

model validation: test model on a new data set (from the same system)
4                                                    5

2
u(t)

y (t)

0                                                    0
¯
¯

PSfrag replacements                                                    PSfrag replacements
y (t)
¯                                  −2

−4                                                   −5
0   20           40         60        ¯
u(t)           0       20       40   60
t                                                    t
relative prediction error

1
• for n too large the predictive
0.8
ability of the model becomes
PSfrag replacements 0.6
worse!
0.4
validation data          • plot suggests n = 10 is a good
0.2                                               choice
modeling data
0
0           20                 40
n

Linear least-squares problems                                                                                            8–22
for n = 50 the actual and predicted outputs on system identiﬁcation and
model validation data are:

model identiﬁcation I/O set                     model validation I/O set
5                                                 5
PSfrag replacements                                  PSfrag replacements
solid: y(t)                                       ¯
solid: y (t)
dashed: predicted y(t)                                       ¯
dashed: predicted y (t)

solid: y(t)
0                      dashed: predicted y(t)     0
¯
solid: y (t)
¯
dashed: predicted y (t)

−5                 model identiﬁcation I/O set     −5
model validation I/O set       0              20       40        60               0      20          40      60
t                                           t

loss of predictive ability when n too large is called model overﬁt or
overmodeling

Linear least-squares problems                                                               8–23

```
To top