; Basis Expansion and Regularization
Documents
User Generated
Resources
Learning Center
Your Federal Quarterly Tax Payments are due April 15th

# Basis Expansion and Regularization

VIEWS: 5 PAGES: 43

• pg 1
```									  Basis Expansion and
Regularization

Prof. Liqing Zhang
Dept. Computer Science & Engineering,
Shanghai Jiaotong University
Outline
• Piece-wise Polynomials and Splines
• Wavelet Smoothing
• Smoothing Splines
• Automatic Selection of the Smoothing Parameters
• Nonparametric Logistic Regression
• Multidimensional Splines
• Regularization and Reproducing Kernel Hilbert
Spaces

2012/11/6       Basis Expansion and Regularization   2
Piece-wise Polynomials and Splines
N
• Linear basis expansion                f ( x)    m hm ( x)
m 1

• Some basis functions that are widely used
hm ( x)  x
hm ( x)  x p
hm ( x)  log( x)
hm ( x)  sin(mx); cos(mx)

2012/11/6           Basis Expansion and Regularization         3
Regularization
• Three approaches for controlling the
complexity of the model.                   N

– Restriction     f ( x)    h ( x)              k k
k 1
– Selection                      m
y    k hk ( x)  
– Regularization:               k 1

M                      M      m           2

          (i )  yi    k hk ( xi )
2
min 
i 1                   i 1   k 1

M                  m           2

min       i 1
yi    k hk ( xi )  J (  )
k 1

J ( )   ?
2

2012/11/6      Basis Expansion and Regularization                                    4
Piecewise Polynomials and Splines
h1  X   I  X  1 ,
h2  X   I 1  X   2 
h3  X   I  2  X  ;
h1  X   I  X  1 ,
h2  X   I 1  X   2 
h3  X   I  2  X ,
hm 3  X   hm  X  X ;
h1  X   1, h2  X   X ,
h3  X    X  1  ,
h4  X    X   2 
2012/11/6                  Basis Expansion and Regularization   5
Piecewise Cubic Polynomials
• Increasing orders
of continuity at the
knots.
• A cubic spline with
knots at  1 and  2:
1, X , X 2 , X 3 ,
 X  1 3 ,  X   2 3 ;
              

• Cubic spline
truncated power
basis
2012/11/6                 Basis Expansion and Regularization   6
Piecewise Cubic Polynomials
• An order-M spline with knots , j=1,…,K is a
piecewise-polynomial of order M, and has
continuous derivatives up to order M-2.
• A cubic spline has M=4.
• Truncated power basis set:
j 1
hj ( X )  X           ,   j  1, , M
hM l ( X )  ( X   l )  1 , l  1, , M
M

2012/11/6              Basis Expansion and Regularization   7
Natural cubic spline
namely that the function is linear beyond the
boundary knots.
Natural boundary constraints

Linear                               Cubic

Cubic                             Linear

1                 2               3

2012/11/6                   Basis Expansion and Regularization            8
B-spline
• The augmented knot sequenceτ:
1   2     M  0 ;
 jM   j ,          j  1,, K ;
 K 1   K  M 1   K  M  2     K  2 M
• Bi,m(x), the i-th B-spline basis function of order m
for the knot-sequenceτ, m≤M.
1  i  x   i 1 
Bi ,1 ( x)                           i  1, , K  2 M  m
0 others                
x  i                       im  x
Bi ,m ( x)                   Bi.m 1 ( x)                   Bi 1,m 1 ( x)
 i  m 1   i                 i  m   i 1
2012/11/6                    Basis Expansion and Regularization                         9
B-spline
• The sequence of
B-spline up to
order 4 with ten
knots evenly
spaced from
0 to 1
• The B-spline
have local
support; they are
nonzero on an
interval spanned
by M+1 knots.
2012/11/6   Basis Expansion and Regularization              10
Smoothing Splines
• Based on the spline basis method:
N
f ( x)    k hk ( x)
m

• So y    k hk ( x)   ,  is the noise.
k 1

k 1
• Minimize the penalized residual sum of squares
N
RSS ( f ,  )   yi  f ( xi )     f ' ' (t ) dt
2            2

i 1

 is a fixed smoothing parameter
  0 : f can be any function that interpolates the data
   : the simple least squares line fit
2012/11/6                  Basis Expansion and Regularization          11
Smoothing Splines
N
• The solution is a natural spline: f ( x)   N j ( x) j
j 1
• Then the criterion reduces to:
RSS( , )   y  N   y  N    T N
T

– where N  {N j ( xi ) };  Nij   Ni" (t ) N "j (t )dt
• So the solution:
ˆ
  ( N T N  N )1 N T y
• The fitted smoothing spline:
N
f ( x)   N j ( x) j
ˆ                   ˆ
j 1

2012/11/6           Basis Expansion and Regularization         12
Smoothing Splines
• Function of age,
that response the
relative change in
bone mineral
density measured
at the spline in
• Separate
smoothing
splines fit the
males and
females,  0.00022

2012/11/6
freedom
Basis Expansion and Regularization                13
Smoothing Matrix
• fˆ the N-vector of fitted values
ˆ
f  N ( N T N  N )1 N T y  S y
• The finite linear operator S  — the smoother matrix
• Compare with the linear operator in the LS-fitting:
M cubic-spline basis functions, knot sequenceξ
ˆ
f  B ( B B )1 B y  H y, B is N  M matrix
T         T

• Similarities and differences:
– Both are symmetric, positive semidefinite matrices
– H  H   H  idempotent（幂等的) ; S  S   S  shrinking
– rank: r ( S  )  N , r ( H  )  M
2012/11/6           Basis Expansion and Regularization       14
Smoothing Matrix
• Effective degrees of freedom of a smoothing spline
df   trace( S  )
• S  in the Reinsch form: S   ( I  K ) 1
• Since f  ˆ  S y, solution: min f y  f 2  f T Kf


• S  is symmetric and has a real eigen-decomposition
N
S     k ( )uk uk
T

k 1

1
 k ( ) 
1  d k
– d k is the corresponding eigenvalue of K

2012/11/6         Basis Expansion and Regularization   15
• Smoothing spline
fit of ozone(臭氧)
concentration
versus Daggot
• Smoothing
parameter df=5
and df=10.
• The 3rd to 6th
eigenvectors of
the spline
smoothing
matrices
2012/11/6      Basis Expansion and Regularization   16
• The smoother matrix for a
smoothing spline is nearly
banded, indicating an
equivalent kernel with local
support.
2012/11/6       Basis Expansion and Regularization   17
• Example:           Y  f (X )  
sin(12( X  0.2))
f (X ) 
X  0.2
ˆ                 ˆ
• For f  S y, then cov( f )  S cov(y)S  S S
T       T

• The diagonal contains the pointwise variances at
the training x i
ˆ            ˆ
• Bias is given by Bias( f )  f  E( f )  f  S f
• f is the (unknown) vector of evaluations of the true f

2012/11/6        Basis Expansion and Regularization   18
• df=5, bias high,
standard error band
narrow

• df=9, bias slight,
variance not
increased appreciably

• df=15, over learning,
standard error widen

2012/11/6        Basis Expansion and Regularization          19
• The integrated squared predictionThe EPE and CV
error (EPE)
in a single
combines both bias and variance curves have the a
summary: EPE( fˆ )  E (Y  fˆ ( X ))2 similar shape.
    ˆ                ˆ
 Var(Y )  E BiasAnd, overall(theX ))
( f  ( X ))  Var f  ( CV     
   MSE( f
2          ˆ ) curve is

• N fold (leave one) cross-validation:                approximately
N                        unbiased as an
CV ( f                  ˆ
ˆ )  ( y  f i ( x ))2
      i     estimate of the
i 1
EPE curve
2
N           ˆ
yi  f  ( xi ) 
      


i 1  1  S  (i, i ) 
2012/11/6          Basis Expansion and Regularization                 20
Logistic Regression
• Logistic regression with a single quantitative
input X         Pr(Y  1 | X  x)
log                    f ( x)
Pr(Y  0 | X  x)
e f ( x)
Pr(Y  1 | X  x) 
1  e f ( x)
• The penalized log-likelihood criterion
N
l ( f ;  )    yi log p( xi )  (1  yi ) log(1  p( xi ))    { f " (t )}2 dt
1
i 1                                              2

                                  
N
1
  yi f ( xi )  log(1  e        f ( xi )
)    { f " (t )}2 dt
i 1                                        2
2012/11/6                    Basis Expansion and Regularization                 21
Multidimensional Splines
• Tensor product basis
– The M1×M2 dimensional tensor product basis
g jk ( X )  h1 j ( X 1 )h2 k ( X 2 ), j  1, , M 1 , k  1, , M 2
– h1 j ( X 1 ), basis function for coordinate X1
– h2 k ( X 2 ), basis function for coordinate X2
M1 M 2
g ( X )   jk g jk ( X )
j 1 k 1

2012/11/6            Basis Expansion and Regularization                22
Tenor product basis of B-splines, some selected pairs
2012/11/6       Basis Expansion and Regularization   23
Multidimensional Splines
• High dimension smoothing Splines
N
min   f  yi  f ( xi )2  J  f ,              xi  IR d
i 1
– J is an appropriate penalty function
  2 f ( x)  2   2 f ( x)  2   2 f ( x)  2 
J  f     2 
 x 2   2 x x    x 2  dx1dx2
                              
IR

        1        1 2                  2      
a smooth two-dimensional surface, a thin-plate spline.
• The solution has the form
N
f ( x)   0   T x    j h j ( x)
j 1
2012/11/6                  Basis Expansion and Regularization               24
Multidimensional Splines
• The decision
boundary of an
regression model.
Using natural
splines in each of
two coordinates.
• df = 1 +(4-1) + (4-
1) = 7

2012/11/6     Basis Expansion and Regularization             25
Multidimensional Splines
• The results of
using a tensor
product of natural
spline basis in
each coordinate.
• df = 4 x 4 = 16

2012/11/6     Basis Expansion and Regularization             26
Multidimensional Splines
• A thin-plate spline fit
to the heart disease
data.
• The data points are
indicated, as well as
the lattice of points
used as knots.

2012/11/6     Basis Expansion and Regularization             27
Reproducing Kernel Hilbert space
• A regularization problems has the form:
~ 2
N                                                        f ( s)
min  L( yi , f ( xi ))  J ( f )               J( f )      ~     ds
f H
 i 1                                                    G ( s)
– L(y,f(x)) is a loss-function.
– J(f) is a penalty functional, and H is a space of
functions on which J(f) is defined.
• The solution
K                     N
f ( x)    kk ( X )    i G ( X  xi )
k 1                  i 1
–    k span the null space of the penalty functional J
2012/11/6                 Basis Expansion and Regularization                      28
Spaces of Functions Generated by Kernel
• Important subclass are generated by the positive
kernel K(x,y).
• The corresponding space of functions Hk is called
reproducing kernel Hilbert space.
• Suppose thatK has an eigen-expansion
K ( x, y )    ii ( x)i ( y ),  i  0, i 1  i2  


i 1

• Elements of H have an expansion
                                   
f ( x)   cii ( x),                      ci2 /  i  
2
f   Hk
ˆ
i 1                                i 1
2012/11/6                  Basis Expansion and Regularization            29
Spaces of Functions Generated by Kernel
• The regularization problem become
N                      2 
min  L( yi , f ( xi ))   f H 
f H k
 i 1                   k

N                                   
min  L( yi ,  c j j ( xi ))    c j /  j 
2
C j 1  i 1 j 1                j 1        
• The finite-dimension solution(Wahba,1990)
N
f ( x)    i K ( x, xi )
i 1
• Reproducing properties of kernel function

 K (, xi ), f  H K  f ( xi ),  K (, xi ), K (, x j )  K ( xi , x j )
2012/11/6                Basis Expansion and Regularization               30
Spaces of Functions Generated by Kernel
•                                    N
f ( x)    i K ( x, xi )
i 1

• The penalty functional
N N
J ( f )   K ( xi , x j ) i j
i 1 j 1
• The regularization function reduces to a
finite-dimensional criterion
min L( y, K )   T K , K  K ( xi , x j )

   
– K is NxN matrix, the ij-th entry K(xi, xj)
2012/11/6               Basis Expansion and Regularization       31
RKHS
• Penalized least squares
min ( y  K )T ( y  K )   T K


• The solution of  :   ( K  I )T y
ˆ
• The fitted values: N
f ( x )    k K ( x , xk )
ˆ           ˆ
k 1
• The vector of N fitted value is given by
ˆ
f  K  K ( K  I ) 1 y
ˆ
 ( I  K 1 ) 1 y
2012/11/6           Basis Expansion and Regularization   32
Example of RKHS
• Polynomial regression
– Suppose h( x) : IR p  IR M M huge
– Given x1 , x2 ,  , x N , with M  N , H  {h j ( xi )}
– Loss function: R(  )  ( y  H )T ( y  H )   T 
L( 
– The penalty)polynomialyregression: 0
ˆ      ˆ
 0   H T (  H )   
 N     M

2    M


min  yi T
M
i 1 

  m hm ( xi )  ˆ    m
m 1
ˆ
{ m }1  HH ( y  H )  H  0
      m 1
2

 The solutionH  ( HH T  I ) 1 HH T y
:
{HH T } : h( xi ), h( x j )  K ( xi , x j )
N
f ( x)  h( x)     i K ( x, xi ),   ( K  I ) 1 y
ˆ            T
ˆ                ˆ
i 1
2012/11/6              Basis Expansion and Regularization       33
Penalized Polynomial Regression
• Kernel: K ( x, x )  (1  x, x ) d has M   p  d 
       
eigen-functions                                                  d        

• E.g. d=2, p=2:
       
K ( x, x)  (1  x1 x1  x2 x2 )
                                             
 1  2 x1 x1  2 x2 x2  ( x1 x1 ) 2  ( x2 x2 ) 2  2 x1 x1 x2 x2
h( x)  (1, 2 x1 , 2 x2 , x12 , x2 , 2 x1 x2 )
2

• The penalty polynomial regression:
2
N
 M
     M
min   yi    m hm ( xi )      m2

i 1                
{ m }1
M
m 1                m 1
2012/11/6                Basis Expansion and Regularization                    34
RBF kernel & SVM kernel
 x  y / 2 2
2

K ( x, y )  e                ;
h( x j )  K ( x, x j ); j  1,..., M
• Support Vector Machines
N
f ( x )   0    j K ( x, x j )
j 1

N                          
min  1  yi f ( xi )    K 
T
 0 ,
 i 1                      

2012/11/6          Basis Expansion and Regularization   35
Wavelet smoothing
• Another type of bases——Wavelet bases
• Wavelet bases are generated by translations and
dilations of a single scaling function  (x).
• If  ( x)  I ( x [0 1]) ,then 0,k ( x)   ( x  k ) generates an
orthonormal basis for functions with jumps at the
integers.
• 0,k ( x) form a space called reference space V0
• The dilations 1,k ( x)  2 (2x  k ) form an orthonormal
basis for a space V1  V0
• Generally, we have   V1  V0  V1  
2012/11/6           Basis Expansion and Regularization          36
 j ,k ( x)  2 j / 2  (2 j x  k )

V0  V1  V2  

V j 1  V j  W j
W j  信号的细节，正交于 j
V

 ( x)   (2 x)   (2 x  1)  W0上的小波基
 j ,k ( x)  2 j / 2 (2 j x  k )  W j 上的小波基

2012/11/6                     Basis Expansion and Regularization   37
Wavelet smoothing
• The L2 space dividing
V j 1  V j  W j  V j 1  W j 1  W j
 V0  W0  W1    W j

 2V1 V0
V                     V1                          V
2
W 3 W 2 W1                  W0  V1 /V0         

• Mother wavelet  ( x)   (2 x)   (2 x  k ) generate
function  0,k   ( x  k ) form an orthonormal basis for
W0. Likewise  j ,k  2 j / 2 (2 j x  k ) form a basis for Wj.

2012/11/6              Basis Expansion and Regularization           38
Wavelet smoothing
• Wavelet basis on W0 :
 ( x)   (2 x)   (2 x  1)
• Wavelet basis on Wj :
 j ,k ( x)  2 j / 2 (2 j x  k )
• The symmlet-p wavelet:
– A support of 2p-1 consecutive intervals.
– p vanishing moments:
  ( x) x dx  0,           j  1,, p
j

2012/11/6          Basis Expansion and Regularization   39
• Wavelet transform: y *  W T y
– y: response vector, W: NxN orthonormal wavelet
basis matrix
• Stein Unbiased Risk Estimation (SURE)
min y W 2  2  1
2

– The solution:
ˆ  sign( y )(| y |  )
j         j      j        

– Fitted function is given by inverse wavelet
transform: fˆ  Wˆ , LS coefficients truncated to 0
– Simple choice for  :    2 log N
2012/11/6         Basis Expansion and Regularization   40
S   ( I  K )  N ( N N   N ) N
1       T          1     T
Smoothing Matrix
• Effective degrees of freedom of a smoothing spline
df   trace( S  )
• S  in the Reinsch form: S   ( I  K ) 1
• Since f  ˆ  S y, solution: min f y  f 2  f T Kf


• S  is symmetric and has a real eigen-decomposition
N
S     k ( )uk uk
T

k 1

1
 k ( ) 
1  d k
– d k is the corresponding eigenvalue of K

2012/11/6         Basis Expansion and Regularization   42

```
To top
;