Embed
Email

Ada boost

Document Sample

Shared by: ewghwehws
Categories
Tags
Stats
views:
1
posted:
1/16/2012
language:
pages:
47
Face recognition





KH Wong









face recogntiion 1

Overview



 PCA Principle component analysis

 Application to face recognition : eigen face

 Reference: [Bebis]









face recogntiion 2

PCA Principle component analysis [1]



 A method of data

compression  a1 

 Use less data in y to a   b1 

represent x but retain  2 b 

important information x   :  y   2

   .. 

 E.g N=10, 10 dimensions

 :   

reduced to K=5 dimensions. a N  bK 

 So recognition is easier.

 

where N  K









face recogntiion 3

Data reduction N dataK data (N>K),

how?

  a1 

 b1 

or

a 

b 

 2

x   :  y   2

y  Tx, where

   .. 

 :   

bK 

 t11 t12 .. t1N 

a N 

  t t22 .. t2 N 

where N  K T  21 

b1  t11a1  t12a2  ...  t1n a N : : : : 

b2  t21a1  t22a2  ...  t1n a N  

: tK 1 tK 2 .. tKN 

bk  tk 1a1  tk 2 a2  ...  tkna N







face recogntiion 4

Dimensionality basis



 The higher dimensiona l space has basis vi 1, 2,.....,N

x j  a1v1  a1v2  ...  a N vN  j

v1 , v2 ,..., vN , are the basis of the N - dimensiona l space

The lower dimensional space has basis ui 1, 2,..,K

x j  b1u1  b1u2  ..........  bN uN  j

ˆ .....

u1 , u2 ,...,uN , are the basis of the N - dimensional space

NK

We can make the vector

xj  xj

ˆ

Note : if N  K then x  x

ˆ

face recogntiion 5

Redundancy concept in space

 u1



x2

x2









x1

x1 P ca .pdf







Data are spread all over the 2D Data are spread along one line , so

space, so redundancy of using 2 redundancy of using 2 axes is high.

axes (x1,x2) is low Can consider to use one axis (U1)

along the spread of data to

represent it. Although some error

may be introduced.





face recogntiion 6

The concept

This point is x j according

to the space with basis {v1,v2 }

ˆ

Or x j according to the space

 In this diagram, the

v2 with basis{u1}.

data is not entirely xj  xj

ˆ

random, we can use a

u1

1-dimensional space

(basis u1)to represent

(approximate) the data

in the 2 dimensional

space (basis v1,v2)

v1

But some information is lost.

because x j  x j  error

ˆ

face recogntiion 7

PCA will enable information lost to be

minimized

 Use Eigen value

method to find the best

solution to minimize

error





Minimize _ the _ error x  x j

ˆ









face recogntiion 8

PCA Algorithm [smith 2002]

Proof is in Appendix

 Step1:

 get data

 Step2:

 subtract the mean

 Step 3:

 Find covariance matrix C

 Step 4:

 find eigen vectors and eigen values of C

 Step5:

 Choosing the large feature components.





face recogntiion 9

Some math background



 Mean

 Variance/ standard deviation

 Covariance

 Covariance matrix









face recogntiion 10

Mean, variance (var) and

standard_deviation (std)  x=



1 n

mean    x    xi 



2.5000

0.5000

n i 1  2.2000

2  1.9000

1 n

var( x )     xi     3.1000

n i 1  2.3000

 2.0000

std ( x )  var( x )  1.0000

 1.5000

 1.1000

 mean_x =

%matlab code  1.8100

x=[2.5 0.5 2.2 1.9 3.1 2.3 2 1 1.5 1.1]'  var_x =

mean_x=mean(x)  0.6166

var_x=var(x)  std_x =

 0.7852

std_x=std(x)







face recogntiion 11

Covariance [see wolfram mathworld]



 “Covariance is a measure of the extent to which

corresponding elements from two sets of ordered

data move in the same direction.”

 http://stattrek.com/matrix-algebra/variance.aspx

 x1   y1 

X   : , Y   : 

   

 xn 

   yn 

 



covariance( X , Y )  

N

xi  x  yi  y 

i 1 N 1



face recogntiion 12

Covariance (Variance-Covariance) matrix

”Variance-Covariance Matrix: Variance and covariance are often displayed together in a variance-

covariance matrix. The variances appear along the diagonal and covariances appear in the off-diagonal

elements”, http://stattrek.com/matrix-algebra/variance.aspx





Assume you have C sets of data X c 1,X c 2 ,..,Xc C . Each has N entries.

 xc ,1 

 

X c   : , X c  mean( X c )

 xc , N 

 

covariance_matrix( X , Y ) 

 N 

  x1,i  X 1 x1,i  X 1   x 1,i  X 1 x2 ,i  X 2   x  X 1 xc ,i  X c  

N N

.. 1,i

 iN 1

 i 1 i 1



1   x2,i  X 2 x1,i  X 1   x  X 2 x2,i  X 2  ..  x2,i  X 2 xc ,i  X c  

N N



 2 ,i



( N  1)  i 1 i 1 i 1



: : : :

N 

 xC ,i  X C x1,i  X C   x  X C x2,i  X 2  ..  xC ,i  X C xc ,i  X c 

N N



C ,i

 i 1 i 1 i 1 









face recogntiion 13

N or N-1 as denominator??

see

http://stackoverflow.com/questions/3256798/why-does-matlab-native-

function-cov-covariance-matrix-computation-use-a-differe



 “n-1 is the correct denominator to use in

computation of variance. It is what's known

as Bessel's correction”

(http://en.wikipedia.org/wiki/Bessel%27s_corr

ection) Simply put, 1/(n-1) produces a more

accurate expected estimate of the variance

than 1/n



face recogntiion 14

Example

covariance_matrix 

N 

  x1,i  X 1 x1,i  X 1  /( N  1)  x  X 1 x2,i  X 1  /( N  1) 

N



1,i



Step2:  iN1

 i 1



 x  X x  X  /( N  1)

 x2,i  X 1 x2,i  X 1  /( N  1)

N



 X_data_adj =  2,i

 i 1

1 1,i 1

i 1 



 X=Xo-mean(Xo)=



 [0.6900 0.4900

 -1.3100 -1.2100

 x  X 1 x1,i  X 1  /( N  1)

N

 0.3900 0.9900 1,i

i 1

 0.0900 0.2900 

 1.2900 1.0900 (0.69*0.69+(-1.31)*(-1.31)+0.39*0.39+0.09*0.09+

1.29*1.29+0.49*0.49+0.19*0.19+(-0.81)*(-0.81)+

 0.4900 0.7900

(-0.31)*(-0.31)+(-0.71)*(-0.71))/(10-1)

 0.1900 -0.3100 =0.6166

 -0.8100 -0.8100

 -0.3100 -0.3100

 -0.7100 -1.0100]



Xc=1 Xc=2 face recogntiion 15

Example

covariance_matrix 

N 

  x1,i  X 1 x1,i  X 1  /( N  1)  x  X 1 x2,i  X 1  /( N  1) 

N



1,i



Step2:  iN1

 i 1



 x  X x  X  /( N  1)

 x2,i  X 1 x2,i  X 1  /( N  1)

N



 X_data_adj =  2,i

 i 1

1 1,i 1

i 1 



 X=Xo-mean(Xo)=



 [0.6900 0.4900

 -1.3100 -1.2100

 0.3900 0.9900

0.0900 0.2900

 x  X 1 x2,i  X 2  /( N  1)

N



1,i

 1.2900 1.0900 i 1





 0.4900 0.7900 (0.69*0.49+(-1.31)*(-1.21)+0.39*0.99+0.09*0.29+

 0.1900 -0.3100 1.29*1.09+0.49*0.79+0.19*-0.31+(-0.81)*(-0.81)+

(-0.31)*(-0.31)+(-0.71)*(-1.01))/(10-1)

 -0.8100 -0.8100

=0.6154

 -0.3100 -0.3100

 -0.7100 -1.0100]



Xc=1 Xc=2 face recogntiion 16

Example

covariance_matrix 

N 

  x1,i  X 1 x1,i  X 1  /( N  1)  x  X 1 x2,i  X 1  /( N  1) 

N



1,i



Step2:  iN1

 i 1



 x  X x  X  /( N  1)

 x2,i  X 1 x2,i  X 1  /( N  1)

N



 X_data_adj =  2,i

 i 1

1 1,i 1

i 1 



 X=Xo-mean(Xo)=



[0.6900 0.4900

 x  X 1 x2,i  X 2  /( N  1)

N



2 ,i

 -1.3100 -1.2100 i 1





 0.3900 0.9900

(0.49*0.49+(-1.21)*(-1.21)+0.99*0.99+0.29*0.29+

 0.0900 0.2900 1.09*1.09+0.79*0.79+(-0.31*-0.31)+(-0.81)*(-0.81)+

 1.2900 1.0900 (-0.31)*(-0.31)+(-1.01)*(-1.01))/(10-1)

 0.4900 0.7900 = 0.7166

Hence:

 0.1900 -0.3100

Covariance_matrix of X =

 -0.8100 -0.8100 cov_x=

 -0.3100 -0.3100

 -0.7100 -1.0100] 0.6166 0.6154



0.6154 0.7166

Xc=1 Xc=2 face recogntiion 17

Eigen vector of a square matrix

Skip the following 3 slides if you are familiar with eigen values and vectors

Because A is

rank2 and is 2x2

 AX=  X,

so A has

2 eigen values

and 2 vectors eigvect of cov_x =

covariance_matrix of X = [-0.7352 0.6779]

In Matlab

cov_x= [ 0.6779 0.7352]

[eigvec,eigval]

=eign(A)

[0.6166 0.6154] eigval of cov_x =

[0.6154 0.7166] [0.0492 0]

[ 0 1.2840 ]

So

eigen value 1= 0.49,

its eigen vector is [-0.7352 0.6779]T



eigen value 2= 1.2840,

its eigen vector is [-0.6779 0.7352]T

face recogntiion 18

To find eigen values

 x1   x1 

A   λ   λ 2  ( d  a ) λ  ( ad  bc)  0

 x2   x2  solution to this quadratic equation



a b   x1   x1   (  d  a )  ( d  a ) 2  4( ad  bc)

 c d  x   λ x 

λ

2

  2   2 So if

ax1  bx2  λx1 a b  0.6166 0.6154

 c d   0.6154 0.7166

( a  λ ) x1  bx2    (i )    

 (  d  a )  ( d  a ) 2  4( ad  bc)

λ

2

cx1  dx2  λx2 λ  ( 0.7166  0.6166) / 2 

cx1  ( λ  d ) x2    (ii ) ( 0.7166  0.6166)^2  4 * (0.6166 * 0.7166  0.6154 * 0.6154)

(i ) /(ii ) 2



(a  λ) b

 eigen values are λ1  0.0492, λ2  1.2840.

c (λ  d )

aλ  ad  λ 2  λd  bc



face recogntiion 19

Find eigen vectors from eigen values

eigen values are λ1  0.0492, λ2  1.2840.

 x1 

 eigen vector for λ1 is  

 x2 

a b   x1   x1 

 c d   x   λ1  x  

  2   2

0.6166 0.6154  x1  x 

0.6154 0.7166  x   0.0492 1 

  2   x2 

solve the above equation

So

 x1    0.7352 eigen value 1= 0.49,

 x    0.6779  its eigen vector is [-0.7352 0.6779]T

 2  

Similarly, another eigen vector for λ2is

eigen value 2= 1.2840,

 x '1    0.6779 its eigen vector is [-0.6779 0.7352]T

 x '    0.7352 

 2  



face recogntiion 20

PCA example

Step2:

Step1:  X_data_adj =

Original data =  X=Xo-mean(Xo)=

Xo=[  [0.6900 0.4900

2.5000 2.4000  -1.3100 -1.2100

 0.3900 0.9900

 0.5000 0.7000  0.0900 0.2900

2.2000 2.9000

 1.2900 1.0900

1.9000 2.2000

 0.4900 0.7900

3.1000 3.0000  0.1900 -0.3100

2.3000 2.7000  -0.8100 -0.8100

2.0000 1.6000  -0.3100 -0.3100

1.0000 1.1000  -0.7100 -1.0100]

Data is biased in this 2D space (not

1.5000 1.6000

1.1000 0.9000] random) so PCA for data reduction

Mean 1.81 1.91 will work. We will show X can be

approximated in a 1-D space with

small data lost.

Eigen vector with

Step4:

Step3: small eigen value

eigvect of cov_x =

Covariance_matrix of X = -0.7352 0.6779

cov_x= Eigen vector with

0.6779 0.7352

Large eigen value

0.6166 0.6154 eigval of cov_x =

0.6154 0.7166 0.0492 0 Small eigen value

0 1.2840 Large eigen value

face recogntiion 21

Step 5:Choosing eigen vector (large feature component) with large eigen value

for transformation to reduce data





Eigen vector with

Covariance matrix of X eigvect of cov_x = small eigen value

 cov_x = -0.7352 0.6779

0.6779 0.7352 Eigen vector with

0.6166 0.6154 Large eigen value

0.6154 0.7166 eigval of cov_x =

0.0492 0

0 1.2840 Small eigen value

Fully Y  PX Large eigen value

reconstruction X  Original data_mean_adjusted

case: Y  transposednew data

For comparison P  with each row pi is an eigen vector of cov(X)

only, no data lost You have twochoices :

 Eigen vector wi the biggest eigen value of covariance(X) transpose 

th d

P _ fully _ rec  

PCA algorithm d

 Eigen vector wi second biggest eigen value of covariance(X) transpose 

th

will select this  0.6779 0.779 

 

Approximate  0.7352 0.6779

Transform  Eigen vector wi the biggest eigen value of covariance(X) transpose 

th d

P _ approx _ rec   

P_approx_rec  0 

0.6779 0.779

For data reduction 

 0 0   face recogntiion 22

 Eigen vector wi the biggest eigen value of covariance(X) transpose 

th d

P _ fully _ rec  

d

 Eigen vector wi second biggest eigen value of covariance(X) transpose 

th

 0.6779 0.779 

 

  0.7352 0.6779

 Eigen vector wi the biggest eigen value of covariance(X) transpose 

th d

P _ approx _ rec   

 0 

0.6779 0.779



 0 0  

Y  PX



 Y_Fully_reconstructed  Y_Approximate_reconstructed

 (use 2 eignen vectors)  (use 1 eignen vector)

 Y_full=P_fully_rec_X  Y_approx=P_approx_rec_X (the

 (two columns are filled)= second column is 0) =

 0.8280 -0.1751  0.8280 0

 -1.7776 0.1429  -1.7776 0

 0.9922 0.3844  0.9922 0

 0.2742 0.1304  0.2742 0

 1.6758 -0.2095  1.6758 0

 0.9129 0.1753  0.9129 0

 -0.0991 -0.3498  -0.0991 0

 -1.1446 0.0464  -1.1446 0

 -0.4380 0.0178  -0.4380 0

 -1.2238 -0.1627  -1.2238 0

 {No data lost, for comparaison only}  {data reduction 2D  1 D, data lost

exist}

face recogntiion 23

 Eigen vector wi the biggest eigen value of covariance(X) transpose 

th d

P _ fully _ rec  

d

 Eigen vector wi second biggest eigen value of covariance(X) transpose 

th

 0.6779 0.779 

 

 0.7352 0.6779

 Eigen vector wi the biggest eigen value of covariance(X) transpose 

th d

P _ approx _ rec   

 0 

0.6779 0.779

 

 0 0  

Y  PX

X  P 1Y  PT Y , becuase P 1  PT

‘O’=Original data

‘’=Recovered using one ‘+’=Recovered

eigen vector that has the using all Eigen

biggest eigen value vectors

(principal component) reconstructed _ x _ full 

reconstructed _ x _ approx. 

P _ full _ recT * Y _ full

P _ approx _ recT * Y _ approx

 Same as

original , so no

Some lost of information lost of

information

eigen vector with large

eigen value (blue)

eigen vector with small

face recogntiion 24

eigen value (blue)

Exercise



 ?









face recogntiion 25

PCA algorithm

Input data  x1 , x2 ,...,xM are N  1 vectors.



1 M

Step1 : x   xi

M i 1

Step2 : subtract t mean :  i ( N 1)  xi  x ( N 1)

he

Step3 : form matrix A  1  2 ..  M ( N M ) ,

find the covariance matrix

M

1 

C N N   1   j  Tj   A( N M ) AT ( M N ) 

M j 1 M ( N  N )

Step4 : find eigen values of C : 1  2  ..  N

Step5 : find eigen vectorsof C : u1 , u2 ,..,uN

face recogntiion 26

Continue



 Since C is symmetricu1,u2 ,..,uN form a basis (i.e. any vectorx

N

actually  x  x   u1b1  u2b2  ...  bn un   bi ui

i 1



ng

Step6 : (dimensional reduction step)keep only the terms correspdni to

the K largest eigenvalue s

K

x  x   bi ui where K  N

ˆ

i 1



The represntation of x  x into the basis u1,u2 ,..,uk is thus

ˆ

 b1 

b 

 2

:

 

bK 





face recogntiion 27

PCA



 Space dimension N reduced to dimension K

 Ui are normalized unit vectors



 b1  u1 

T



b   T

 2  u2  x  x ( N 1)   T x  x ( K 1)

U

: : 

   T

bK ( K 1) uK ( K N )







face recogntiion 28

Geometric interpretation



 PCA transforms the coordinates along the along the spread

of the data. Here are axes: u1 and u2

 The coordinates are determined by the eigenvectors of the

covariance matrix corresponding to the largest

eigenvalues.

 The magnitude of the eigenvalues corresponds to the

variance of the data along the eigenvector directions



x2 u1

u2



_

x



x1

face recogntiion 29

Choose K



 > Threshold 0.95 will

preserve 95 % K



information  i

i 1

N

 Threshold (e.g. 0.95)

 If K=N ,100% will be  i

reserved (no data i 1



1 N

reduction) error  e   i

2 i  K 1

 Data standardization is Before you use xi

needed xi  mean( xi )

xi 

std ( xi )

std  standard deviation





face recogntiion 30

Application to face recognition



 Step1: obtain the training faces images

I1,I2,…,IM (centered and same size)

 Each image is represented by a vector 

 Each image (a NxN matrix)  (N2x1) vector



N=100

Vector N2X1=10000x1





N=100





face recogntiion 31

Continue

 Note: C (size N2XN2) is too large to be

calculated , if N=100, C=10000x10000



Input data  Γ 1,Γ 2 ,...,Γ M are N 2  1 vectors.

1 M

Step3 :  (11)   Γi

M i 1

Step4 : subtract t mean :  i ( N 2 1)  Γ i   ( N 2 1)

he



Step5 : form matrix A  1( N 2 M )  2 ( N 2 M ) ..  M ( N 2 M ) 

( N 2 M )

,

find the covariance matrix C of A

M

1 

CN 2 N 2   1   j  Tj   A( N M ) AT ( M N ) 

M j 1 M ( N 2  N )





face recogntiion 32

Continue

Step6 : Find eigen vectors ui of AAT ( N 2 N 2 ) ,



 ( e. g .size( A) is N 2 xM  10000x300)

AAT ui  i ui

1

C AAT  is too large to be calculated in limited time,

M

Size of C  N 2 xN 2  10000x10000 if N  100

- - - -The trick is as follows : - - - - - - -

Step6.1: find A T A (size MxM) (instead of C  AA T (size N 2 xN 2 ) )

e.g.M  300 training images , then AT A (size 300x300)

Step 6.2 : Find vi the eigen vectorsof AT A.

What is the relation between

ui  eigen _ vector _ of ( AAT )( N 2 N 2 1000010000) , and

vi  eigen _ vector _ of ( AT A)( M M 300300) ?



face recogntiion 33

What is the relation between

Continue ui  eigen _ vector _ of ( AAT )( N 2 N 2 1000010000) , and

vi  eigen _ vector _ of ( AT A)( M M 300300) ?



Answer : (relation of ui and vi ) Importantresult



AA u   u      (i )

T

i i i i  i

A Av   v      (ii)

T

i i i ui  Avi

from (ii ) : i is an eigen value (scalar) of AT A

multiply each size of (ii) by A

 AAT Avi  Ai vi  i Avi , since i is a scalar

AA Av    Av     (iii)

T

i i i



Compare(iii) with (i)

ui  Avi and i  i

So, AAT and AAT have the same eigenvalue s

and eigen vectorsare related : ui  Avi

face recogntiion 34

Important results

(e.g N=100, M=300)

 (AAT)size=10000x10000 have N2=10000 eigen vectors

and eigen values

 (ATA) size=300x300 have M=300 eigen vectors and

eigen values

 The M eigen values of (ATA) are the same as

the M largest eigen values of (AAT)







Importantresult

i  i

ui  Avi

face recogntiion 35

Continue

Step6.3 :



Find the best M  300 eigen vectorsui 1, 2..300

and M  300 eigen values μi 1,2 ,...M of AT A M M 300300 

- - - The first M eigen values i of AAT  and eigen values i

of AT A are the same

μi 1,2 ,...M  i 1,2 ,...M

Also

ui  Avi

Note : normalize ui , so ui  1

Step7 : Only the K(e.g  5) eigen vectors(eigen avlues)

are useful in most cases

face recogntiion 36

Training faces













face recogntiion 37

Find largest eigen vectors u1,u2,..uk













face recogntiion 38

Eigen faces



 

K

For each face, find the K

 i  mean   w j u j , wj  uT  i



ˆ ˆ

(e.g. K=5) face images j

j 1

(called eigen faces)

corresponding to the first K u j  eigen _ faces

eigen vectors with largest

eigen values

 Each face (subtract by the

mean) can be represented

by a combination of eigen

faces





http://onionesquereality.files.wordpress.com/2009/02/eigenfaces-reconstruction.jpg





face recogntiion 39

Reference

 [Bebis] Face Recognition Using Eigenfaces www.cse.unr.edu/~bebis/CS485/Lectures/Eigenfaces.ppt

 [Turk 91] Turk and Pentland , “Face recognition using Principal component analysis” journal of

Cognitive Neuroscience 391), pp71-86 1991.

 [smith 2002] LI Smith , "A tutorial on Principal Components Analysis”,

http://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf

 [Shlens 2005] Jonathon Shlens , “ A tutorial on Prinicpal COmpponent Analysis”,

http://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf

 [AI Access] http://www.aiaccess.net/English/Glossaries/GlosMod/e_gm_covariance_matrix.htm









face recogntiion 40

Appendix: pca_test1.m

 %pca_test1.m, example using data in [smith 2002] LI Smith,%matlab  figure(1)

by khwong  clf

 %"A tutorial on Principal Components Analysis”,  hold on

 plot(-1,-1) %create the same daigram aas in fig.3.1 of[smith 2002].

 %www.cs.otago.ac.nz/cosc453/student_tutorials/principal_componen  plot(4,4), plot([-1,4],[0,0],'-'),plot([0,0],[-1,4],'-')

ts.pdf  hold on

 %---------Step1--get some data------------------  title('PCA demo')

 function test  %step5: select feature

 x=[2.5 0.5 2.2 1.9 3.1 2.3 2 1 1.5 1.1]' %column vector  %eigen vectors,length of the eigen vector proportional to its eigen val

 y=[2.4 0.7 2.9 2.2 3.0 2.7 1.6 1.1 1.6 0.9]' %column vector  plot([0,eigvect(1,1)*eigval_1],[0,eigvect(2,1)*eigval_1],'b-')%1stVec

 plot([0,eigvect(1,2)*eigval_2],[0,eigvect(2,2)*eigval_2],'r-')%2ndVec

 N=length(x)  title('eign vector 2(red) is much longer (bigger eigen value), so keep

 %---------Step2--subtract the mean------------------ it')

 mean_x=mean(x), mean_y=mean(y) 



 x_adj=x-mean_x,y_adj=y-mean_y %data adjust for x,y  plot(x,y,'bo') %original data

 %%full %%%%%%%%%%%%%%%%%%%%%%%%

 %---------Step3---cal. covraince matrix---------------- %recovered_data_full=P_full*data_adj+repmat([mean_x;mean_y],1,

 data_adj=[x_adj,y_adj] N)

 cov_x=cov(data_adj)  final_data_full=P_full*data_adj'



 %---------Step4---cal. eignvector and eignecalues of cov_x-------------  recovered_data_full=P_full'*final_data_full+repmat([mean_x;mean_y]

 [eigvect,eigval]=eig(cov_x) ,1,N)

 eigval_1=eigval(1,1), eigval_2=eigval(2,2)  %recovered_data_full=P_full*data_adj'+repmat([mean_x;mean_y],1,

N)

 eigvect_1=eigvect(:,1),eigvect_2=eigvect(:,2),  plot(recovered_data_full(1,:),recovered_data_full(2,:),'r+')

 

 %eigvector1_length is 1, so the eigen vector is a unit vector  %%approx %%%%%%%%%%%%%%%%

 eigvector1_length=sqrt(eigvect_1(1)^2+eigvect_1(2)^2)  %recovered_data_full=P_full*data_adj+repmat([mean_x;mean_y],1,

N)

 eigvector2_length=sqrt(eigvect_2(1)^2+eigvect_2(2)^2)  final_data_approx=P_full*data_adj'

 

 %sorted,big eigen_vect(big eignval first)  recovered_data_approx=P_approx'*final_data_approx+repmat([mea

 %P_full=[eigvect(1,2),eigvect(2,2);eigvect(1,1),eigvect(2,1)] n_x;mean_y],1,N)

 %recovered_data_full=P_full*data_adj'+repmat([mean_x;mean_y],1,

 P_full=[eigvect_2';eigvect_1'] %1st eigen vector is small,2nd is large N)

 P_approx=[eigvect_2';[0,0]]%keep (2nd) big eig vec only,small gone  plot(recovered_data_approx(1,:),recovered_data_approx(2,:),'gs')













face recogntiion 41

A short proof of PCA (principal

component analysis)

 This proof is not vigorous, the detailed proof can be found in [Shlens

2005] .

 Objective:

 We have an input data set X with zero mean and would like to transform

X to Y (Y=PX, where P is the transformation) in a coordinate system

that Y varies more in principal (or major) components than other

components.

 E.g. X is in a 2 dimension space (x1,y1), after transforming X into Y

(coordinates y1,y2) , data in Y mainly vary on y1-axis and little on

y2-axis.

 That is to say, we want to find P so that covariance of Y

(cov_Y=[1/(n-1)]YYT) is a diagonal matrix , because diagonal matrix

has only elements in its diagonal and shows that the coordinates

of Y has no correlation. n is used for normalization.





face recogntiion 42

Continue

 Given X (with zero mean), we want to show that for Y=PX, if each row pi is an

eigenvector of XXT, then the covariance of Y is a diagonal matrix.

 Proof:

 For the covariance matrix (cov_Y) of Y

 cov_Y=[1/(n-1)]YYT , (n is a normalization factor=length of X or Y.), put Y=PX

 cov_Y=[1/(n-1)](PX)(PX)T

 cov_Y=[1/(n-1)](PX)XTPT

 cov_Y=[1/(n-1)]P(XXT)PT------(1)

 From theorems 3,4 in appendix of [Shlens 2005]

 For (XXT)=PTDP, if in P, each row pi is an eigenvector of XXT, then D is a diagonal matrix. Put

this in (1)

 cov_Y=[1/(n-1)]P(PTDP)PT

 cov_Y=[1/(n-1)]D, so covariance of Y (cov_Y) is a diagonal matrix. Meaning that

coordinates in Y has no correlation.

 We showed that for P if each row of pi is an eigenvector of XXT

 Covariance of Y is a diagonal matrix.

 (Done)!









face recogntiion 43

Appendix









face recogntiion 44

Covariance i,j and covariance matrix 

(reference: http://en.wikipedia.org/wiki/Covariance_matrix)



  X1 

X : 

 

Xn

 

ij  cov(X i , X j )  E  X i  i X j   j 

where

i  E ( X i )  expected_value  mean

 E  X 1  1  X 1  1  E  X 1  1  X 2  2  ... E  X 1  1  X n  n 

 E  X    X    E  X    X    ... E  X    X   

 2 2 1 1 2 2 2 2 2 2 n n 



 : : : : 

 

 E  X n  n  X 1  1  E  X n  n  X 2  2  ... E  X n  n  X n  n 







face recogntiion 45

Appendix

test_cov.m

 %test_cov

 clear

 x=[2.5 0.5 2.2 1.9 3.1 2.3 2 1 1.5 1.1]'

 y=[2.4 0.7 2.9 2.2 3.0 2.7 1.6 1.1 1.6 0.9]'

 x_adj=x-mean(x),y_adj=y-mean(y)

 cov_xy=0,cov_xx=0,cov_yy=0,N=length(x)

 for (i=1:N)

 cov_xx=x_adj(i)*x_adj(i)+cov_xx

 cov_xy=x_adj(i)*y_adj(i)+cov_xy

 cov_yy=y_adj(i)*y_adj(i)+cov_yy

 end

 'self cal'

 c_N_minus_1_as_denon= [cov_xx/(N-1) cov_xy/(N-1) ; cov_xy/(N-1) cov_yy/(N-1)]

 c_N_as_denon= [cov_xx/(N) cov_xy/(N) ; cov_xy/(N) cov_yy/(N)]

 'using matlab cov function'

 matlab_cov_xy=cov(x,y)

 matlab_cov_xy1=cov(x,y,1)











face recogntiion 46

Appendix: Eigen value and eigen vector



 AN*NXX*1=1x1XX*1

 A is a square matrix, X is a vector,  is a

scalar

 A is a transformation that does not change

eth direction of X but only its length









face recogntiion 47



Related docs
Other docs by ewghwehws
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!