A Unified Framework for Subspace Based Face Recognition by liamei12345

VIEWS: 2 PAGES: 45

									A Unified Framework for Subspace
    Based Face Recognition

 M. Phil. Candidate: Wang Xiaogang
 Super Advisor: Prof. Tang Xiaoou
              July 2003
    Abstract
   PCA, LDA and Bayesian analysis are three of
    the most representative subspace based face
    recognition approaches. We show that they
    can be unified under the same framework.
    Starting from the framework, a unified
    subspace analysis is developed using PCA,
    Bayes, and LDA as three steps. It achieves
    better performance than the standard
    subspace methods
Content
   Introduction to face recognition
   Review of PCA, Bayes, and LDA
   A unified framework for subspace based face
    recognition
   Unified subspace analysis for face recognition
   Experiments
   Several novel subspace methods developed from
    the framework
   Conclusion
Introduction
   Face recognition procedure




    Face detection (Determine whether or not there are
    any faces, return the location and extent of the face)

                        Face feature          Recognition       Output
        Feature                                              (Face identity)
       extraction
                                                Gallery
Introduction
   Two kinds of variation




Extrapersonal variation      Intrapersonal variation
Introduction
   Factors affecting face recognition

    •Pose                •Session


    •Lighting             •Decoration


    •Expression           •Occlusion
Introduction
   Difficulty for face recognition
       There are too few (sometimes only one)
        training samples for each face class
       The number of classes to be classified is large
Introduction
   Face Recognition Method
       Feature based face recognition
       Appearance based face recognition
Introduction
   Feature based face recognition
       Extract the geometrical relationship and other
        parameters of face features, such as eyes, nose, mouth
        and chin, for matching.
       Introduction
          Appearance based face recognition
               A 2D face image is viewed as a vector in the high
                dimensional image space. A suitable metric is then used
                for face matching in the image space or its subspace

I 1,1            I 1, N 


                                    V  I 1,1,, I M ,1, I M , N 
                                                     Image vector
I M ,1          I M , N 



       Face Image
             Introduction
                Appearance based face recognition
Image in
 gallery           Geometric          Photometric            Project to
                                                                              Matching
Probe face        normalization       normalization          subspace
  image


                                                                            PCA
                 (Alignment, remove   (Histogram equaliz-    Extract low
                 background etc.)     ation, or normalize    dimensional    Bayes
                                      to zero mean and       discriminant
                                      unit variant vector)   feature
                                                                            LDA
Review of Subspace Methods
   Notation
    Face data vector length: N
                                        
    Training face images:    X  x1 ,, xM 

    Training sample number:        M
    Face classes: X 1 ,, X L 
    Face classes number: L
                    
    Class label: xi 
  Principal Component Analysis (PCA)
     Face image is projected to several face templates,
      called Eigenfaces, which capture most of the face
      variation. The projection weights form the vector
      for face representation and recognition.

          =           + a1           + a2       +…+ a K


The feature vector is formed by the K weights
                            
                            y  a1 ,, aK 
     Principal Component Analysis (PCA)
K eigenfaces U  u1,, uK  are computed
                                                                       u1
                                                       u2
from the eigenvectors of covariance
matrix C with the largest eigenvalues                  2           1
       M
              T                   1     M
                                                           m
  C    xi  m  xi  m          m       xi
       i 1
                                        M    i 1


                 
        i ui  Cui
                      
The reference for each face class in the gallery database and
                      P
                
the probe image T are projected to PCA subspace to get
                                        
                                   
                 
                       
wp  U P  m
      T                   wT  U T  m
                                T


                                                               
Face class is found to minimize the distance          w p  wT
              Linear Discriminant Analysis (LDA)
                     LDA seeks for the subspace best discriminating
                      different classes. The projection vectors           W
                      maximize the ratio between the between-class
                      scatter matrix S b and within-class scatter matrix S w
        L

             
                             
SW             xk  mi xk  mi T
            
       i 1 xk X i
         L
             T
S B   mi  m mi  m                                PCA projection
       i 1
                               T
                                               Class 1
                          W S BW
W  arg max
                          W T SW W                    Class 2
     W T T  P 
                          
                                                                    LDA projection
Linear Discriminant Analysis (LDA)
                                                    
   W can be computed from the eigenvectors of    S w1Sb

   In face recognition, the training sample number is
    small (M<<N). The rank of S w is at most M-L.
    The N by N matrix S w may become singular
   Usually, the dimensionality of face data is first
    reduced to M-C using PCA, and then apply LDA
    in the reduced PCA subspace.
           Bayesian Face Recognition
                Both PCA and LDA are initially developed considering
                 class variation. They classify the input probe face into L
                 classes for L individuals.
                Bayesian algorithm classifies the face difference Δ into
                 intrapersonal variation ΩI for the same individual and
                 extrapersonal variation ΩE from different individuals
                                                                                    ΩI
                                Class 1
         T         Match                          T      - Δ        Classify
                                 Class i
                                 ……                                                 ΩE
    L-ary                                                           Maximum    I
                   Gallery      Class L                Class i      likehood
classification
                                       Minimum        Gallery               Binary
                       (PCA LDA)       distance                  (Bayes) classification
Bayesian Face Recognition
   The similarity between two face images is based
    on the intrapersonal likehood P(Δ| ΩI)
   ΩI is modeled as Gaussian distribution, and P(Δ|
    ΩI) is estimated as

     P |  I  
                     2 
                                1
                             N /2        1/ 2
                                    | I |
                                                   
                                                exp   mI T  1   mI 
                                                                 I          
    Bayesian Face Recognition
      Estimate P(Δ| ΩI) in high dimensional image
       space
           Apply PCA on the intrapersonal difference set {Δ| Δ
            ΩI }. The image space is decomposed to principal
            intrapersonal subspace F and its complementary
            subspace F .
                                                                 F
                     K
                            yi2
                                                         K
                                                                        Δ
        d F                          2       yi2
                                                     2
DIFS:                i 1    i
                                  DFFS:                  i 1
                                                                        DFFS
yi is the projection weights of  on the                         DIFS

intrapersonal eigenvectors, and i is the
intrapersonal eigenvalue                                                    F
     Bayesian Face Recognition
       P(Δ| ΩI) is computed as

                                                         1                                                  2   
                                                                                                                      
   P |  I                                       exp d F  
                                  1                                                1
                                                                                                         exp         
                                              1/ 2
                                                         2                                     1/ 2
                                                                                                             2 
                   2                                             2 ( N  K ) / 2                              
                                      K                                                   K
                           N /2
                                                                                             
                                      i 1 i                                              i 1 i




 is the average eigenvalue in the complementary subspace F

All the parameters are fixed in recognition procedure. It is
equivalent to evaluating the distance DI  d F     2   / 
          Diagram of the Unified Framework for
          Subspace Based Face Recognition
                                                         
                                            
       PCA:            PCA    wT  wP  U T  P  U T 
                                           T



                                             
                                   
      LDA:             LDA    W T  P  WT
                                   T



      Bayes:           DI  d F     2   / 


                                                         
                T                                        VD          Intrapersonal variation
 Probe Face                                Subspace            ?
                                                                     Extrapersonal variation
                         P

                    Reference                              PCA subspace

      Class 1                                              LDA subspace
       ……
      Class L                                              Intrapersonal subspace

Gallery database
       Face Difference Model
                            ~ ~ ~
                          I T  N
 ~
 I : Intrinsic difference discriminating identity
~
T : Transformation difference arising from all kinds of            Deteriorating
     transformations, such as lighting, expression, changes etc.
                                                                    recognition
~
N : Noise randomly distributing in face images

                                     ~ ~
Intrapersonal variation:        I  T  N
                                     ~ ~ ~
Extrapersonal variation:        E  I  T  N
       PCA Subspace
                                                       
                                                      xi              Principal
       M                                                                             Complementary
             T
 C    xi  m  xi  m                                              subspace       subspace
       i 1
                                                                         ~ ~
               x  x x  x                              
              M    M                                                     T &I
        1                              
                         i       j   i       j
                                                  T
                                                                    ~
       2M     i 1 j 1                                             N

                                                                                       Eigenvectors
                                                                                 ~ ~ ~
                                                                               I T  N



Theorem 1: The PCA subspace characterizes the difference
between any two face images xi  x j  , which may belongs to
                                 

the same individual or different individuals
           Intrapersonal Subspace (Bayes)

                          x  x x  x 
                                                    
                CI                  i       j   i         j
                                                               T
                              
                         
                        xi  x j



                                                                                                                F
                     Principal                       Complementary                                                     Δ
                                                                                  K
                                                                                         yi2
                                                                       d                   2   / 
                     subspace                          subspace
Intrapersonal      ~
                   T                                               ~              i 1    i                            DFFS
  subspace                                                         I                                            DIFS
                 ~
                 N
                                                       Eigenvectors
                                                                                                                           F
                                 ~ ~
                            I  T  N
     LDA Subspace
                                                mi  m j mi  m j T
                                              L    L

                                             
                                         1                    
         S w  CI                Sb   
                                        2M   i 1 j 1


Theorem 2: The within-class scatter matrix is identical to CI, the
covariance matrix to compute the intrapersonal subspace, which
characterizes the distribution of face variation for the same
individuals. Using the mean face image to describe each
individual class, the between class matrix characterizes the
variation between any two mean face images
            Compare Different Subspaces
        PCA and Bayes can be viewed as the intermediate steps of LDA

         Data         PCA           Whiten          PCA on class centers           LDA
                                                                                 Subspace
                                PCA Subspace        Intrapersonal Subspace

                PCA                 Bayes
                                             LDA

    Principal   Complementary       Principal   Complementary        Principal   Complementary
    subspace      subspace          subspace      subspace           subspace      subspace
     ~ ~                          ~                                  ~
     T &I                         T                      ~           I
                                                         I                                        ~
~                               ~                                                                 N
N                               N                                                                 ~
                                                                                                  T
                Eigenvectors                     Eigenvectors                      Eigenvectors
        PCA subspace                Intrapersonal subspace                   LDA subspace
         Compare Different Subspaces
                                            Decompose face image difference
Algorithm         Subspace
                                     Principal subspace   Complementary subspace
                                           ~ ~                    ~
  PCA           PCA subspace               T I                   N
                                             ~                   ~ ~
 Bayes      Intrapersonal subspace           T                   I N
                                             ~                   ~ ~
  LDA           LDA subspace                 I                   T N



•The subspace dimension of each method can affect the recognition performance

•Conventional LDA fails to attain the best performance without significant
changes in each individual step. It is directly computed from the eigenvectors of
   
 S w1Sb . In fact, it fixes the PCA and intrapersonal subspace as M-L dimension,
and LDA subspace at L-1 dimension.
L-ary versus Binary Classification
   There are few samples for each face classe. So it is hard to
    estimate the distribution for each class.
   However, Human face share similar intrapersonal
    variation, so we can use the average intrapersonal variation
    to approximate that for each individual class

         …        …                    …        …
    1       i       L          1       i        L


             +                             +


                                               
 Unified Subspace Analysis
   Project the face data to PCA subspace and adjust the PCA
    dimension (dp)to reduce the noise.
  Apply Bayesian analysis in the PCA subspace and adjust

    the dimension (di) of intrapersonal subspace. The PCA
    subspace and intrapersonal subspace may be computed
    from an enlarged training set containing the extra samples
    not in the classes to be recognized.
 Compute the class centers of the L individuals in the
   gallery, and project them to the intrapersonal subspace,
   whitened by the intrapersonal eigenvalues.
 Apply PCA on the whitened L class centers to compute the

  discriminant feature vector of dimension dl
Unified Subspace Analysis
   Advantages
       It provides a new 3D parameter space to improve the recognition
        performance.
                           DIFS               DFFS
                       D                  E           F
            PCA                 LDA   B

            A                                     L
                            K



            dl                            H
                                                          G
                  dp
                                      C

            O      di           N             M
       Unified Subspace Analysis
                 It adopts different training data at different training steps according
                  to the special requirement of the step.

                  Training samples
Lighting           outside gallery
 change


             PCA                Intrapersonal            LDA
           subspace               subspace             subspace
                                                                            Testing


                                                                           Expression,
           Expression                                                       lighting
                              Training samples
            change                                                          changes
                                  in gallery
Experiments
   Data set from FERET face database
        There are two face images (FA/FB) for each individual
        990 face images of 495 people for training
        Another 700 people for testing
             700 face images in gallery as reference
             700 face images for probe




                                                   Normalized face image
Examples of FA/FB pair
Experiments
   PCA
                               1

                             0.95

                              0.9

                             0.85
      Recognition accuracy




                              0.8

                             0.75

                              0.7

                             0.65

                              0.6
                                                                             Direct correlation
                             0.55
                                                                             PCA (Euclid)

                              0.5
                                    100   200   300   400      500     600      700     800       900
                                                  Number of eigenvectors
                              Experiments
                                  Bayes
                         1
                                                                                                         K
                                                                                                               yi2
                                                                                                 d    
                       0.95

                        0.9
                                                                                         DIFS:          i 1   i
                       0.85
Recognition accuracy




                                                                                                                     K
                                                                                                        yi2
                        0.8                                                                       2            2

                       0.75
                                                                                         DFFS:                       i 1

                        0.7

                       0.65                                                              ML:     DIFS+DFFS
                        0.6
                                                                    Direct correlation
                                                                    ML
                       0.55
                                                                    DIFS

                        0.5
                              20    40     60        80       100     120        140
                                           Number of eigenvectors
             Experiments
                     Bayesian analysis in the reduced PCA space

             Bayes on raw face data
                                                       Maximum point
                                                                                          1

                                                                                        0.98             Maximum point

                                                                                                                                                       Bayes on
 1                                                                                      0.96
                                                                                                                                                      raw face data

                                                                                        0.94
0.9
                                                                                        0.92

0.8                                                                                      0.9

      PCA benchmark                                                                     0.88
0.7
                                                                                    0
                                                                                        0.86
                                                                              100                                                    PCA benchmark
0.6
                                                                                        0.84
                                                                        200
0.5                                   Low accuracy region                               0.82                                                                          Di
                                                                  300         Di
       800                                                                               0.8
                      600                                   400                                0   100    200    300     400   500     600   700     800   900   1000
                            400
                 dp                     200                                                                                    Dp
Experiments
   Extract discriminant features from intrapersonal
    subspace
    Experiments
   Subspace analysis using different training sets
       Select 100 people from FERET database, with 4 face
        images taken in different sessions for each people
       The gallery contains 2 face images of each people,
        totally 200 face images
       The remaining 200 face images are used for probe

       An extra training set contains 1204 face images of 400
        people outside the gallery
Experiments
   Subspace analysis using different training sets

                         Case I: All the three steps use the 200
                         samples of 100 people in the gallery to
                         compute the subspaces

                         Case II: All the three steps use the 1404
                         samples of 500 people including 400
                         people outside the gallery to compute the
                         subspaces
                         Case III: PCA and intrapersonal
                         subspaces are computed from 1404
                         samples, and LDA subspace is compute
                         from only the 200 samples in the gallery
Several Novel Methods Developed from this
Framework
   Discriminant analysis in dual intrapersonal subspaces


                                        DIFS   E DFFS F
                                    D
              F               PCA         LDA B
                      Δ       A                          L
                                         K
                      DFFS
               DIFS
                                                     H
                                                             G
                                                 C
                          F
                              O              N       M
Several Novel Methods Developed from this
Framework
   Random subspace based LDA face recognition
                            Fusion



       LDA classifier      ……              LDA classifier


      Random subspace      ……             Random subspace



                           Face data

   Eigentransformation algorithm could reduce the large
    transformation difference. It is applied to face sketch
    recognition and hallucination
            Several Novel Methods Developed from this
            Framework



    Photo         Sketch      Synthesized          Photo         Sketch      Synthesized
                                sketch                                         sketch




High-resolution Low-resolution Hallucinated   High-resolution Low-resolution Hallucinated
  face image      face image                    face image      face image    face image
                                face image
Conclusion
   Using a face difference model, we discover how each of
    the subspace method contributes on extracting the
    discriminating information from the face difference.
   It leads to a 3D parameter space using the three subspace
    dimensions as axes. Searching through the space, better
    performance is achieved than standard subspace methods.
   To find the optimal parameters, first observe the dp-di
    accuracy surface to decide dp and di, and choose the dl
    according to the accuracy curve in LDA subspace.
    Conclusion
   It provides a better understanding on how to select the
    training sets. Since the subspace analysis can be divided to
    three steps, different training set can be adopt according to
    the special requirement of each step.
   Publications
•X. Wang, and X. Tang, “Unified Subspace Analysis for Face
Recognition,” Proceedings of ICCV, 2003.
•X. Wang, and X. Tang, “Face Sketch Synthesis and Recognition,”
Proceedings of ICCV, 2003.
•X. Wang, and X. Tang, “An improved Bayesian Algorithm in
Reduced PCA Space,” Proceedings of ICASSP, 2003.
•X. Wang, and X. Tang, “Face Hallucination and Recognition,”
Proceedings of the 4th International Conference on Audio- and
Video-Based Person Authentication, 2003.
•X. Tang, and X. Wang, “Face Photo Recognition Using Sketch,”
Proceedings of ICIP, pp. I-257-I-260, 2002.
Thank you!

								
To top