face-recognition-int.. - Face Recognition

Document Sample
face-recognition-int.. - Face Recognition Powered By Docstoc

                                                       Face Detection Problem
       Face Recognition                      • Scan window over image

                                             • Classify window as either:
                                                 – Face
                                                 – Non-face

                                             Wi d       Cl   ifi

                                                Face Detection: Experimental Results
Face Detection now in many Digital Cameras
                                                       Test set 1: 125 images with 483 faces
                                                       Test set 2: 20 images with 136 faces

            Canon Powershot


Face Recognition Problem                                   Face Verification Problem
                                                      • Face Authentication/Verification (1:1 matching)

                                     query image
                                     q Query face g
                                         y            • Face Identification/Recognition (1:N matching)


Application: Access Control                                 Biometric Authentication
www viisage com



                                            Application: Autotagging Photos in
Application: Video Surveillance             Facebook, Flickr, Picasa, iPhoto, …

Face Scan at Airports


                                                               iPhoto 2009
                                          • Can be trained to recognize pets!



                iPhoto 2009           Why is Face Recognition Hard?
• Things iPhoto thinks are faces      The many faces of Madonna

                                              Intra-class Variability
Recognition should be Invariant to
                                     • Faces with intra-subject variations in pose, illumination,
                                       expression, accessories, color, occlusions, and brightness

        •   Lighting variation
        •   Head pose variation
        •   Different expressions
        •   Beards, disguises
        •   Glasses, occlusion
            Gl          l i
        •   Aging, weight gain
        •   …


        Inter-class Similarity                                                                      Face Detection in Humans
• Different people may have very similar appearance
                                                                                                There are cells that detect faces in the
                                                                                                  “Fusiform Face Area” of brain

                   k     d hl
                                                    k/hi/   li h/i d    h/    i   /2000/    l

              Twins                          Father and son

Blurred Faces are Recognizable                                                                  Blurred Faces are Recognizable

                                                                                                  Michael Jordan, Woody Allen, Goldie Hawn, Bill Clinton, Tom Hanks,
                                                                                                  Saddam Hussein, Elvis Presley, Jay Leno, Dustin Hoffman, Prince
                                                                                                  Charles, Cher, and Richard Nixon. The average recognition rate at this
                                                                                                  resolution is one-half.


by L. Harmon and B. Julesz, Scientific American,1973        S. Dali, Gala contemplating the Mediterranean Sea, 1976

                                                       Eyebrows Aid Recognition

                                                          Richard Nixon and Winona Ryder


       Saccadic Eye Movements                                                                   Field of View
                                                                               • Human vision system uses narrow-field-of-view
                                                                                 and wide-field-of-view naturally and intelligently
                                                                                                                y            g    y
                                                                                   2 , high-acuity fovea window of the world
                                                                                   3 saccades per second and gaze moves
                                                                                   Human vision can integrate information seamlessly

    Work by Russian psychophysicist Yarbus who traced saccadic eye movements

                         Challenges                                                  Problems of Recognition:
                                                                                 Recognition is Easier than Synthesis

•     Sinha et al [2005] use this example to illustrate the difficulty of      Pawan Sinha gave an Id tikit operator
                                                                               P      Si h            Identikit    t
      finding a suitable “similarity” measure to gauge similarity between a
      pair of faces.                                                             photographs of celebrities and asked him to
•     In this example, the outer two faces actually belong to the same
      person while the middle one does not. But conventional pixel-based
                                                                                 create the best likenesses he could. He thought
      measures who say otherwise.                                                he did very well.
•     Common variations in pose (this case), lighting, expression,
      distance, aging remain challenges to face recognition.                   Who are these people?


                                                                          Illumination and Shading Affect
            Problems of Recognition

 Bill Cosby, Tom Cruise, Ronald Reagan, Michael Jordan

       Vision is Inferential: Illumination                                    Vision is Inferential: Illumination

                                                    Which square is
                                                    darker, A or B?


             Context is Important

P. Sinha and T. Poggio, I think I know that face, Nature 384, 1996, 404.
                                                                           P. Sinha and T. Poggio, Last but not least, Perception 31, 2002, 133.

             Holistic Processing                                                   Holistic Processing

                                   Who is/are this person?

                                                                                                             Woody Allen and Oprah Winfrey


     NIST’s Face Recognition
        Grand Challenges

• Goal: Advance performance of face recognition
  10 fold (20%  2% verification rate @ 0 1% false
  10-fold                               0.1%
  alarm rate)
• Focus on different scenarios

                                                        Face Recognition Architecture

                                                     Image      Extraction                          Face
                                                     (window)                Feature                Identity


    Image as a Feature Vector                            Nearest Neighbor Classifier
                                                     { Rj } are set of training images
                         x2                          ID  arg min dist ( R j , I )


                  x1          x3
      Consider an n-pixel i
    • C    id                    to be      i t in
                      i l image t b a point i
      an n-dimensional “image space,” x  Rn
    • Each pixel value is a coordinate of x                                      x1        x3   R2

                  Key Idea                               Eigenfaces (Turk and Pentland, 91)

• One or more images for each person (class)         • Use Principle Component Analysis
  Expensive t compute k di t
• E        i to        t    distances, especially
                                             i ll            to d      the dimensionality
                                                       (PCA) t reduce th di      i   lit
  when each image is big (n dimensional)
• Not all images are very likely – especially when
  we know that every image contains a face. I.e.,
  images of faces are highly correlated, so
  compress them into a low-dimensional subspace
  that retains the key appearance characteristics


       Eigenface Representation                                              Eigenface Representation
 Each face image is represented by a weighted combination
 of a small number of “component” or “basis” faces

   Principal Component Analysis (PCA)                                     Principal Component Analysis (PCA)
• Pattern recognition in high-dimensional spaces
   − Problems arise when performing recognition in a high-dimensional     − Dimensionality reduction implies information
     space (“curse of dimensionality”)                                      loss
   − Significant improvements can be achieved by first mapping the data
     into a lower-dimensional subspace                                    − How to determine the best lower dimensional
                                                                          − Maximize information content in the
                                                                            compressed data by finding a set of k
                                                                            orthogonal vectors that account for as much of
                                                                            the data’s variance as possible
   − The goal of PCA is to reduce the dimensionality of the data            − Best dimension = direction in n-D with max variance
     while retaining the important variations present in the original
     data                                                                   − 2nd best dimension = direction orthogonal to first and
                                                                              max variance


 Principal Component Analysis (PCA)                                      Principal Component Analysis (PCA)
 • Geometric interpretation
                                                                          − The best low-dimensional space can be
   − PCA projects the data along the directions where the data
     varies the most                                                                           best
                                                                            determined by the “best” eigenvectors of the
   − These directions are determined by the eigenvectors of the             covariance matrix of the data, i.e., the
     covariance matrix corresponding to the largest eigenvalues             eigenvectors corresponding to the largest
   − The magnitude of the eigenvalues corresponds to the variance
     of the data along the eigenvector directions
                                                                            eigenvalues – also called “principal

                                                                          − Can be efficiently computed using SVD


                    Subspaces                                                           Subspaces
• Suppose we have points in 2D and we take a
                                                                         • Some lines will represent the data well and
  line through that space
                                                                           others not depending on how well the
                                                                           projection separates the data points

• We can project each point onto that 1D line


                 Subspaces                                          Eigenvectors
                                                     • An eigenvector is a vector, u, that obeys the
• Rather than using a line, we can do a similar
                                                       following rule:
  projection onto a vector v
                    vector,                                            i ui  Cui
                                                          where C is a matrix,  is a scalar called the
                                                     • Example:
                                                             2 3          3          3 2 3 3
                                                          C           u         4         
                                                             2 1          2           2   2 1  2 
• Scale the vector to obtain any point on the line   • So eigenvalue =4 for this eigenvector

                    Method                                               Method
• Each input image, Xi , is an n-D column            • Stack all training images together          nxM
  vector of all pixel values (i raster order)
           f ll i l l        (in   t     d )                     Y  [Y1Y2 ...YM ]                 matirx

• Compute “average face image” from all M            • Compute n x n Covariance Matrix
  training images of all people:                                        1
                       1 M                                   C  YY T     YiYi T
                  A      Xi
                      M i 1
                                                                        M i
                                                     • Compute eigenvalues and eigenvectors of
• Normalize each training image, Xi, by                C by solving
  subtracting the average face:
                                                                    i ui  Cui
                   Yi  X i  A


                 Method                                             Method
• Compute eigenvalues and eigenvectors of      • Each ui is an n x 1 eigenvector called an
    by l i
  C b solving                                    “ i   f    ” (to b    t !)
                                                 “eigenface” (t be cute!)
                i ui  Cui                    • The eigenfaces form a “basis,” meaning
     where the eigenvalues are
                                                      Yi  w1u1  w2u2  ...  wn u n
             1  2  ...  n                              n
                                                      X i   wi ui  A
     and the corresponding eigenvectors are                 i 1

               u1, u2, …, un                   • Image is exactly reconstructed by a linear
                                                 combination of all eigenvectors

                 Method                        How do you Construct Face Space?

                                               [              ]                    [             ]
• Reduce dimensionality by using only the
  b t k << n eigenvectors (i
  best        i       t           the
                           (i.e., th ones
  corresponding to the largest k eigenvalues
              X i   wi ui  A
                    i 1                       [ X1 X2 X3 X4 X5 ]                       [ u1 u2 u3 ]
• Each image Xi is approximated by a set of
  k weights [wi1 , wi2, …, wik ] = Wi where    Construct data matrix by stacking vectorized
                                               images and then using Singular Value
                                               Decomposition (SVD) to compute eigenfaces
            wij  u T ( X i  A)


    Eigenspace Representation                                       Face Image Reconstruction
                                                              • Face X in “face space” coordinates:
• Key property: Given 2 images, X1 and X2,
    d their    j ti    into i
  and th i projections i t eigenspace, Z1
  and Z2, then
          || X 1  X 2 ||  || Z1  Z 2 ||
                                                              • Reconstruction:
• That is distance in eigenspace is
  approximately equal to the correlation                            =       +
  between the 2 images
                                                              X     =   A   +   w1u1 + w2u2 + w3u3 + w4u4 + …

               Reconstruction                                                     Method
The more eigenfaces you use, the better the reconstruction,
but even a small number gives good quality for matching           • So, image Xi is a point in n-D “image
                                                                           ” that is  j t d into       i t
                                                                    space” th t i projected i t a point Wi
                                                                    in the k-D subspace called “face space”
                                                                    defined by the “eigenfaces” (i.e., basis
                                                                    vectors) u1, u2, …, uk


         Eigenfaces Algorithm
                                                              Example: Training Images
•    Modeling (Training)
    1. Given a collection of n labeled training images
    2 Compute mean image and covariance matrix
    3. Compute k eigenvectors (note that these are
       images) of covariance matrix corresponding to k
       largest eigenvalues                                                           Note: Faces must be
    4. Project the training images to the k-dimensional                              approximately
                                                                                     registered (translation,
       face space                                                                    rotation, size, pose)
•    Recognition (Testing)
     R     iti (T ti )
    1. Given a test image, project it into face space
    2. Classify it as the class (person) that is closest to
                                                                                   [ Turk & Pentland, 2001]
       it (as long as its distance to the closest person is
       “close enough”)


                                  m=5 eigenface images
      Average Image, A


                  Example                                        Experimental Results
                         Top eigenvectors: u1,…uk
                                                         • Training set: 7,562 images of approximately
                                                           3 000 people
    Average: A                                           • k=20 eigenfaces computed from a sample of
                                                           128 images
                                                         • Test set accuracy on 200 faces was 95%


        Difficulties with PCA
                                                          • The direction of maximum variance is
• Projection may suppress important detail                  not always good for classification
  – smallest variance directions may not be
• Method does not take discriminative task
  into account
  – typically, we wish to compute features that
    allow good discrimination
  – not the same as largest variance


                          Limitations                                                               Limitations
• PCA assumes that the data has a Gaussian
  distribution (mean µ, covariance matrix C)                                   − Background (de-emphasize the outside of the face – e.g.,
                                                                                 by multiplying the input image by a 2D Gaussian window
                                                                                  y      py g         p      g y
                                                                                 centered on the face)
                                                                               − Lighting conditions (performance degrades with light
                                                                               − Scale (performance decreases quickly with changes to
                                                                                 head size); possible solutions:
                                                                                    multi scale
                                                                                  − multi-scale eigenspaces
                                                                                  − scale input image to multiple sizes
                                                                               − Orientation (performance decreases but not as fast as
                                                                                 with scale changes)
                                                                                  − plane rotations can be handled
The shape of this dataset is not well described by its principal components
                                                                                  − out-of-plane rotations are more difficult to handle

                          Limitations                                              Extension: Eigenfeatures
 • Not robust to misalignment
                                                                              • Describe and
                                                                                encode a set of
                                                                                facial features:

                                                                              • Use for detecting
                                                                                facial features