Face Detection: a Survey

Document Sample
Face Detection: a Survey Powered By Docstoc
					  Digital Image Processing
  Lecture 8: Face Detection




Prepared by: Eng. Mohamed Hassan
Supervised by: Dr. Ashraf Aboshosha

 http://www.icgst.com/A_Aboshosha.html

                 editor@icgst.com
               Tel.: 0020-122-1804952
               Fax.: 0020-2-24115475
  Ashraf Aboshosha, www.icgst.com, editor@icgst.com
  Ashraf Aboshosha, www.icgst.com, editor@icgst.com
   The face detection techniques

 Feature-Based           Approach
 o Skin color and face geometry
 o Detection task is accomplished by
    Distance, angles and area of visual features

 Image-Based          Approach
 o As a general recognition system




      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
   The face detection techniques

 Feature-Based           Approach
 o Low-Level Analysis
    Segmentation of visual features
 o Feature Analysis
    Organized the features into
        1. Global concept
        2. Facial features
 o Active Shape Models
    Extract the complex & non-rigid feature
    Ex: eye pupil, lip tracking.


      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                  Low-Level Analysis:
          Segmentation of visual features

 Edges: (The most primitive feature)
  o Trace a human head outline.
  o Provide the information
     Shape & position of the face
  o Edge operators
     Sobel
     Marr-Hildreth
     first and second derivatives of Gaussians


       Ashraf Aboshosha, www.icgst.com, editor@icgst.com
       Ashraf Aboshosha, www.icgst.com, editor@icgst.com
             Low-Level Analysis:
     Segmentation of visual features

o The steerable filtering
  1. Detection of edges
  2. Determining the orientation
  3. Tracking the neighboring edges
o Edge-detection system
  1. 1. Label the edge
  2. 2. Matched to a face model
  3. 3. Golden ratio
                                               height 1  5
                                                     
                                               width     2


     Ashraf Aboshosha, www.icgst.com, editor@icgst.com
     Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                 Low-Level Analysis:
          Segmentation of visual features

 Gray   information
 o Facial feature ( eyebrows , pupils …)

 o Application
    Search an eye pair
    Find the bright pixel (nose tips)
    Mosaic (pyramid) images




         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
      Segmentation of visual features:
          Color Based Segmentation

 Color   information
 o Difference races?
    Different skin color gives rise to a tight cluster in
     color space.                                       R
                                               r 
                                                   R G  B
 o Color models                                          G
                                               g 
                                                    R G  B
    Normalized RGB colors                              B
                                               b 
                                                   R G  B
    A color histogram for a face is made
    Comparing the color of a pixel with respect to the r
     and g.


          Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                                Why normalized ? Brightness change
          Ashraf Aboshosha, www.icgst.com, editor@icgst.com
             Low-Level Analysis:
      Segmentation of visual features

o HSI color model
   For large variance among facial feature
    clusters [106].
    Extract lips, eyes, and eyebrows.
   Also used in face segmentation
o YIQ
   Color’s ranging from orange to cyan
    Enhance the skin region of Asians [29].
o Other color models
   HSV, YES, CIE-xyz …
o Comparative study of color space [Terrilon 188]


     Ashraf Aboshosha, www.icgst.com, editor@icgst.com
     Ashraf Aboshosha, www.icgst.com, editor@icgst.com
               Low-Level Analysis:
          Segmentation of visual features

 Color   segmentation by color thresholds
 o Skin color is modeled through
    Histogram or charts (simple)
    Statistical measures (complex)
    Ex:
      Skin color cluster can be represented as Gaussian
       distribution [215]
 o Advantage of Statistical color model
    The model is updatable
    More robust against changes in environment

       Ashraf Aboshosha, www.icgst.com, editor@icgst.com
       Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                Low-Level Analysis:
         Segmentation of visual features


 The   disadvantage:
 o Not robust under varying lighting condiction




        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
      Color based segmentation:
    Skin model construction (Example)




The original image was taken from http://nn.csie.nctu.edu.tw/face-detection/ppframe.htm




      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
       Color based segmentation:
     Skin model construction (Example)




The original image was taken from http://nn.csie.nctu.edu.tw/face-detection/ppframe.htm


      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                     Low-Level Analysis:
             Segmentation of visual features

    Motion information
    o    a face is almost always moving
    o    Disadvantages:
         •      What if there are other object moving in the
                background.


    Four steps for detection
    1.   Frame differencing
    2.   Thresholding
    3.   Noise removal
    4.   Locate the face
             Ashraf Aboshosha, www.icgst.com, editor@icgst.com
             Ashraf Aboshosha, www.icgst.com, editor@icgst.com
       Motion-Based segmentation:



 Motion   estimation
 o People are always moving.
 o For focusing of attention
    discard cluttered, static background
 o A spatio-temporal Gaussian filter can be
   used to detect moving boundaries of faces.




      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
 Motion-Based segmentation:




                             3
                         a         a ( x 2  y 2 u 2t 2 )
G (x , y , t )  u ( ) e     2
                        
                    2 1 2 
m (x , y , t )      2 2 G ( x , y , t )
                       u t 




Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Ashraf Aboshosha, www.icgst.com, editor@icgst.com
  The face detection techniques

 Image-Based         Approach
 o Linear Subspace Methods
 o Neural Networks
 o Statistical Approaches




     Ashraf Aboshosha, www.icgst.com, editor@icgst.com
     Ashraf Aboshosha, www.icgst.com, editor@icgst.com
    The face detection techniques

 Face   interface
                                                    Face database
 o Face detection
 o Face recognition
                                                                        Output:
                                                                      Mr.Chan
                      Face detection             Face   recognition
                                                                      Prof..Chen




          Ashraf Aboshosha, www.icgst.com, editor@icgst.com
          Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                      Face detection

 Todetect faces in an image (Not recognize it yet)
 Challenges
  o A picture has 0,1 or many faces
  o Faces are not the same: with spectacles, mustache etc
  o Sizes of faces vary
 Availablein most digital cameras nowadays
 The simple method
  o Slide a window across the window and detect faces
    Too slow, pictures have                          too   many   pixels
     (1280x1024=1.3M pixels)


       Ashraf Aboshosha, www.icgst.com, editor@icgst.com
       Ashraf Aboshosha, www.icgst.com, editor@icgst.com
           Evaluation of face detection

 Detection rate
  o Give positive results in locations where faces exist
  o Should be high > 95%
 False positive rate
  o The detector output is positive but it is false (there is
    actually no face).Definition of False positive: A result that is erroneously positive
       when a situation is normal. An example of a false positive: a particular test designed to
       detect cancer of the toenail is positive but the person does not have toenail cancer.
       (http://www.medterms.com/script/main/art.asp?articlekey=3377)

  o Should be low <10-6
 A good system has
  o High detection rate
  o Low false positive rate



                                                              False positive result
              Ashraf Aboshosha, www.icgst.com, editor@icgst.com
              Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                                Exercise

 What are the detection rate and false
 detection rate here?
               9 faces in the picture, 8
               are correctly detected.
 o Answer
    detection rate=(8/9)*100%
    false detection rate=(1/9)*100%

                             1 window
                             reported to
                                                         False positive result
                             have face is in
                             fact not a face

         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
   The Viola and Jones method

 The  most famous method
 Training may need weeks
 Recognition is very fast, e.g. real-time for
  digital cameras.
 Techniques
  o Integral image for feature extraction
  o Ada-Boost for feature selection
  o Attentional cascade for fast rejection of non-face
    sub-windows


    Ashraf Aboshosha, www.icgst.com, editor@icgst.com
    Ashraf Aboshosha, www.icgst.com, editor@icgst.com
             Image Features ref[3]

“Rectangle filters”




Rectangle_Feature_value f=
∑ (pixels in white area) –
∑ (pixels in shaded area)

        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                             Exercise

 Find    the                  1                         2       3   3
    Rectangle_Feature_val
    ue (f) of the box
                               3                         0       1   3
    enclosed by the dotted
    line
   Rectangle_Feature_value f= 5                         8       7   1
   ∑ (pixels in white area) –
    ∑ (pixels in shaded area)
   f=(8+7)-(0+1)
   =15-1= 14
                                             0           2       3   6



             Ashraf Aboshosha, www.icgst.com, editor@icgst.com
             Ashraf Aboshosha, www.icgst.com, editor@icgst.com
     Example: A simple face detection
        method using one feature
Rectangle_Feature_value f
f= ∑(pixels in white area) – ∑ (pixels in
shaded area)
                                       Result
If (f) is large it is face ,i.e.
if (f)>threshold, then
 face
Else
 non-face

                                        This is a face:T he eye-part is
                                        dark, the nose-part is bright
     This is not a face.                So f is large, hence it is face
     Because f is small
          Ashraf Aboshosha, www.icgst.com, editor@icgst.com
           Ashraf Aboshosha, www.icgst.com, editor@icgst.com
          How to find features faster
   Integral images fast calculation method
                [Lazebnik09 ]


 The   integral image = sum
  of all pixel values above                                 (x,y)
  and to the left of (x,y)
 Can be found very quickly




        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Computing the integral image
      [Lazebnik09 ]




Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Ashraf Aboshosha, www.icgst.com, editor@icgst.com
          Computing the integral image
                [Lazebnik09 ]



                        ii(x, y-1)
                  s(x-1, y)

                                                i(x, y)



   Cumulative row sum: s(x, y) = s(x–1, y) + i(x, y)
   Integral image: ii(x, y) = ii(x, y−1) + s(x, y)
   MATLAB: ii = cumsum(cumsum(double(i)), 2);


            Ashraf Aboshosha, www.icgst.com, editor@icgst.com
            Ashraf Aboshosha, www.icgst.com, editor@icgst.com
        Calculate sum within a rectangle


 A,B,C,D are the values of the                       D                       B
  integral images at the corners of
  the rectangle R.                                            Rectangle
 The sum of image values inside R                               R
  is:                                                         Area=       A
      Area_R = A – B – C + D                        C         Area_R
 If A,B,C,D are found , only 3
  additions are needed to find
  Area_R




          Ashraf Aboshosha, www.icgst.com, editor@icgst.com
          Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Why do we need to find pixel sum of rectangles?
     Answer: We want to get face features

   You may consider these
    features as face features
    o Two eyes= (Area_A-Area_B)
                                                                A
    o Nose =(Area_C+Area_E-Area_D)                              B   C
    o Mouth =(Area_F+Area_H-                                        D
      Area_G)                                                       E
                                                                        F
   They can be different sizes,
                                                                        G
    polarity and aspect ratios                                          H




            Ashraf Aboshosha, www.icgst.com, editor@icgst.com
            Ashraf Aboshosha, www.icgst.com, editor@icgst.com
         Face feature and example
                                                             Pixel values inside
                                                             The areas
                                                             10       20       4
                                           Shaded area       7        45       7
                                               -1            216      102      78
                                             Integral
                                              +2
                                            White area
                                              Image          129      210      111
                      F=Feat_val =
                      pixel sum in shared area - pixel sum in white area
                      Example
                      •Pixel sum in white area= 216+102+78+129+210+111=846

                      •Pixel sum in shared area= 10+20+4+7+45+7=93

                      Feat_val=F=846-93
                      If F>threshold,
                          feature=+1
A face                Else
                          feature=-1 End if;
                      We can choose threshold =768 , so feature is +1.
         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                                      Exercise 1?
Definition: Area at X =pixel sum of the area from top-left corner to
                             X= Area_X
                                           Top-left corner
   Find the feature output of this
    image.                                          1           2     3   3
   Area_D=1
   Area_B=1+2+3=6
                                                           D              B
   Area_C =1+3=4
   Area_A=1+2+3+3+4+6=19                           3           4     6   3
   Area_E=? 1+3+5=9
   Area_F=? 1+2+3+3+4+6+5+2+4=30                          C              A
   Pixel sum of the area inside the box
    ABDC=                                           5           2     4   1
   Area_A - Area_B - Area_C +Area_D
    =? 19-6-4+1=10
   Pixel sum of the area inside the box                    E             F
    EFAC=
   ?Area_F-Area_A-area_E+Area_C=                   0           2     3   6
   30-19-9+4=6
   Feature of EFBD=
   (white area-shaded area)=?
                  Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                  Ashraf Aboshosha, www.icgst.com, editor@icgst.com
             4 basic types of features
             for white_area-gray_area

 Type) Rows x columns                        Each  basic type can
 Type 1) 1x2                                   have difference sizes
                                                and aspect ratios.
 Type   2) 2x1

 Type   3) 1x3

 Type   4) 3x1

 Type   5) 2x2

          Ashraf Aboshosha, www.icgst.com, editor@icgst.com
          Ashraf Aboshosha, www.icgst.com, editor@icgst.com
     Feature selection [Lazebnik09 ]
                                       Some examples and their types
 For   a 24x24 detection              Fill in the types for the 2nd, 3rd rows
  region, the number of                       2       3      5      1     4
  possible rectangle
  features is ~160,000!
 Exercise
 Name the types of the
  feature (type 1,2,3,4,5)                                   Type
  of the features in the                                     1)

  left figure                                                2)

                                                             3)

                                                             4)

                                                             5)
         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                             Exercise 2?

 Stillkeeping the 5 basic features types
  (1,2,3,4,5)
  o Find the number of features for a resolution of
    36 x36 windows
  o Answer: 704004, explain your answer.




          Ashraf Aboshosha, www.icgst.com, editor@icgst.com
          Ashraf Aboshosha, www.icgst.com, editor@icgst.com
         The detection challenge

                                                     (x,y) 24x24Sub-window
 Use  24x24 base window
                                                                     1280
                                                                X-axis
 For y=1;y<=1024;y++                             (1,1)
  For x=1;x<=1280;x++{
    Set (x,y) = the left top
     corner of the 24x24 sub-  Y-
     window                    axis
    For the 24x24 sub-window,
                               1024
     extract 162,336 features
     and see they combine to
     form a face or not.
   }
 Conclusion     : too slow
        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
        Solution to make it efficient

 The   whole 162,336 feature set is too large
 o Solution: select good features to make it more
   efficient
 o Use: “Boosting”
 Boosting
 o Combine many small weak classifiers to become
   a strong classifier.
 o Training is needed




        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
     Boosting for face detection

 Defineweak learners based on rectangle
 features
                                             value of rectangle
                                                  feature
                       1 if pt f t ( x )  ptt
            ht ( x )  
                       0 otherwise                           threshold
           window                                  Pt=
                                               polarity{+1,
                                                   -1}



      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
   Face detection using Adaboost
 AdaBoost       training
 o E.g. Collect 5000 faces, and 9400 non-faces.
   Different scales.
 o Use AdaBoost for training to build a strong
   classifier.
 o Pick suitable features of different scales and
   positions, pick the best few. (Take months to do ,
   details is in [Viola 2004] paper)
 Testing
 o Scan through the image, pick a window and
   rescale it to 24x24,
 o Pass it to the strong classifier for detection.
 o Report face, if the output is positive
       Ashraf Aboshosha, www.icgst.com, editor@icgst.com
       Ashraf Aboshosha, www.icgst.com, editor@icgst.com
              To improve false positive rate:
                   Attentional cascade
       Cascade  of many AdaBoost strong classifiers
       Begin with with simple classifiers to reject many
        negative sub-windows
       Many non-faces are rejected at the first few
        stages.
       Hence the system is efficient enough for real
        time processing.

Input image
                             True                 True
              Adaboost              Adaboost             Adaboost True   Face
              Classifier1           Classifier2          Classifier3     found
                     False                False                False
               Non-face             Non-face             Non-face

                Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                                            An example
          More     features for later stages in the cascade
              [viola2004]


              type2      type3




               2 features             10 features           25 features      50 features…
Input image
                               True                  True
                Adaboost               Adaboost              Adaboost True           Face
                Classifier1            Classifier2           Classifier3             found
                       False                 False                 False
                 Non-face              Non-face              Non-face

                      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                                  Attentional cascade
         Chain         classifiers     that   are Receiver operating
              progressively more complex and         characteristic
              have lower false positive rates:          % False Pos
                                                                                           0                       50
                                                                                      vs false neg determined by




                                                                                     100
                                                                       % Detection

                                                                                     0
Input image                                                                                    False positive rate
                                True                 True
                 Adaboost              Adaboost             Adaboost True                                Face
                 Classifier1           Classifier2          Classifier3                                  found
                        False                False                False
                  Non-face             Non-face             Non-face

                       Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                       Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                Attentional cascade [viola2004]
               Detection             rate for each stage is 0.99 , for 10
               stages,
               o overall detection rate is 0.9910 ≈ 0.9


               Falsepositive rate at each stage is 0.3, for 10
               stages
               o false positive rate =0.310 ≈ 6×10-6)

Input image
                               True                  True
                Adaboost               Adaboost             Adaboost True   Face
                Classifier1            Classifier2          Classifier3     found
                       False                 False                False
                 Non-face              Non-face             Non-face

                      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
        Detection process in practice
                [smyth2007]
 Use 24x24 sub-window
 Scaling
  o scale the detection (not the input image)
  o Features evaluated at scales by factors of 1.25 at
    each level
  o Location : move detector around the image (1
    pixel increments)
 Final   detections
  o A real face may result in multiple nearby
    detections (merge them to become the final
    result)


          Ashraf Aboshosha, www.icgst.com, editor@icgst.com
          Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                         Skin detection



                                 skin




 Skin   pixels have a distinctive range of colors
  o Corresponds to region(s) in RGB color space
 Skin   classifier
  o A pixel X = (R,G,B) is skin if it is in the skin (color) region
  o How to find this region?

         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                         Skin detection




 Learn   the skin region from examples
  o Manually label skin/non pixels in one or more “training images”
  o Plot the training data in RGB space
     skin pixels shown in orange, non-skin pixels shown in gray
     some skin pixels may be outside the region, non-skin pixels inside.

         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                          Skin classifier




 Given   X = (R,G,B): how to determine if it is skin or not?
  o Nearest neighbor
     find labeled pixel closest to X
  o Find plane/curve that separates the two classes
     popular approach: Support Vector Machines (SVM)
  o Data modeling
     fit a probability density/distribution model to each class
          Ashraf Aboshosha, www.icgst.com, editor@icgst.com
          Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                         Probability

o X is a random variable
o P(X) is the probability that X achieves a certain
  value
                                     called a PDF
                                           -probability distribution/density function
                                           -a 2D PDF is a surface
                                           -3D PDF is a volume




        continuous X                           discrete X

     Ashraf Aboshosha, www.icgst.com, editor@icgst.com
     Ashraf Aboshosha, www.icgst.com, editor@icgst.com
    Probabilistic skin classification




 Model     PDF / uncertainty
    o Each pixel has a probability of being skin or not skin

   Skin classifier
    o Given X = (R,G,B): how to determine if it is skin or not?
    o Choose interpretation of highest probability

   Where do we get                     and                    ?
           Ashraf Aboshosha, www.icgst.com, editor@icgst.com
           Ashraf Aboshosha, www.icgst.com, editor@icgst.com
          Learning conditional PDF’s




 We   can calculate P(R | skin) from a set of training
    images
•   It is simply a histogram over the pixels in the training images
    o each bin Ri contains the proportion of skin pixels with color Ri
•   This doesn’t work as well in higher-dimensional spaces. Why not?

                         Approach: fit parametric PDF functions
                               • common choice is rotated Gaussian
                                    – center
                                    – covariance

           Ashraf Aboshosha, www.icgst.com, editor@icgst.com
           Ashraf Aboshosha, www.icgst.com, editor@icgst.com
     Learning conditional PDF’s




 We  can calculate P(R | skin) from a set of training
  images
 But this isn’t quite what we want
  o Why not? How to determine if a pixel is skin?
  o We want P(skin | R) not P(R | skin)
  o How can we get it?


       Ashraf Aboshosha, www.icgst.com, editor@icgst.com
       Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                                Bayes rule


                                     what we measure       domain knowledge
                                       (likelihood)             (prior)
 In   terms of our problem:



              what we want                            normalization term
               (posterior)

 What can we use for the prior P(skin)?
       • Domain knowledge:
           – P(skin) may be larger if we know the image contains a person
           – For a portrait, P(skin) may be higher for pixels in the center
       • Learn the prior from the training set. How?
       – P(skin) is proportion of skin pixels in training set

             Ashraf Aboshosha, www.icgst.com, editor@icgst.com
             Ashraf Aboshosha, www.icgst.com, editor@icgst.com
              Bayesian estimation




        likelihood                                  posterior (unnormalized)



 Bayesian     estimation
  o Goal is to choose the label (skin or ~skin) that maximizes the
    posterior ↔ minimizes probability of misclassification
     this is called Maximum A Posteriori (MAP) estimation



        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
     Skin detection results




Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Ashraf Aboshosha, www.icgst.com, editor@icgst.com
               General classification
 This same procedure applies in more general
  circumstances
  o More than two classes
  o More than one dimension




Example: face detection
   • Here, X is an image region
         – dimension = # pixels
         – each face can be thought of as a
           point in a high dimensional space

                                              H. Schneiderman, T. Kanade. "A Statistical Method for 3D
                                              Object Detection Applied to Faces and Cars". CVPR 2000
            Ashraf Aboshosha, www.icgst.com, editor@icgst.com
            Ashraf Aboshosha, www.icgst.com, editor@icgst.com
  Recognition problems
 What     is it?
  o Object and scene recognition
 Who     is it?
  o Identity recognition
 Where      is it?
  o Object detection
 What     are they doing?
  o Activities
 All   of these are classification problems
  o Choose one class from a list of possible candidates

          Ashraf Aboshosha, www.icgst.com, editor@icgst.com
          Ashraf Aboshosha, www.icgst.com, editor@icgst.com
             What is recognition?

A  different taxonomy from [Csurka et al.
  2006]:
• Recognition
    o Where is this particular object?
•   Categorization
    o What kind of object(s) is(are) present?
•   Content-based image retrieval
    o Find me something that looks similar
•   Detection
    o Locate all instances of a given class

      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Eigenfaces for recognition




Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                     Linear subspaces

                                         convert x into v1, v2 coordinates


                                         What does the v2 coordinate measure?
                                              - distance to line
                                              - use it for classification—near 0 for orange pts
                                         What does the v1 coordinate measure?
                                              - position along line
                                              - use it to specify which orange point it is




   Classification can be expensive:
    o Big search prob (e.g., nearest neighbors) or store large PDF’s
   Suppose the data points are arranged as above
    o Idea—fit a line, classifier measures distance to line

            Ashraf Aboshosha, www.icgst.com, editor@icgst.com
            Ashraf Aboshosha, www.icgst.com, editor@icgst.com
          Dimensionality reduction




Dimensionality reduction
   • We can represent the orange points with only their v1 coordinates
     (since v2 coordinates are all essentially 0)
   • This makes it much cheaper to store and compare points
   • A bigger deal for higher dimensional problems

          Ashraf Aboshosha, www.icgst.com, editor@icgst.com
          Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                               Linear subspaces

                                                   Consider the variation along direction v
                                                   among all of the orange points:




                                                   What unit vector v minimizes var?


                                                   What unit vector v maximizes var?




Solution: v1 is eigenvector of A with largest eigenvalue
          v2 is eigenvector of A with smallest eigenvalue



                      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
   Principal component analysis


 Suppose    each data point is N-dimensional
 o Same procedure applies:



 o The eigenvectors of A define a new coordinate system
    eigenvector with largest eigenvalue captures the most variation
     among training vectors x
    eigenvector with smallest eigenvalue has least variation
 o We can compress the data using the top few eigenvectors
    corresponds to choosing a “linear subspace”
      represent points on a line, plane, or “hyper-plane”
    these eigenvectors are known as the principal components


        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                The space of faces




                                     =               +



 An   image is a point in a high dimensional space
 o An N x M image is a point in RNM
 o We can define vectors in this space as we did in the 2D case
         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
            Dimensionality reduction




 The set of faces is a “subspace” of the set of               images
  o We can find the best subspace using PCA
  o This is like fitting a “hyper-plane” to the set of faces
      spanned by vectors v1, v2, ..., vK
      any face
         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
          Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                          Eigenfaces

 PCA   extracts the eigenvectors of A
 o Gives a set of vectors v1, v2, v3, ...
 o Each vector is a direction in face space
    what do these look like?




        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
   Projecting onto the eigenfaces
 The   eigenfaces v1, ..., vK span the space of faces
 o A face is converted to eigenface coordinates by




         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
       Recognition with eigenfaces

   Algorithm
    1. Process the image database (set of images with labels)
        •   Run PCA—compute eigenfaces
        •   Calculate the K coefficients for each image
    2. Given a new image (to be recognized) x, calculate K
       coefficients

    3. Detect if x is a face


    4. If it is a face, who is it?
           Find closest labeled face in database
         nearest-neighbor in K-dimensional space


            Ashraf Aboshosha, www.icgst.com, editor@icgst.com
            Ashraf Aboshosha, www.icgst.com, editor@icgst.com
        Choosing the dimension K


  eigenvalues




                 i=    K                         NM


 How many eigenfaces to use?
 Look at the decay of the eigenvalues
  o the eigenvalue tells you the amount of variance “in the
    direction” of that eigenface
  o ignore eigenfaces with low variance

         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
    View-Based and Modular
Eigenspaces for Face Recognition




    Ashraf Aboshosha, www.icgst.com, editor@icgst.com
    Ashraf Aboshosha, www.icgst.com, editor@icgst.com
          Part-based eigenfeatures

 Learn  a separate
  eigenspace for each
  face feature
 Boosts performance
  of regular
  eigenfaces




      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Bayesian Face Recognition




Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Bayesian Face Recognition




Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Bayesian Face Recognition




Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Bayesian Face Recognition




Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Morphable Face Models




 Ashraf Aboshosha, www.icgst.com, editor@icgst.com
 Ashraf Aboshosha, www.icgst.com, editor@icgst.com
             Morphable Face Model

 Use subspace to model elastic 2D or 3D
 shape variation (vertex positions), in
 addition to appearance variation


                                            Shape S



                                          Appearance T



        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
        Morphable Face Model

                m                             m
   S model   ai S i            Tmodel   bi Ti
               i 1                          i 1




 3D   models from Blanz and Vetter ‘99


   Ashraf Aboshosha, www.icgst.com, editor@icgst.com
   Ashraf Aboshosha, www.icgst.com, editor@icgst.com
          Face Recognition Resources

 Face     Recognition Home Page:
•   http://www.cs.rug.nl/~peterkr/FACE/face.html
 PAMI     Special Issue on Face & Gesture (July ‘97)
 FERET
•   http://www.dodcounterdrug.com/facialrecognition/Feret/feret.htm
 Face-Recognition            Vendor Test (FRVT 2000)
•   http://www.dodcounterdrug.com/facialrecognition/FRVT2000/frvt2000.htm
 Biometrics       Consortium
•   http://www.biometrics.org



            Ashraf Aboshosha, www.icgst.com, editor@icgst.com
            Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Robust real-time face detection




     Ashraf Aboshosha, www.icgst.com, editor@icgst.com
     Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Scan classifier over locs. & scales




     Ashraf Aboshosha, www.icgst.com, editor@icgst.com
     Ashraf Aboshosha, www.icgst.com, editor@icgst.com
       “Learn” classifier from data
 Training Data
• 5000 faces (frontal)
• 108 non faces
• Faces are normalized
  o Scale, translation
 Many   variations
• Across individuals
• Illumination
• Pose (rotation both in plane and out)


        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
       Characteristics of algorithm

• Feature set (…is huge about 16M features)
• Efficient feature selection using AdaBoost
• New image representation: Integral Image
• Cascaded Classifier for rapid detection


 Fastest    known face detector for gray scale
    images




        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                      Image features

•   “Rectangle filters”
    o Similar to Haar wavelets
•   Differences between
    sums of pixels in
    adjacent rectangles




       Ashraf Aboshosha, www.icgst.com, editor@icgst.com
       Ashraf Aboshosha, www.icgst.com, editor@icgst.com
                         Integral Image
 Partial    sum

 Any rectangle is
    D = 1+4-(2+3)

 Also known as:
• summed area tables
• boxlets [Simard98]




         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
       Huge library of filters




Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Ashraf Aboshosha, www.icgst.com, editor@icgst.com
         Constructing the classifier

 Perceptron        yields a sufficiently powerful
  classifier




 Use  AdaBoost to efficiently choose best
                                                  hi(x)
  features
                                                  b=Ew(y [x> q])
• add a new hi(x) at each round
                                 a=Ew(y [x< q])
• each hi(xk) is a “decision stump”                          x

        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
          Constructing the classifier

 For each round of boosting:
• Evaluate each rectangle filter on each example
• Sort examples by filter values
• Select best threshold for each filter (min error)
    o Use sorting to quickly scan for optimal threshold
• Select best filter/threshold combination
• Weight is a simple function of error rate
• Reweight examples
    o (There are many tricks to make this more efficient.)

         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
         Ashraf Aboshosha, www.icgst.com, editor@icgst.com
     Good reference on boosting

 Friedman, J., Hastie, T. and Tibshirani, R.
 Additive Logistic Regression: a Statistical
 View of Boosting
   http://www-stat.stanford.edu/~hastie/Papers/boost.ps
 “We  show that boosting fits an additive logistic
 regression model by stagewise optimization of a
 criterion very similar to the log-likelihood, and
 present likelihood based alternatives. We also
 propose a multi-logit boosting procedure which
 appears to have advantages over other methods
 proposed so far.”


        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
        Ashraf Aboshosha, www.icgst.com, editor@icgst.com
     Trading speed for accuracy

 Givena nested set of classifier hypothesis
 classes




 Computational          Risk Minimization




      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
     Speed of face detector (2001)

 Speed  is proportional to the average number of
  features computed per sub-window.
 On the MIT+CMU test set, an average of 9
  features (/ 6061) are computed per sub-window.
 On a 700 Mhz Pentium III, a 384x288 pixel
  image takes about 0.067 seconds to process
  (15 fps).
 Roughly 15 times faster than Rowley-Baluja-
  Kanade and 600 times faster than
  Schneiderman-Kanade.
      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
      Ashraf Aboshosha, www.icgst.com, editor@icgst.com
               Sample results




Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Ashraf Aboshosha, www.icgst.com, editor@icgst.com
Ashraf Aboshosha, www.icgst.com, editor@icgst.com

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:30
posted:9/15/2012
language:English
pages:91