A Robust Face Recognition Method by Flavio58

VIEWS: 116 PAGES: 12

More Info
									A Robust Face Recognition Method Using Distance Metric Project Report with Annotated Bibliography
Ashwini Shukla (Y4112) (ashukla@iitk.ac.in) CS - 676 Term Project IIT Kanpur November 17, 2007

1 Intoduction 2 Image Transformation 2.1 Compared against Edge-Maps . . . . . . . . . . . . . . . . . . 2.2 Transformation Procedure . . . . . . . . . . . . . . . . . . . . 2.3 Robustness against pose/illumination variation . . . . . . . . 3 Partial Hausdorff Distance 3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Computation Algorithm . . . . . . . . . . . . . . . . . . . . . 3.3 Time Complexity Analysis . . . . . . . . . . . . . . . . . . . . 4 Performance Tests 4.1 Face Databases Used . . . . . . . . . . . . . . . . . . . . . . . 5 Results & Comparisons 5.1 Tabular Results . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Time taken-a function of image size . . . . . . . . . . . . . . 6 Discussion 3 3 3 3 4 5 5 6 6 7 7 8 8 9 9




In this work, we have implemented a new Partial Hausdorff Distance - based measure to compare the appearance of faces. The measure is applied to face recognition based on gray images of faces instead of edge maps. The measure is robust to variations in a face due to expression, illumination and slight pose differences.The partialness in the measure could tolerate expression and slight pose changes. The transformed face images are less sensitive to illumination changes besides preserving the appearance information of faces. The new measure is defined on these transformed faces. The organization of the report is as follows. The next section describes the image transformation and shows that the transformation is robust to illumination variations. Section 3, defines the new Hausdorff distance-based measure, Hpv , being used in this work describes the procedure to calculate it for face recognition and lists the properties of the measure. A timespace efficient algorithm is also explained for the computation of Hpv . In section 4, the performance of Hpv for face recognition is evaluated using benchmark face databases viz. the ORL (AT&T Laboratories, Cambridge)[1] and the YALE (Yale University, New Haven)[2]. The reference this work is based on is Vivek/Sudha(2006)[6]


Image Transformation
Compared against Edge-Maps

Existing Hausdorff distance-based measures for face recognition are defined between edge images of faces.Edge images are less affected by illumination variations. However, edge images do not carry the overall facial appearance. When the gray images that have appearance information are directly considered for comparison, its performance is affected by illumination variations. The effect of illumination can be reduced to a large extent by representing a pixel based on relative intensities of pixels in its neighborhood. Thus, a pixel is represented by a vector rather than a single gray value.


Transformation Procedure

The trasnformation is carried out by considering the 8-neighborhood of the pixel. Consider a pixel and its 8-neighborhood which forms a 3 x 3 window. The signs of the first derivative taken along the direction of neighbors is expected to remain same for uniform illumination changes over the window.


The advantage of representing a pixel in terms of its neighborhood is that it captures the distribution of the intensities in the neighborhood.

Figure 1: Transormation to Vector Image Each element of the vector corresponds to the first derivative along the direction of a neighbour of the pixel. The element takes one of the values, viz., 1, 0 and -1. Let g(p) and g(pn ) be the gray values of pixels p and its neighbor pn . Then the element (corresponding to pn ) of vector assigned to p takes the value 1 if g(p) > g(pn ), 0 if g(p) = g(pn ), -1 if g(p) < g(pn ). Fig. 1 shows a 3 x 3 window in an image and the vector corresponding to the pixel p at the center.


Robustness against pose/illumination variation

The robustness to expression and slight pose variations can be achieved by incorporating partialness in the measure. The robustness of the proposed transformation to illumination variation is illustrated using images from one of the testing databases. Fig. 2(a), 2(b) are images of same person under different illumination and 2(c) is the image of a different person. The corresponding transformed images are 3(a), 3(b) and 3(c). The error image for 2(a) and 2(b) is almost dark which shows that the transformed images are less affected by illumination variations. It has larger values over nose, mouth and eye regions and hence discriminate the transformed images of different faces.


Figure 2: Original Images

Figure 3: Transformed Images

Figure 4: Error Images


Partial Hausdorff Distance

Let A and B be the face images of size r x c to be compared. Let Av and Bv be the transformed images. The border pixels of M and T are ignored for the transformed images as they do not have all eight neighbors. The size of Av or Bv is therefore (r-2)x(c-2). Let v(p) represent the vector at pixel p. Then the new Hausdorff distance-based measure between A and B is defined as Hv (A, B) = max(hv (A, B), hv (B, A)) where hv is the directed version of Hv and is given by h(A, B) = max min d(a, b)
a∈Av b∈Bv

where, 5

d(a, b) =

a−b L

if va = vb if va = vb

The proposed measure can be improved for image comparison applications by introducing partial matching. Partial matching is obtained by taking K th ranked distance instead of maximum in hv . The equations are given by h(A, B) = K th min d(a, b)
a∈Av b∈Bv

The value of K chosen for this work is 0.5*size(A).


Computation Algorithm

The computation of hpv (A, B) involves finding the nearest pixel b in Bv for each pixel a in Av with v(a)=v(b). Instead of scanning the entire image, if linked list of pixels in Bv is constructed for every possible vector, then the search for the nearest pixel can be limited. There are 38 possible vectors and hence an array of 38 linked lists are constructed. The elements of array are the pointers to the linked lists. The array index i is computed from a vector [n0 , . . . , n7 ] as i = 7 (nk + 1) ∗ 3k. The pictorial representation k=0 of the data structure is shown in Fig. 5. Once the data structure is created for Bv , the computation of hpv (A, B) is done as follows. For each pixel in Av , the linked list corresponding to its vector value is searched linearly for the nearest pixel and the distance to it is assigned. This leads to the assignment of distance value to every pixel in Av . The K th ranked distance is then computed. This gives hpv (A, B). Similarly hpv (B, A) is computed by creating data structure for Av and Hpv (A, B) is the maximum of these two distances.


Time Complexity Analysis

The vectorization of the image of size r c can be done in a single scan and hence the time complexity of generating the Vector Image is O(rc). Index of the linked list array can be computed in constant time. Insertion of all pixels in the images to corresponding lists takes O(rc) time. As finding the nearest pixel involves linear search of a list of pixels, the time taken by this operation depends on the length of the list, and this has to be executed 2rc times. Hence, the time required to find distance will be O(drc) where d is 6

the length of the largest list.To sort the H values to get the K th maximum is done by using quick sort which is of order O(rclog(rc)) on average. But d ranges from 1 to rc in the best to the worst case. Hence the worst case complexity is O(drc).


Performance Tests
Face Databases Used

The ORL face database [1] from AT&T Laboratories, Cambridge University was one of the databases used consisting of images of 40 subjects (36 men and 4 women). A set of images were taken over a varied period of time in two sessions for each person. Figure 5 shows sample images of some subjects.

Figure 5: Sample ORL Face Database The other was the Yale face database [2] consisting of 15 subjects each having 11 images. The images have expression variations as well as illumination variations due to a light source positioned at right, left and center with respect to the face. Some of the images of a person are shown in Figure 6.


Figure 6: Sample YALE Face Database


Results & Comparisons
Tabular Results

The presented method when compared with other existing methods on the basis of pose and illumination variation reveals that it is way ahead of any of them in terms of recognition rate. Expression Smiling Angry Screaming PHD 88.88% 77.77% 46.82% LEM 78.57 % 92.86 % 31.25 % M2HD 52.68 % 81.25 % 20.54 % Hpv 100 % 96.82 % 81.74 %

Table 1: Comparison based on Expression Variation Further work is also possible to improve the time of the algorithm which right now is not suitable for very practical applications.


Pose Right/Left Up Down Average

PHD 65% 43.33% 61.66% 58.75%

LEM 74.17% 70% 70% 72.09%

M2HD 50% 65% 67.67% 58.17%

Hpv 86.66% 85% 95% 88.33%

Table 2: Comparison based on Pose Variation No. of Best Matches 1 2 3 4 5 YALE 100% 100% 100% 100% 98.66% ORL(1-20) 100% 98% 96% 89% 81% ORL(21-40) 100% 100% 99% 93% 88%

Table 3: Results for first 5 best matches


Time taken-a function of image size

Testing was done on linux based server ( with Single core 64-bits dual cpu (AMD opteron(TM) processor 250) machine with 8GB RAM and 3x140GB hard disk.. The tests revealed the folowing. The algorithm is taking around 0.7-0.9 seconds per image for the databases used where the images are of the size 92x112 pixels in ORL and 75x105 in YALE database.



The results tabulated for this work seem to be better than what is depicted in Vivek/Sudha(2006) [7]. One reason for this could be the number of databases used, which is lesser ( Two as against Four ) in this work. Secondly, care has been taken to keep the images of roughly the same size for YALE database used in this work and the ORL database gives nicely cropped and exactly the same sized images. Whereas in the Vivek/Sudha work, there is no mentioning of the cropping condition of images and their size.One way of removing image size dependance would be to scale up/down the images being compared so that the distaces are calculated on the same scale if the images happen to be of different sizes. Thirdly, the background in both the


databases used is constant, but again there is no mention of the background in their work. It is seen that the time taken is a function of the number of pixels and increases just linearly with the number of images being compared. Further work can be done to improve the time taken by the algorithm. This can be done by pre-transforming the database images and also using some criteria to hash them into groups. This would significantly reduce the match finding time.One such criteria could be distance between the eyes or the relative position of the mouth w.r.t the eyes ect.


[1] “The orl database of faces.” [Online]. Available: http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html [2] “Yale university face database.” [Online]. http://cvc.yale.edu/projects/yalefaces/yalefaces.html Available:

[3] K. M. Lam, K. H. Lin, and W. Chi, “Spatially eigenweighted hausdorff distances for human face recognition,” Pattern Recognition Letters, vol. 36, pp. 1827–1834, December 2003. This paper is an extension of the above weighting approach by the same people.A very similar structure with the one above.Same definition for hausdorff distance was used.But the weighting function used is a lot more better as it was statistically calculated from a set of training images.Regions of maximum variation in the training images was given the maximum weight to emphasise the critical pixels which may or may not belong to the eyes or mouth ect.Very good collection of experimental data is shown.Wide variety of databases were used to validate the theory, viz. YALE,MIT,ORL and mixed databases.Shows slightly better results than their previous work. [4] K. M. Lam, K. H. Lin, and W. C. Siu, “Human face recognition based on spatially weighted hausdorff distance,” Pattern Recognition Letters, vol. 24, pp. 499–507, August 2003. This is a very straight forward paper with new definition of hausdorff distance as an average-min distance.Quite clearly the author was able to show the relevance of such a modified version of the same.A logical and simple weighting strategy was followed to emphasize on the highly discriminating features of human face such as eyes, mouth ect.A reference to another paper was given for the methodology to find out the location of eyes and mouth with no further explanations on that.All these updates in the field of human face recognition resulted in slightly improved results but the overall appraoch was based on the edge maps hence many of the features were lost as they were not able to form good edges.


[5] K. M. Lam and H. Yan, “Locating and extracting the eye in human face images,” The Journal of Pattern Recognition Society, vol. 29, pp. 771–779, August 1995. Very less information given about the snake method used to find out the head contour which in turn decides the boxed region for the eyes.A very vaguely defined energy minimization method is suggested for drawing the head contour.Seems to work well though. A very generic model for a human eye is assumed and corners are found at the expected places according to the model.A nice series of experiments resulted in a knowledge of the values of feature vectors for different regions of the eye itself.Very impressive results are shown wherein the algorithm is shown to identify the contour of the eyelids and the eyeball too. [6] E. Vivek and N. Sudha, “Robust hausdorff distance measure for face recognition,” The Journal of Pattern Recognition Society, vol. 40, pp. 431–442, April 2006. Very nicely written and easy language.One of the most recent and best methods in face recognition as of today.The credit goes the transformation method used to get a false coloured image from the input image which is less suceptible to changes with illumination. Instead of using an edge map of images using this transformation which is based on the gradients in all 8 directions at every pixel results in reducing the loss of features from the non eye, mouth regions which was evident in the above approaches.Very nice tabulation of data showing the effectiveness of the approach irrespective of changes in illumination ,pose and even expression.


To top