VIEWS: 116 PAGES: 12 CATEGORY: Technology POSTED ON: 6/4/2009 Public Domain
A Robust Face Recognition Method Using Distance Metric Project Report with Annotated Bibliography Ashwini Shukla (Y4112) (ashukla@iitk.ac.in) CS - 676 Term Project IIT Kanpur November 17, 2007 Contents 1 Intoduction 2 Image Transformation 2.1 Compared against Edge-Maps . . . . . . . . . . . . . . . . . . 2.2 Transformation Procedure . . . . . . . . . . . . . . . . . . . . 2.3 Robustness against pose/illumination variation . . . . . . . . 3 Partial Hausdorﬀ Distance 3.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Computation Algorithm . . . . . . . . . . . . . . . . . . . . . 3.3 Time Complexity Analysis . . . . . . . . . . . . . . . . . . . . 4 Performance Tests 4.1 Face Databases Used . . . . . . . . . . . . . . . . . . . . . . . 5 Results & Comparisons 5.1 Tabular Results . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Time taken-a function of image size . . . . . . . . . . . . . . 6 Discussion 3 3 3 3 4 5 5 6 6 7 7 8 8 9 9 2 1 Intoduction In this work, we have implemented a new Partial Hausdorﬀ Distance - based measure to compare the appearance of faces. The measure is applied to face recognition based on gray images of faces instead of edge maps. The measure is robust to variations in a face due to expression, illumination and slight pose diﬀerences.The partialness in the measure could tolerate expression and slight pose changes. The transformed face images are less sensitive to illumination changes besides preserving the appearance information of faces. The new measure is deﬁned on these transformed faces. The organization of the report is as follows. The next section describes the image transformation and shows that the transformation is robust to illumination variations. Section 3, deﬁnes the new Hausdorﬀ distance-based measure, Hpv , being used in this work describes the procedure to calculate it for face recognition and lists the properties of the measure. A timespace eﬃcient algorithm is also explained for the computation of Hpv . In section 4, the performance of Hpv for face recognition is evaluated using benchmark face databases viz. the ORL (AT&T Laboratories, Cambridge)[1] and the YALE (Yale University, New Haven)[2]. The reference this work is based on is Vivek/Sudha(2006)[6] 2 2.1 Image Transformation Compared against Edge-Maps Existing Hausdorﬀ distance-based measures for face recognition are deﬁned between edge images of faces.Edge images are less aﬀected by illumination variations. However, edge images do not carry the overall facial appearance. When the gray images that have appearance information are directly considered for comparison, its performance is aﬀected by illumination variations. The eﬀect of illumination can be reduced to a large extent by representing a pixel based on relative intensities of pixels in its neighborhood. Thus, a pixel is represented by a vector rather than a single gray value. 2.2 Transformation Procedure The trasnformation is carried out by considering the 8-neighborhood of the pixel. Consider a pixel and its 8-neighborhood which forms a 3 x 3 window. The signs of the ﬁrst derivative taken along the direction of neighbors is expected to remain same for uniform illumination changes over the window. 3 The advantage of representing a pixel in terms of its neighborhood is that it captures the distribution of the intensities in the neighborhood. Figure 1: Transormation to Vector Image Each element of the vector corresponds to the ﬁrst derivative along the direction of a neighbour of the pixel. The element takes one of the values, viz., 1, 0 and -1. Let g(p) and g(pn ) be the gray values of pixels p and its neighbor pn . Then the element (corresponding to pn ) of vector assigned to p takes the value 1 if g(p) > g(pn ), 0 if g(p) = g(pn ), -1 if g(p) < g(pn ). Fig. 1 shows a 3 x 3 window in an image and the vector corresponding to the pixel p at the center. 2.3 Robustness against pose/illumination variation The robustness to expression and slight pose variations can be achieved by incorporating partialness in the measure. The robustness of the proposed transformation to illumination variation is illustrated using images from one of the testing databases. Fig. 2(a), 2(b) are images of same person under diﬀerent illumination and 2(c) is the image of a diﬀerent person. The corresponding transformed images are 3(a), 3(b) and 3(c). The error image for 2(a) and 2(b) is almost dark which shows that the transformed images are less aﬀected by illumination variations. It has larger values over nose, mouth and eye regions and hence discriminate the transformed images of diﬀerent faces. 4 Figure 2: Original Images Figure 3: Transformed Images Figure 4: Error Images 3 3.1 Partial Hausdorﬀ Distance Deﬁnition Let A and B be the face images of size r x c to be compared. Let Av and Bv be the transformed images. The border pixels of M and T are ignored for the transformed images as they do not have all eight neighbors. The size of Av or Bv is therefore (r-2)x(c-2). Let v(p) represent the vector at pixel p. Then the new Hausdorﬀ distance-based measure between A and B is deﬁned as Hv (A, B) = max(hv (A, B), hv (B, A)) where hv is the directed version of Hv and is given by h(A, B) = max min d(a, b) a∈Av b∈Bv where, 5 d(a, b) = a−b L if va = vb if va = vb The proposed measure can be improved for image comparison applications by introducing partial matching. Partial matching is obtained by taking K th ranked distance instead of maximum in hv . The equations are given by h(A, B) = K th min d(a, b) a∈Av b∈Bv The value of K chosen for this work is 0.5*size(A). 3.2 Computation Algorithm The computation of hpv (A, B) involves ﬁnding the nearest pixel b in Bv for each pixel a in Av with v(a)=v(b). Instead of scanning the entire image, if linked list of pixels in Bv is constructed for every possible vector, then the search for the nearest pixel can be limited. There are 38 possible vectors and hence an array of 38 linked lists are constructed. The elements of array are the pointers to the linked lists. The array index i is computed from a vector [n0 , . . . , n7 ] as i = 7 (nk + 1) ∗ 3k. The pictorial representation k=0 of the data structure is shown in Fig. 5. Once the data structure is created for Bv , the computation of hpv (A, B) is done as follows. For each pixel in Av , the linked list corresponding to its vector value is searched linearly for the nearest pixel and the distance to it is assigned. This leads to the assignment of distance value to every pixel in Av . The K th ranked distance is then computed. This gives hpv (A, B). Similarly hpv (B, A) is computed by creating data structure for Av and Hpv (A, B) is the maximum of these two distances. 3.3 Time Complexity Analysis The vectorization of the image of size r c can be done in a single scan and hence the time complexity of generating the Vector Image is O(rc). Index of the linked list array can be computed in constant time. Insertion of all pixels in the images to corresponding lists takes O(rc) time. As ﬁnding the nearest pixel involves linear search of a list of pixels, the time taken by this operation depends on the length of the list, and this has to be executed 2rc times. Hence, the time required to ﬁnd distance will be O(drc) where d is 6 the length of the largest list.To sort the H values to get the K th maximum is done by using quick sort which is of order O(rclog(rc)) on average. But d ranges from 1 to rc in the best to the worst case. Hence the worst case complexity is O(drc). 4 4.1 Performance Tests Face Databases Used The ORL face database [1] from AT&T Laboratories, Cambridge University was one of the databases used consisting of images of 40 subjects (36 men and 4 women). A set of images were taken over a varied period of time in two sessions for each person. Figure 5 shows sample images of some subjects. Figure 5: Sample ORL Face Database The other was the Yale face database [2] consisting of 15 subjects each having 11 images. The images have expression variations as well as illumination variations due to a light source positioned at right, left and center with respect to the face. Some of the images of a person are shown in Figure 6. 7 Figure 6: Sample YALE Face Database 5 5.1 Results & Comparisons Tabular Results The presented method when compared with other existing methods on the basis of pose and illumination variation reveals that it is way ahead of any of them in terms of recognition rate. Expression Smiling Angry Screaming PHD 88.88% 77.77% 46.82% LEM 78.57 % 92.86 % 31.25 % M2HD 52.68 % 81.25 % 20.54 % Hpv 100 % 96.82 % 81.74 % Table 1: Comparison based on Expression Variation Further work is also possible to improve the time of the algorithm which right now is not suitable for very practical applications. 8 Pose Right/Left Up Down Average PHD 65% 43.33% 61.66% 58.75% LEM 74.17% 70% 70% 72.09% M2HD 50% 65% 67.67% 58.17% Hpv 86.66% 85% 95% 88.33% Table 2: Comparison based on Pose Variation No. of Best Matches 1 2 3 4 5 YALE 100% 100% 100% 100% 98.66% ORL(1-20) 100% 98% 96% 89% 81% ORL(21-40) 100% 100% 99% 93% 88% Table 3: Results for ﬁrst 5 best matches 5.2 Time taken-a function of image size Testing was done on linux based server (172.31.1.6) with Single core 64-bits dual cpu (AMD opteron(TM) processor 250) machine with 8GB RAM and 3x140GB hard disk.. The tests revealed the folowing. The algorithm is taking around 0.7-0.9 seconds per image for the databases used where the images are of the size 92x112 pixels in ORL and 75x105 in YALE database. 6 Discussion The results tabulated for this work seem to be better than what is depicted in Vivek/Sudha(2006) [7]. One reason for this could be the number of databases used, which is lesser ( Two as against Four ) in this work. Secondly, care has been taken to keep the images of roughly the same size for YALE database used in this work and the ORL database gives nicely cropped and exactly the same sized images. Whereas in the Vivek/Sudha work, there is no mentioning of the cropping condition of images and their size.One way of removing image size dependance would be to scale up/down the images being compared so that the distaces are calculated on the same scale if the images happen to be of diﬀerent sizes. Thirdly, the background in both the 9 databases used is constant, but again there is no mention of the background in their work. It is seen that the time taken is a function of the number of pixels and increases just linearly with the number of images being compared. Further work can be done to improve the time taken by the algorithm. This can be done by pre-transforming the database images and also using some criteria to hash them into groups. This would signiﬁcantly reduce the match ﬁnding time.One such criteria could be distance between the eyes or the relative position of the mouth w.r.t the eyes ect. 10 References [1] “The orl database of faces.” [Online]. Available: http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html [2] “Yale university face database.” [Online]. http://cvc.yale.edu/projects/yalefaces/yalefaces.html Available: [3] K. M. Lam, K. H. Lin, and W. Chi, “Spatially eigenweighted hausdorﬀ distances for human face recognition,” Pattern Recognition Letters, vol. 36, pp. 1827–1834, December 2003. This paper is an extension of the above weighting approach by the same people.A very similar structure with the one above.Same deﬁnition for hausdorﬀ distance was used.But the weighting function used is a lot more better as it was statistically calculated from a set of training images.Regions of maximum variation in the training images was given the maximum weight to emphasise the critical pixels which may or may not belong to the eyes or mouth ect.Very good collection of experimental data is shown.Wide variety of databases were used to validate the theory, viz. YALE,MIT,ORL and mixed databases.Shows slightly better results than their previous work. [4] K. M. Lam, K. H. Lin, and W. C. Siu, “Human face recognition based on spatially weighted hausdorﬀ distance,” Pattern Recognition Letters, vol. 24, pp. 499–507, August 2003. This is a very straight forward paper with new deﬁnition of hausdorﬀ distance as an average-min distance.Quite clearly the author was able to show the relevance of such a modiﬁed version of the same.A logical and simple weighting strategy was followed to emphasize on the highly discriminating features of human face such as eyes, mouth ect.A reference to another paper was given for the methodology to ﬁnd out the location of eyes and mouth with no further explanations on that.All these updates in the ﬁeld of human face recognition resulted in slightly improved results but the overall appraoch was based on the edge maps hence many of the features were lost as they were not able to form good edges. 11 [5] K. M. Lam and H. Yan, “Locating and extracting the eye in human face images,” The Journal of Pattern Recognition Society, vol. 29, pp. 771–779, August 1995. Very less information given about the snake method used to ﬁnd out the head contour which in turn decides the boxed region for the eyes.A very vaguely deﬁned energy minimization method is suggested for drawing the head contour.Seems to work well though. A very generic model for a human eye is assumed and corners are found at the expected places according to the model.A nice series of experiments resulted in a knowledge of the values of feature vectors for diﬀerent regions of the eye itself.Very impressive results are shown wherein the algorithm is shown to identify the contour of the eyelids and the eyeball too. [6] E. Vivek and N. Sudha, “Robust hausdorﬀ distance measure for face recognition,” The Journal of Pattern Recognition Society, vol. 40, pp. 431–442, April 2006. Very nicely written and easy language.One of the most recent and best methods in face recognition as of today.The credit goes the transformation method used to get a false coloured image from the input image which is less suceptible to changes with illumination. Instead of using an edge map of images using this transformation which is based on the gradients in all 8 directions at every pixel results in reducing the loss of features from the non eye, mouth regions which was evident in the above approaches.Very nice tabulation of data showing the eﬀectiveness of the approach irrespective of changes in illumination ,pose and even expression. 12