FAST FACE DETECTION BY LIFTING DYADIC WAVELET FILTERS
Shared by: vqx13199
FAST FACE DETECTION BY LIFTING DYADIC WAVELET FILTERS Shigeru Takano, Koichi Niijima and Turghunjan Abdukirim Department of Informatics, Kyushu University, 6-1, Kasuga-koen, Kasuga, Fukuoka, 816-8580, JAPAN, takano, niijima, t-abdu @i.kyushu-u.ac.jp ABSTRACT jet. In our recent papers [4, 5, 6], we presented several im- This paper presents a fast algorithm for detecting facial parts age extraction algorithms using lifting wavelet transforms such as nose, eyes and lips in an image by using lifting of down-sampling type. The method is based on the learn- dyadic wavelet ﬁlters. Free parameters in the lifting ﬁlters ing of free parameters in the lifting scheme so as to van- are learned so as to maximize the cosine of an angle be- ish highpass components of a target image. Applying the tween a vector whose components are the lifting ﬁlters and learned highpass ﬁlter to a test image, however, parts differ- a vector of pixels in the facial part. Applying the learned ﬁl- ent from the target part are also extracted. This is for the ter to a test image, facial parts in the image can be detected. reason that the wavelet used is not shift-invariant and the In simulation, we show that our algorithm is fast and robust learning of free parameters is done so as to vanish highpass one for detecting facial parts from an image. components. 1. INTRODUCTION In this paper, we propose a facial parts detection algo- Face detection from an image has attracted a considerable rithm using lifting dyadic wavelet ﬁlters. Since lowpass and attention in computer vision. Research in computer vision highpass components obtained by dyadic wavelet transform includes face recognition, face tracking, pose estimation and are not down-sampled, they possess a shift-invariant prop- facial expression recognition. Face detection plays an es- erty. Our facial parts detection algorithm consists of two sential role in these research topics. Therefore, it is impor- processes, learning process and detection process. In the tant to develop a fast and robust face detection algorithm. learning process, free parameters in the dyadic wavelet ﬁl- There are several studies on the face detection by tem- ters are determined so as to maximize the cosine of an angle plate matching methods. In the template matching process, between a vector whose components are lifting highpass ﬁl- however, a huge amount of computation is required to ex- ters and a vector consisting of pixels in the facial part. In the tract faces from an image. Furthermore, this technique does detection process, we compute an angle between the vector not have robustness against the size, shape, color and tex- of the learned ﬁlters and a vector of pixels in a test image, ture of the face. Thus, development of fast and robust face and compare the cosine of the angle with the maximized co- detection algorithms has been expected. Recently, wavelet sine value. The cosine value in both processes is evaluated theory has been often used for feature extraction. Oren et al. by dividing the inner product of the two vectors by their  developed the pedestrian detection system using wavelet norms. An essential part of the inner product is a computa- templates. They used low level components obtained by ap- tion of lowpass and highpass components by old ﬁlters not plying Haar wavelet ﬁlter to a static image, and raised detec- including free parameters, which can be done in advance. tion rate by using support vector machine. However, since So, our detection algorithm is fast enough and realizes real- Haar wavelet transform is of down-sampling type, lowpass time processing. and highpass components obtained by the transform are not shift-invariant. This causes the failure of facial parts de- tection. Wiskott et al.  proposed the elastic bunch graph The outline of the paper is as follows. In Section 2, matching method based on Gabor wavelet ﬁlters. The method we introduce lifting dyadic wavelet ﬁlters. Section 3 de- uses a set of components obtained by applying Gabor wavelet scribes a learning algorithm of the lifting dyadic wavelet to the hand selected points on a face, which is called a jet. ﬁlters. In Section 4, we present an algorithm for detecting The jet has robustness against the size and orientation of the facial parts from an image. Section 5 contains experimental face. However, since Gabor wavelet is a complex-valued results. Concluding remarks and future work can be found continuous wavelet, it is time-consuming to compute the in Section 6. 2. LIFTING DYADIC WAVELET FILTERS where We denote lowpass and highpass analysis ﬁlters by Ó Ó × Ñ Ó Ñ (8) and Ó , respectively, and lowpass and highpass synthe- Ñ sis ﬁlters by Ó and Ó , respectively. We also denote Similarly, we obtain highpass components the Fourier transforms of Ó , Ó , Ó and Ó by Ó ´ µ, Ó ´ µ, Ó ´ µ and Ó ´ µ, respectively. We call these Æ¾ ÓÐ ﬁlters dyadic wavelet ﬁlters when they satisfy the recon- ½ Ð ¼ · Ð (9) struction condition Ð Æ½ £ Ó´ µ Ó ´ µ · Ó´ µ Ó £´ µ ¾ ¾ (1) where where the symbol £ denotes complex conjugation. ¼ ÓÐ ¼ · (10) Let us denote a signal by ¼ . By using the ﬁlters Ó and Ó , we can compute lowpass and highpass compo- Ó × Ñ Ó Ñ (11) nents ½ and ½ as follows: Ñ Ó ½ ¼ · (2) Here we deﬁne two vectors ½ Ó ¼ · (3) ´ Æ½ ¡ ¡ ¡ Æ¾ µ ÖÓÛ ´ ÖÓÛ ¼ · Æ½ ¡ ¡ ¡ ÖÓÛ ¼ · Æ¾ µ Using inner product symbol ¡, Conversely, by virtue of the reconstruction condition (1), we ½ can be written as can reconstruct the original signal ¼ from ½ and ½ as ½ ¡ ÖÓÛ ½ Ó ½ Ó ¼ ½ · ½ (4) Similarly, can be expressed as ¾ ¾ ½ The formulae (2), (3) and (4) imply that ¼ is equivalent ½ ¡ ÓÐ to ½ ½ . Furthermore, we deﬁne the lifting dyadic wavelet ﬁlters where , , and , respectively, by ´ Æ½ ¡ ¡ ¡ Æ¾ µ Ó ÓÐ ´ ¼ ÓÐ · Æ½ ¡¡¡ ¼ ÓÐ · Æ¾ µ Ó · × Ñ Ó Ñ Ñ Therefore, we deﬁne the cosine for each of ½ and (5) Ó ×Ñ Ó Ñ ½ as follows: Ñ Ó ½ Ó× ÖÓÛ (12) Here × Ñ denote free parameters. In the paper , we proved that the Fourier transforms of the ﬁlters deﬁned by ½ Ó× ÓÐ (13) (5) satisfy the reconstruction condition (1). Therefore, (2), (3) and (4) hold for the new ﬁlters. Here and indicate the angles between and ÖÓÛ , and between and ÓÐ , respectively, and the symbol ¡ 3. LEARNING PROCESS denotes the norm of the vectors. Let us denote an image by ¼ . Applying the formula Our criterion for learning the free parameters × Ñ and (2) to the image ¼ in vertical direction, we get × Ñ is to maximize (12) and (13) for the facial part ¼ ¡ in a training image. This criterion is equivalent to ÖÓÛ ¼ ¼ · (6) ¡ ¾ ½ ÖÓÛ min. (14) ÓÐ ¾ Next applying (3) to (6) in horizontal direction yields ½ min. (15) Æ¾ ÖÓÛ ½ Ð ¼ · Ð (7) The problem (14) and (15) can be solved by the steepest Ð Æ½ decent method. Choosing the point ´ µ of facial parts and under the Choosing the point ´ ¼ ¼µ that minimizes (18), ¼ ¼ ¼ conditions (14) and (15), we learn free parameters × Ñ gives a facial part. and × Ñ , where ¾ Ñ ¾ and ½ ¡¡¡ . Using To realize fast face detection, we ﬁrst search ¼ for the learned parameters with different length, we are able to nose. After searching nose, eyes and lips are found around robustly capture the features of the facial parts. their locations, which have already been predicted in the Our learning algorithm of the parameters × Ñ and × Ñ learning process. is as follows: We summarize our face detection algorithm as follows. 1. Prepare facial parts ¼ to learn × Ñ and × Ñ . 1. Compute lowpass and highpass components Ó ½ , Ó and ½ Ó for a test image. 2. Compute the lowpass and highpass components ½ ÖÓÛ , ½ , ¼ÓÐ and ½ by using ¼ 2. Compute ½ and ½ deﬁned by (16) and (6), (7), (10) and (9). (17) using the computed components in Step 1 and 3. Determine × Ñ and × Ñ by solving the minimiza- the learned parameters. tion problems (14) and (15) at the points ´ µ of 3. Find the position ´ µ of nose by minimizing Ê . nose, eyes and lips ¼ . 4. Search an area predicted in the learning process for 4. DETECTION PROCESS eyes and lips. In Section 3, free parameters × Ñ and × Ñ in (8) and 5. EXPERIMENTAL RESULTS (11) were learned for a facial part in a training image. In this section, we detect facial parts in a test image using the In an experiment, we detect facial parts in a facial image of learned parameters. Now, we denote the test image again by the ﬁrst author, which has 8-bit, ¾ ¢ ¾½ size. As a train- ¼ . The components ½ and ½ for the test ing image, we use another facial image of the ﬁrst author image ¼ must be computed fast. They can be written with the same format as the test image. Figure 1 illustrates in the present case as the training image, and Fig. 2 the test image. Our face de- ¾ ½ Ó ½ × Ñ Ó ½ Ñ (16) Ñ ¾ ¾ ½ Ó ½ × Ñ Ó ½ Ñ (17) Ñ ¾ Here ½ Ó and ½ Ó are highpass components obtained by replacing Ð in (7) by Ó Ð , and Ð in (9) by Ð , and Ó is given by ½ Ó Ó Ó ½ Ð ¼ · · Ð Ð Fig. 1. Training image. We see from (16) and (17) that ½Ó , ½ Ó and ½ Ó are independent of and can be computed in advance. There- fore, the computation of (16) and (17) involves short run- ning time. To ﬁnd the facial parts from ¼ , we introduce the quantity Ê ´ É ½µ¾ · ´É ½µ¾ (18) ½ ½ where ½ É ÖÓÛ (19) ½ Fig. 2. Test image. É ÓÐ (20) 1.6 tection system can use any dyadic wavelet ﬁlters as initial ’nose’ ’left_eye’ ’right_eye’ ﬁlters. In the experiment, we choose the B-spline dyadic 1.4 ’lips’ wavelet ﬁlters. Applying the ﬁlters to the training image, 1.2 we learned × Ñ and × Ñ following our learning algo- 1 rithm described in Section 3. In Table 1, we list the values and Ó× for the parameters × Ñ and × Ñ , 0.8 of Ó× ½ ¡¡¡ learned for nose, eyes and lips. 0.6 0.4 0.2 Table 1. The values of Ó× (above) and Ó× (below). 0 -10 -5 0 5 10 Fig. 4. Plots of Ê . nose left eye right eye lips 0.99016 0.86358 0.76062 0.98608 1 6. CONCLUSION 0.99249 0.94574 0.93858 0.98168 0.97906 0.86079 0.66756 0.97437 2 We proposed a facial parts detection algorithm using lift- 0.97991 0.91778 0.82076 0.97009 ing dyadic wavelet ﬁlters, which is based on the learning 0.97576 0.88259 0.73522 0.97326 3 of free parameters in the ﬁlters. Since we learned 8 kinds 0.96734 0.90104 0.79914 0.96865 of sets of free parameters, our detection method has high 0.98050 0.88503 0.67083 0.97524 4 accuracy. The detection time of the proposed algorithm is 0.96674 0.92188 0.82044 0.97547 satisfactory, but not enough to realize the on-line face detec- 0.98355 0.89683 0.71575 0.97343 tion system. We will develop the on-line detection system 5 0.97319 0.94218 0.81051 0.97809 by using FPGA. This is a future work. 0.98297 0.90744 0.74607 0.96893 6 0.97638 0.94449 0.80767 0.98001 7. REFERENCES 0.98027 0.91253 0.74379 0.96489 7 0.97846 0.94255 0.81388 0.98199  T. Abdukirim, K. Niijima and S. Takano, Lifting dyadic 0.96677 0.91204 0.73393 0.96164 8 wavelets, DOI Technical report, No.212, 2002. 0.97991 0.93926 0.80921 0.98324  M. Oren, C. Papageorgiou, P. Sinha, E. Osuna, and T. Poggio, Pedestrian detection using wavelet tem- Detection of facial parts was done following our detec- plates, Proceedings of the IEEE Conference on Com- tion algorithm described in Section 4. Figure 3 shows the puter Vision and Pattern Recognition, pp193-199, 1997. result of facial parts detection. We searched a block with  W. Sweldens, The lifting scheme: A custom-design construction of bi-orthogonal wavelets. Appl. Comput. Harmon. Anal., Vol.3, No.2, pp.186-200, 1996.  S. Takano and K. Niijima, Robust lifting wavelet trans- form for subimage extraction, Proceedings of the SPIE: Wavelet Applications in Signal and Image Processing VIII, Vol.4119, pp.902-910, 2000.  S. Takano and K. Niijima, Extraction of subimage by lifting wavelet ﬁlters. IEICE Transaction on Fundamen- tals, Vol.E83-A, No.8, pp.1559-1565, 2000. Fig. 3. Extracted facial parts.  S. Takano and K. Niijima, Subimage extraction by integer-type lifting wavelet transforms, Proceedings of the IEEE International Conference on Image Process- ½¼ ¢ ½¼ size around the point predicted in the learning pro- ing, Vol.2, pp.403-406, 2000. cess for eyes and lips. The detection time was approxi-  L. Wiskott, J.M. Fellous, N. Krﬁger, and C. von der mately 2.6 sec by using Pentium 4 (2.26GHz). Malsburg, Face recognition by elastic bunch graph In Fig. 4, we plot Ê around the detected nose, eyes matching, IEEE Transaction on PAMI, Vol.19, No.7, and lips. Figure 4 shows that the values of Ê at the pp.775-779, 1997. detected points are minimal.