FAST FACE DETECTION BY LIFTING DYADIC WAVELET FILTERS

Document Sample
scope of work template
							                FAST FACE DETECTION BY LIFTING DYADIC WAVELET FILTERS

                            Shigeru Takano, Koichi Niijima and Turghunjan Abdukirim

                                   Department of Informatics, Kyushu University,
                              6-1, Kasuga-koen, Kasuga, Fukuoka, 816-8580, JAPAN,
                                     takano, niijima, t-abdu @i.kyushu-u.ac.jp


                         ABSTRACT                                 jet. In our recent papers [4, 5, 6], we presented several im-
This paper presents a fast algorithm for detecting facial parts   age extraction algorithms using lifting wavelet transforms
such as nose, eyes and lips in an image by using lifting          of down-sampling type. The method is based on the learn-
dyadic wavelet filters. Free parameters in the lifting filters      ing of free parameters in the lifting scheme so as to van-
are learned so as to maximize the cosine of an angle be-          ish highpass components of a target image. Applying the
tween a vector whose components are the lifting filters and        learned highpass filter to a test image, however, parts differ-
a vector of pixels in the facial part. Applying the learned fil-   ent from the target part are also extracted. This is for the
ter to a test image, facial parts in the image can be detected.   reason that the wavelet used is not shift-invariant and the
In simulation, we show that our algorithm is fast and robust      learning of free parameters is done so as to vanish highpass
one for detecting facial parts from an image.                     components.


                   1. INTRODUCTION
                                                                      In this paper, we propose a facial parts detection algo-
Face detection from an image has attracted a considerable         rithm using lifting dyadic wavelet filters. Since lowpass and
attention in computer vision. Research in computer vision         highpass components obtained by dyadic wavelet transform
includes face recognition, face tracking, pose estimation and     are not down-sampled, they possess a shift-invariant prop-
facial expression recognition. Face detection plays an es-        erty. Our facial parts detection algorithm consists of two
sential role in these research topics. Therefore, it is impor-    processes, learning process and detection process. In the
tant to develop a fast and robust face detection algorithm.       learning process, free parameters in the dyadic wavelet fil-
    There are several studies on the face detection by tem-       ters are determined so as to maximize the cosine of an angle
plate matching methods. In the template matching process,         between a vector whose components are lifting highpass fil-
however, a huge amount of computation is required to ex-          ters and a vector consisting of pixels in the facial part. In the
tract faces from an image. Furthermore, this technique does       detection process, we compute an angle between the vector
not have robustness against the size, shape, color and tex-       of the learned filters and a vector of pixels in a test image,
ture of the face. Thus, development of fast and robust face       and compare the cosine of the angle with the maximized co-
detection algorithms has been expected. Recently, wavelet         sine value. The cosine value in both processes is evaluated
theory has been often used for feature extraction. Oren et al.    by dividing the inner product of the two vectors by their
[2] developed the pedestrian detection system using wavelet       norms. An essential part of the inner product is a computa-
templates. They used low level components obtained by ap-         tion of lowpass and highpass components by old filters not
plying Haar wavelet filter to a static image, and raised detec-    including free parameters, which can be done in advance.
tion rate by using support vector machine. However, since         So, our detection algorithm is fast enough and realizes real-
Haar wavelet transform is of down-sampling type, lowpass          time processing.
and highpass components obtained by the transform are not
shift-invariant. This causes the failure of facial parts de-
tection. Wiskott et al. [7] proposed the elastic bunch graph          The outline of the paper is as follows. In Section 2,
matching method based on Gabor wavelet filters. The method         we introduce lifting dyadic wavelet filters. Section 3 de-
uses a set of components obtained by applying Gabor wavelet       scribes a learning algorithm of the lifting dyadic wavelet
to the hand selected points on a face, which is called a jet.     filters. In Section 4, we present an algorithm for detecting
The jet has robustness against the size and orientation of the    facial parts from an image. Section 5 contains experimental
face. However, since Gabor wavelet is a complex-valued            results. Concluding remarks and future work can be found
continuous wavelet, it is time-consuming to compute the           in Section 6.
       2. LIFTING DYADIC WAVELET FILTERS                                                               where

We denote lowpass and highpass analysis filters by Ó                                                                                             Ó
                                                                                                                                                                    × Ñ                  Ó
                                                                                                                                                                                                    Ñ                  (8)
and Ó , respectively, and lowpass and highpass synthe-                                                                                                         Ñ
sis filters by Ó and Ó , respectively. We also denote
                                                                                                       Similarly, we obtain highpass components
the Fourier transforms of Ó , Ó , Ó and Ó           by
  Ó ´ µ, Ó ´ µ, Ó ´ µ and Ó ´ µ, respectively. We call these                                                                                            ƾ
                                                                                                                                                                                    ÓÐ
filters dyadic wavelet filters when they satisfy the recon-                                                                      ½                                    Ð           ¼         ·      Ð                     (9)
struction condition                                                                                                                                     Рƽ
                £
   Ó´   µ   Ó       ´   µ ·         Ó´       µ   Ó £´   µ           ¾            ¾               (1)   where

where the symbol £ denotes complex conjugation.                                                                   ¼
                                                                                                                      ÓÐ
                                                                                                                                                                        ¼                ·                           (10)
   Let us denote a signal by ¼ . By using the filters Ó
and Ó , we can compute lowpass and highpass compo-                                                                                                      Ó
                                                                                                                                                                                × Ñ                Ó
                                                                                                                                                                                                        Ñ            (11)
nents ½ and ½ as follows:                                                                                                                                            Ñ
                                                            Ó
                            ½                                        ¼   ·                       (2)   Here we define two vectors

                            ½
                                                         Ó
                                                                     ¼   ·                       (3)                               ´           ƽ ¡ ¡ ¡           ƾ µ
                                                                                                            ÖÓÛ
                                                                                                                                   ´
                                                                                                                                           ÖÓÛ
                                                                                                                                           ¼                ·   ƽ ¡ ¡ ¡                     ÖÓÛ
                                                                                                                                                                                             ¼              ·   ƾ µ
                                                                                                       Using inner product symbol ¡,
Conversely, by virtue of the reconstruction condition (1), we
                                                                                                                                                                        ½                can be written as
can reconstruct the original signal ¼ from ½ and ½
as                                                                                                                                              ½                       ¡           ÖÓÛ

                                                                                              
            ½                   Ó                               ½            Ó
   ¼                                     ½              ·                                ½       (4)   Similarly,                          can be expressed as
            ¾                                                   ¾                                                          ½

The formulae (2), (3) and (4) imply that ¼ is equivalent                                                                                            ½                   ¡            ÓÐ

to ½      ½   .
   Furthermore, we define the lifting dyadic wavelet filters                                             where
   ,    ,     and      , respectively, by
                                                                                                                                       ´     ƽ ¡ ¡ ¡                ƾ µ
                                     Ó                                                                         ÓÐ
                                                                                                                                       ´   ¼
                                                                                                                                               ÓÐ
                                                                                                                                               · ƽ                     ¡¡¡               ¼
                                                                                                                                                                                              ÓÐ
                                                                                                                                                                                                       ·   ƾ    µ
                                     Ó
                                                 ·              ×  Ñ         Ó
                                                                                  Ñ
                                                        Ñ                                              Therefore, we define the cosine for each of                                                           ½          and
                                                                                                 (5)
                                     Ó
                                                                ×Ñ       Ó
                                                                                  Ñ                     ½     as follows:
                                                     Ñ
                                     Ó                                                                                                                                      ½
                                                                                                                                           Ó×
                                                                                                                                                                                     ÖÓÛ                             (12)
Here × Ñ denote free parameters. In the paper [1], we
proved that the Fourier transforms of the filters defined by                                                                                                              ½
                                                                                                                                           Ó×
                                                                                                                                                                                     ÓÐ
                                                                                                                                                                                                                     (13)
(5) satisfy the reconstruction condition (1). Therefore, (2),
(3) and (4) hold for the new filters.
                                                                                                       Here      and    indicate the angles between         and ÖÓÛ ,
                                                                                                       and between and          ÓÐ , respectively, and the symbol ¡
                            3. LEARNING PROCESS
                                                                                                       denotes the norm of the vectors.
Let us denote an image by ¼        . Applying the formula                                                  Our criterion for learning the free parameters × Ñ and
(2) to the image ¼     in vertical direction, we get                                                   × Ñ is to maximize (12) and (13) for the facial part ¼

                                                                                                                                                                        ¡
                                                                                                       in a training image. This criterion is equivalent to
                        ÖÓÛ
                        ¼                                            ¼   ·                       (6)

                                                                                                                                                                        ¡
                                                                                                                                                                            ¾
                                                                                                                               ½                                ÖÓÛ
                                                                                                                                                                                                     min.            (14)
                                                                                                                                                                    ÓÐ ¾
Next applying (3) to (6) in horizontal direction yields                                                                            ½                                                                 min.            (15)
                                             ƾ
                                                                 ÖÓÛ
                        ½                                   Ð    ¼               ·   Ð           (7)   The problem (14) and (15) can be solved by the steepest
                                         Рƽ                                                          decent method.
    Choosing the point ´ µ of facial parts and under the                     Choosing the point ´ ¼ ¼µ that minimizes (18), ¼ ¼ ¼
conditions (14) and (15), we learn free parameters × Ñ                       gives a facial part.
and × Ñ , where  ¾         Ñ ¾ and            ½ ¡¡¡ . Using                      To realize fast face detection, we first search ¼    for
the learned parameters with different length, we are able to                 nose. After searching nose, eyes and lips are found around
robustly capture the features of the facial parts.                           their locations, which have already been predicted in the
    Our learning algorithm of the parameters × Ñ and × Ñ                     learning process.
is as follows:                                                                   We summarize our face detection algorithm as follows.
   1. Prepare facial parts ¼               to learn × Ñ and × Ñ .               1. Compute lowpass and highpass components         Ó
                                                                                                                                   ½     ,
                                                                                    Ó    and ½ Ó     for a test image.
   2. Compute the lowpass and highpass components                                   ½
        ÖÓÛ       , ½      , ¼ÓÐ   and ½      by using
        ¼                                                                       2. Compute ½         and ½      defined by (16) and
      (6), (7), (10) and (9).                                                      (17) using the computed components in Step 1 and
   3. Determine × Ñ and × Ñ by solving the minimiza-                               the learned parameters.
      tion problems (14) and (15) at the points ´ µ of
                                                                                3. Find the position ´   µ   of nose by minimizing Ê     .
      nose, eyes and lips ¼    .
                                                                                4. Search an area predicted in the learning process for
               4. DETECTION PROCESS                                                eyes and lips.

In Section 3, free parameters × Ñ and × Ñ in (8) and                                     5. EXPERIMENTAL RESULTS
(11) were learned for a facial part in a training image. In
this section, we detect facial parts in a test image using the               In an experiment, we detect facial parts in a facial image of
learned parameters. Now, we denote the test image again by                   the first author, which has 8-bit, ¾ ¢ ¾½ size. As a train-
  ¼     . The components ½            and ½        for the test              ing image, we use another facial image of the first author
image ¼         must be computed fast. They can be written                   with the same format as the test image. Figure 1 illustrates
in the present case as                                                       the training image, and Fig. 2 the test image. Our face de-
                               ¾
      ½
                   Ó
                   ½                   × Ñ              Ó
                                                        ½     Ñ       (16)
                           Ñ  ¾
                               ¾
      ½
                   Ó
                   ½                   × Ñ              Ó
                                                        ½     Ñ       (17)
                           Ñ  ¾

Here ½  Ó      and ½ Ó  are highpass components obtained
by replacing Ð in (7) by Ó Ð , and Ð in (9) by Ð , and
  Ó      is given by
 ½
           Ó               Ó       Ó
           ½                           Ð       ¼       ·     ·    Ð
                       Ð
                                                                                               Fig. 1. Training image.
We see from (16) and (17) that ½Ó    , ½ Ó     and ½ Ó

are independent of and can be computed in advance. There-
fore, the computation of (16) and (17) involves short run-
ning time.
    To find the facial parts from ¼      , we introduce the
quantity

  Ê             ´  É         ½µ¾ ·                 ´É           ½µ¾   (18)
               ½                               ½
where
                                           ½
                É                                      ÖÓÛ            (19)

                                           ½                                                     Fig. 2. Test image.
                É                                      ÓÐ             (20)
                                                                         1.6
tection system can use any dyadic wavelet filters as initial                                                          ’nose’
                                                                                                                 ’left_eye’
                                                                                                               ’right_eye’
filters. In the experiment, we choose the B-spline dyadic                 1.4                                           ’lips’



wavelet filters. Applying the filters to the training image,               1.2


we learned × Ñ and × Ñ following our learning algo-                       1

rithm described in Section 3. In Table 1, we list the values
          and Ó× for the parameters × Ñ and × Ñ ,
                                                                         0.8
of Ó×
     ½ ¡¡¡   learned for nose, eyes and lips.                            0.6



                                                                         0.4



                                                                         0.2
Table 1. The values of   Ó×     (above) and      Ó×   (below).
                                                                          0
                                                                           -10       -5         0          5                    10

                                                                                    Fig. 4. Plots of Ê     .
         nose        left eye       right eye           lips
       0.99016       0.86358        0.76062           0.98608
 1                                                                                   6. CONCLUSION
       0.99249       0.94574        0.93858           0.98168
       0.97906       0.86079        0.66756           0.97437
 2                                                               We proposed a facial parts detection algorithm using lift-
       0.97991       0.91778        0.82076           0.97009
                                                                 ing dyadic wavelet filters, which is based on the learning
       0.97576       0.88259        0.73522           0.97326
 3                                                               of free parameters in the filters. Since we learned 8 kinds
       0.96734       0.90104        0.79914           0.96865
                                                                 of sets of free parameters, our detection method has high
       0.98050       0.88503        0.67083           0.97524
 4                                                               accuracy. The detection time of the proposed algorithm is
       0.96674       0.92188        0.82044           0.97547
                                                                 satisfactory, but not enough to realize the on-line face detec-
       0.98355       0.89683        0.71575           0.97343    tion system. We will develop the on-line detection system
 5
       0.97319       0.94218        0.81051           0.97809    by using FPGA. This is a future work.
       0.98297       0.90744        0.74607           0.96893
 6
       0.97638       0.94449        0.80767           0.98001
                                                                                     7. REFERENCES
       0.98027       0.91253        0.74379           0.96489
 7
       0.97846       0.94255        0.81388           0.98199
                                                                 [1] T. Abdukirim, K. Niijima and S. Takano, Lifting dyadic
       0.96677       0.91204        0.73393           0.96164
 8                                                                   wavelets, DOI Technical report, No.212, 2002.
       0.97991       0.93926        0.80921           0.98324
                                                                 [2] M. Oren, C. Papageorgiou, P. Sinha, E. Osuna, and
                                                                     T. Poggio, Pedestrian detection using wavelet tem-
    Detection of facial parts was done following our detec-          plates, Proceedings of the IEEE Conference on Com-
tion algorithm described in Section 4. Figure 3 shows the            puter Vision and Pattern Recognition, pp193-199, 1997.
result of facial parts detection. We searched a block with
                                                                 [3] W. Sweldens, The lifting scheme: A custom-design
                                                                     construction of bi-orthogonal wavelets. Appl. Comput.
                                                                     Harmon. Anal., Vol.3, No.2, pp.186-200, 1996.
                                                                 [4] S. Takano and K. Niijima, Robust lifting wavelet trans-
                                                                     form for subimage extraction, Proceedings of the SPIE:
                                                                     Wavelet Applications in Signal and Image Processing
                                                                     VIII, Vol.4119, pp.902-910, 2000.
                                                                 [5] S. Takano and K. Niijima, Extraction of subimage by
                                                                     lifting wavelet filters. IEICE Transaction on Fundamen-
                                                                     tals, Vol.E83-A, No.8, pp.1559-1565, 2000.

               Fig. 3. Extracted facial parts.                   [6] S. Takano and K. Niijima, Subimage extraction by
                                                                     integer-type lifting wavelet transforms, Proceedings of
                                                                     the IEEE International Conference on Image Process-
½¼ ¢ ½¼ size around the point predicted in the learning pro-
                                                                     ing, Vol.2, pp.403-406, 2000.
cess for eyes and lips. The detection time was approxi-          [7] L. Wiskott, J.M. Fellous, N. Krfiger, and C. von der
mately 2.6 sec by using Pentium 4 (2.26GHz).                         Malsburg, Face recognition by elastic bunch graph
    In Fig. 4, we plot Ê     around the detected nose, eyes          matching, IEEE Transaction on PAMI, Vol.19, No.7,
and lips. Figure 4 shows that the values of Ê         at the         pp.775-779, 1997.
detected points are minimal.

						
Related docs