Facial Expression Recognition Using HLAC Features and WPCA

Document Sample
Facial Expression Recognition Using HLAC Features and WPCA Powered By Docstoc
					    Facial Expression Recognition Using HLAC Features
                       and WPCA

                   Liu Fang , Wang Zhi-liang, Wang Li, Meng Xiu-yan

                              School of Information Engineering,
                    University of Science & Technology Beijing, China

       Abstract. This paper proposes a new facial expression recognition method
       which combines Higher Order Local Autocorrelation (HLAC) features with
       Weighted PCA. HLAC features are computed at each pixel in the human face
       image. Then these features are integrated with a weight map to obtain a feature
       vector. We select the weight by combining statistic method with psychology
       theory. The experiments on the “CMU-PITTSBURGH AU-Coded Face
       Expression Image Database” show that our Weighted PCA method can improve
       the recognition rate significantly without increasing the computation, when
       compared with PCA.

1 Introduction

Facial expression is a visible manifestation of the affective state, cognitive activity,
intention, personality, and psychopathology of a person; it not only expresses our
emotions, but also provides important communicative cues during social interaction.
Reported by psychologists, facial expression constitutes 55 percent of the effect of a
communicated message while language and voice constitute 7 percent and 38 percent
respectively[1]. So it is obvious that analysis and automatic recognition of facial
expression can improve human-computer interaction or even social interaction.
   FER provided by us is the shortened form of computer automatic recognition of
human facial expression, in detail it means using computer to extract facial features
automatically and classify them to six basic expressions [2]. As a principle component
of artificial psychology and affective computing, facial expression recognition will
have widely application domains such as HCI, tele-teaching, tele-medical treatments,
Car surveillance, PC-GAME, affective robots etc[3].

2 Common steps and content of FER

Automatic systems for facial expression recognition usually take the form of a
sequential configuration of processing blocks, which adheres to a classical pattern
recognition model (see Figure 1). The main blocks are: image acquisition, pre-
processing, feature extraction, classification, and post-processing.
                         Row                 Pre-              Feature
                         data           processing            extraction

                      Classification              Post-             Class
                                                 processing         Label

                  Fig. 1 Flowchart of facial expression recognition system

   The main steps embedded in the components of an automatic expression
recognition system are feature extraction and classification. Feature extraction
converts pixel data into a higher-level representation –shape, motion, color, texture,
and spatial configuration of the face or its components. Geometric, kinetic, and
statistical, or spectral-transform-based features are often used as alternative
representation of the facial expression. Then a wide range of classifiers, covering
parametric as well as non-parametric techniques, has been applied to automatic
expression recognition problem. For example, LDA classifier, ANN classifier[4],
SVM classifier, HMM classifier, Bayesian network classifiers [5]etc.

3 Higher-order Local Auto-Correlations Features

Higher-order Local Auto-Correlation (HLAC) features developed by Otsu are widely
used in many computer vision applications, such as character recognition and face
recognition [6]. It is a derivative of a higher-order autocorrelation function, which is
defined by the following equation:

         x(a1 ,..., a N ) = ∫ I (r ) I (r + a1 )...I (r + a N )dr = ∫ h(r )dr          (1)

where  I (r ) represents an image, r is a reference point in the image, h(r ) represents
a local feature defined at a pixel r, ( a1 ,..., a N ) is a set of displacements, N is the
order of the autocorrelation function, and x is a primary statistical feature of the image.
We can define an infinite number of higher-order autocorrelation functions by
changing the set of displacements ( a1 ,..., a N ) and N above .We must reduce the
number of functions to a reasonably small number to produce higher-order
autocorrelation functions that are useful in practical applications.
   Limiting the order of autocorrelation function N to the second (N=0,1,2),
restricting the displacements ai within one pixel, and taking into account the
equivalence of duplications by translation, the number of displacement patterns
reduces to 35. Fig. 2 shows 3 types of local displacement patterns (each type is called
a mask).Using all the masks, we can compute 35 types of primary statistical features
of an image. These features are called HLAC features. Fig. 3 shows the original gray
image, the binary image computed by threshold method ,and (c) shows the processed
image, It is obtained by adding the three matrixes which is computed by moving the
three mask (showed in figure 2)on the given binary image.

              (a)horizontal line                 (b) curve                   (c)corner
                  Fig. 2 type of local displacement patterns for HLAC functions

                       (a) gray image (b) binary image (c)processed image
                                       Fig. 3 type of face image

4 Weighted PCA

Science every area (e.g. area around the eyes) may contain important features for
classification of expressions while other areas (e.g. area around the forehead) may not,
here we use the Weighted PCA approach to find the optimal weight map vector [7].
   Let the training set of feature images be h1 , h2 ,..., hk .The average feature image

of the set is defied by            h = 1 / k ∑ hi ,the covariance matrix of our sample

set H =[ h1 , h2 ,..., hk ] is               (            )(    )
                                 S = ∑ hi − h hi − h ,and U = [u1 , u 2 ,..., u q ] is the

eigenvectors corresponding to the largest eigenvalue of the scatter matrix. Then the
feature image can transformed into its eigenface components by a simple
operation:. y = U ( x − h)

   Principal Component Analysis (PCA) seeks a projection that best represents the
data in a least-squares sense[8]. It use the variance criterion J to optimize the matrix
U ,as follows:
                   p                                  p               T                        (2)
        J 1 (U ) = ∑ || ( h + Uy k ) − hk || 2 = ∑ (( h + Uy k ) − hk ) (( h + Uy k ) − hk )
                  k =1                            k =1

Where    hk represents training samples, y k represents their projections.
   In facial expression recognition, it is natural to think that some areas (e.g. area
around the eyes) may contain important features for classification of expressions
while other areas (e.g. area around the forehead) may not. Concurrent with this idea,
we propose a new feature projection method using weighted coefficient in principal
component analysis to distinguish the different areas’ contributions to facial
expression recognition. Then we construct sample - x m ’s weighted reconstruction
error function:
                                                p                    T                          (3)
   J 2 (U , xm )∑j wj ((x0 j +Uykj ) − xmj ) = ∑((x0 +Uyk ) − xm ) W((x0 +Uyk ) − xm )
         '                                 2

                                               k =1

Where the weighted coefficient diagonal matrix W =Diagonal[ w1 , w2 , w3 ,..., wn ],
w1 , w2 , w3 ,..., wn =n;Then the aim is to find matrix U ' ,which can minimize the
variance criterion J 2 (U ) .

    J 2 (U ' ) = ∑m J 2 (U ' , xm ) = ∑m (xm − x0 ) (U 'U 'T − I )W (U 'U 'T − I )(xm − x0 )    (4)

In this way, the basis of weighted PCA is the row vectors of matrix U .
   Of course we can obtain the matrix U by optimization method. But in fact,
especially in real-time recognition, there is an approximate method, Given the
weighted coefficient diagonal matrix W , the covariance matrix defined as follows:
                                   p                                                           (5)
                             S = ∑ ( x k − x0 )W ( x k − x0 ) T
                                  k =1

Then, we could use the eigenvectors of matrix S in place of U .But How to decided
   Presently, most computer facial analysis and recognition methods are based on
Ekman’s Face Action Coding System (FACS)[9], which is proposed in 1978, and
widely admitted in psychology domain. It is the most comprehensive method of
coding facial displays.FACS action units are the smallest visibly discriminable
changes in facial display, and combinations of FACS action units can be used to
describe emotion expressions and global distinctions between positive and negative
   In this paper, for analysis and recognition facial expressions, we use FACS
interpretation and FACS’AU to model six basic type of facial expressions- Anger,
Disgust, Fear, Happiness, Sadness and Surprise. In other words, that is to make each
facial expressions corresponds to AU weighted combinations.(see table 1)
   According to table, we selected different weighted in the area around the eyes,
brows and lip, y the rule of letting them corresponding to the AU contribution. For
example, In “Fear” expression, we decided that the area’s weighted of eyes, mouth
and others are 0.6, 0.3, 0.1 respectively.
                    Table 1. FACS prototypes for modeled expressions.
     Expression         Created prototype                   Detail description
       Anger            AU4+5+7+15+24            Brow Lowerer + Upper Lid Raiser +
                                              Lid Tightener
                                                 + Lip Corner Depressor + Lip Pressor
       Disgust             AU9+10+17             Nose Wrinkler+ Upper Lip Raiser+
                                              Chin Raiser
        Fear          AU1+2+4+5+7+20+25          Inner Brow Raiser + Outer Brow
                                              Raiser + Brow Lowerer
                                                 + Upper Lid Raiser + Lid Tightener +
                                              Lip Stretcher
                                                 + Lips Part
      Happiness            AU6+12+25             Cheek Raiser + Lip Corner Puller +
                                              Lips Part
       Sadness          AU1+4+7+15+17            Inner Brow Raiser + Brow Lowerer +
                                              Lid Tightener
                                                 + Lip Corner Depressor + Chin Raiser
      Surprise          AU1+2+5+25+26            Inner Brow Raiser + Outer Brow
                                              Raiser + Upper Lid Raiser +Lips Part
                                              +Jaw Drop

5 Tests and results

We evaluated the performance of our facial expression recognition method on the
CMU-Pittsburgh AU-Coded Face Expression Image Database, Which contains a
recording of the facial behavior of 210 adults who are 18 to 50 years old; 69% female
and 31% male; and 81% Caucasian, 13% African, and 6% other groups. In this
database, Image in this database sequences from neutral to target display were
digitized into 640 by 480 or 490 pixel arrays with 8-bit precision for grayscale values.
   In our experiment, we choice 560 images from the CMU database randomly. Those
images are classified into 7 basic expressions- nature, anger, disgust, fear, joy,
sadness, and surprise.70% of the 80 images of each facial expression are used for
training, while others are used for testing. In PCA method, we use the pictures of
human faces, while in WPCA method, we should extract the local areas in one’s face
by integral projection method as Figure 4. Figure 5 shows the first three acquired
Eigeneyes and Eigenmouths .

                      Fig.4 Expressions areas of our Sample images
             Fig. 5 the Eigeneyes(Up) / Eigenmouth(Down) in our experiment

   Our experiment used the PCA, WPCA and HLAC+WPCA method to train and test
the system. The recognition rate was obtained by averaging the recognition rates of all
7 types of facial expressions. The recognition rate of our method using the
HLAC+WPCA was 83.9%, while PCA method was 71.3% and WPCA method was
78.9%.(as table 2 shows)Our hybrid method outperformed the others, and by using
WPCA, we compute the eigenface and eigenmouth in place of the whole face ,so it
can reduce our computation .

                        Table 2. Recognition rates of different method
                                Method                 Rate(%)
                                 PCA                    71.3
                                WPCA                    78.9
                              HLAC+WPCA                 83.9

   In table 3, we show our HLAC+WPCA method’s recognition rates of different
facial expressions, respectively. The first row presents the recognition results, while
the left row means the real facial expressions. It is easy to know from the table that
“joy” and “surprise” gained better recognition rates, because their facial action are

                Table 3. Recognition rates for 7 basic facial expression(%)
  Results      nature        anger        disgus       fear       joy         Sad    surpris
                                           t                                           e
   nature       68.5           7.2         12.7         0          0          11.6      0
   anger         6.1          81.9          2.2         0          0          9.8       0
  disgust       11.1           5.5         79.6         0          0          3.8       0
    fear          0             0           3.8        89.5        0           0       6.7
    joy          7.8            0           0           0         91.2         0        0
    Sad           0             0           4.8        13.1        0          82.1      0
  surprise        0             0           0          5.5         0           0      94.5
6 Concluding remarks

This paper described a facial expression recognition method using HLAC features and
Weighted PCA. Our method is a hybrid of two approaches: geometric based
approaches[10] and statistic based approaches. We generalized and combined these
approaches from a more unified standpoint.
   We quantified and visualized relative importance (weights) of facial areas for
classification of expressions. Besides, we proposed a new idea to confirm the weight
coefficient, which is combined statistic method with psychology theory. Furthermore,
the proposed method can be used with other types of local features and with more
complicated but powerful classification methods (e.g. neural networks, support vector
machines, kernel-based Fisher discriminant analysis, etc).
   Given that facial expression and emotion are dynamic process, for future work, we
will add subtle and blended facial expressions recognition to our research and pay
more attention to extract kinetic features and transitory changings, such as facial wrinkles,
etc. And to implement human-Computer Interaction tasks related to real-time facial expressions,
we will also investigate the algorithm’s efficiency and delays.

7 Acknowledgment

The paper supported by the key lab named Advanced Information Science and
Network Technology of Beijing (No. TDXX0503 ) and the key foundation of USTB.


1. A. Mehrabian. Communication without words. Psychology Today, 2(4):53–56, 1968.
2. Yang Guoliang, Wang Zhiliang1, Ren Jingxia. Facial expression recognition based on
   adaboost algorithm, Computer Applications. Vol 25.No 4,Apr 2005.
3. C. L. Lisetti and D. J. Schiano. Automatic Facial Expression Interpretation: Where Human-
   Computer Interaction, Arti cial Intelligence and Cognitive Science Intersect. Pragmatics and
   Cognition (Special Issue on Facial Information Processing: A Multidisciplinary Perspective),
   8(1):185-235, 2000.
4. Neil Alldrin, Andrew Smith, Doug Turnbull. Classifying Facial Expression with Radial
   Basis Function Networks, using Gradient Descent and K-means. CSE253, 2003
5. Ira Cohen, Nicu Sebe, Fabio G. Cozman. Learning Bayesian network classifiers for facial
   expression recognition using both labeled and unlabeled data. CVPR '03- Volume I.-p. 595
6. T. Kurita, N. Otsu, and T. Sato, “A Face Recognition Method Using Higher Order Local
   Autocorrelation and Multivariate Analysis,” in Proc. IAPR Int. Conf. on Pattern
   Recognition, pp. 213–216, 1992.
7. Qiao Yu, Huang Xiyue, Chai Yi. Face recognition based on Weighted PCA. Journal of
   Chongqing University. Vol 27, No 3, Mar 2004.
8. M.Turk and A. Pentland, “Eigen Faces for Recognition,” Journal of Cognitive Neuroscience,
   vol. 3, pp. 71–86, 1991.
9. P. Ekman et al1, Facial Action Coding System” ,Consulting Psychologists Press ,1978.
10.Philipp Michel, Rana EI Kaliouby, ”Real time Facial Expression Recognition in Video using
   Support Vector Machines”, ICMI’3, November 5-7 ,2003