Docstoc

Local Alignment of Gradient Features for Face Photo and Face Sketch Recognition

Document Sample
Local Alignment of Gradient Features for Face Photo and Face Sketch Recognition Powered By Docstoc
					LOCAL ALIGNMENT OF GRADIENT FEATURES FOR FACE PHOTO AND FACE

                       SKETCH RECOGNITION



                                 Thesis

                              Submitted to

                    The School of Engineering of the

                     UNIVERSITY OF DAYTON



              In Partial Fulfillment of the Requirements for

                             The Degree of

               Master of Science in Electrical Engineering



                                   By

                            Ann Theja Alex

                              Dayton, Ohio

                            December, 2012
LOCAL ALIGNMENT OF GRADIENT FEATURES FOR FACE PHOTO AND FACE

                            SKETCH RECOGNITION




Name: Alex, Ann Theja


APPROVED BY:




Vijayan K. Asari, Ph.D                       Tarek M. Taha, Ph.D
Advisory Committee Chairman                  Committee Member
Professor                                    Associate Professor
Department of Electrical & Computer          Department of Electrical & Computer
Engineering                                  Engineering




                           Eric J. Balster, Ph.D
                           Committee Member
                           Assistant Professor
                           Department of Electrical & Computer
                           Engineering




John G. Weber, Ph.D                          Tony E. Saliba, Ph.D
Associate Dean                               Dean, School of Engineering
School of Engineering                        & Wilke Distinguished Professor


                                        ii
                                      ABSTRACT



LOCAL ALIGNMENT OF GRADIENT FEATURES FOR FACE PHOTO AND FACE

                               SKETCH RECOGNITION



Name: Alex, Ann Theja
University of Dayton

Advisor: Asari, Vijayan K.


       Automatic recognition of human faces (face photo recognition) irrespective of the

expression variations and occlusions is a challenging problem. In the proposed technique,

the edges of a face are identified, and a feature string is created from edge pixels. This

forms a symbolic descriptor corresponding to the edge image referred to as ‘edge-string‘.

The ‘edge-strings‘ are then compared using the Smith-Waterman algorithm to match

them. The class corresponding to each image is identified based on the number of string

primitives that match. This method needs only a single training image per class. The

proposed technique is also applicable to face sketch recognition. In face sketch

recognition, a sketch drawn based on the descriptions of the victims or witnesses is

compared against the photos in the mug shot database to facilitate a faster investigation.

The effectiveness of the proposed method is compared with state-of-the-art algorithms on

several databases. The method is observed to give promising results for both face photo

recognition and face sketch recognition.

                                           iii
                Dedicated to
    My precious gifts from almighty God-
My husband, parents, siblings and grandparents




                      iv
                               ACKNOWLEDGEMENTS


       Over the past two and a half years I have received support and encouragement

from many wonderful individuals. I would not have finished this thesis without the

support from my advisor, my friends and my family.


       Dr. Vijayan K. Asari has inspired me with his continuing support. He has been a

great mentor, a patient guide and an inspiring teacher without whom this thesis would

have been impossible. He taught me the greatest lesson in life –‗to take every stumbling

block as an opportunity to improve‘. I would like to thank all my professors at the

University of Dayton who prepared me with the courses that were necessary for

understanding literature related to my thesis. My sincere thanks go to my committee

members Dr. Tarek M. Taha and Dr. Eric Balster, who spared time for reading and

commenting on this thesis.


       I appreciate my colleagues and friends in Vision Lab whose suggestions helped

me throughout the phases of algorithm development. My sincere thanks to Saibabu

Arigela and Binu M. Nair who helped me a lot with setting up the defense.


       My parents, grandparents and sisters have been supporting me right from my

childhood. I extend my gratitude to them for encouraging me continuously. I extend

special gratitude to my grandfather, V.A Kuriakose, for opening the world of engineering

to me when I was a toddler and for inspiring me to be a great engineer.

                                            v
       I would say- ‗Behind every successful wife there is a loving and understanding

husband‘. This thesis would have been impossible without my wonderful husband Alex.

He inspired me every day to pull through this journey. He makes my days sunny even

when things go wrong. His parent‘s and sister‘s love and continuous encouragement

made me comfortable with research even after our marriage.


       Last but not the least, I would like to thank the omnipresent God for answering

my prayers and giving me the strength- without him I would be nothing. I believe ―It was

all God‘s doing and it is marvelous in our eyes.‖




                                            vi
                                                TABLE OF CONTENTS



ABSTRACT ....................................................................................................................... iii

DEDICATION……………………………………………………………………………iv

ACKNOWLEDGEMENTS ................................................................................................ v

LIST OF ILLUSTRATIONS .............................................................................................. x

LIST OF TABLES ........................................................................................................... xiii

LIST OF ABBREVIATIONS AND NOTATIONS ........................................................ xiv

CHAPTER I INTRODUCTION ......................................................................................... 1

1.1.          Contributions of this Thesis ................................................................................ 5

1.2.          Thesis Organization ............................................................................................ 6

CHAPTER II BACKGROUND ......................................................................................... 7

2.1        Face Perception ....................................................................................................... 7

   Is Face photo/Face sketch recognition a completely dedicated process in human

   perception? ...................................................................................................................... 9

   Is Face photo/Face sketch recognition the result of a holistic analysis or a feature based

   analysis? .......................................................................................................................... 9

   Significance of the facial features ................................................................................. 10

   Effect of high and low frequency components ............................................................. 10

   Role of race/gender ....................................................................................................... 10

   Human memory is poor for faces that are not frequently seen ..................................... 11

                                                                  vii
2.2       History of Face Photo/Face Sketch Recognition .................................................. 11

   Face detection ............................................................................................................... 14

   Face feature extraction .................................................................................................. 17

   Face recognition/face classification .............................................................................. 20

CHAPTER III PATTERN MATCHING BY LOCAL ALIGNMENT OF GRADIENT

FEATURES ...................................................................................................................... 24

3.1       Algorithm Architecture ......................................................................................... 26

3.2       Edge Feature Extraction ........................................................................................ 27

   Edge detection ............................................................................................................... 27

    Gaussian smoothing ..................................................................................................... 27

    Gradient filtering .......................................................................................................... 27

    Non-maxima suppression ............................................................................................. 28

    Adaptive threshold computation .................................................................................. 28

    Hysteresis thresholding ................................................................................................ 29

   Reference line determination ........................................................................................ 30

   Edge string generation .................................................................................................. 33

3.3       Edge Feature Matching ......................................................................................... 35

   Local string alignment .................................................................................................. 35

   Percentage similarity computation ................................................................................ 37

3.4       Local String Alignment Example with Real Image Patterns ................................ 38

3.5       Local Alignment Match Procedure vs. Global Alignment Match Procedure ....... 39

CHAPTER IV FACE PHOTO RECOGNITION ............................................................. 41

4.1       Training and Testing Procedure for Face Photo Recognition Applications ......... 42



                                                               viii
4.2       Experiments on Expression Invariant Face Recognition ...................................... 43

   Experiments on Yale database ...................................................................................... 44

   Experiments on the Japanese Female Face Expression (JAFFE) database .................. 46

   Experiments on the CMU AMP Expression database .................................................. 47

4.3       Experiments on Occlusion Invariant Face Recognition........................................ 48

   Experiments on the LFW (Labeled Faces in the Wild) database .................................. 48

   Experiments on the AR database .................................................................................. 49

CHAPTER V FACE SKETCH RECOGNITION ............................................................ 50

5.1       Training and Testing Procedure for Face Sketch Recognition Applications ........ 52

5.2       Synthesis Based Methods vs. Proposed Method: Complexity Comparison

          ……………………………………………………………………………………53

5.3       Experiments on the Sketch Databases .................................................................. 55

   Experiments on the CUHK database ............................................................................ 55

   Experiments on the AR database .................................................................................. 56

CHAPTER VI CONCLUSION AND FUTURE WORK ................................................. 58

LIST OF PUBLICATIONS .............................................................................................. 61

BIBLIOGRAPHY ............................................................................................................. 63




                                                             ix
                                             LIST OF ILLUSTRATIONS



Fig 1.1 Face sketch recognition problem- images from CUHK database [2]. .................... 4

Fig 1.2 Occlusion invariant face recognition problem- images from AR database [3]. ..... 4

Fig 1.3 Expression invariant face recognition problem- images from JAFFE database [4].

     ..................................................................................................................................... 5

Fig 1.4 Expression invariant face recognition problem- images from Yale database [5]... 5

Fig 2.1 The Thatcher effect was named after the British Prime Minister Margaret

     Thatcher on whose photo this famous effect was first demonstrated by Peter

     Thompson in 1980. This is a phenomenon where it becomes difficult to detect local

     feature changes in an upside down face while identical changes are obvious in an

     upright face. ................................................................................................................ 8

Fig 2.2 Typical steps involved in a Face Recognition System. In all face recognition

     systems, the first step is a face detection step where the face regions are detected.

     The second is the feature extraction step that extracts relevant features. These

     features are used by the face recognition step for classification. .............................. 12

Fig 2.3 Face Photo Authentication and Face Sketch Authentication problems are shown

     above. ........................................................................................................................ 13

Fig 2.4 Face Photo Identification and Face Sketch Identification problems are shown

     above. ........................................................................................................................ 14



                                                                   x
Fig 3.1 Algorithm architecture of the proposed approach. ............................................... 26

Fig 3.2 Edge detection algorithm. ..................................................................................... 28

Fig 3.3 A sample image and the outputs corresponding to each of the steps in the Canny

       edge detection algorithm is shown above. Original Image, Gaussian Smoothed

       Image, Gradient Filtered Image, Non-maxima suppression output and the Canny

       edge detection output image are shown in the respective order in this image. ......... 30

Fig 3.4 This shows the drawback of using a fixed line of reference. The reference line is

       the same even when the face rotates in-plane. This results in a wrong angle

       calculation, as the angle is always computed with reference to the ‗reference line‘. 30

Fig 3.5 Iris detection. ........................................................................................................ 31

Fig 3.6 CHT illustration. ................................................................................................... 33

Fig 3.7 Reference line determination using the procedure shown in fig 3.5. ................... 33

Fig 3.8 Sample image and the corresponding edge image with the δ and θ values shown.

       The reference line for the angle calculation is the horizontal line passing through the

       origin. This is the case when a fixed reference line is used. ..................................... 34

Fig 3.9 The two sample patterns which are considered for illustrating the local string

       alignment algorithm mechanism. .............................................................................. 38

Fig 3.10 The Ψ matrix before and after backtracking. ..................................................... 38

Fig 3.11 Ψ matrix and resulting alignment. ...................................................................... 39

Fig 3.12 Sample images from the CMU AMP Expression database [77]. ....................... 40

Fig 4.1 Flow diagram of test procedure for face identification. ....................................... 44

Fig 4.2 Performance comparison on Yale Expression database: Our method vs other

       methods (Hu and Wang, 2006) [78]. ........................................................................ 46



                                                                xi
Fig 4.3 Performance comparison on a subset of Yale Expression database: Our method vs

      other methods (Ekenel and Stiefelhagen, 2005) [79]................................................ 46

Fig 4.4 Performance comparison on JAFFE database [80][81] (Wang and Ruan, 2010). 47

Fig 4.5 Sample images from the LFW database with manually added occlusion. ........... 48

Fig 5.1 Flow diagram of test procedure for face sketch recognition. ............................... 53

Fig 5.2 Sample images from the CUHK database. ........................................................... 55

Fig 5.3 Rank curve for CUHK student dataset. ................................................................ 55

Fig 5.4 Sample images from AR database and corresponding sketches provided by the

      CUHK group. ............................................................................................................ 56




                                                              xii
                                        LIST OF TABLES



Table 4.1 Proposed method vs. error rates in [55] (Gao and Leung, 2002)...................... 45

Table 4.2 Comparison of recognition rates on CMU AMP Expression database. ........... 48

Table 4.3 Comparison of recognition rates on AR database. ........................................... 49

Table 5.1 Comparison of recognition rates on AR database. ........................................... 56




                                                  xiii
           LIST OF ABBREVIATIONS AND NOTATIONS


CUHK – Chinese University of Hong Kong


JAFFE – Japanese Female Face Expression


SIFT – Scale Invariant Feature Transform


PCA – Principal Component Analysis


SVM – Support Vector Machine


SNoW – Sparse Network of Winnows


LDA – Linear Discriminant Analysis


ICA – Independent Component Analysis


SURF – Speeded Up Robust Features


HoG – Histogram of Gradients


LBP – Local Binary Patterns


CHT – Circular Hough Transform


LFW – Labeled Faces in the Wild


GPU – Graphics Processing Unit



                                  xiv
                                       CHAPTER I

                                    INTRODUCTION




       Automatic face photo and face sketch recognition are two rapidly growing areas.

The first application has already intruded the commercial world to some extent. The

ultimate aim of the face recognition and face sketch recognition systems are to enable fast

identity matching in the domains of surveillance, authentication and investigations. Some

of the prominent applications are in entertainment, authentication, information security

and law enforcement sectors. Entertainment applications include virtual reality, video

gaming and console based gaming systems. The techniques can also be used for

validating identity proofs. These include passport, driving license, state IDs or other

identity cards used in workplaces. The mechanism can be used to authenticate users and

ensure a double level security without any interventions into the privacy of the individual.

It also has several applications in information security. A good biometric measure can be

used for database encryption, file security, internet security and secure login to high

security computers. Another application is in law enforcement. Face sketch recognition is

of primary importance in criminal investigation and forensic applications. They are used

in surveillance systems for post event analysis and suspect tracking. Such a recognition

system can also be used by banks to make ATM use more secure. A perfect biometric

authentication system that can recognize faces with extreme accuracy can effectively

                                             1
       replace the use of PIN numbers in ATM transactions or can be used as a

secondary security measure along with the PIN number. They can also be used to ensure

secure access to buildings.

       Biometric systems can be categorized into invasive and noninvasive systems

based on the extent of co-operation required from the individuals involved. High-

accuracy biometric authentication systems are available to enable iris recognition and

finger print analysis which can be used for the applications discussed earlier. However,

such authentication mechanisms are invasive and therefore require user permission and

active participation. For example, iris recognition requires the subject‘s eye to be in close

proximity to the IR camera. On the other hand, the frontal face and the profile face of a

person can be easily captured without active involvement of the person. For this reason,

they are considered noninvasive.

       An ideal face recognition system should be able to capture face images and do the

relevant tasks of recognition/authentication or tracking based on the purpose for which

the system is employed. Face recognition and face sketch recognition are fields that are

making considerable progress. Although face recognition has moved into the application

sector, the systems are not flawless. Challenges in this field can be categorized into

extrinsic variations (pose, lighting and occlusions) and intrinsic variations (expression

and age variations). One of the intrinsic challenges in the field of face recognition is the

change in expressions of the individuals. This makes the captured images different from

those in the verification database. Another factor is the presence of occlusions. If the

person happens to wear a scarf or sunglasses, then the captured image becomes different

from the one that is already in the database. Yet another factor is aging. A face image



                                             2
taken now will be considerably different from that taken 10 years later. So the biometric

systems need some mechanism to update the stored templates periodically. Pose

variations make face recognition even tougher to achieve. The frontal face and the profile

face of the same person are considerably different than the frontal face of two different

people. The solution to the pose variation problem takes templates for various poses as

input or finds features that are pose invariant. An ideal face recognition system should be

pose, expression, occlusion and aging invariant. However, to achieve all these together is

a next to impossible. So algorithms typically try to focus on a couple of challenges

together and solve them before considering the other challenges [1]. The most effective

systems today do not give holistic solutions. Face recognition works perfectly well if

images have no expression, pose, or age variations and when there are no occlusions.

       Face sketch recognition is also a very tough research problem. Face sketch

recognition is a relatively younger area compared to face recognition. The major

challenge in face sketch recognition is that a sketch is drawn based on the description of

the victims or the witnesses. In such a scenario there are considerable differences

between the actual face and the sketch of a convict. This makes face sketch recognition a

very difficult task. On the other hand, face sketch recognition can be used as an

alternative to face recognition where a machine generated sketch is used for the

recognition task. Though a relatively new field, face sketch recognition is a very useful

application which can be used for criminal investigations. Face sketch recognition makes

search over a mug shot database easier.            In this work, our focus is on both face

recognition and face sketch recognition. We have devised a method which can be used

for effectively solving both these problems.



                                               3
       Fig 1.1 Face sketch recognition problem- images from CUHK database [2].




  Fig 1.2 Occlusion invariant face recognition problem- images from AR database [3].


       Examples of face images where our method can be applied for recognition are

shown figures. 1.1, 1.2, 1.3 and 1.4. We solve the face recognition problem handling both

the expression and occlusion variations. The second problem handled by our method is

the face sketch recognition problem. The method is useful for applications such as




                                           4
damaged photograph recognition for recognizing faces in damaged identity cards, face

authentication/identification and convict recognition in criminal investigations.




Fig 1.3 Expression invariant face recognition problem- images from JAFFE database [4].




 Fig 1.4 Expression invariant face recognition problem- images from Yale database [5].


1.1.   Contributions of this Thesis

    Most of the databases currently have the face images detected and normalized. For

the same reason it is assumed that the face images are detected and normalized prior to

being used as input for the proposed system. This thesis contributes to the areas feature

extraction and recognition subtasks of a face photo/face sketch recognition system. In

addition, the method can also be used in conjunction with face detection algorithms if the

faces are not readily available.

    Listed below are the main contributions of this thesis:

       1. A unified framework that uses the edge features for face photo and face sketch

           recognition is developed.

       2. We have developed a novel method for face photo/ face sketch recognition

           using the local string alignment concept which was originally developed for

           genome and protein sequence matching.

                                             5
       3. A modified approach is incorporated into the algorithm that enables in-plane

           rotation invariant face photo and face sketch recognition.

       4. The method is a unique framework that can handle both expression variation

           and occlusions in face photos.

       5. The method is tested on different face photo databases and faces sketch

           databases and is compared with similar methods to prove its effectiveness.

       6. In face sketch recognition this method is unique since it allows recognition

           across modalities and does not require any synthesis steps.

           The proposed method is inherently parallelizable and hence is capable of

       making accurate real time recognition a reality.



1.2.   Thesis Organization

    The thesis is organized as follows. Chapter 2 provides a brief background study of the

topics that are covered in this thesis. Chapter 3 defines the proposed method. In Chapter 4

the applications of the proposed method in face photo recognition are discussed.

Applications of the proposed method for face sketch recognition are discussed in Chapter

5. Finally we conclude our findings and describe the future work in Chapter 6.




                                            6
                                        CHAPTER II

                                    BACKGROUND




   Face photo recognition and Face sketch recognition were originally treated as

cognitive psychology problems to analyze and understand the human cognitive process

behind them. So an analysis of these two topics requires a brief understanding of their

theoretical background. In this chapter, a brief history of face perception and face photo/

face sketch recognition is discussed.



2.1 Face Perception

   Face perception is an important feature of the human cognitive system. Earliest works

in face recognition were in psychology, done in 1950s [6]. The earliest works in face

recognition as an engineering problem dates back to 1960s [7]. The research on automatic

face recognition was first intiated by Kanade in 1973 [8]. An early work in the field of

automatic face sketch recognition dates back to 2002 [9].

   In the field of computer vision we apply pattern recognition algorithms and try to

imitate the same cognitive intricacy associated with face perception. The biggest

challenge faced by a computer vision scientist who tries to solve the problem of face

photo/ face sketch recognition is the task of replicating the human visual system and the

cognitive behavior involved.

                                            7
Fig 2.1 The Thatcher effect was named after the British Prime Minister Margaret
Thatcher on whose photo this famous effect was first demonstrated by Peter Thompson in
1980. This is a phenomenon where it becomes difficult to detect local feature changes in
an upside down face while identical changes are obvious in an upright face.

   The human visual system is the primary means of sensing in most human beings. The

visual system transmits light waves detected by the eyes, to the visual cortex. Once the

information reaches the visual cortex, it is transmitted through the dorsal and ventral

streams of the brain. The ventral stream is shown to perform most of the object/face

photo/face sketch recognition tasks. There is strong evidence that shows that the

appearance of the face enters the ventral stream and the task of identifying the face is

performed by the fusiform area. The famous Thatcher effect [10] shown in figure 2.1,

proved that there is a dedicated area in brain associated with face recognition.

   Any attempt to solve even simple face photo/ face sketch recognition problems

proves to be a challenging venture. In effect, we are primarily trying to replicate the


                                             8
behavior of a portion of the human brain. It is not only associated with the visual system,

but also with the brain. For the same reason, a better understanding of face perception

helps implementing a perfect system. Some of the cognitive facts associated with face

recognition as performed by the human brain are as follows [10][11][12].

Is Face photo/Face sketch recognition a completely dedicated process in human
perception?

   The proof of existence of a dedicated process associated with recognition is the

Thatcher effect shown above in figure 2.1. Faces are better remembered when presented

in upright manner than any other orientation. It is also noted that infants are ‗pre-wired‘

to be attracted to faces. Prosopagnosia patients cannot recognize previously familiar

individuals from their face. But they can identify the same people based on their voice,

hair color, dress and other features. On the other hand, they can recognize individual

features such as nose, eyes, mouth etc. [13]. All these prove the existence of a dedicated

area in brain associated with face perception.

Is Face photo/Face sketch recognition the result of a holistic analysis or a feature
based analysis?

   Both holistic and feature based analysis are equally important. A holistic analysis

helps recognize face regions and the features help identify individual faces. If the features

appear to be more prominent- such as a big ear, a long nose etc, then the features are

more important compared to the holistic face. This is the key reason for caricatures to be

easily recognizable. A caricature is defined by Perkins as ‗a symbol that exaggerates

measurements relative to any measure that varies from person to person‘ [14]. For

example, the length of nose varies from person to person. Such prominent features are




                                             9
exaggerated in caricatures, making them easy to recognize as those associated with a

specific face. Thus caricatures are good enough for us to recognize an individual [15].

Significance of the facial features

   Nose is considered a less useful feature as compared to other features on the face.

However, the nose is a key feature in case of profile face recognition. The upper part of

the face is said to have more information than the lower part. It is also noted that the

faces are recognized based on the attractiveness. The more attractive faces are recognized

faster than the less attractive faces. The faces with distinctive features- atypical faces- are

recognized faster [15].

Effect of high and low frequency components

   Low frequency components are found more useful in giving a global description. For

example – gender identification is easily with only low frequency components. On the

other hand, high frequency components are crucial to the identification task. Therefore,

all the frequency components play a significant role in face photo / face sketch

recognition [16].

Role of race/gender

   It is noted that humans easily identify the faces from the same race. This is because

the brain is thought to create an average face based on which other faces are encoded.

However, the average faces are different in different races. So people tend to easily

identify individuals from the same race as that to which they belong. It has also been

noted that in Japanese population, majority of the female facial features are

heterogeneous, as opposed to the male features, making it easy to identify females [17].




                                              10
Human memory is poor for faces that are not frequently seen

   The human memory is not capable of effectively memorizing the faces that are not

familiar. This makes face sketching in criminal investigations a very challenging task.

Most often the descriptions given by the witnesses recalling their memory may not be

correct, leading to a less accurate sketch rendering. Very familiar faces can easily be

recognized despite any degradation in the image information. The gait and behavioral

traits of the target individual also help fast recognition. An experiment conducted by

Burton et al in 1999 showed that even when the gait and body are hidden a fast

recognition is possible. He also showed that despite an individual being familiar, if the

face is occluded then the recognition accuracy is very low. The aspects of familiarity that

contribute to fast encoding and recognition of faces are not well established [18] [19].

   There are many psychological aspects associated with face photo and face sketch

recognition that are still intriguing to the psychology community exploring these areas.

The understanding of face perception from a psychological perspective definitely helps in

replicating the same behavior in a computer algorithm [10] [11] [12].



2.2 History of Face Photo/Face Sketch Recognition

   A typical face photo/face sketch recognition system involves face detection, face

feature extraction and face recognition. The steps involved are shown in figure 2.2. Based

on the purpose of the system, the input and the output may differ. Some algorithms solve

face recognition as a single problem and may not have such a sub-task based solution.




                                            11
Fig 2.2 Typical steps involved in a Face Recognition System. In all face recognition
systems, the first step is a face detection step where the face regions are detected. The
second is the feature extraction step that extracts relevant features. These features are
used by the face recognition step for classification.


   There are typically two types of applications for face photo and face sketch

recognition. One of the applications is verification or authentication of face images where

a one to one match of the input photo or sketch is performed. This is shown in figure 2.3.




                                            12
Fig 2.3 Face Photo Authentication and Face Sketch Authentication problems are shown
above.

   The other application is identification or recognition of face images which involves a

one to many match. This is shown in figure 2.4. In the following subsections, the

emerging technologies in each of the sub tasks involved with face photo and face sketch

recognition are discussed. The research in each sub problem are equally important.

Isolating the subtasks makes it easier to focus on specific aspects of face photo/ face

sketch recognition [20].




                                          13
Fig 2.4 Face Photo Identification and Face Sketch Identification problems are shown
above.

Face detection

   A conventional computer vision based system for face photo / face sketch recognition

requires a detection step. Most of the databases have inputs that are normalized. They

have a standard format which eliminates the need for any detection step. For example, the

police mug shot database has the images with the faces of people involved in previous

criminal reports. These images are passport photographs, and do not require a detection

step. The detection step is necessary in video surveillance and face tracking where a

particular face has to be monitored in a cluttered background. Face detection involves

several challenges, such as pose variations, feature occlusions, expression variations and

lighting conditions. Face detection algorithms are classified into four categories [21]. The

categorization may overlap and a particular algorithm may fall in more than one category

[22]. They are

   Knowledge based methods – where rules are used to encode the human face. They

translate our knowledge of human face into a set of rules. Some simple rules are - eyes

                                            14
appear above the nose when the face is upright and that the eye areas are darker than the

cheek. Han, Liao, Yu and Chen in 1997 formulated an algorithm which was a knowledge

based approach [23]. It is very hard to formulate an effective set of rules. Hence more

often the rule based methods are combined with other methods to ensure better

recognition rates. Some methods use color based rules to detect the face regions.

However, color information may vary depending on the lighting condition.

   Feature invariant methods – that tend to find features that are invariant across faces

and different poses. Scale Invariant Feature Transform (SIFT) is one such algorithm [24].

In most of the cases, the feature based methods can directly provide a recognition result

without a separate detection step.

   Template matching based methods – where a template is matched against the test

image to see if they match. Faces are detected based on whether the templates are

matched. The templates include face contours, edge information or silhouette of faces.

However they are not very good solutions because even an expression variation may

bring a considerable deformation on the face. These cannot be captured by templates.

Some researchers use deformable templates to solve this problem.

   Appearance based methods – which are modifications of the template matching

method. Some appearance based methods treat the face image or the feature vector as a

random variable and find the probability for a face or a non-face based on statistical

information. Another approach is to define a function that distinguishes face and non-face

regions.

   Some common algorithms that are used for face detection are described below. Some

of these methods are also used for feature extraction and face recognition.



                                            15
   PCA based methods - Sirovich and Kirby developed a PCA (Principal Component

Analysis) based method to differentiate face and non-face regions [25][26]. In their

method, faces are represented as points in a special coordinate system. Turk and Pentland

modified this approach to define the Eigen-face method [27].

   Neural Network based methods- In such methods, a large training set of face

regions is trained into a neural network. The network is also trained with non- face

regions. Based on the output of the trained network, any input image can be classified as

a face or non-face. The major disadvantage of using neural network based methods is the

large convergence time associated with them [28].

   Support Vector Machines (SVM)- They model a hyper-plane in the n-dimensional

space where the face and non-face classification is carried out. The hyper-plane margins

are maximized to avoid false detections. SVM was first used by Osuna et.al [29] for face

detection.

   Sparse Network of Winnows (SNoW)- is another neural network based approach

which was introduced by Yang et al for face detection [30]. The system is trained with

positive and negative samples. The method is less time consuming than many other

systems.

   Inductive Learning – This is another approach where a rule is used to determine if

the presented region is a face region or a non-face region. The most common algorithm

used is the C4.5 algorithm for decision tree learning put forth by Quinlan in [31].

   Viola-Jones Algorithm- This is the most famous and most widely used algorithm for

face detection. The availability of the algorithm in Open CV libraries has made this the

first choice for face detection implementations [32][33]. The real time capability and the



                                            16
high accuracy makes this algorithm a prominent presence in face detection

implementations.

   The accuracy of the two other steps -feature extraction and face classification-

depends on the accuracy of the face detection step.

Face feature extraction

   Human beings can recognize faces even as kids. Anyone can easily recognize his or

her grandparents‘ wedding photo even if they have changed their appearance over the

years. Though these appear to be simple tasks for human beings, they are challenging

tasks for a computer to perform. The accuracy of these tasks depends on what

information the human brain extracts from the images. Feature extraction is the process

of extracting relevant information from the images. The accuracy of this step determines

the classification or recognition accuracy. Feature extraction involves dimensionality

reduction, feature extraction and feature selection. However, there are no clear-cut

boundaries between these sub-tasks. The feature extraction step replaces an image with a

set of features. All inputs have a corresponding feature set associated with them. These

features are the inputs to the classifier [21]. However, the feature extraction and

recognition are two subtasks which are very hard to isolate in most cases. For the same

reason, there are some overlaps in methods described in this section and the section on

face recognition.

   Some common methods used for feature extraction are described below. They include

PCA (Principal Component Analysis) which forms a linear mapping of the image to a

subspace. Face images are projected into a feature space called face space. The face is

encoded using a set of Eigen vectors called the Eigen faces. PCA is one of the first


                                           17
methods that was used for face feature extraction. Another approach called modular PCA

was introduced later, where the image is divided into sub-regions and each region is

projected into specific face sub-spaces. Modular PCA provided more accurate results

than PCA [34].

Kernel PCA which is Eigen vector based forms a nonlinear mapping to a subspace. In

simple terms, this is a nonlinear form of PCA. It uses Integral Operator kernel functions

to compute the principal components in the higher dimensional space. The points in this

space are related to the input space in a nonlinear manner. This nonlinear mapping forms

a better feature when compared to the linear mapping provided by PCA [35].

Weighted PCA which is a weighted variant of PCA [36]. It offers a better recognition

result than PCA.

LDA (Linear Discriminant Analysis) which is Eigen vector based that forms a

supervised linear map [37]. It uses Fisher faces instead of Eigen faces for representing the

face images in the subspace defined using LDA. Several modified variants of LDA have

been introduced ever since first LDA was introduced. Most of the variants perform better

than the conventional LDA.

Kernel LDA which is LDA with kernel methods [38]. This is similar to kernel PCA.

Face image data distribution is highly complex. They form complex manifolds in higher

dimensional space. To effectively represent these manifolds, a nonlinear mapping is

required. The Kernel LDA is a variant of LDA that offers a nonlinear mapping. It is

observed to perform better than LDA and Kernel PCA.




                                            18
ICA (Independent Component Analysis) which forms a linear map [39]. This separates

multivariate signals into components, based on assumption that non-Gaussian signals are

statistical independent.

Neural Network based methods that encompass a very large collection of algorithms

that use neural networks [40]. They include the Multilayer Neural Networks based on

error back propagation. Another popular architecture is Self Organizing Maps. The

Nonlinear attractor is also a very promising Neural Network based nonlinear mapping

architecture. They help model complex manifolds using Neural Networks. They are built

based on the basic Hopfield Network architecture.

Active Shape Models that search boundaries using statistical methods [41]. This is a

statistical model which involves two steps. They are the model construction step and the

searching step. A set of feature points are identified in all images. The same feature

points are marked on all the faces. The shape is represented using a shape vector. The

mean shape and the new profiles are compared using Mahalanobis distance for

recognition.

Graph Models that use a graphical structure as a feature for recognition task [42]. In

some methods, probabilistic graphical models are used in conjunction with the feature

set. The nodes in these graphs represent the conditional dependencies between the feature

sets. These graphs are used to classify individuals.

SIFT features (Scale Invariant Feature Transform) which was introduced by David

Lowe in 2004 for object recognition [24]. The features are also useful for face

recognition [43]. This method captures the Grey level features of the image by a scale

space decomposition of the image.



                                             19
SURF (Speeded Up Robust Features), which is another feature extraction technique

used for face recognition [44] [45]. It relies on integral images. The detector is based on

the Hessian matrix. SURF is partially inspired by SIFT. It is much faster than SIFT.

HoG (Histogram of Gradients) which is a method that was proposed in 2005 for object

recognition [46]. The method describes shapes using the distribution of intensity

gradients or edge directions. The method was found useful for face recognition later [47].

It performs better than many other conventional feature extraction techniques.

LBP (Local Binary Patterns) which is perhaps the most effective of all the available

features so far [48]. The feature was initially defined for texture analysis. Later it was

found that the same features were useful for face recognition and resulted in higher

accuracy than other methods [49]. The LBP feature creates a binary number for each

local neighborhood. LBP features are later classified using statistical measures such as

Chi-Square distance or Log likelihood statistic.

Once the features are extracted, a feature selection mechanism is used to find the most

relevant features. This is a computationally costly step. The feature selection process is

an NP hard problem. One of the solutions is to use the Brute force search which is very

time consuming. Another approach is the branch and bound approach whose time

complexity is O(2n). Yet another approach is to select the best individual features. This is

not an effective approach. Many more feature selection methods are being introduced

these days [50].

Face recognition/face classification

   Once the features are extracted and selected, the next step involved is the

classification step, where the face is identified as that of a specific person. As mentioned


                                            20
earlier, the classification task can be either to authenticate or identify an individual.

Classification algorithms can be of three types- supervised algorithms, semi-supervised

algorithms and unsupervised algorithms.

Supervised learning – In this approach, all the examples are tagged.

Semi-supervised learning – In this approach, all the examples are not tagged. There is

only a small tagged set. These techniques are used when getting more tagged examples is

not possible.

Unsupervised learning – In this approach, there are no tagged examples. All the

examples are untagged. Such systems learn on the fly.

Most of the face recognition systems are supervised or semi-supervised systems.

        The classifiers can be further categorized based on the classification. They are-

similarity based classifiers, probability based classifiers and decision boundary based

classifiers.

Similarity based classifiers are those that find the most similar stored pattern to that of

an input pattern. Some of the common methods that fall into this category are

        Template matching where a stored pattern is matched against the input pattern to

         see how similar they are.

        Nearest Mean based classifier that assigns the pattern to the class whose mean is

         closest to the test input.

        K-nearest neighbor based classifier where the test input is assigned to the class

         that has the highest number of neighbors to the specific input in the feature space

         under consideration.




                                             21
    Probability based classifiers are those that are built based on probability rules. They

    are all based on the Bayesian decision rule. The most common probabilistic classifier

    is the Naïve Bayesian classifier which is purely based on the Bayes theorem.

    Decision boundary based classifiers are those which are defined based on certain

    decision rules. The decision is made based on some criteria. The common decision

    boundary based classifiers are

           Decision trees which are defined based on the information associated with

            each attribute. The information gain is used to decide the order of the node

            placements in the tree. The common algorithm in this classifier is the C4.5

            defined by Quinlan in [31].

           Perceptron learning algorithm which is a neural network based algorithm,

            where the boundaries are defined by the plane defined by the neural network

            architecture.

           Multilayer Perceptron learning algorithm which is an extension of the

            perceptron learning algorithm that defines complex hyper-planes. It is more

            effective than the perceptron learning when the patterns form complex

            structures in the feature space.

           Support vector machine which makes maximum margin boundaries between

            the classes making them more appropriate in many classification tasks.

           LDA, PCA etc which are used for feature extraction. They defined specific

            subspaces.

  Classifiers are usually combined together to offer better results. A particular classifier

may perform better for a subset of data. The classifiers are in general combined in series


                                               22
where each classifier is followed by the next. Another approach is the parallel approach

where individual classifiers are executed in parallel and then the classification outcome is

combined later. The hierarchy approach is where the classifiers are modeled in a tree like

structure.

    Face Detection, Feature Extraction and Face Identification are three independent

subtasks, but the boundary between these tasks are ill-defined making it hard to define

each method as falling under a specific subtask [21]. Though the methods are defined

under specific subtask headings, there is a considerable amount of overlap across the

subtasks. For example, PCA by itself is used for classification though it can also be used

as a dimensionality reduction step during feature extraction.




                                            23
                                     CHAPTER III

  PATTERN MATCHING BY LOCAL ALIGNMENT OF GRADIENT FEATURES



   Researchers have proven that edge/gradient features play a significant role in human

vision and learning. Edges capture the most important details in images [51]. In the

context of human vision, line-drawings with gradient information appear to be sufficient

for recognition purposes.

   As edge detection algorithms in image processing are based on gradient information,

edge detection outputs retain the key feature points in the image. This supports the

premise that the initial mental perception of an object can be defined as a matching of an

edge-based representation. Moreover, Magnetoencephalography (MEG) studies by Liu et

al. have also experimentally proven that line drawings of a face can produce the same

amount of magnetic response as grayscale images [52].

   The machine learning concept of using line edge maps for face recognition tasks was

first introduced by Takeo Kanade in 1973 [53] and is also mentioned by Brunelli &

Poggio [54]. The same idea was used by Gao & Leung (2002) for face recognition [55].

The approach was to generate edge maps and to use them in conjunction with the

Hausdorff distance to recognize faces. Takács used the Sobel edge maps and combined it

with Hausdorff distance [56]. Gao & Leung proposed a similar technique for human

profile face recognition [57]. Another technique was proposed in [58] by Chen & Gao,

where the line edge map was computed first for occlusion invariant face recognition. The

                                           24
line edge map technique relies on the creation of line edges to approximate the edge

features of the face.

    Once the features are extracted, there should be a mechanism to recognize and

classify them. A promising memory representation for visual patterns is to represent them

as strings, trees or a set of propositions. All these methods are referred to as syntactic

pattern recognition methods. Models that treat the visual pattern as a symbolic

description have been proposed in literatures in psychology. Some examples are Palmer

[59], Buffart et al. [60], Buffart and Leeuwenberg [61], van der Helm and Leeuwenberg

[62], and van der Helm et al. [63]. The studies and experiments by Richard A. Chechile et

al. provide psychological evidence for the significance of the syntactic approaches in the

area of familiarity analysis for visual patterns [64]. The structural and syntactic method is

a high level approach that finds a numeric or non-numeric description of the pattern

[65][66]. The approach is used for simple shape recognition and character recognition

problems where string matching is used for recognition. Gao & Leung used string

matching for recognizing profile faces. In the work by Chen & Gao, the same approach

was used for occlusion invariant face recognition. The matching procedure that is

employed in this method is merge dominant. There are two separate steps involved with

the match procedure-aligning the substrings and merging the substrings. The matches of

each substring pair is identified first. Their method uses a global alignment procedure to

match the individual substrings pairs. The substrings that belong to the same image are

merged to determine the overall similarity between the reference image and the test

image. To match the substrings individually and then to merge them, is a cumbersome

procedure [58].



                                             25
   Inspired by the above findings, we propose an algorithm for expression and occlusion

invariant face photo recognition and face sketch recognition. The proposed method uses

the intrinsic edges on the face image. As there is no approximation steps involved in edge

detection, the features are more reliable than those obtained from a line edge map as in

[58]. The edge features are represented as an attributed string which is a concatenation of

the polar coordinates of the edge pixels scanned in the raster order. The method also

offers a single step matching where the string corresponding to the face image is aligned

with the reference image using a local alignment algorithm. The key advantage of using

the string representation is that it is relatively immune to noise. The proposed algorithm

is described in greater details in the following sections.

3.1 Algorithm Architecture




                 Fig 3.1 Algorithm architecture of the proposed approach.


   The algorithm architecture of the proposed system is shown in figure 3.1. The

algorithm steps include – Edge Feature Extraction and Edge Feature Matching. Each of

these steps has subtasks associated with them which are also shown in the figure 3.1.




                                              26
Edge Feature Extraction is defined in more detail in section 3.2. Edge Feature Matching

is described in section 3.3.

3.2 Edge Feature Extraction

   The three steps involved in the edge feature extraction process are Edge Detection,

Reference Line Determination and ‗Edge- String‘ Generation.

Edge detection

   The proposed method uses Canny‘s algorithm instead of other edge detection

mechanisms as it offers better edge output with good localization and works well even if

the image is noisy. The block diagram of the adaptive approach for edge detection that

we have used in the proposed method is shown in figure 3.2.


Gaussian smoothing

   This is the first step involved in Canny Edge detection. This step is for noise

reduction. The Gaussian kernel used in the Canny edge detection algorithm for

smoothing the image contributes to the removal of any noise [67]. The standard deviation

of the Gaussian kernel used in our edge detection algorithm is 1.4.


Gradient filtering

   This is the second step in Canny Edge detection algorithm where the horizontal (Gx)

and vertical gradient (Gy) of the image for which the edges are to be detected are

computed first. This computation can be done using the Robert cross gradient mask, the

Prewitt mask or the Sobel mask. Our method needs only a simple mask, hence we have

used the Robert cross gradient. From the horizontal and vertical gradient values, the

magnitude and direction of the gradient can be found as follows:

                                 √                            (3.1)

                                            27
                                       ( )                    (3.2)

   The value of        is then rounded to one of the values representing the vertical,

horizontal and the two diagonal directions. That is,   is rounded to 0o, 45o, 90o or 135o.




                            Fig 3.2 Edge detection algorithm.


Non-maxima suppression

   If a pixel magnitude happens to be non-maximum when compared to its neighbors in

the direction, then those values are suppressed to produce better edge detections.


Adaptive threshold computation

   Otsu‘s algorithm is used between the non-maxima suppression step and the hysteresis

step to determine the high (Th) and the low (Tl) thresholds. Otsu‘s algorithm finds the

                                            28
optimum threshold by maximizing the between-class variance. The between-class

variance is given by the criterion function given in Eq.3.3. Here               is the between-

class variance when the intensity value used for computing the criterion function is k.




       where         ∑        . L is number of distinct intensity levels in the image.

In Eq.3.3,


                                 ∑


                                 ∑                                (3.5)

The criterion function is computed for every intensity value k. The value of k that

maximizes Eq.3.3. is chosen as the high threshold         for the Canny edge detection. The

lower threshold is set as                 as in [69].


Hysteresis thresholding

   In the final hysteresis step, the method uses the adaptive thresholds obtained using the

Otsu‘s algorithm in the previous step to determine the Canny edge detector thresholds

[68][69]. In this step all those pixels with intensity value greater than the     value are set

to one. All those who are below           are set to zero. Those pixels whose intensities are

greater than    , are also set to one if they are connected to any of the pixels who had the

pixel intensity value greater than    .

   A sample image and the output for each step are shown in figure 3.3. The edge

detection output is passed on to the next stage of edge feature extraction – ‗Reference

Line Determination‘.


                                               29
Fig 3.3 A sample image and the outputs corresponding to each of the steps in the Canny
edge detection algorithm is shown above. Original Image, Gaussian Smoothed Image,
Gradient Filtered Image, Non-maxima suppression output and the Canny edge detection
output image are shown in the respective order in this image.




Fig 3.4 This shows the drawback of using a fixed line of reference. The reference line is
the same even when the face rotates in-plane. This results in a wrong angle calculation, as
the angle is always computed with reference to the ‗reference line‘.

Reference line determination

   The edges are represented in the polar coordinate form to create the ‗Edge- Strings‘

during the ‗Edge-String‘ generation step. To compute the angle corresponding to each

edge pixel, a reference line is needed. The simplest method is to use the horizontal or the

vertical line as the reference line. This is not a very efficient approach as it cannot handle

in-plane rotations. An example is shown in figure 3.4. As the reference line changes, the

angle for a pixel marked on the eyebrow of a person changes.

                                             30
   An alternative is to use a reference line that does not vary with in-plane rotation of the

face. Determining the face orientation, when rotated in-plane and fixing the reference line

is difficult. The easiest option is to fix a reference line as a line connecting two features

on the face. Thus even when the face rotates the line remains the same, hence the angles

are not miscomputed. A feasible option is to use the line connecting two eyes. Viola-

Jones eye detection algorithm works fairly well [33]. However, the localization offered

by this algorithm is not sufficient for us, since the angle values have to match accurately.

A better option is to use the line connecting the two irises as the reference line. Iris

detection is a very mature field in computer vision that guarantees an accurate detection.

   We have used a framework that offers an accurate detection of the irises so as to use

the line connecting them as the reference line. The proposed framework is a combination

of Viola-Jones algorithm, Canny Edge detection and Circular Hough Transform (CHT)

Algorithm. The framework is shown in figure 3.5.

   In this work, the initial analysis and experiments were done based on the fixed

reference line. This is because it is unlikely to find significant in-plane rotation in most of

the databases. If there are any in-plane rotations, the proposed framework handles them.

The framework is an adaptation of a method that is proven to work for iris detection [70].




                                    Fig 3.5 Iris detection.


                                              31
    Our method uses a Viola-Jones eye detector, available in OpenCV for detecting and

localizing the eye regions [33][71]. The detection algorithm helps extract eye regions

using the Haar cascades for eyes. The method creates a strong classifier using several

weak classifiers, where each classifier considers only one Haar feature. The stronger

classifiers are created using an adaptive boosting algorithm referred to as Adaboost.

Adaboost builds a strong classifier using a linear combination of all the weak classifiers

[72]. The classifiers form a cascade where the initial classifiers are less accurate but they

become progressively stronger and offer robust detection with very low false positives.

    The second step is the edge detection step, where the Canny edges of the detected eye

images are computed. The procedure followed is the same as described in the section 3.2.

    The third step is the computation of the Circular Hough Transform (CHT) [73]. The

edge image is then scanned to find pixels with nonzero values. For each of these pixels,

the center (       is estimated as follows




    Here, x and y are the coordinates of the pixel p. ‗r‘ is the range of radii and 0≤ θ ≤ π.

For a particular ‗r‘, the value of   and     is stored in an accumulator. The accumulator

counter is incremented for all the points in the edge image and all the possible ‗r‘ values.

The accumulator entry with the maximum value points to the center of the edge circle in

the edge image. The principle behind the CHT is shown in figure 3.6.




                                             32
                                 Fig 3.6 CHT illustration.

   The CHT provides the center of the circle in the edge image. In the case of an eye

edge image, the only circle present is that representing the pupil. This method finds the

center of the pupil, which is used as the position of an end point of the reference line. The

process is repeated for both the eyes to find the line connecting the two eyes. This line is

used as the reference line.




        Fig 3.7 Reference line determination using the procedure shown in fig 3.5.

Edge string generation

   In this step, 'edge-strings' are generated from the edge detection output image

[73][74]. The angle and distance values in the polar coordinate representation of an edge

pixel are used as the two attributes to form a string primitive. Each 'edge-string' is a

sequence of string primitives representing a pixel of the edge image in polar coordinates.

The edge image is scanned in the raster order to generate the 'edge-string'. The origin is



                                             33
marked as a fixed point at the center of the face image. This approach works as all images

are of the same size after the preprocessing steps. If the centroid is determined based

edges available on the edge image, it can be different for the reference and test image. To

find the angle in polar coordinate representation, we use a horizontal line that passes

through the origin as the reference, as shown in Fig. 2.

                                                                   (3.6)

   Here (         is the Cartesian coordinate of the origin and              is the Cartesian

coordinates of the edge pixel under consideration.‘        ‘ is the angle corresponding to the

pixel . Similarly, the distance `   ' from the origin to the edge pixel `   ' is computed as:

                               √                                    (3.7)




Fig 3.8 Sample image and the corresponding edge image with the δ and θ values shown.
The reference line for the angle calculation is the horizontal line passing through the
origin. This is the case when a fixed reference line is used.

   In our approach, the distance `     ' of each edge pixel to the origin and the angle `    '

from the horizontal axis are the two features that are of interest. Each string element is a

distance and angle pair separated using a special character `:'. The string primitives are

separated from each other using the character `,'. The special characters `,' and `:' are used

to make parsing easier during the alignment process. An example of such a string


                                             34
representation corresponding to the pixels (      ,   ), (   ,   ), …. (     ,      ) is given

below:

                              Ʌ1=

   Here           represents a string primitive corresponding to the edge pixel `     '.

3.3 Edge Feature Matching

   The two steps involved in Edge Feature Matching are Local String Alignment and

Percentage Similarity Computation.

Local string alignment

   The similarity between the 'edge-strings' is computed using string matching. We use

the Smith-Waterman algorithm, a dynamic programming algorithm for string matching

[75]. The algorithm was proposed for genetic sequence alignments where minor

mutations render similar sequences dissimilar. Similar to these mutations in genome

sequences, there can be local changes in edges on faces due to expression variations. In

the case of sketches, the artist‘s rendering style and artistic strokes induce such

variations. These variations create dissimilarities in the edges on the face image, although

other locally similar regions are retained. In the proposed string matching procedure,

when we align the elements of two 'edge-strings', a comparison is performed. An element

in string Ʌ1 is said to match with an element in string Ʌ2 only if the corresponding '       '

and '    ' values match. The algorithmic steps of our matching algorithm are shown in

Algorithm 3.1.




                                            35
                    Algorithm 3.1. Modified Smith Waterman Algorithm.

   The proposed algorithm offers an optimal local alignment. String matching procedure

requires a scoring mechanism based on which the match values and the gap penalties are

assigned. To mathematically compute a scoring metric, there should be sufficiently large

number of samples to analyze the statistical distributions. For a reliable computation of

metrics, thousands of reference images of the same individual are needed. However, such

a statistical analysis is not possible as there is only one reference image per individual.

This is strictly a single image in case of face sketch recognition. An alternative is to use a

                                             36
large number of training images per individual. This is not a realistic assumption for face

recognition algorithms since it is impossible to obtain a large number of training images

in a real life face photo or face sketch recognition application. For this reason, we choose

the simplest metric that gives equal weight to a match or a mismatch. The gap penalty is -

1 and match value is 1 in the matching algorithm. The algorithm uses mismatch value of

0. An example of the proposed alignment mechanism is shown in the section 3.4.

Percentage similarity computation

       In this step, the similar string primitive pairs common to the two strings that are

aligned, are identified. This value is referred to as the ‗match count‘ represented by ‗σ‘.

The percentage similarity   is computed as




       Here     and    are the lengths of the strings   and     respectively. The class that

gives the maximum value of      is identified as the class to which the test image belongs.

One of the advantages of the proposed approach is that it is not merge dominant. If the

approach does not take care of local region similarities, then the edge strings     and

have to be divided into sub-regions. These sub-regions can be compared using a global

alignment algorithm and merged to find the overall match count. The proposed approach

handles the local similarities intrinsically, making it a better approach than merge

dominant methods.




                                             37
Fig 3.9 The two sample patterns which are considered for illustrating the local string
alignment algorithm mechanism.

3.4 Local String Alignment Example with Real Image Patterns

   An example of local string alignment is included in this section. Fixed reference line

is assumed for this particular example. The two patterns that are considered for alignment

are shown in figure 3.9. Here, in this specific case λ1 =length(Λ1) and λ2=length(Λ2). The

dynamic programming matrix Ψ is a λ1×λ2 matrix. In this case, we assume μ =1 as the

match value and γ =-1 as the gap penalty. σ represents the match count, η1 and η2 the

indices. The resultant Ψ is shown in figure 3.10. The first one is before backtracking and

the second one shows the backtracking in blue arrows.




                  Fig 3.10 The Ψ matrix before and after backtracking.




                                           38
   The detailed structure and the resulting alignment are shown in figure 3.11. The two

strings have seven string primitives in common.

3.5 Local Alignment Match Procedure vs. Global Alignment Match Procedure

       We evaluated the performance of two common alignment algorithms- Smith-

Waterman local alignment algorithm and Needleman-Wunsch global alignment algorithm

[76]. The motivation for the introduction of this algorithm was to obtain the best

alignment in regions with low similarity. A global alignment algorithm cannot bring out

local similarities. Thus a local alignment algorithm is a better choice than global

alignment for matching 'edge-strings'. Smith-Waterman algorithm is guaranteed to give

the best local alignment, making it the best choice for our purpose.




                        Fig 3.11 Ψ matrix and resulting alignment.




                                            39
         Fig 3.12 Sample images from the CMU AMP Expression database [77].


        To prove this, we performed an experiment on a subset of the CMU AMP

Expression database [77]. CMU AMP Expression database has 13 subjects and 75 images

for each subject. Out of the 75 images, one of the image was chosen as the

training/reference image. Some sample images from this database are shown in figure

3.12.

        All experiments were done choosing a random image in each class as the training

image, since it is not necessary for the reference image to be a neutral expression image.

From the remaining 74 images per class, one random image per class was chosen as the

test image. It should be noted that the test set is a small set consisting of 13 images with

one image from each class. Even for such a small test set, the global alignment based

approach recognizes only 4 out of the 13 (recognition rate=30.77%). Repeating the same

experiment using the proposed method gives 100% recognition. For the alignment step,

the gap penalties and the scores used are the same values for the Needleman-Wunsch and

the proposed method.


                                            40
                                      CHAPTER IV

                              FACE PHOTO RECOGNITION



        As stated earlier, the use of edge features for recognition was a contribution from

Takeo Kanade [53]. Gao & Leung conceived the idea of using edge maps in conjunction

with Hausdorff distance for face recognition [55]. Another approach is to use edge maps

based method for profile face recognition [56]. The latest work was done by Chen & Gao

[57] for occlusion invariant face recognition using a different edge extraction and

alignment procedure. Inspired by the scope of edges in face recognition applications we

have applied the proposed method for face photo recognition. The proposed method is

described in detail in chapter III.

        In this chapter, the training and testing procedures for face photo recognition

using the proposed method is discussed. The results obtained by our experiments to show

the effectiveness of the proposed method as a face photo identification/authentication and

damaged photo identification method are also included. All procedures work the same

way except for the fact that the number of images involved in face identification and face

authentication are different. For face identification, the test image is compared against the

entire class of individuals. Face authentication is a relatively simple application that does

not require a comparison with all other classes of individuals. it is just a similarity check

with the photograph in the face database and the photo to be authenticated. To test and

evaluate the performance of the proposed approach on damaged IDs, these damages are

                                             41
considered as occlusions. In other words, the problem of identifying a person from a

damaged ID is similar to recognizing a partially occluded face. In occlusion invariant

recognition, the test procedure is the same. Since a major portion of the photograph to be

authenticated is damaged, the procedure is more similar to the identification task than the

authentication task. The procedures are described in section 4.1. Section 4.2 describes the

experiments on expression invariant face recognition which includes both the

authentication and identification tasks. Section 4.3 describes the preliminary experiments

in occlusion invariant face recognition which is similar to damaged identity proof

recognition.

4.1    Training and Testing Procedure for Face Photo Recognition Applications

       The proposed method is slightly modified for face photo recognition. There can

be several applications for face photo recognition such as face authentication, face

identification and damaged identity proof identification. In face photo authentication

applications, the reference image is the photo stored in the face database of the system.

The test images are those images which are present on authentication documents in case

of face authentication. When the test image comes from a surveillance feed of a security

system, the face can have expression variations. In such applications, there can be a large

variation in the possible test set of face photos. However, even in this case, the system

requires only one reference image. Damaged identity proofs may have only partial faces.

However, the database has undamaged reference images against which the test image is

validated. In a way, the damaged photographs can be considered as partially occluded

faces. The proposed method can hence be used for occlusion invariant face recognition.

Occlusion invariant aspect of the proposed method has not been validated fully. Only



                                            42
preliminary results are available to substantiate this finding. The key advantage of the

proposed method is the requirement of only one reference image. This makes the system

highly scalable.

       A flow diagram of the system as used for face authentication/identification

/identity proof identification applications is shown in figure 4.1. The test image set

consists of several face images with expression variations. The 'edge-string'

representations corresponding to the test images are generated. All the 'edge-strings' are

generated by scanning the edge image in the raster order. Thus there is a correspondence

in the ordering of the string primitives of the 'reference-string' and the test image 'edge-

string'. The `edge-strings' of these test images are matched with the `reference-string' of

each class using the matching procedure described in section B. Match count, σ, refers to

the number of string elements that are common between the two edge strings ('reference-

string' and 'edge-string') under consideration; it indicates how similar the two strings are.

       The match count is calculated from the alignment by counting similar primitives

during the back tracking step of the matching procedure. The value of match count (σ)

cannot exceed the size of the smaller of the two strings λ1 and λ2. As σ depends on the

length of the two sequences, a second measure called the percentage similarity (ρ) is

used, which is computed as shown in Eq 3.8. The class corresponding to the maximum ρ

value is identified as the class to which the test image belongs.

4.2    Experiments on Expression Invariant Face Recognition

       The effectiveness of the proposed method is compared with state-of-the-art

algorithms on the Yale Face database, the Japanese Female Face Expression database

(JAFFE) and CMU AMP Face Expression database.



                                             43
              Fig 4.1 Flow diagram of test procedure for face identification.



Experiments on Yale database

       The Yale database contains 165 grayscale images of 15 individuals in GIF format

[5]. There are 11 images per subject, one per different facial expression or configuration:

"center-light", "w/glasses", "happy", "left-light", "w/no glasses", "normal", "right-light",

"sad", "sleepy", "surprised", and "wink". We have chosen the center-light image as the


                                            44
training image/reference image. All the 10 images other than the reference image were

used for testing. Fig. 4.2 shows a chart comparing the results of the proposed method

with the results in [78]. As Ekenel & Stiefelhagen used a subset of the Yale database for

testing their approach, we tested the proposed method on the same subset for comparison.

The test images are those labeled as "glasses", "happy", "left light", "right light",

"surprised" and "sad". The faces are all of the same size and are used as provided in the

database. The proposed method offers a recognition rate of 100%.



        Table 4.1 Proposed method vs. error rates in [55] (Gao and Leung, 2002).
                     Method                                  Error Rate

                    Edge map                                   26.06%

                Eigen face (PCA)                                24.4%

                   Correlation                                  23.9%

                Linear Subspace                                 21.6%

                 Line Edge Map                                 14.55%

               Fisher Face (LDA)                                7.3%

               Proposed Method                                     6%


        The comparison with methods in [79] is given in Fig. 4.3. A comparison of the

results of our method and the error rate in [55] is also provided in Table 4.1. A higher

recognition rate is offered by the proposed method as edge features are capable of

capturing the similarities across faces of the same individual even when the expressions

vary.




                                           45
  Fig 4.2 Performance comparison on Yale Expression database: Our method vs other
                        methods (Hu and Wang, 2006) [78].




Fig 4.3 Performance comparison on a subset of Yale Expression database: Our method vs
                 other methods (Ekenel and Stiefelhagen, 2005) [79].

Experiments on the Japanese Female Face Expression (JAFFE) database

       This database contains 213 images of 10 Japanese female models [4]. There is one

neutral image per person. The other images correspond to basic facial expressions. A

random image is chosen as the reference image for every class and the remaining images


                                          46
are used as the test images. In Fig. 6, the results obtained using the proposed method is

compared with the results in [80] and [81]. The proposed method offers a recognition rate

of 99.52% as shown in Fig.7.




 Fig 4.4 Performance comparison on JAFFE database [80][81] (Wang and Ruan, 2010).

       The proposed approach outperforms other methods even when there are a wide

range of expression variations in the image. The edge features are retained across the

expression variant faces of the same individual. This makes our method superior to other

methods.

Experiments on the CMU AMP Expression database

       This database has 13 subjects, and for each subject there are 75 images with

different facial expressions [77]. All the images are captured in the same lighting

condition, with variations in expression. A random image of each individual is taken as

the training image, and the remaining 74 images are used for testing. All images in the

database are of the same size- 64x64. We have achieved 100% accuracy on this database.




                                           47
The results of this experiment are compared with methods in [82], [83] and [84] as shown

in table 4.2. Our method offers similar results as the methods in [83] and [84].

      Table 4.2 Comparison of recognition rates on CMU AMP Expression database.
                       Method                              Recognition Rate


                      SRC [82]                                     99.49%

                    CS Based [83]                                  100%

        Class Dependent Factor Analysis [84]                       100%

                 Proposed Method                                   100%


        The advantage of our approach over these methods is that our approach requires

only a single training image.

4.3     Experiments on Occlusion Invariant Face Recognition

        Occlusions are similar to the damages seen on damaged identity proofs. This is

one of the applications of occlusion invariant face recognition.

Experiments on the LFW (Labeled Faces in the Wild) database




      Fig 4.5 Sample images from the LFW database with manually added occlusion.

        The method is tested on the LFW database with manually added occlusions [85].

The images are shown in figure 4.5. Ten images were randomly chosen from the LFW

database. Occlusions were added to cover the nose, the left eye, the right eye, the lower

half of the face, the lower half diagonally and the vertical half. The sizes of the

occlusions were adjusted to cover the features appropriately. These occlusions are similar

                                            48
to damages that occur to photographs in ID cards. It is observed that the method gives

100% recognition. The preliminary results are promising.

Experiments on the AR database

       The method was tested on the AR dataset [3] using face images as query images.

The method was tested for 123 individuals. 9 images per person were used for this

experiment. The image set included an image in neutral expression as the training image.

Two images with sunglass and scarf were chosen for testing the occlusion invariance.

Three images were chosen to test the lighting invariance and another three were chosen

for testing the expression variation. The results of this experiment are shown in table 4.3.

The proposed method offers a better recognition rate than other methods.

               Table 4.3 Comparison of recognition rates on AR database.
                                               Proposed Method

                           Lighting                       83

                          Expression                      88

                           Occlusion                      79




                                            49
                                      CHAPTER V

                            FACE SKETCH RECOGNITION


       Face sketch recognition refers to the problem of recognizing a person from an

artist drawn sketch. This has wide application in criminal investigation. A sketch drawn

based on the descriptions is compared with a large database of face images to retrieve

similar images. Since artist drawn sketches are vulnerable to variations due to the artist's

rendering style, the problem presented is not a one to one matching problem; the focus is

on eliminating unrelated images, and narrowing down the search. The victim or the

witnesses are allowed to cross-verify and identify the convict from the list of images

retrieved from database. In face sketch recognition, the only features that are available

are the edges that are common to both the photo and the sketch.

       There are two categories of methods for face sketch recognition. One set of

methods involve preprocessing to reduce the distance between the sketch space and photo

space [87][88][89]. The sketches corresponding to the photo images are first synthesized.

Once we have a sketch corresponding to the face photo, the final recognition task is

performed in the sketch space using conventional classification methods. The photo

images corresponding to the synthesized sketches that show highest similarity are then

retrieved from the database for analysis by the witnesses or the victims. The second

category involves methods which allow classification across modalities, and does not

require sketch synthesis [90]. The proposed method belongs to the second category. The


                                            50
edges in the sketches and the photo images are used as the features for classification and

image retrieval.

       All sketch synthesis based methods require a very large set of training images for

training the system. The images are first divided into patches. The final comparison step

required for sketch synthesis has a complexity of O((nd)2), where n is the number of

images and d the number of patches associated with each image. In addition to this, the

conventional methods that are used in the final recognition task after sketch synthesis

also involve additional complexity. Consequently, all sketch synthesis based methods are

computationally expensive. As the classifiers used are standard classifiers such as PCA

and LDA, the accuracy of all these methods depends on the accurate synthesis of

sketches from the training patches. However, for very accurate sketch synthesis, a very

large training patch set is required. Another drawback of these methods is that they are

not scalable. The photos corresponding to the training patches should belong to the same

race as the test image subject. If a new test subject from a different race is added to the

system, a very large set of training images from the specific race should also be added.

        The difficulties posed by conventional methods call for a new approach for face

sketch recognition. The major advantage of our method is that it requires only one image

for training and therefore the proposed approach is scalable. Even if a subject of a

different gender or a different race is added to the system, it can easily perform

recognition task as long as there is one training image available. Our method does not

rely on any sketch synthesis step. This greatly reduces the computational complexity. The

algorithm finds application in face image retrieval from a large database such as a police

mug shot database. The proposed technique is a novel approach that uses the inherent



                                            51
edges in the face photo and sketch images to solve the problem of face sketch

recognition. To the best of our knowledge this is the first contribution that uses a

syntactic approach for face sketch recognition. Experiments are carried out on the CUHK

student dataset to prove the effectiveness of the proposed approach for face sketch

recognition and sketch based face image retrieval [2].

       The training and testing procedures associated with the sketch recognition are

described in section 5.1. The experiments are described in detail in section 5.2.

5.1 Training and Testing Procedure for Face Sketch Recognition Applications

       Flowchart for the training and testing procedure are shown in fig 5.1. It is

assumed that face regions are detected, cropped, registered/aligned, resized and centered.

The algorithm is a supervised learning technique- the training input and their respective

classes are known a priori. The training images are also referred to as reference images.

We need only one reference image per class. The reference images used for face sketch

recognition application are the face photo images in the mugshot database. In the training

phase, the edges are detected for all reference images. The 'edge-string' representations

corresponding to each edge image is created. These 'edge-strings' corresponding to the

reference images are referred to as ‗reference-strings'. In the testing phase, the edge

string corresponding to the test image (the artist drawn sketch) is first generated. The

match count (σ) and percentage similarity (ρ) are computed. The class which yields the

maximum percentage similarity is identified as the class to which the test sketch belongs.

ρ is computed using Eq 3.8.




                                            52
           Fig 5.1 Flow diagram of test procedure for face sketch recognition.

       For face sketch recognition, most of the time, only one sketch image is available

as the reference image. The proposed method uses the test sketch image and retrieves

related photo images from the database based on the local alignment similarity measure.

The retrieved images may be shown to the witnesses or the victims to identify the

suspect.

5.2 Synthesis Based Methods vs. Proposed Method: Complexity Comparison

       The methods that need sketch synthesis require a large set of training images.

There is an additional preprocessing step involved, where the sketches are divided into

patches of small size such as 3x3, 5x5, 7x7 or higher. As the patch size increases, the

accuracy of the sketch synthesis decreases. On the other hand, as the patch size decreases,


                                            53
the space and time complexity of the method increases. The accuracy of the sketch

synthesis based methods depends on the number of training patches and consequently,

the requirement of a large training set is inevitable. The entire training set and the test set

need to be first divided into such small patches. This is a very costly preprocessing step.

The space complexity of these methods is O(nd), where n is the number of images and d

is the size of the patch. The classification step introduces additional complexity. After

preprocessing, the sketch synthesis is performed by matching each patch. The time

complexity associated with the synthesis step is O((nd)2). As the number of patches (d) is

a very large quantity, the d2 term further increases the complexity. This makes sketch

synthesis based methods highly inefficient in terms of space and time complexities.

Furthermore, the approach is not scalable. If the patches belong to a race different from

that to which the test image belongs, these methods require a new set of training images

with images from the same race as that of the test image. This means that even if a single

person from a different race gets added to the test set, a large set of training images of

subjects from the same race is needed. If the synthesis step is inaccurate the method will

not recognize the sketch correctly.

       The proposed method requires only one training image per person, making it

more scalable than other state-of-the art approaches. It does not require any additional

training information. The space complexity involved is O(n), since there is only one edge

image corresponding to each image. The time complexity is O(n2). Quadratic time

complexity O(n2) of the proposed method makes it superior to conventional methods

with complexity O((nd)2).




                                              54
5.3 Experiments on the Sketch Databases

       Experiments are performed on the CUHK (Chinese University of Hong Kong)

database and the AR database.

Experiments on the CUHK database

       In the CUHK student dataset, the probe set consists of 100 face sketches and a

gallery set of 100 photo images [87][88][89]. Sample images are shown in figure 5.2.

       The rank curve in figure 5.3 shows the effectiveness of the proposed method in

face sketch recognition when compared with other methods. The Geometry method and

Eigen face method do not require sketch synthesis like our method.




                   Fig 5.2 Sample images from the CUHK database.




                     Fig 5.3 Rank curve for CUHK student dataset.
                                           55
 Fig 5.4 Sample images from AR database and corresponding sketches provided by the
                                  CUHK group.


       Our method performs better than other methods as shown in fig. 5.3. This is

because the edges are the only features that are common across the photos and sketches.

Methods such as Geometry method and Eigen face method do not capture these features.


Experiments on the AR database

       The method was tested on the AR dataset [3] using the procedure described in

[86] with sketch images as query images. The sketch images corresponding to the AR

database images are provided by the CUHK (Chinese University of Hong Kong) group.

Sample images are shown in figure 5.4. The results of this experiment are compared with

the results in [86] as shown in table 5.1. The proposed method offers a better recognition

than other methods.

              Table 5.1 Comparison of recognition rates on AR database.
                      Euclidean Distance Random Sampling Proposed Method
                          Matching                 LDA
     Lighting                48.9                   77.8                78
    Expression               64.4                   75.6                80
    Occlusion                37.8                   62.2                70



                                           56
       These experiments have proven the effectiveness of the proposed approach as a

robust method for face sketch recognition.




                                             57
                                      CHAPTER VI

                        CONCLUSION AND FUTURE WORK




       In this thesis, we have presented a method that relies on edge features for face

photo and face sketch recognition. The proposed method is useful for face recognition

applications that require expression and occlusion invariance. For expression invariant

face recognition, experiments were conducted on the Yale face database, the JAFFE

(Japanese Female Face Expression) database and the CMU AMP Expression database.

The results on expression invariant face recognition exceed the standards of state-of-the-

art systems. The preliminary results in occlusion invariant face recognition are promising,

but require more work. Experiments were conducted on a subset of the LFW (Labeled

Faces in the Wild) database and a subset of the AR database.

       The proposed method can also be used for face sketch recognition. The algorithm

does not require face sketch synthesis and allows recognition across photo and sketch

space. The effectiveness of our approach for face sketch recognition was shown by the

experimental results on the CUHK student dataset and AR database. The use of edge

features and attributed strings for sketch recognition is a novel contribution. Our method

can be further improved by incorporating learning rules to extract other commonalities

between the photo and sketch images. Future work includes modification and testing the

proposed algorithm on other face sketch databases such as CUFSF (CUHK Face Sketch

                                            58
FERET Database). Further analysis has to be performed to see the tolerance of this

method with difference in artist rendering styles.

           The method was tested with fixed reference line for ‗edge-string‘ computation.

The line connecting the two eye centers was also tested. However, this difference does

not introduce any significant difference in the recognition accuracy. This is because the

faces in all the databases are upright. The advantage of our approach over other methods

is that our approach requires only a single training image. This makes it highly scalable

as opposed to other methods in face photo and face sketch recognition. Thus the method

is appropriate for security and surveillance applications where only a single reference

image is available. This fact also makes it the best choice for authentication applications.

To the best of our knowledge this is the first method that uses a syntactic approach for

face photo and face sketch recognition.

           The only disadvantage of the current algorithm is the computationally expensive

Local String Alignment process. However, the algorithm is an excellent candidate for

parallel     implementation.   There   are   GPU     (Graphics   Processing   Unit)   based

implementations for similar alignment algorithms that are used for genome and protein

sequence recognition. Future work includes implementing our algorithm as a parallelized

implementation in dedicated hardware such as GPUs or FPGAs to make the execution

faster. The method is to be further evaluated by testing on larger sets of databases such as

CK+ database for face photo recognition and databases such as XM2VTS and CUFSF

(CUHK Face Sketch FERET) database for face sketch recognition.

           The method fails if the face photo/ sketch images involve pose variations. Pose

variation is a major challenge in face photo/ face sketch recognition. To handle this, the



                                             59
method can be combined with other state of the art methods that are used for pose

invariant face recognition. A possible approach is to interpolate face in different angles.

To implement this, we have to include three training images corresponding to each

individual in the database- the frontal face, the left profile face and the right profile face.

This is the minimum requirement for interpolating face. The recognition system may

receive a test image in a different pose other than the ones for which the training image is

available. The pose of the test image can be estimated as an angle value using pose

estimation techniques. The training image can be interpolated to a face in that specific

pose angle using face reconstruction algorithms. The new training image can then be

compared with test image for recognition using the proposed method. This is a potential

modification to the proposed method for incorporating pose invariant face recognition.

       The method can also be used for other applications such as action recognition and

expression recognition. However, this requires a lot of experimentation and may also

require slight modifications of the proposed approach.




                                              60
                             LIST OF PUBLICATIONS




1.    "Gradient feature matching for in-plane rotation invariant face sketch

recognition," Ann Theja Alex, Vijayan K. Asari, and Alex Mathew, IS&T/SPIE

International Conference on Electronic Imaging: Image Processing: Machine Vision

Applications VI, San Francisco, CA, USA, 3 - 7 February 2013.

2.    "Gradient feature matching for expression invariant face recognition using single

reference image," Ann Theja Alex, Vijayan K. Asari, and Alex Mathew, IEEE

International Conference on Systems, Man and Cybernetics - SMC 2012 , Seoul, Korea,

14-17 October 2012.

3.     ―Learning embedded lines of attraction by self organization for pose and

expression invariant face recognition,‖ Ming-Jung Seow, Ann Theja Alex, and Vijayan

K. Asari, SPIE Journal of Optical Engineering, vol. 51, no. 10, October 2012.

4.    "Local alignment of gradient features for face sketch recognition," Ann Theja

Alex, Vijayan K. Asari, and Alex Mathew, 8th International Symposium on Visual

Computing (ISVC 2012), Rethymnon, Crete, Greece, vol. 7431, pp. 378-387, 16-18

July 2012.

5.    ―A Self Organized Learning Strategy for Object Recognition by an Embedded

Line of Attraction‖, Ming-Jung Seow, Ann Theja Alex, Vijayan K. Asari, Proceedings

of SPIE Defense, Security,and Sensing (SPIE DSS 2012), Baltimore, Maryland, USA,

vol. 8398, pp. 839803: 1-13, April 2012 - Invited Talk.

                                          61
6.      ―Neighborhood Dependent Approximation by Nonlinear Embedding for Face

Recognition‖, Ann Theja Alex, Vijayan K. Asari, Alex Mathew, Proceedings of the

16th International Conference on Image Analysis and Processing (ICIAP 2011),

Ravenna, Italy, vol. 6978, pp. 544-553, September 2011.

7.      ―A Linear Manifold Representation for Color Correction in Digital Images‖, Alex

Mathew, Ann Theja Alex, Vijayan K. Asari, Proceedings of the International

Conference on Image Processing (ICIP 2011), Bangalore India, pp. 652-658, 5-7 August

2011.

8.      ―A Linear Manifold Representation for Color Constancy‖, Ann Theja Alex,

Vijayan K. Asari, Alex Mathew in the 6th Annual ASME Dayton Engineering

Symposium (DESS 2010), Dayton.

9.      ―A Manifold Based Methodology for Color Constancy‖, Alex Mathew, Ann

Theja Alex, Vijayan K. Asari, Proceedings of the 39th IEEE Applied Imagery and

Pattern Recognition Workshop (AIPR 2010), Washington D.C, pp. 1-7.




                                          62
                              BIBLIOGRAPHY



1. W. Zhao, R. Chellappa, P. J. Phillips, and A. Rosenfeld., ―Face recognition: A

   literature survey,‖ ACM Comput. Surv., Vol. 35, pp. 399-458, 2003.

2. X. Wang and X. Tang, ―Face Photo-Sketch Synthesis and Recognition,‖ IEEE

   Transactions on Pattern Analysis and Machine Intelligence, Vol. 31, pp. 1955-

   1967, 2009.

3. A.M. Martinez and R. Benavente, ``The AR face database," CVC Tech. Report

   #24, pp. 1-8, 1998.

4. M. Lyons, S. Akamatsu, M. Kamachi and J. Gyoba, ―Coding facial expressions

   with Gabor wavelets,‖ Proceedings of the Third IEEE Conference on Face and

   Gesture Recognition, pp. 200-205, 1998.

5. P.N. Belhumeur, J.P. Hespanha, and D.J. Kriegman, ―Eigenfaces vs. Fisherfaces:

   recognition using class specific linear projection,‖ IEEE Transactions on Pattern

   Analysis and Machine Intelligence, Vol.19, pp. 711-720, 1997.

6. I. S. Bruner, and R. Tagiuri, ―The perception of people,‖ In Handbook of Social

   Psychology, Vol. 2, G. Lindzey, Ed., Addison-Wesley, Reading, MA, pp. 634–

   654, 1954.

7. W. W Bledsoe, ―The model method in facial recognition,‖ Technical report

   PRI:15, Panoramic research Inc., Palo Alto, CA, 1964.

                                      63
8. T. Kanade, ―Picture Processing by Computer Complex and Recognition of

   Human Faces,‖ Technical report, Dept. Information Science, Kyoto University,

   pp. 1-148, 1973.

9. X. Tang and X. Wang, ―Face Photo Recognition Using Sketch,‖ Proceedings of

   IEEE International Conference on Image Processing, Vol. 1, pp. 257-260, 2002.

10. P. Sinha, B. Balas, Y. Ostrovsky and R. Russell, ―Face recognition by humans: 20

   results all computer vision researchers should know about,‖ pp. 1-26, 2006.

11. R. Chellappa, C.L. Wilson, S. Sirohey, "Human and machine recognition of faces:

   a survey," Proceedings of the IEEE , Vol.83, pp. 705-741, 1995.

12. P. F. de Carrera, ―Face Recognition Algorithms,‖ pp. 1-78, 2010.

13. H. Ellis, M. Jeeves, F. Newcombe and A. Young, ―Aspects of Face Processing,‖

   pp. 1-509, 1986.

14. D.Perkins, "A definition of caricature and recognition," Studies in the

   Anthropology of Visual Communi., Vol. 2, pp. 1-24, 1975.

15. V. Bruce, "Perceiving and Recognizing Faces", in Mind & Language, Vol. 5, pp.

   342-364, 1990.

16. J. Sergent, ―Ontogenesis and microgenesis of face perception,‖ Cahiers de

   Psychologie Cognitive/Current Psychology of Cognition, Vol 9, pp. 123-128,

   1989.

17. A.G. Goldstein, ―Facial feature variation: Anthropometric data II. Bulletin of the

   Psychonomic Society,‖ Vol. 13, pp. 191-193, 1979.

18. A. M. Burton, S. Wilson, M. Cowan and V. Bruce, ―Face recognition in poor-

   quality video,‖ Psychological Science, Vol. 10, pp. 243-248, 1999.



                                       64
19. S. L. Sporer, R. S. Malpass, G. Koehnken, ―Psychological Issues in Eyewitness

   Identification‖, (Mahwah, NJ: Lawrence Erlbaum Associates), pp. 205-231, 1996.

20. Rein-Lien    Hsu,   ―Face   Detection      and    Modeling    for   Recognition,‖

    Ph.D. Dissertation. Michigan State University, East Lansing, MI, USA.

   Advisor(s) Anil K. Jain, pp. 1-198, 2002.

21. J. Rabia, A. R. Hamid, ―A Survey of Face Recognition Techniques,‖ JIPS, Vol. 5,

   pp. 41-68, 2009.

22. M.-H. Yang, D. Kriegman, and N. Ahuja,‖ Detecting faces in images: A survey,‖

   IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, pp.

   34–58,

23. C.C. Han, H.Y. M. Liao, K. chung Yu, and L.H. Chen, ―Lecture Notes in

   Computer Science,‖ Vol. 1311, chapter Fast face detection via morphology-based

   pre-processing, pp. 469–476. Springer Berlin / Heidelberg, 1997.

24. David   G.     Lowe, "Distinctive    image       features   from    scale-invariant

   keypoints," International Journal of Computer Vision, Vol. 60, pp. 91-110, 2004.

25. L. Sirovich and M. Kirby, ―Low-dimensional procedure for the characterization

   of human faces,‖ Journal of the Optical Society of America A- Optics, Image

   Science and Vision, Vol. 4, pp. 519–524, 1987.

26. M. Kirby and L. Sirovich, ―Application of the karhunen-loeve procedure for the

   characterization of human faces,‖ IEEE Transactions on Pattern Analysis and

   Machine Intelligence, Vol. 12, pp. 103–108, 1990.

27. M. Turk and A. Pentland, ―Eigenfaces for recognition. Journal of Cognitive

   Neurosicence,‖ Vol. 3, pp. 71–86, 1991.



                                        65
28. H. A. Rowley, S. Baluja, and T. Kanade, ―Neural network-based face detection,‖

   IEEE trans. Pattern Analysis and Machine Intelligence, Vol. 20, pp. 23–38, 1998.

29. E. Osuna, R. Freund, and F. Girosi, ―Training support vector machines: An

   application to face detection,‖ Proceedings of the IEEE Conf. Computer Vision

   and Pattern Recognition, pp. 130–136, 1997.

30. M.H. Yang, D. Roth, and N. Ahuja, ―A snow-based face detector,‖ Advances in

   Neural Information Processing Systems, Vol. 12, pp. 855–861, 2000.

31. J. R. Quinlan, C4.5, ―Programs for Machine Learning,‖ Morgan Kaufmann

   Publishers, pp. 71-80, 1993.

32. U. Park, ―Face Recognition: face in video, age invariance, and facial marks,‖

   Dissertation. Michigan State University, pp. 1-158, 2009.

33. P. A. Viola and M. J. Jones, ―Robust real-time face detection,‖ International

   Journal of Computer Vision, Vol. 57, pp.137-154, 2004.

34. R. Gottumukkal and V.K. Asari. "An improved face recognition technique based

   on modular PCA approach." Pattern Recognition Letters, Vol. 25, pp. 429-436,

   2004.

35. M. Sebastian, et al. "Kernel PCA and de-noising in feature spaces." Advances in

   neural information processing systems, Vol.11, pp. 536-542, 1999.

36. Wang, Hui-Yuan, and Xiao-Juan Wu. "Weighted PCA space and its application in

   face recognition," Proceedings of IEEE International Conference on Machine

   Learning and Cybernetics, Vol. 7, pp. 4522-4527, 2005.




                                      66
37. H. Yu and J. Yang, "A direct LDA algorithm for high-dimensional data — with

   application to face recognition,‖ Pattern Recognition, Vol. 34, pp. 2067–2069,

   2001.

38. Huang, Rui, et al. "Solving the small sample size problem of LDA,‖ Proceedings

   of 16th IEEE International Conference on Pattern Recognition. Vol. 3, pp. 29-32,

   2002.

39. Bartlett, Marian Stewart, Javier R. Movellan, and Terrence J. Sejnowski. "Face

   recognition by independent component analysis," IEEE Transactions on Neural

   Networks, Vol. 13, pp. 1450-1464, 2002.

40. Ming-Jung Seow, Ann Theja Alex, and Vijayan K. Asari, ―Learning embedded

   lines of attraction by self organization for pose and expression invariant face

   recognition,‖ SPIE Journal of Optical Engineering, pp. 1-14, Vol. 51, 2012.

41. Zheng, Zhong-Long, and Fan Yang. "Enhanced active shape model for facial

   feature localization", IEEE International Conference on Machine Learning and

   Cybernetics, Vol. 5, pp. 2841-2845, 2008.

42. Chen, Yi, et al. "Discriminative Local Sparse Representations for Robust Face

   Recognition,‖ pp. 1111-1947, 2011.

43. Bicego, Manuele, et al. "On the use of SIFT features for face authentication,"

   IEEE Computer Vision and Pattern Recognition Workshop, pp. 1-35, 2006.

44. H. Bay, A. Ess, T. Tuytelaars, Luc Van Gool, "SURF: Speeded Up Robust

   Features", Computer Vision and Image Understanding (CVIU), Vol. 110, pp. 346-

   359, 2008.




                                        67
45. P. Dreuw, P. Steingrube, H. Hanselmann and H. Ney, ―SURF-Face: Face

   Recognition Under Viewpoint Consistency Constraints,‖ In A. Cavallaro, S.

   Prince and D. Alexander, editors, Proceedings of the British Machine Conference,

   pp. 7.1-7.11, 2009.

46. N. Dalal and B. Triggs, ―Proceedings of IEEE Conference Computer Vision and

   Pattern Recognition,‖ pp. 886 - 893, 2005.

47. O. Déniz, G. Bueno, J. Salido, and F. De la Torre, ―Face recognition using

   Histograms of Oriented Gradients,‖ Pattern Recognition Letters, Vol. 32, pp.

   1598-1603, 2011.

48. T. Ojala, M. Pietikäinen, and D. Harwood, "Performance evaluation of texture

   measures with classification based on Kullback discrimination of distributions",

   Proceedings of the 12th IAPR International Conference on Pattern Recognition,

   Vol. 1, pp. 582 – 585, 1994.

49. T. Ahonen, A. Hadid and M. Pietikäinen, ―Face description with local binary

   patterns : Application to Face Recognition,‖ IEEE Transactions on Pattern

   Analysis and Machine Intelligence , pp. 469-481, 2006.

50. A. K. Jain, R. P. W. Duin, and J. Mao, ―Statistical Pattern Recognition: A

   Review,‖ IEEE Trans. Pattern Anal. Mach. Intell, Vol. 22, pp. 4-37, 2000.

51. I. Biederman and J. Gu, ―Surface versus edge-based determinants of visual

   recognition,‖ Cognitive Psychology, Vol. 20, pp. 38-64, 1988.

52. J. Liu, C.A. Masanori Higuchi, A. Marantz and N. Kanwisher, ―The selectivity of

   the   occipitotemporal   M170    for    faces,‖   Cognitive     Neuroscience   and

   Neuropshychology, Vol. II, pp. 337-341, 2000.



                                      68
53. Takeo Kanade, ―Computer recognition of human faces,‖ Interdisciplinary systems

   research, Vol. 47, pp. 1-106, Birkhäuser, 1977.

54. R. Brunelli and T. Poggio, ―Face Recognition: Features versus Templates,‖ IEEE

   Transactions on Pattern Analysis and Machine Intelligence, Vol. 15, pp. 1042-

   1052, 1993.

55. Y. Gao and M.K.H Leung, ―Face recognition using line edge map,‖ IEEE

   Transactions on Pattern Analysis And Machine Intelligence, Vol.24, pp. 764 -

   779, 2002.

56. B. Tak´acs, ―Comparing face images using the modified Hausdorff distance,‖

   Pattern Recognition, Vol.31, pp. 1873-1881, 1998.

57. Y. Gao and M.K.H Leung, ―Human face profile recognition using attributed

   string,‖ Pattern Recognition, Vol.35, pp. 353-360, 2002.

58. W. Chen and Y. Gao, ―Recognizing partially occluded faces from a single sample

   per class using string-based matching,‖ Proceedings of the European Conference

   on Computer Vision, Vol.3, pp. 496-509, 2010.

59. S.E. Palmer, ―Hierarchical structure in perceptual representation,‖ Cognitive

   Psychology, Vol.9, pp. 441-474, 1997.

60. H. Buffart, E.J.L. Leeuwenberg and F. Restle, ―Coding theory of visual pattern

   completion,‖ Journal of Experimental Psychology: Human Perception and

   Performance, Vol.7, pp. 241-274, 1981.

61. H. Buffart and E.L.J. Leeuwenberg, ―Structural information theory,‖ In H.

   Geissler (Ed.), Modern issues in perception New York: Elsevier Science, pp. 48-

   72, 1983.



                                       69
62. P.A. van der Helm and E.L.J. Leeuwenberg, ―Accessibility: a criterion for

   regularity and hierarchy in visual pattern codes,‖ Journal of Mathematical

   Psychology, Vol.35, pp. 151-213, 1991.

63. P.A. van der Helm, R.J. van Lier and E.L.J. Leeuwenberg, ―Serial pattern

   complexity: irregularity and hierarchy. Perception,‖ Vol. 21, pp. 517-544, 1992.

64. R. A. Chechile, J. E. Anderson, S. A. Krafczek, and S. L. Coley, ―A syntactic

   complexity effect with visual patterns: evidence for the syntactic nature of the

   memory representation,‖ Journal of Experimental Psychology: Learning,

   Memory, and Cognition, Vol.22, pp. 654-669, 1996.

65. K.S. Fu, ―Syntactic pattern recognition and application,‖ Prentice-Hall,

   Englewood Cliffs, NJ, pp. 1-640, 1982.

66. H. Bunke, ―Structural and syntactic pattern recognition,‖ in: C.H. Chen, L.F. Pau,

   P.S.P. Wang (Eds.), Handbook of Pattern Recognition and Computer Vision,

   World Scientific Publishing Company, Singapore, pp. 163-209, 1994.

67. J. Canny, ―A computational approach to edge detection,‖ IEEE Transactions on

   Pattern Analysis and Machine Intelligence, Vol.8, pp. 679-698, 1986.

68. N. Otsu, ―A threshold selection method from gray-level histogram,‖ IEEE

   Transactions on System, Man and Cybernetics. Vol.9, pp. 62-66,1979.

69. Y. Huo, G. Wei, Y. Zhang and L. Wu, ―An adaptive threshold for the Canny

   Operator of edge detection,‖ International Conference on Image Analysis and

   Signal Processing, pp. 371-374, 2010.




                                       70
70. R. Cortland Tompkins, "Multimodal Recognition using Simultaneous Images of

   Iris and Face with Opportunistic Feature Selection,‖ Advisor: Dr. Vijayan Asari,

   University of Dayton.

71. http://pr.willowgarage.com/wiki/Face_detection

72. Y. Freund and R. E. Schapire, ―A Decision-Theoretic Generalization of on-Line

   Learning and an Application to Boosting,‖ Journal of Computer and System

   Sciences, pp. 119-139, 1995.

73. A.T Alex, V. K. Asari, and A. Mathew, "Gradient feature matching for expression

   invariant face recognition using single reference image," IEEE International

   Conference on Systems, Man and Cybernetics, pp. 1-6, 2012.

74. A.T. Alex, V. K. Asari, and A. Mathew, "Local alignment of gradient features for

   face sketch recognition," 8th International Symposium on Visual Computing –

   ISVC, Vol. 7432, pp. 378-387, 2012.

75. T. F. Smith and M. S. Waterman, ―Identification of common molecular

   subsequences,‖ Journal of Molecular Biology," Vol.147, pp. 195-197, 1981.

76. S. B. Needleman and C. D. Wunsch, ―A general method applicable to the search

   for similarities in the amino acid sequence of two proteins,‖ Journal of Molecular

   Biology, Vol.48, pp. 443-453, 1970.

77. X. Liu, T.Chen and B.V.K. Vijaya Kumar, ―Face Authentication for Multiple

   Subjects Using Eigenflow,‖ Pattern Recognition, Vol. 36, pp. 313-328, 2003.

78. Y. Hu and Z.Wang, ―A similarity measure based on Hausdorff distance for human

   face recognition.,‖ Proceedings of the 18th International Conference on Pattern

   Recognition, Vol.3, pp. 1131-1134, 2006.



                                       71
79. H.K. Ekenel and R. Stiefelhagen, ―Local appearance based face recognition using

   discrete cosine transform,‖ EUSIPCO, pp. 1-5, 2005.

80. Z. Wang and Q. Ruan, ―Facial expression recognition based orthogonal local

   fisher discriminant analysis,‖ Proceedings of the 10th IEEE International

   Conference on Signal Processing, pp. 1358-1361, 2010.

81. P. Tsai, T. P. Tran and L. Cao, ―Expression-invariant facial identification,‖ IEEE

   International Conference on Systems, Man and Cybernetics, pp.5151-5155, 2009.

82. J. Wright, A. Y. Yang, A. Ganesh, S.S. Sastry and Y. Ma, ―Robust face

   recognition via sparse representation,‖ IEEE Transactions on Pattern Analysis and

   Machine Intelligence, Vol.31, pp. 210-227, 2008.

83. P. Nagesh, B. Li, ―A compressive sensing approach for expression invariant face

   recognition,‖ IEEE Conference on Computer Vision and Pattern Recognition, pp.

   1518-1525, 2009.

84. B. Tunc,V. Dagli, M. Gokmen, ―Robust face recognition with class dependent

   factor analysis,‖ Proceedings of International Joint Conference on Biometrics, pp.

   1-6, 2011.

85. G.B. Huang, M. Ramesh, T. Berg, and E.Learned-Miller, ―Labeled Faces in the

   Wild:   A     Database   for   Studying   Face   Recognition   in   Unconstrained

   Environments,‖ University of Massachusetts, Amherst, Technical Report 07-49,

   2007.

86. X. Wang and X. Tang, ‖Face Photo-Sketch Synthesis and Recognition,‖ IEEE

   Transactions on Pattern Analysis and Machine Intelligence, Vol. 31, pp. 1955-

   1967, 2009.



                                       72
87. X. Tang and X. Wang, ―Face Photo Recognition Using Sketch,‖ Proceedings of

   IEEE International Conference on Image Processing, Vol. 1, pp. 257-260, 2002.

88. X. Tang and X. Wang, ―Face Sketch Recognition,‖        IEEE Transactions on

   Circuits and Systems for Video Technology (CSVT), Special Issue on Image and

   Video Based Biometrics, Vol. 14, pp. 50-57, 2004.

89. X. Wang and X. Tang, ―Face Photo-Sketch Synthesis and Recognition,‖ IEEE

   Transactions on Pattern Analysis and Machine Intelligence, Vol. 31, pp. 1955-

   1967, 2009.

90. W. Zhang, X. Wang and X. Tang, ―Coupled Information-Theoretic Encoding for

   Face Photo-Sketch Recognition,‖ Proceedings of IEEE Conference on Computer

   Vision and Pattern Recognition, pp. 513-520, 2011.




                                      73

				
DOCUMENT INFO
Stats:
views:8
posted:11/15/2013
language:
pages:87