4 Design and Implementation of 3D Face Reconstruction

Document Sample
4 Design and Implementation of 3D Face Reconstruction Powered By Docstoc
					Acknowledgements
I would like to express my gratitude to all the people who helped me to fulfil
this project. First of all, I want to thank my supervisor of this project, Dr.
Andreas Lanitis. He has given me many suggestions during my work, either
about general issues or specific problems. And I‟m glad to thank Dr. Georgios
Stylianou, who has given me much advice on technique related to graphics. At
last, thank Dr. Andreas Grondoudis for taking part in my presentation and help
me to improve this project.
Once again, thank all of you!




Jiang Shan                                                                  1
Abstract
This thesis is about a system used to reconstruct a 3D model of a face based on
two orthogonal images. The feature of this system is that it‟s totally
independent from a database. All information required to reconstruct a 3D face
model are based on input images. Also, it‟s relatively easy to use, so that users
who are non-proficient with computer graphics and image processing can also
learn how to use it in a short term. This thesis describes the background of 3D
face reconstruction techniques, the CANDIDE model which is used as the base
in this system, and the very technique used in this software. Moreover, a
performance testing is provided as well.




Jiang Shan                                                                     2
Table of Contents
Acknowledgements.............................................................................................. 1
Abstract ................................................................................................................ 2
Table of Contents ................................................................................................. 3
1    Introduction ................................................................................................. 4
  1.1      Overview.............................................................................................. 4
  1.2      Aims and Objectives ............................................................................ 4
  1.3      Feasibility ............................................................................................ 5
  1.4      Structure of the Report......................................................................... 5
2    Literature Review ........................................................................................ 6
  2.1      What Is 3D Face Reconstruction? ....................................................... 6
  2.2      Overview of Different Methods........................................................... 6
     2.2.1         Shape from Shading Methods...................................................... 6
     2.2.2         Shape from Motion Methods ....................................................... 7
     2.2.3         Stereo Methods ............................................................................ 7
     2.2.4         Geometry-Based Methods ........................................................... 8
3    Introduction on the CANDIDE Model ........................................................ 9
  3.1      Overview of the CANDIDE Model ..................................................... 9
  3.2      The CANDIDE Versions ..................................................................... 9
  3.3      Description of CANDIDE v3.1.6 ...................................................... 10
     3.3.1         The Reasons to choose CANDIDE v3.1.6................................. 10
     3.3.2         The Content and Structure of CANDIDE v3.1.6 ....................... 11
     3.3.3         Significant Shortages of v3.1.6 Affecting Design and
     Implementation .......................................................................................... 11
4    Design and Implementation of 3D Face Reconstruction ........................... 13
  4.1      Software Structure ............................................................................. 13
  4.2      Framework ......................................................................................... 14
  4.3      Load Images....................................................................................... 16
  4.4      Load the CANDIDE Model ............................................................... 17
  4.5      Calculate Normal Vectors.................................................................. 18
  4.6      Project Vertices on 2D Planes and Adjust Coordinates..................... 19
  4.7      Fit Face Images by Graphic Operation .............................................. 21
  4.8      Global, Part and Vertex Based Operation.......................................... 22
  4.9      Difference of Implementation between Frontal and Side Images ..... 23
  4.10 Improve Quality by Interpolation ...................................................... 24
  4.11 Texture Mapping and Reconstruction ............................................... 29
5    Performance Evaluation ............................................................................ 31
6    Conclusions and Future Work ................................................................... 34
  6.1      Conclusions ....................................................................................... 34
  6.2      Future Work ....................................................................................... 34
References.......................................................................................................... 35
User Manual....................................................................................................... 36
Software CD ...................................................................................................... 40




Jiang Shan                                                                                                              3
1 Introduction
1.1 Overview
Nowadays, techniques in the 3D domain are being developed better and better.
In many real cases, techniques based on 3D images are involved. People
believe that the future is about 3D images, but not 2D ones. However, obtaining
3D images is still far away to most of people. One of the reasons is that
equipment used to capture 3D images is relatively expensive. Thus, using
software to simulate 3D models based on 2D information is considered as an
applied approach. This project is designed for this reason. By employing some
techniques, it can reconstruct a 3D model of a face based on 2D photos. The
requirement of hardware is relatively low, so that it can be installed in most of
modern computers.

1.2 Aims and Objectives
The objective of this project is reconstructing 3D face model based on two
orthogonal input images about a face of a person. Figure 1 shows it. The
reconstructed 3D model should include both geometry and texture features of
the person. For performance, high resolution and short execution time are
desired. A Graphic User Interface (GUI) is required to make it convenient to
operate the system.




       Orthogonal Images
                                                                3D Model

             Figure 1: Objective of 3D Face Reconstruction




Jiang Shan                                                                     4
1.3 Feasibility
In order to ensure high level of performance and match the schedule of
development, the MATLAB development environment is chosen as the only
tool for implementation.
      Because of the restriction of the schedule and the limit of the MATLAB
development environment, this software requires MATLAB as runtime
environment.


1.4 Structure of the Report
This report is composed by four main chapters. First of all, a literature review is
provided. In that chapter, main ideas, advantages and disadvantages of different
methods of 3D face reconstruction are introduced while emphasising the
geometry-based approach. After that, the CANDIDE model which is used as
the base of 3D face reconstruction by this software is introduced. The reasons
to use the CANDIDE model and the ways to limit the shortage of it are also
mentioned in this chapter. The next chapter describes the main idea of the
design and implementation of the system. Technical issues involved are
explained, and solutions are provided. Then, the algorithm used to test the
software is introduced. The testing result and evaluation are provided at the end
of that chapter. Besides those four chapters, a chapter about conclusion and
future work is provided at the end of this report. In the appendix, references and
user manual are given.




Jiang Shan                                                                       5
2 Literature Review
2.1 What Is 3D Face Reconstruction?
3D face reconstruction is a technology used for reconstructing three-
dimensional (3D) face geometry from media such as images and video. The
reason we need this technology is to provide acquisition of 3D data without
expensive scanning equipment so that 3D data could easily be used in daily
applications.

      Based on the methodology employed, the methods of 3D face
reconstruction can be classified into the following categories: Methods based
on shape from shading (SFS), shape from motion methods, stereo methods and
geometry-based methods.


2.2 Overview of Different Methods
2.2.1 Shape from Shading Methods
Shading is the variation in brightness from one point to another in the image. It
carries information about shape because the amount of light a surface point
reflects depends on its orientation relative to the incident light. Shape from
shading (SFS) methods attempt to recover the 3D face by computing accurately
the lighting conditions.
       SFS methods usually use a single face image as an input, but in some
cases, they can also utilize multiple non-orthogonal images of a human subject.
This feature is supposed to be one of the most important advantages of SFS
methods. If an application is lack of source images or requires high level of
quality, SFS methods can be performed in both cases. The main steps involved
in a typical SFS-based reconstruction method are: Given a novel face image,
first the light source‟s position and orientation is inferred. Using the shading
information, the normal vector and the 3D point for each pixel of the face are



Jiang Shan                                                                     6
recovered. In the last step, the 3D face is reconstructed. Optimization processes
can be done after that.
       How to utilize the shading information is the key of SFS methods. One
of the approaches is based on statistic information. Atick et al [Atick et al. 1996]
[1] proposed the first statistical shape from shading method for reconstructing a
3D face from a single image. First, they acquire a database of 200 scans of 3D
faces, parameterize all faces in cylindrical coordinates and align those using 3D
rigid transformations. Definitely, the quality of the acquisition is directly
determined by the content of the database. This is the shortage of this approach.


2.2.2 Shape from Motion Methods
Shape from motion techniques are used for generating a 3D face model based
on the extracted shapes of the face in different frames in image sequences.
       Leung et al [Leung et al. 2000] [2] located manually 44 facial points on
a face shown in the input image sequences. In a pre-processing step, they
identified manually the same feature points on a generic 3D model, thus the
features in both the video and the generic model are in correspondence. To
compute the correspondence of the rest of the points, they performed cylindrical
projection to map all the points of the 3D generic mesh to 2D and then
triangulate the feature points to help compute the inverse affine map for every
non-feature point. Finally they texture mapped the face onto the 3D generic
model and morph the generic model to better match the original.
       There are many improved methods based on the above idea. However,
most of them share the same disadvantages: requiring much source information
and the operation is relatively complex.


2.2.3 Stereo Methods
Stereo based face reconstruction is a technique used for 3D reconstruction in
very diverse scenes. Stereo requires two off-the-shelf digital cameras which are
connected together and calibrated so that they aim at the same object. The
framework of stereo based face reconstruction is as follows. Both cameras are

Jiang Shan                                                                       7
used to take two snapshots of the face. Then pixel correspondences are
established between the two images to create the disparity map. The disparity
map and the knowledge of the relative distance between the two cameras are
used to compute the depth map.
       The most important shortage of stereo based methods is that the
equipment requirement is not always available. Also, the performance is often
affected by the environment conditions.


2.2.4 Geometry-Based Methods
Geometry based methods extract the geometry features (eyes, mouth, nose, etc)
of the face images, and use them to reconstruct the 3D face. The advantage of
this class of methods is no 3D face database is required. The inputs of geometry
based methods are at least two face images which are orthogonal to each other.
The geometry from all images representing the same face is extracted and then
blended to produce the 3D face. The disadvantage, however, is the quality of
the acquisition is usually limited by the quality of the source images.
       The first step of all geometry-based methods is acquiring a frontal and a
side photo of the face to be reconstructed. The second step is finding all the
corresponding points from both photos. The frontal photo provides the x and y
coordinates for each point of the face and the side photo provides the y and z
coordinates. As a result, the x, y and z coordinate of all points in the face is
available. Finally, the reconstructed face is texture mapped using the blended
texture form the orthogonal photos.
       All geometry-based methods are based on the assumption that faces
have symmetrical shape and texture. Since only a frontal and a side image is
provided information from one side image is used for inferring the appearance
of the second side. When dealing with non symmetrical faces geometry-based
methods fail to produce accurate results. Nevertheless, this shortage can be
removed by processing on a third image for the other side of the face.




Jiang Shan                                                                    8
3 Introduction on the CANDIDE Model
3.1 Overview of the CANDIDE Model
CANDIDE is a parameterised face mask specifically developed for model-
based coding of human faces. Although it has a low number of polygons
(approximately 100), it captures the most important feature points of human
faces. Thus, it can offer a relatively high level of model and at the same time
allows fast reconstruction with moderate computing power.
         CANDIDE is controlled by global and local Action Units (AUs). The
global ones correspond to rotations around three axes. The local Action Units
control the mimics of the face so that different expressions can be obtained [3].
         The concept of Action Units was first described about 30 years ago by
the Swedish researcher Carl-Herman Hjortsjö in his book Man's Face and the
Mimic Language (in Swedish). This work was later extended by Paul Ekman
and Wallace V Friesen of the Department of Psychiatry at University of
California Medical Centre [3].
         The CANDIDE model was created by Mikael Rydfalk at the Linköping
Image Coding Group in 1987. This work was motivated by the first attempts to
perform image compression through animation [3].
      The CANDIDE model became known to a larger public through journal
articles. It is publicly available and is now used by research groups around the
world.


3.2 The CANDIDE Versions
The Original CANDIDE, described in the report by M. Rydfalk, contained 75
vertices and 100 triangles and is demonstrated by the Java Demo. This version
is rarely used for its low quality of modelling and lack of Action Units [3].
         The most widespread version, the de facto standard CANDIDE model,
is a slightly modified model with 79 vertices, 108 surfaces and 11 Action Units.



Jiang Shan                                                                      9
This model was created by Mårten Strömberg while implementing the xproject
package, and is here referred to as Candide-1 [3].
       Later, Bill Welsh at British Telecom created another version with 160
vertices and 238 triangles covering the entire frontal head (including hair and
teeth) and the shoulders. This version, known as Candide-2 is also included in
the xproject package, but is delivered with only six Action Units [3].
      A third version of CANDIDE has been derived from the original one. The
main purpose of another model is to simplify animation by MPEG-4 Facial
Animation Parameters [3]. Therefore, about 20 vertices have been added, most
of them corresponding to MPEG4 feature points. This model is called
Candide-3 and is included in the WinCandide package [3]. In Figure 2, the wire
frames of different versions of CANDIDE are shown:




             Candide-1             Candide-2             Candide-3
                   Figure 2: Different versions of CANDIDE



3.3 Description of CANDIDE v3.1.6
3.3.1 The Reasons to choose CANDIDE v3.1.6
For this project, CANDIDE v3.1.6 is chosen as the basic face modeling tool.
The reasons to use it are: First of all, it‟s a relatively new version of the
CANDIDE family. Many researchers are working on it, so that it‟s easy to
share experience and gather advices about it. Also, it contains a reasonable
number of feature points and polygons, so that the model based on it can reach
an acceptable level of quality, while keeping the running time relatively low.
And because of its structure, it can be easily utilized by a MATLAB program.



Jiang Shan                                                                  10
3.3.2 The Content and Structure of CANDIDE v3.1.6
There are two files in the CANDIDE v3.1.6 package. The contents of them are
exactly the same. One is a text file, used as a description file. And the other one
is an object file, which is the one can be really utilized.
        The structure of CANDIDE v3.1.6 is matrix based. The content can be
divided into 6 parts. The first part is a description about modification of each
version (from v3.1.1 to v3.1.6). The second part is a list of vertices. There are
113 vertices in v3.1.6, and for each point, x, y and z coordinates are offered.
Thus, it‟s a 113-by-3 matrix. All coordinates are between -1.0 to 1.0. The third
part is a list of faces, which are triangles. There are 184 faces in v3.1.6, and for
each face, three vertices are offered. Thus, it‟s a 184-by-3 matrix. In this part,
only indices of vertices are mentioned. Thus, in order to draw a face, the list of
vertices must be involved. The fourth part is a list of animation units. Sixty five
different animation units are provided, and in each of them, the indices of the
vertices required to move and how to move are mentioned. The fifth part is a
list of shape units, which informs users how to change shapes of each face
organ. The last part is about texture. For this project, only the second and the
third parts are considered as useful.


3.3.3 Significant Shortages of v3.1.6 Affecting Design and
        Implementation
There are three significant shortages of v3.1.6, which effects design and
implementation directly. First of all, the index of matrices in v3.1.6 begins with
0, however, in MATLAB, it begins with 1. This problem may lead to “out of
index” errors in some conditions, so that it should be considered while
designing and implementing. The other one is v3.1.6 is a normal-free model,
which means it doesn‟t involve any information about normal vectors of either
faces or vertices. However, in such an application like this project, avoiding
using normal vectors is almost impossible. As a result, calculating normal
vectors of faces and vertices and storing them are considered to be included in
the task domain. The last shortage is that not all of the faces of v3.1.6 consist of
Jiang Shan                                                                       11
the same sequence of vertices. It makes calculating normal vectors difficult. In
the next chapter, this problem will be discussed more clearly and specifically.




Jiang Shan                                                                    12
4 Design and Implementation of 3D Face
    Reconstruction
4.1 Software Structure
In order to fulfil the aims of this software and considering the features of
MATLAB, the software will consist of three major parts.
       The first part is the main program. The major functions of this main
program are: (1) offering a user-friendly GUI; (2) getting inputs of users‟
operation; (3) showing the outputs or reacting based on users‟ operation; (4)
offering information related to the software; (5) storing data as global variables.
Figure 3 is a screenshot of the GUI.




                             Figure 3: GUI Layout

       The second part is a set of functions (procedures). Each function may
perform a specific task or a set of related tasks. Most of work related to
calculation, decision making and data manipulating will be done in this part.



Jiang Shan                                                                      13
       The last part is the CANDIDE model, which is obviously necessary for
this software.


4.2 Framework
The whole procedure of 3D face reconstruction can be divided into the
following steps:
   Step1: Load Images. In this step, two images are loaded: one is a frontal
    image of a face, and the other one is a side image of the same face.
   Step2: Load CANDIDE Model. In this step, the CANDIDE Model is
    loaded. All useful data will be stored as global variables in MATLAB
    matrix format.
   Step3: Project CANDIDE Model. In this step, vertices of CANDIDE
    Model will be projected on both images as feature points.
   Step4: Morph CANDIDE Model. In this step, by moving feature points, the
    CANDIDE model is adjusted to fit the face more accurately.
   Step5: Interpolation. In this step, in order to improve the quality of the 3D
    model, interpolation is done. While generating new vertices, the resolution
    of the 3D model will be much higher than before.
   Step6: Texture Mapping. In this step, each vertex is assigned colour which
    is extracted from either the frontal image or the side one.
   Step7: Reconstruct 3D Face. In this step, a 3D face model is generated
    based on all information obtained by the above steps.
   Step8: Export 3D Model. This is the last step of the whole procedure. Users
    can save the 3D model for other usage.
       In Figure 4, the relationship and outputs of the major steps are
demonstrated. In the remaining part of this chapter, the whole procedure will be
discussed step-by-step. And for important specific issues, details will be offered.




Jiang Shan                                                                     14
        Orthogonal Images
             Adjustment
                                                CANDIDE Model


                            Projection                       Interpolation




                                Morph




     Feature Extraction                           Improved Model

                                   Texture Mapping




                               3D Model

             Figure 4: Framework of 3D Face Reconstruction




Jiang Shan                                                                   15
4.3 Load Images
Not as pure image processing software, in this project, those images loaded are
not really opened, i.e. there won‟t be any modification on those images. In
order to use those images, only the address and the file name are really required.
Thus, a MATLAB function uigetfile is used to load images. This function
provides a standard GUI-based file selector, and the directory of the file and the
file name are returned. This function also has a variation which allows users to
select more than one file at once. However, this variation is abandoned by this
project to benefit with the flexibility that those two images may not appear in
the same directory. So that, users don‟t have to move a file in order to load
them at the same time.
       The most challenging issue for this part is that in most of the cases, the
frontal and side images are either not in the same size or zoomed in different
ratio. For this project, most of image processing related issues are left to users
in order to insure the concentration on the very purpose of this software.
However, this problem can be very annoying, especially for large amount of
work. Thus, I decide to include the solution of this problem as a part of this
project. The solution is that after selecting frontal and side images, users are
asked to define two points on each image. Those points should correspond pair
by pair. Figure 5 is an example; the blue points are defined by user:




                 Figure 5: Adjusting Frontal and Side Images




Jiang Shan                                                                     16
        The idea is the following: points in the frontal image, say, are P1(x1, y1)
and P2(x2, y2), and points in the side image are P3(x3, y3) and P4(x4, y4). Since
the origin of coordinates in MATLAB is at the top left, so that if y1 = a, y3 = c,
then y2 = a + b, and y4 = c + d. Set r = b / d, which can be considered as the
zooming ratio between the frontal and the side images. Set n = a – c * r, which
is the difference between the two images in the same zooming ratio. Set y
coordinates in the frontal image are Yf, and in the side image are Ys. Thus, for
all points:
        Ys = (Yf – n) / r, (r ≠ 0).
        Besides adjustment between frontal and side images, this solution can
provide extra benefit, which is locating two points of a face at the very
beginning of the whole procedure of reconstructing. For example, in Figure 5,
the two user-defined points are the peak of nose and the midpoint of mouth, so
that these two feature points can be located at the loading images stage. Also,
all the other feature points can be moved based on the location of these two
points, which means that all feature points are located at a relatively close
position to the face automatically. This feature can help users to reduce their
work in a dramatic way.


4.4 Load the CANDIDE Model
As I mentioned before, the CANDIDE Model is a text-based data collection.
Thus, to get information from a CANDIDE file, string-based operation is
necessary. In order to ensure high efficiency, I decide to use a function
provided by my project supervisor to complete this task. This function accepts a
file name as input, in this case, the file name of the CANDIDE file should be
passed. By performing string operation, this function can read vertices,
connections, faces, vertices_texture_cords and normal, and return them as
output. However, as I mentioned in the previous chapter, the CANDIDE Model
doesn't involve any information about normal vectors, so that this function only
returns an empty matrix as normal.



Jiang Shan                                                                      17
       The only problem here is that all indices of the CANDIDE Model start
with 0, but indices of MATLAB start with 1. The solution is to add 1 to all
indices after loading information.


4.5 Calculate Normal Vectors
Normal vector is an attribute with significant meaning in most 3D applications.
For this project, it can be used at least for interpolation. Thus, calculating
normal vectors correctly is an important part of the task domain. For a 3D
application, usually normal vectors can be divided into two categories: face
normal vectors and vertex normal vectors. In this project, vertex normal vectors
are more useful than face ones. However, in order to calculate vertex normal
vectors, calculating face normal vectors is necessary.
       Since all polygons can be divided into a limited number of triangles, in
this report I only mention triangle faces to demonstrate all methods related to
polygons. Look at Figure 4, for a face such as face ABC, the normal vector is a
vector perpendicular to it. In order to get such a vector, the easiest way is
calculating the cross product of two unparallel vectors in this face. In face ABC,
vector AB and vector AC are unparallel to each other obviously. Thus, both AB
X AC and AC X AB can be used as normal vector. The only difference
between them is the directions of them are opposite. Because of this issue,
while calculating face normal vectors, the sequence of vectors is very important.
Either clock-wise or counter-clock-wise doesn‟t really matter in most of cases,
but consistency must be insured. The last step of calculating normal vectors is
normalization, which makes the magnitude of a normal vector equal to 1. In
Figure 6, vector N is a normal vector of face ABC, and N‟ is the normalized N,
we have:

        N  AC  AB , and

               N
        N'        .
               N




Jiang Shan                                                                    18
                     Figure 6: Calculating Normal Vectors

       Because the CANDIDE Model doesn‟t consist of the same sequence of
indices of vertices for all faces, it is difficult to make sure all face normal
vectors share the same value, towards either outside or inside the 3D model.
Fortunately, this inconsistency isn‟t random. Modifying the CANDIDE Model
is done manually to fix this problem.
       Once we get normal vectors of all faces, vertex normal vectors can be
obtained by calculating the average of face normal vectors related to each
vertex. For example, vertex A is related to faces f1, f2, and f3, the normal vectors
of those faces are n1, n2 and n3 correspondingly. As a result, the normal vector
of A is N = (f1 + f2 + f3) / 3. Now another problem appears: in the CANDIDE
Model, not all vertices related to some face, which means it‟s impossible to
calculate normal vectors for such vertices (It doesn't make sense to calculate
them as well). To avoid any potential problems, I decide to leave them as zero
vectors.


4.6 Project Vertices on 2D Planes and Adjust
      Coordinates
As I mentioned in chapter 2, the very basic idea of this project is a geometry-
based method of 3D face reconstruction. As Figure 7 shows, this project
requires two orthogonal images. By the frontal image, x and y coordinates of
points are obtained, and by the side image, z and y coordinates are acquired. As


Jiang Shan                                                                      19
a result, x, y and z coordinates is available for each point. Thus, reconstructing
a 3D face model is possible.




             Figure 7: Geometry-Based 3D Face Reconstruction


       Now it‟s obvious that obtaining coordinates from both images is
extremely essential to this software. In order to generate a useful 3D model, a
large number of points are definitely required. In this case, obtaining
coordinates manually is not only time consuming, but also unwise. This issue
can be considered as one of the most important reasons to use the CANDIDE
model. The idea is the following: by projecting vertices of the CANDIDE
model on both images (in different ways, of course), feature points are
generated. By utilizing information gathered by those feature points, more
points can be involved automatically.
       As a model for general use, the CANDIDE model is built following the
most popular mathematic rule: the centre of the model is at the origin of a
rectangular coordinate, and the range of coordinates is from -1 to 1 in both
directions. By contrast, in MATLAB, the origin is at the top left of an image,
and only positive values are involved in the image. Thus, adjusting vertices
from CANDIDE format to MATLAB format is the most important job for
projection. Again, I use a function provided by my project supervisor to
perform this task. This function, however, only works with x and y coordinates,
I will do the similar thing for z coordinates as well. The basic idea of this
function is the following: (1) getting the width (w) and height (h) of the image


Jiang Shan                                                                     20
vertices will be projected on; (2) dividing both w and h by 2 to get w‟ and h‟; (3)
for all x and z coordinates:
       x'  x * w  w ,
(4) For all y coordinates:
       y'  h  y * h .



4.7 Fit Face Images by Graphic Operation
After projection, how to move feature points to fit the face in both images is the
next key job. In this stage, graphic transformation operations are really useful.
By performing translation, rotation and scaling around all three axes, the
CANDIDE model can be morphed as close to the face as possible. All those
operations are matrix-based operations, so that in order to fulfil the mathematic
rules, all coordinates of feature points are appended with an extra column into
the following format temporarily: [x y z 1]. In Figure 8, all matrices involved in
graphic transformation operations are shown:




   Translation Matrix        Scaling Matrix     Rotation around X-axis Matrix




      Rotation around Y-axis Matrix           Rotation around Z-axis Matrix


                   Figure 8: Graphic Transformation Matrices

       To perform those operations, just multiplying the vector of coordinates
with the corresponding matrix is ok. However, for this project, rotation will not
serve the aim much. This software accepts two orthogonal images of a face as

Jiang Shan                                                                      21
inputs. This requirement implies those two images must be about a face looking
straight forward, but not about a face shaking or nodding. Although this
software can generate 3D models for askew faces by employing rotation, the
models will not be as accurate as expected. In order to avoid any confusion, the
ability of rotation is abandoned at the end.


4.8 Global, Part and Vertex Based Operation
In the previous section, the basic idea of fitting the CANDIDE model onto a
face is described. Performing those graphic transformation operations on the
whole model can be considered as global operations. They are useful, however,
not perfect. The CANDIDE model is designed for general use, which means it
can not fit all particular faces by just morphing them (each one individually) as
a whole. Global operations, let‟s say, are the first level of operations provided
by this software. The next level is operations on individual parts of a face. In
this project, the CANDIDE model is divided into 12 parts based on the
structure of human faces. Those parts are: outer hair line, inner hair line, left
eye brow, right eye brow, left eye, right eye, nose, left cheek, right cheek, upper
lip, lower lip, and chin. By choosing a part, users can perform graphic
operations on only the chosen part. Moreover, if necessary, users can choose
more than one parts as active parts, and morph them together. This level of
operations makes it much more flexible to fit a face. As a result, the generated
3D model can be closer to accurate.
       Although the second level of operations makes things much better, it is
not perfect either. Based on my experiments, in most of the cases, there are
always some of points that can not be fitted anyway. In order to deal with those
trouble makers, the third level of operations, vertex based operations are
appended. In this level, users can move individual vertices one-by-one. The
best approach to achieve this function is employing standard mouse events:
dragging, selecting by rectangular frame, right clicking, etc. However, in
MATLAB, all those events rely on ActiveX controls. Thus, for a program
without ActiveX controls as this one, those events can not be involved. (I

Jiang Shan                                                                      22
haven‟t studied on all possible solution provided by MATLAB, so that there
may be a way to use them without ActiveX controls. At least so far I haven‟t
found any.) The solution I provide is selecting and moving only one vertex at a
time. It makes this function a little annoying and not so user friendly; however,
despite this shortage, it works well.


4.9 Difference               of         Implementation                between
      Frontal and Side Images
Another issue related to fitting faces is that frontal and side images are different
on implementation. Code that works on frontal images can not be directly used
for side images, and vice-versa. Figure 9 shows the difference between feature
points in frontal and side images:




            Frontal image                                Side image
           Figure 9: Feature Points in Frontal and Side Images
          Reconstruction
      The most significant difference is that on frontal images, x and y
coordinates are manipulated, but on side images, z and y coordinates are
manipulated. This difference means that not all graphic operations can be done
on either frontal or side images. In frontal images, any operations related to z
coordinates will not show any influence. Similarly, in side images, all
operations about x coordinates will not show any modification.
      The other difference is that feature points shown on frontal and side
images are different. In frontal images, most of the feature points can be shown,
by contrast, in side images, because of the content of the side of a face, only
Jiang Shan                                                                  23
points on the left half and the middle line of the face can be shown. The benefit
of hiding points on the right half is that graphic operations can be done without
confusion, especially for vertex based movement. The drawback is that
information gathered by feature points on the left half is mirrored to the right
half, which makes the generated 3D model symmetric. This is also the most
significant drawback of geometry-based approach; the solution is designed as
future work. While testing, showing all points on the left half and the middle
line of the face on the side image is found to be too confusing because of the
number of points. Thus, I reduced the number of visible feature points for side
images, i.e. only a few feature points are drawn on side images, while other
feature points can not be seen but they are still there.


4.10 Improve Quality by Interpolation
As I mentioned in chapter 3, there are only 113 vertices and 184 faces in the
CANDIDE model. Obviously such a number of vertices and faces cannot
provide a 3D model in high quality. The solution is employing interpolation to
improve the quality. There are many ways of interpolation with different
performance. In the remained part of this section, I will describe all methods of
interpolation I have tested for this project.
        The easiest way of interpolation is linear interpolation. Figure 10 shows
the basic idea of linear interpolation.




                         Figure 10: Linear Interpolation

      For a face to be interpolated, coordinates of midpoints of all three edges
are calculated. Those midpoints are considered as new vertices. By connecting

Jiang Shan                                                                    24
those new vertices two-by-two, the original face is divided into four new faces.
As a result, the number of vertices is appended around one and a half times of
the number of faces, and the number of faces turns to four times as before. The
advantages of linear interpolation are easy to be implemented and high speed of
calculation. The disadvantage is that although it does increase the number of
vertices and faces, it can only make the colour quality higher, not the geometric
quality. The geometric shape of the model remains the same as the original one.
      The other option of interpolation is non-linear interpolation. The basic
idea of non-linear interpolation is that instead of getting the midpoint of the
edge connecting two vertices as a new vertex, getting the midpoint of a curve
passing both of those two vertices as the a new vertex. Unlike linear
interpolation, there is not a definitely right way to find the proper curve. The
method of locating the curve is based on hypothesis and experiments. Also, all
kinds of non-linear interpolation require normal vectors; by contrast, linear
interpolation doesn't involve them.
      The first hypothesis is advised by Dr. Georgios Stylianou. The
assumption is that the generated vertex and the two original vertices form an
equilateral triangle. Look at Figure 11, vertices P1 and P2 are original vertices,
and P is the new vertex.




          Figure 11: Demonstration of Dr. Georgios Stylianou‟s idea

       To calculate the position of P, the normal vectors of P1 and P2 must be
available. Set they are N1 and N2 correspondingly. For triangle PP1P2 is an
equilateral triangle, we have:



Jiang Shan                                                                     25
                    
       h  a * sin   , and
                   3
             P1  P2 
       P                  h * N1  N 2 .
                 2
       The normal vector of P is the average of N1 and N2.
       There is also a variation of this assumption, in which the generated
vertex is supposed not to be P, a vertex of the equilateral triangle, but the
midpoint of the Bézier curve defined by the triangle. Refer to Figure 11 again,
the point P‟ is the assumed new vertex. To calculate the position of P‟, just
performs the following formula:
                                                                 1
       P'  bt   1  t  P1  2 * t 1 - t P  t 2 P2 ,  t   .
                            2

                                                                 2
       Thus, P‟ = (P1 + P2 + 2P) / 4. The normal vector is same as the one of P.
       Definitely, the running time of this method is higher than linear
interpolation. However, the most significant problem here is although it is a
good way for general use, the assumption of the position of the generated
vertex is examined to be too high for a 3D face model in either way. Figure 12
shows the output of interpolation based on these two assumptions. Thus, new
assumption has to be found.




Jiang Shan                                                                   26
        Without Bézier Curve                       With Bézier Curve

             Figure 12: Non-linear Interpolation: Assumption (1)

       While doing more experiments, I found that as the assumption turns
closer to the CANDIDE model itself, the result also turns better. Based on this
observation, I decide to define the assumption based on the CANDIDE model.
In Figure 13, vertices A, B and C are used to obtain my assumption.




            Figure 13: Assumption based on CANDIDE model (1)

       My hypothesis is the following: (1) any three vertices in the CANDIDE
model must form a curve; (2) by choosing three vertices properly, the curve
formed may reflect the shape implied by the CANDIDE model almost perfectly;
(3) based on the relationship among those three vertices, an assumption can be
obtained, and it will provide non-linear interpolation in high quality.
Jiang Shan                                                                  27
       Set the angle ABC is 2α, based on the definition of dot product of
vectors:

                                                           BA  BC
                                      
        BA  BC  BA * BC * cos2  cos2                             .
                                                           BA * BC

       The angle 2α is the assumption. Figure 13 shows how to use this
assumption.




             Figure 14: Assumption based on CANDIDE model (2)

       Again, P1 and P2 are original vertices, P is the generated vertex, h is the
height, and s is the length of edge P1P2. P1, P and P2 correspond A, B and C in
Figure 13 following the sequence. Thus, angle P1PP2 = angle ABC = 2α. Edge
P1P = P2P, so that angle P1PP2 is divided by h equally, each sub angle =α. Thus:
             s · ctg 
        h               .
                  2

        ctg 2   
                            2
                                     1 , and s  P1  P2 , so:
                       1  cos2 

                                              2
                             P1  P2 ·                1
              P1  P2                    1  cos2      N  N2
        P                                              · 1     .
                 2                         2                 2

       Based on the result of testing, this assumption works much better than
all those previous methods. Figure 15 shows the result. For this benefit, I decide
to use this assumption as the interpolation factor.




Jiang Shan                                                                     28
             Figure 15: Non-linear Interpolation: Assumption (1)

       However, while testing, one problem is found. This non-linear
interpolation works much more slowly than linear interpolation. If users choose
to construct a model with very high resolution, this non-linear interpolation
takes too long time to execute, which is unacceptable. Thus, I decide to mix
linear and non-linear interpolation together. The strategy is the following: For
lower resolution, non-linear interpolation is used, in order to construct
relatively good geometric shape; for higher resolution, however, linear
interpolation is used, in order to improve quality of texture mapping.


4.11 Texture Mapping and Reconstruction
Once the wire frame of the 3D model is well constructed, the quality of
geometric shape of the model can be insured. The only element left is texture.
In this project, the texture of the 3D model relies on the two source images. By
extracting sample colour from those images, colour of all vertices of the 3D
model can be defined. For vertices with different position, the source of colour
extracting is not the same. Generally speaking, colour of most of the central
points is defined by frontal images, and colour of most of the points near the
boundary is defined by side images. In Figure 16, the colour of the red part is
defined by frontal images, and the blue part is defined by side images.

Jiang Shan                                                                   29
         Figure 16: Red = Defined by Front; Blue = Defined by Side



       When showing a generated 3D model on screen, colour of pixels which
are not vertices is generated by MATLAB automatically based on a build-in
interpolation algorithm.
       Once again, because CANDIDE model is based on mathematic
Descartes‟ coordinate system, but the origin of coordinate of MATLAB is at the
top left corner, y coordinates are required to be reversed in order to make sure
all vertices are assigned right colour. The colour system I‟m using now is Red-
Green-Blue (RGB) 256 colour.
       After all these steps, a 3D face reconstruction model is completed. For
showing it, the MATLAB function patch is recommended. Two of the
arguments of this function are FaceColor and EdgeColor. When they are
assigned „interp‟ and „none‟ correspondingly, the quality of output can be
maximized based on current research.




Jiang Shan                                                                   30
5 Performance Evaluation




                  Figure 17: Constructed 3D Face Models (1)
Figure 17 shows two constructed 3D face models based on two real photos of
two persons. The first column is for frontal images, the second column is for
side images, and the third column is for constructed models. The results show
that the current system can capture texture very well. Resolution of the
constructed models can be ensured. Basic geometric features are captured as
well.
        In order to evaluate the performance on capturing geometric features of
this software, another kind of experiments is performed. For this kind of
experiments, source images are no longer real photos, but screen shots of
existing 3D face models in two angles. The constructed models are compared
with source models.
        Figure 18 and 19 show two constructed 3D face models based on this
idea. For each figure, the first row shows two screen shots which are considered
as input images. The second row shows the comparison between source models
and constructed models. The left ones are source models, and the right ones are
constructed models. As we can see, the geometric features of models
constructed by the software are not as same as the features of source models.

Jiang Shan                                                                   31
There are two main reasons lead to it. One is about interpolation. The algorithm
of interpolation is still not good enough for this application. The other reason is
about MATLAB. In order to show the constructed models, a MATLAB
function patch is used. However, while displaying on screen, shapes of models
are modified by MATLAB along z-axis. Thus, the models shown on screen are
not exactly as same as they are supposed to be.




                  Figure 18: Constructed 3D Face Models (2)




                  Figure 19: Constructed 3D Face Models (3)
       In order to get rid of this confusion caused by MATLAB, Iterative
Closest Point (ICP) algorithm is required for testing. Basically this algorithm is
Jiang Shan                                                                      32
used to compare geometric shapes of two 3D models. The basic idea of this
algorithm is the following: First of all, two 3D models A and B are overlapped;
then from one model, let‟s say A, to the other model, B, distances between
points are calculated; for a point Ai from A and a point from Bj are considered
as a pair, if the distance between them is the shortest one from Ai to any points
in model B; each point from A is paired with a point from B, and the average
distance is calculated; by rotating one of the two models, this average distance
may be decreased; performing this algorithm iteratively for so many times until
the average distance reaches the lowest value; examining this distance, if it is
small enough, these two models can be considered as close enough to each
other in geometric features; otherwise, they are not close enough.
       ICP algorithm requires relatively long time for implementation and
testing, so that I leave it as future work in order to reach the schedule.




Jiang Shan                                                                    33
6 Conclusions and Future Work
6.1 Conclusions
Based on Chapter 5, the software can be concluded as the following so far:
1. Texture features can be captured accurately.
2. High resolution can be ensured. (188416 polygons at most)
3. Short execution time can be satisfied in a relatively high level. (from less
   than 1 second to about 1 minute, depending how high the resolution is)
4. The GUI is direct and relatively easy to use.
5. Basic geometric features are reconstructed, but not very accurate in details.
   More testing is required.


6.2 Future Work
1. More testing based on ICP algorithm.
2. Improving the GUI to make it easier to use, especially for mouse operations.
3. Improving algorithms used to constructed 3D models in order to minimize
   execution time.
4. Finding an accurate way to display constructed 3D models instead of using
   the default display procedure provided by MATLAB.




Jiang Shan                                                                   34
References
1. Atick, J., Griffin, P., and Redlich, N. 1996. Statistical approach to shape from
shading:
Reconstruction of 3d face surfaces from single 2d images. Neural Computation
8, 1321{1340.
2. Leung, W., Tseng, B., Shae, Z., Hendriks, F., and Chen, T. 2000. Realistic
video avatar. Multimedia and Expo. IEEE.
3. CANDIDE – a parameterized face. Last accessed: Tue Sep 26 17:08:18 2000
http://www.bk.isy.liu.se/candide/main.html.




Jiang Shan                                                                      35
User Manual
                              1. When the software is initiated, all functions
                                    are disabled except load source images and
                                    3D models. Users should first click menu
                                    File and choose item Load Images… to load
                                    source images.
2. The         item          Load
   Images…opens two standard
   windows open file dialogs,
   users select images used for
   3D face reconstruction.
3. After selecting images will
   be used, users are asked to
   click the peak of nose and the
   midpoint of mouth in both
   frontal and side images. The blue points indicate where user clicks on.




Jiang Shan                                                                 36
4. After opening images, most of functions provided by the GUI are
   active.




5. GUI group 1 and 2 are used to translate and scale feature points to fit faces.
   Group 1 is used to change units of translation and scaling. For translation,
   users can indicate how many pixels are considered as one translation unit.
   For scaling, users can manipulate in how many times scaling should be
   done. By clicking buttons in group 2, translation and scaling can be
   performed. The effect is shown in Area 5.
6. GUI group 3 is used to control the domain of feature points which
   translation and scaling will affect. By default, it is global. By checking
   check boxes corresponding to all parts of a face, users can manipulate
   individual parts instead of the whole face. Feature points corresponding
   selected parts are shown in red colour in Area 5.
7. By clicking the button Adjust Points by Mouse in Area4, users can
   manipulate individual feature points. First clicking on a feature point to
   select it. The one selected turns to red. Then clicking on the position of
   destination, the point will move there.


Jiang Shan                                                                    37
8. By checking radio buttons in Area 6, users can swap between editing frontal
   and side images.
9. While editing side images, Z Translation and Z Scaling in Group 1 and 2
   are active, and in Area 5, side images and side feature points are shown.




10. GUI group 7 controls quality of constructed models. More polygons mean
   higher quality, also mean longer time required.
11. After fitting all
   feature points to
   face, by clicking
   button Construct
   3D Face in Area
   8, a 3D face
   model           is
   constructed and
   displayed.
12. By clicking the
   Rotate 3D button,
   users can rotate
Jiang Shan                                                                 38
   the constructed 3D face models.
13. Users can save constructed face models by clicking menu File and selecting
   item Save 3D Model As… .




Jiang Shan                                                                 39
Software CD




Jiang Shan    40

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:58
posted:11/4/2011
language:English
pages:40