Document Sample

Acknowledgements I would like to express my gratitude to all the people who helped me to fulfil this project. First of all, I want to thank my supervisor of this project, Dr. Andreas Lanitis. He has given me many suggestions during my work, either about general issues or specific problems. And I‟m glad to thank Dr. Georgios Stylianou, who has given me much advice on technique related to graphics. At last, thank Dr. Andreas Grondoudis for taking part in my presentation and help me to improve this project. Once again, thank all of you! Jiang Shan 1 Abstract This thesis is about a system used to reconstruct a 3D model of a face based on two orthogonal images. The feature of this system is that it‟s totally independent from a database. All information required to reconstruct a 3D face model are based on input images. Also, it‟s relatively easy to use, so that users who are non-proficient with computer graphics and image processing can also learn how to use it in a short term. This thesis describes the background of 3D face reconstruction techniques, the CANDIDE model which is used as the base in this system, and the very technique used in this software. Moreover, a performance testing is provided as well. Jiang Shan 2 Table of Contents Acknowledgements.............................................................................................. 1 Abstract ................................................................................................................ 2 Table of Contents ................................................................................................. 3 1 Introduction ................................................................................................. 4 1.1 Overview.............................................................................................. 4 1.2 Aims and Objectives ............................................................................ 4 1.3 Feasibility ............................................................................................ 5 1.4 Structure of the Report......................................................................... 5 2 Literature Review ........................................................................................ 6 2.1 What Is 3D Face Reconstruction? ....................................................... 6 2.2 Overview of Different Methods........................................................... 6 2.2.1 Shape from Shading Methods...................................................... 6 2.2.2 Shape from Motion Methods ....................................................... 7 2.2.3 Stereo Methods ............................................................................ 7 2.2.4 Geometry-Based Methods ........................................................... 8 3 Introduction on the CANDIDE Model ........................................................ 9 3.1 Overview of the CANDIDE Model ..................................................... 9 3.2 The CANDIDE Versions ..................................................................... 9 3.3 Description of CANDIDE v3.1.6 ...................................................... 10 3.3.1 The Reasons to choose CANDIDE v3.1.6................................. 10 3.3.2 The Content and Structure of CANDIDE v3.1.6 ....................... 11 3.3.3 Significant Shortages of v3.1.6 Affecting Design and Implementation .......................................................................................... 11 4 Design and Implementation of 3D Face Reconstruction ........................... 13 4.1 Software Structure ............................................................................. 13 4.2 Framework ......................................................................................... 14 4.3 Load Images....................................................................................... 16 4.4 Load the CANDIDE Model ............................................................... 17 4.5 Calculate Normal Vectors.................................................................. 18 4.6 Project Vertices on 2D Planes and Adjust Coordinates..................... 19 4.7 Fit Face Images by Graphic Operation .............................................. 21 4.8 Global, Part and Vertex Based Operation.......................................... 22 4.9 Difference of Implementation between Frontal and Side Images ..... 23 4.10 Improve Quality by Interpolation ...................................................... 24 4.11 Texture Mapping and Reconstruction ............................................... 29 5 Performance Evaluation ............................................................................ 31 6 Conclusions and Future Work ................................................................... 34 6.1 Conclusions ....................................................................................... 34 6.2 Future Work ....................................................................................... 34 References.......................................................................................................... 35 User Manual....................................................................................................... 36 Software CD ...................................................................................................... 40 Jiang Shan 3 1 Introduction 1.1 Overview Nowadays, techniques in the 3D domain are being developed better and better. In many real cases, techniques based on 3D images are involved. People believe that the future is about 3D images, but not 2D ones. However, obtaining 3D images is still far away to most of people. One of the reasons is that equipment used to capture 3D images is relatively expensive. Thus, using software to simulate 3D models based on 2D information is considered as an applied approach. This project is designed for this reason. By employing some techniques, it can reconstruct a 3D model of a face based on 2D photos. The requirement of hardware is relatively low, so that it can be installed in most of modern computers. 1.2 Aims and Objectives The objective of this project is reconstructing 3D face model based on two orthogonal input images about a face of a person. Figure 1 shows it. The reconstructed 3D model should include both geometry and texture features of the person. For performance, high resolution and short execution time are desired. A Graphic User Interface (GUI) is required to make it convenient to operate the system. Orthogonal Images 3D Model Figure 1: Objective of 3D Face Reconstruction Jiang Shan 4 1.3 Feasibility In order to ensure high level of performance and match the schedule of development, the MATLAB development environment is chosen as the only tool for implementation. Because of the restriction of the schedule and the limit of the MATLAB development environment, this software requires MATLAB as runtime environment. 1.4 Structure of the Report This report is composed by four main chapters. First of all, a literature review is provided. In that chapter, main ideas, advantages and disadvantages of different methods of 3D face reconstruction are introduced while emphasising the geometry-based approach. After that, the CANDIDE model which is used as the base of 3D face reconstruction by this software is introduced. The reasons to use the CANDIDE model and the ways to limit the shortage of it are also mentioned in this chapter. The next chapter describes the main idea of the design and implementation of the system. Technical issues involved are explained, and solutions are provided. Then, the algorithm used to test the software is introduced. The testing result and evaluation are provided at the end of that chapter. Besides those four chapters, a chapter about conclusion and future work is provided at the end of this report. In the appendix, references and user manual are given. Jiang Shan 5 2 Literature Review 2.1 What Is 3D Face Reconstruction? 3D face reconstruction is a technology used for reconstructing three- dimensional (3D) face geometry from media such as images and video. The reason we need this technology is to provide acquisition of 3D data without expensive scanning equipment so that 3D data could easily be used in daily applications. Based on the methodology employed, the methods of 3D face reconstruction can be classified into the following categories: Methods based on shape from shading (SFS), shape from motion methods, stereo methods and geometry-based methods. 2.2 Overview of Different Methods 2.2.1 Shape from Shading Methods Shading is the variation in brightness from one point to another in the image. It carries information about shape because the amount of light a surface point reflects depends on its orientation relative to the incident light. Shape from shading (SFS) methods attempt to recover the 3D face by computing accurately the lighting conditions. SFS methods usually use a single face image as an input, but in some cases, they can also utilize multiple non-orthogonal images of a human subject. This feature is supposed to be one of the most important advantages of SFS methods. If an application is lack of source images or requires high level of quality, SFS methods can be performed in both cases. The main steps involved in a typical SFS-based reconstruction method are: Given a novel face image, first the light source‟s position and orientation is inferred. Using the shading information, the normal vector and the 3D point for each pixel of the face are Jiang Shan 6 recovered. In the last step, the 3D face is reconstructed. Optimization processes can be done after that. How to utilize the shading information is the key of SFS methods. One of the approaches is based on statistic information. Atick et al [Atick et al. 1996] [1] proposed the first statistical shape from shading method for reconstructing a 3D face from a single image. First, they acquire a database of 200 scans of 3D faces, parameterize all faces in cylindrical coordinates and align those using 3D rigid transformations. Definitely, the quality of the acquisition is directly determined by the content of the database. This is the shortage of this approach. 2.2.2 Shape from Motion Methods Shape from motion techniques are used for generating a 3D face model based on the extracted shapes of the face in different frames in image sequences. Leung et al [Leung et al. 2000] [2] located manually 44 facial points on a face shown in the input image sequences. In a pre-processing step, they identified manually the same feature points on a generic 3D model, thus the features in both the video and the generic model are in correspondence. To compute the correspondence of the rest of the points, they performed cylindrical projection to map all the points of the 3D generic mesh to 2D and then triangulate the feature points to help compute the inverse affine map for every non-feature point. Finally they texture mapped the face onto the 3D generic model and morph the generic model to better match the original. There are many improved methods based on the above idea. However, most of them share the same disadvantages: requiring much source information and the operation is relatively complex. 2.2.3 Stereo Methods Stereo based face reconstruction is a technique used for 3D reconstruction in very diverse scenes. Stereo requires two off-the-shelf digital cameras which are connected together and calibrated so that they aim at the same object. The framework of stereo based face reconstruction is as follows. Both cameras are Jiang Shan 7 used to take two snapshots of the face. Then pixel correspondences are established between the two images to create the disparity map. The disparity map and the knowledge of the relative distance between the two cameras are used to compute the depth map. The most important shortage of stereo based methods is that the equipment requirement is not always available. Also, the performance is often affected by the environment conditions. 2.2.4 Geometry-Based Methods Geometry based methods extract the geometry features (eyes, mouth, nose, etc) of the face images, and use them to reconstruct the 3D face. The advantage of this class of methods is no 3D face database is required. The inputs of geometry based methods are at least two face images which are orthogonal to each other. The geometry from all images representing the same face is extracted and then blended to produce the 3D face. The disadvantage, however, is the quality of the acquisition is usually limited by the quality of the source images. The first step of all geometry-based methods is acquiring a frontal and a side photo of the face to be reconstructed. The second step is finding all the corresponding points from both photos. The frontal photo provides the x and y coordinates for each point of the face and the side photo provides the y and z coordinates. As a result, the x, y and z coordinate of all points in the face is available. Finally, the reconstructed face is texture mapped using the blended texture form the orthogonal photos. All geometry-based methods are based on the assumption that faces have symmetrical shape and texture. Since only a frontal and a side image is provided information from one side image is used for inferring the appearance of the second side. When dealing with non symmetrical faces geometry-based methods fail to produce accurate results. Nevertheless, this shortage can be removed by processing on a third image for the other side of the face. Jiang Shan 8 3 Introduction on the CANDIDE Model 3.1 Overview of the CANDIDE Model CANDIDE is a parameterised face mask specifically developed for model- based coding of human faces. Although it has a low number of polygons (approximately 100), it captures the most important feature points of human faces. Thus, it can offer a relatively high level of model and at the same time allows fast reconstruction with moderate computing power. CANDIDE is controlled by global and local Action Units (AUs). The global ones correspond to rotations around three axes. The local Action Units control the mimics of the face so that different expressions can be obtained [3]. The concept of Action Units was first described about 30 years ago by the Swedish researcher Carl-Herman Hjortsjö in his book Man's Face and the Mimic Language (in Swedish). This work was later extended by Paul Ekman and Wallace V Friesen of the Department of Psychiatry at University of California Medical Centre [3]. The CANDIDE model was created by Mikael Rydfalk at the Linköping Image Coding Group in 1987. This work was motivated by the first attempts to perform image compression through animation [3]. The CANDIDE model became known to a larger public through journal articles. It is publicly available and is now used by research groups around the world. 3.2 The CANDIDE Versions The Original CANDIDE, described in the report by M. Rydfalk, contained 75 vertices and 100 triangles and is demonstrated by the Java Demo. This version is rarely used for its low quality of modelling and lack of Action Units [3]. The most widespread version, the de facto standard CANDIDE model, is a slightly modified model with 79 vertices, 108 surfaces and 11 Action Units. Jiang Shan 9 This model was created by Mårten Strömberg while implementing the xproject package, and is here referred to as Candide-1 [3]. Later, Bill Welsh at British Telecom created another version with 160 vertices and 238 triangles covering the entire frontal head (including hair and teeth) and the shoulders. This version, known as Candide-2 is also included in the xproject package, but is delivered with only six Action Units [3]. A third version of CANDIDE has been derived from the original one. The main purpose of another model is to simplify animation by MPEG-4 Facial Animation Parameters [3]. Therefore, about 20 vertices have been added, most of them corresponding to MPEG4 feature points. This model is called Candide-3 and is included in the WinCandide package [3]. In Figure 2, the wire frames of different versions of CANDIDE are shown: Candide-1 Candide-2 Candide-3 Figure 2: Different versions of CANDIDE 3.3 Description of CANDIDE v3.1.6 3.3.1 The Reasons to choose CANDIDE v3.1.6 For this project, CANDIDE v3.1.6 is chosen as the basic face modeling tool. The reasons to use it are: First of all, it‟s a relatively new version of the CANDIDE family. Many researchers are working on it, so that it‟s easy to share experience and gather advices about it. Also, it contains a reasonable number of feature points and polygons, so that the model based on it can reach an acceptable level of quality, while keeping the running time relatively low. And because of its structure, it can be easily utilized by a MATLAB program. Jiang Shan 10 3.3.2 The Content and Structure of CANDIDE v3.1.6 There are two files in the CANDIDE v3.1.6 package. The contents of them are exactly the same. One is a text file, used as a description file. And the other one is an object file, which is the one can be really utilized. The structure of CANDIDE v3.1.6 is matrix based. The content can be divided into 6 parts. The first part is a description about modification of each version (from v3.1.1 to v3.1.6). The second part is a list of vertices. There are 113 vertices in v3.1.6, and for each point, x, y and z coordinates are offered. Thus, it‟s a 113-by-3 matrix. All coordinates are between -1.0 to 1.0. The third part is a list of faces, which are triangles. There are 184 faces in v3.1.6, and for each face, three vertices are offered. Thus, it‟s a 184-by-3 matrix. In this part, only indices of vertices are mentioned. Thus, in order to draw a face, the list of vertices must be involved. The fourth part is a list of animation units. Sixty five different animation units are provided, and in each of them, the indices of the vertices required to move and how to move are mentioned. The fifth part is a list of shape units, which informs users how to change shapes of each face organ. The last part is about texture. For this project, only the second and the third parts are considered as useful. 3.3.3 Significant Shortages of v3.1.6 Affecting Design and Implementation There are three significant shortages of v3.1.6, which effects design and implementation directly. First of all, the index of matrices in v3.1.6 begins with 0, however, in MATLAB, it begins with 1. This problem may lead to “out of index” errors in some conditions, so that it should be considered while designing and implementing. The other one is v3.1.6 is a normal-free model, which means it doesn‟t involve any information about normal vectors of either faces or vertices. However, in such an application like this project, avoiding using normal vectors is almost impossible. As a result, calculating normal vectors of faces and vertices and storing them are considered to be included in the task domain. The last shortage is that not all of the faces of v3.1.6 consist of Jiang Shan 11 the same sequence of vertices. It makes calculating normal vectors difficult. In the next chapter, this problem will be discussed more clearly and specifically. Jiang Shan 12 4 Design and Implementation of 3D Face Reconstruction 4.1 Software Structure In order to fulfil the aims of this software and considering the features of MATLAB, the software will consist of three major parts. The first part is the main program. The major functions of this main program are: (1) offering a user-friendly GUI; (2) getting inputs of users‟ operation; (3) showing the outputs or reacting based on users‟ operation; (4) offering information related to the software; (5) storing data as global variables. Figure 3 is a screenshot of the GUI. Figure 3: GUI Layout The second part is a set of functions (procedures). Each function may perform a specific task or a set of related tasks. Most of work related to calculation, decision making and data manipulating will be done in this part. Jiang Shan 13 The last part is the CANDIDE model, which is obviously necessary for this software. 4.2 Framework The whole procedure of 3D face reconstruction can be divided into the following steps: Step1: Load Images. In this step, two images are loaded: one is a frontal image of a face, and the other one is a side image of the same face. Step2: Load CANDIDE Model. In this step, the CANDIDE Model is loaded. All useful data will be stored as global variables in MATLAB matrix format. Step3: Project CANDIDE Model. In this step, vertices of CANDIDE Model will be projected on both images as feature points. Step4: Morph CANDIDE Model. In this step, by moving feature points, the CANDIDE model is adjusted to fit the face more accurately. Step5: Interpolation. In this step, in order to improve the quality of the 3D model, interpolation is done. While generating new vertices, the resolution of the 3D model will be much higher than before. Step6: Texture Mapping. In this step, each vertex is assigned colour which is extracted from either the frontal image or the side one. Step7: Reconstruct 3D Face. In this step, a 3D face model is generated based on all information obtained by the above steps. Step8: Export 3D Model. This is the last step of the whole procedure. Users can save the 3D model for other usage. In Figure 4, the relationship and outputs of the major steps are demonstrated. In the remaining part of this chapter, the whole procedure will be discussed step-by-step. And for important specific issues, details will be offered. Jiang Shan 14 Orthogonal Images Adjustment CANDIDE Model Projection Interpolation Morph Feature Extraction Improved Model Texture Mapping 3D Model Figure 4: Framework of 3D Face Reconstruction Jiang Shan 15 4.3 Load Images Not as pure image processing software, in this project, those images loaded are not really opened, i.e. there won‟t be any modification on those images. In order to use those images, only the address and the file name are really required. Thus, a MATLAB function uigetfile is used to load images. This function provides a standard GUI-based file selector, and the directory of the file and the file name are returned. This function also has a variation which allows users to select more than one file at once. However, this variation is abandoned by this project to benefit with the flexibility that those two images may not appear in the same directory. So that, users don‟t have to move a file in order to load them at the same time. The most challenging issue for this part is that in most of the cases, the frontal and side images are either not in the same size or zoomed in different ratio. For this project, most of image processing related issues are left to users in order to insure the concentration on the very purpose of this software. However, this problem can be very annoying, especially for large amount of work. Thus, I decide to include the solution of this problem as a part of this project. The solution is that after selecting frontal and side images, users are asked to define two points on each image. Those points should correspond pair by pair. Figure 5 is an example; the blue points are defined by user: Figure 5: Adjusting Frontal and Side Images Jiang Shan 16 The idea is the following: points in the frontal image, say, are P1(x1, y1) and P2(x2, y2), and points in the side image are P3(x3, y3) and P4(x4, y4). Since the origin of coordinates in MATLAB is at the top left, so that if y1 = a, y3 = c, then y2 = a + b, and y4 = c + d. Set r = b / d, which can be considered as the zooming ratio between the frontal and the side images. Set n = a – c * r, which is the difference between the two images in the same zooming ratio. Set y coordinates in the frontal image are Yf, and in the side image are Ys. Thus, for all points: Ys = (Yf – n) / r, (r ≠ 0). Besides adjustment between frontal and side images, this solution can provide extra benefit, which is locating two points of a face at the very beginning of the whole procedure of reconstructing. For example, in Figure 5, the two user-defined points are the peak of nose and the midpoint of mouth, so that these two feature points can be located at the loading images stage. Also, all the other feature points can be moved based on the location of these two points, which means that all feature points are located at a relatively close position to the face automatically. This feature can help users to reduce their work in a dramatic way. 4.4 Load the CANDIDE Model As I mentioned before, the CANDIDE Model is a text-based data collection. Thus, to get information from a CANDIDE file, string-based operation is necessary. In order to ensure high efficiency, I decide to use a function provided by my project supervisor to complete this task. This function accepts a file name as input, in this case, the file name of the CANDIDE file should be passed. By performing string operation, this function can read vertices, connections, faces, vertices_texture_cords and normal, and return them as output. However, as I mentioned in the previous chapter, the CANDIDE Model doesn't involve any information about normal vectors, so that this function only returns an empty matrix as normal. Jiang Shan 17 The only problem here is that all indices of the CANDIDE Model start with 0, but indices of MATLAB start with 1. The solution is to add 1 to all indices after loading information. 4.5 Calculate Normal Vectors Normal vector is an attribute with significant meaning in most 3D applications. For this project, it can be used at least for interpolation. Thus, calculating normal vectors correctly is an important part of the task domain. For a 3D application, usually normal vectors can be divided into two categories: face normal vectors and vertex normal vectors. In this project, vertex normal vectors are more useful than face ones. However, in order to calculate vertex normal vectors, calculating face normal vectors is necessary. Since all polygons can be divided into a limited number of triangles, in this report I only mention triangle faces to demonstrate all methods related to polygons. Look at Figure 4, for a face such as face ABC, the normal vector is a vector perpendicular to it. In order to get such a vector, the easiest way is calculating the cross product of two unparallel vectors in this face. In face ABC, vector AB and vector AC are unparallel to each other obviously. Thus, both AB X AC and AC X AB can be used as normal vector. The only difference between them is the directions of them are opposite. Because of this issue, while calculating face normal vectors, the sequence of vectors is very important. Either clock-wise or counter-clock-wise doesn‟t really matter in most of cases, but consistency must be insured. The last step of calculating normal vectors is normalization, which makes the magnitude of a normal vector equal to 1. In Figure 6, vector N is a normal vector of face ABC, and N‟ is the normalized N, we have: N AC AB , and N N' . N Jiang Shan 18 Figure 6: Calculating Normal Vectors Because the CANDIDE Model doesn‟t consist of the same sequence of indices of vertices for all faces, it is difficult to make sure all face normal vectors share the same value, towards either outside or inside the 3D model. Fortunately, this inconsistency isn‟t random. Modifying the CANDIDE Model is done manually to fix this problem. Once we get normal vectors of all faces, vertex normal vectors can be obtained by calculating the average of face normal vectors related to each vertex. For example, vertex A is related to faces f1, f2, and f3, the normal vectors of those faces are n1, n2 and n3 correspondingly. As a result, the normal vector of A is N = (f1 + f2 + f3) / 3. Now another problem appears: in the CANDIDE Model, not all vertices related to some face, which means it‟s impossible to calculate normal vectors for such vertices (It doesn't make sense to calculate them as well). To avoid any potential problems, I decide to leave them as zero vectors. 4.6 Project Vertices on 2D Planes and Adjust Coordinates As I mentioned in chapter 2, the very basic idea of this project is a geometry- based method of 3D face reconstruction. As Figure 7 shows, this project requires two orthogonal images. By the frontal image, x and y coordinates of points are obtained, and by the side image, z and y coordinates are acquired. As Jiang Shan 19 a result, x, y and z coordinates is available for each point. Thus, reconstructing a 3D face model is possible. Figure 7: Geometry-Based 3D Face Reconstruction Now it‟s obvious that obtaining coordinates from both images is extremely essential to this software. In order to generate a useful 3D model, a large number of points are definitely required. In this case, obtaining coordinates manually is not only time consuming, but also unwise. This issue can be considered as one of the most important reasons to use the CANDIDE model. The idea is the following: by projecting vertices of the CANDIDE model on both images (in different ways, of course), feature points are generated. By utilizing information gathered by those feature points, more points can be involved automatically. As a model for general use, the CANDIDE model is built following the most popular mathematic rule: the centre of the model is at the origin of a rectangular coordinate, and the range of coordinates is from -1 to 1 in both directions. By contrast, in MATLAB, the origin is at the top left of an image, and only positive values are involved in the image. Thus, adjusting vertices from CANDIDE format to MATLAB format is the most important job for projection. Again, I use a function provided by my project supervisor to perform this task. This function, however, only works with x and y coordinates, I will do the similar thing for z coordinates as well. The basic idea of this function is the following: (1) getting the width (w) and height (h) of the image Jiang Shan 20 vertices will be projected on; (2) dividing both w and h by 2 to get w‟ and h‟; (3) for all x and z coordinates: x' x * w w , (4) For all y coordinates: y' h y * h . 4.7 Fit Face Images by Graphic Operation After projection, how to move feature points to fit the face in both images is the next key job. In this stage, graphic transformation operations are really useful. By performing translation, rotation and scaling around all three axes, the CANDIDE model can be morphed as close to the face as possible. All those operations are matrix-based operations, so that in order to fulfil the mathematic rules, all coordinates of feature points are appended with an extra column into the following format temporarily: [x y z 1]. In Figure 8, all matrices involved in graphic transformation operations are shown: Translation Matrix Scaling Matrix Rotation around X-axis Matrix Rotation around Y-axis Matrix Rotation around Z-axis Matrix Figure 8: Graphic Transformation Matrices To perform those operations, just multiplying the vector of coordinates with the corresponding matrix is ok. However, for this project, rotation will not serve the aim much. This software accepts two orthogonal images of a face as Jiang Shan 21 inputs. This requirement implies those two images must be about a face looking straight forward, but not about a face shaking or nodding. Although this software can generate 3D models for askew faces by employing rotation, the models will not be as accurate as expected. In order to avoid any confusion, the ability of rotation is abandoned at the end. 4.8 Global, Part and Vertex Based Operation In the previous section, the basic idea of fitting the CANDIDE model onto a face is described. Performing those graphic transformation operations on the whole model can be considered as global operations. They are useful, however, not perfect. The CANDIDE model is designed for general use, which means it can not fit all particular faces by just morphing them (each one individually) as a whole. Global operations, let‟s say, are the first level of operations provided by this software. The next level is operations on individual parts of a face. In this project, the CANDIDE model is divided into 12 parts based on the structure of human faces. Those parts are: outer hair line, inner hair line, left eye brow, right eye brow, left eye, right eye, nose, left cheek, right cheek, upper lip, lower lip, and chin. By choosing a part, users can perform graphic operations on only the chosen part. Moreover, if necessary, users can choose more than one parts as active parts, and morph them together. This level of operations makes it much more flexible to fit a face. As a result, the generated 3D model can be closer to accurate. Although the second level of operations makes things much better, it is not perfect either. Based on my experiments, in most of the cases, there are always some of points that can not be fitted anyway. In order to deal with those trouble makers, the third level of operations, vertex based operations are appended. In this level, users can move individual vertices one-by-one. The best approach to achieve this function is employing standard mouse events: dragging, selecting by rectangular frame, right clicking, etc. However, in MATLAB, all those events rely on ActiveX controls. Thus, for a program without ActiveX controls as this one, those events can not be involved. (I Jiang Shan 22 haven‟t studied on all possible solution provided by MATLAB, so that there may be a way to use them without ActiveX controls. At least so far I haven‟t found any.) The solution I provide is selecting and moving only one vertex at a time. It makes this function a little annoying and not so user friendly; however, despite this shortage, it works well. 4.9 Difference of Implementation between Frontal and Side Images Another issue related to fitting faces is that frontal and side images are different on implementation. Code that works on frontal images can not be directly used for side images, and vice-versa. Figure 9 shows the difference between feature points in frontal and side images: Frontal image Side image Figure 9: Feature Points in Frontal and Side Images Reconstruction The most significant difference is that on frontal images, x and y coordinates are manipulated, but on side images, z and y coordinates are manipulated. This difference means that not all graphic operations can be done on either frontal or side images. In frontal images, any operations related to z coordinates will not show any influence. Similarly, in side images, all operations about x coordinates will not show any modification. The other difference is that feature points shown on frontal and side images are different. In frontal images, most of the feature points can be shown, by contrast, in side images, because of the content of the side of a face, only Jiang Shan 23 points on the left half and the middle line of the face can be shown. The benefit of hiding points on the right half is that graphic operations can be done without confusion, especially for vertex based movement. The drawback is that information gathered by feature points on the left half is mirrored to the right half, which makes the generated 3D model symmetric. This is also the most significant drawback of geometry-based approach; the solution is designed as future work. While testing, showing all points on the left half and the middle line of the face on the side image is found to be too confusing because of the number of points. Thus, I reduced the number of visible feature points for side images, i.e. only a few feature points are drawn on side images, while other feature points can not be seen but they are still there. 4.10 Improve Quality by Interpolation As I mentioned in chapter 3, there are only 113 vertices and 184 faces in the CANDIDE model. Obviously such a number of vertices and faces cannot provide a 3D model in high quality. The solution is employing interpolation to improve the quality. There are many ways of interpolation with different performance. In the remained part of this section, I will describe all methods of interpolation I have tested for this project. The easiest way of interpolation is linear interpolation. Figure 10 shows the basic idea of linear interpolation. Figure 10: Linear Interpolation For a face to be interpolated, coordinates of midpoints of all three edges are calculated. Those midpoints are considered as new vertices. By connecting Jiang Shan 24 those new vertices two-by-two, the original face is divided into four new faces. As a result, the number of vertices is appended around one and a half times of the number of faces, and the number of faces turns to four times as before. The advantages of linear interpolation are easy to be implemented and high speed of calculation. The disadvantage is that although it does increase the number of vertices and faces, it can only make the colour quality higher, not the geometric quality. The geometric shape of the model remains the same as the original one. The other option of interpolation is non-linear interpolation. The basic idea of non-linear interpolation is that instead of getting the midpoint of the edge connecting two vertices as a new vertex, getting the midpoint of a curve passing both of those two vertices as the a new vertex. Unlike linear interpolation, there is not a definitely right way to find the proper curve. The method of locating the curve is based on hypothesis and experiments. Also, all kinds of non-linear interpolation require normal vectors; by contrast, linear interpolation doesn't involve them. The first hypothesis is advised by Dr. Georgios Stylianou. The assumption is that the generated vertex and the two original vertices form an equilateral triangle. Look at Figure 11, vertices P1 and P2 are original vertices, and P is the new vertex. Figure 11: Demonstration of Dr. Georgios Stylianou‟s idea To calculate the position of P, the normal vectors of P1 and P2 must be available. Set they are N1 and N2 correspondingly. For triangle PP1P2 is an equilateral triangle, we have: Jiang Shan 25 h a * sin , and 3 P1 P2 P h * N1 N 2 . 2 The normal vector of P is the average of N1 and N2. There is also a variation of this assumption, in which the generated vertex is supposed not to be P, a vertex of the equilateral triangle, but the midpoint of the Bézier curve defined by the triangle. Refer to Figure 11 again, the point P‟ is the assumed new vertex. To calculate the position of P‟, just performs the following formula: 1 P' bt 1 t P1 2 * t 1 - t P t 2 P2 , t . 2 2 Thus, P‟ = (P1 + P2 + 2P) / 4. The normal vector is same as the one of P. Definitely, the running time of this method is higher than linear interpolation. However, the most significant problem here is although it is a good way for general use, the assumption of the position of the generated vertex is examined to be too high for a 3D face model in either way. Figure 12 shows the output of interpolation based on these two assumptions. Thus, new assumption has to be found. Jiang Shan 26 Without Bézier Curve With Bézier Curve Figure 12: Non-linear Interpolation: Assumption (1) While doing more experiments, I found that as the assumption turns closer to the CANDIDE model itself, the result also turns better. Based on this observation, I decide to define the assumption based on the CANDIDE model. In Figure 13, vertices A, B and C are used to obtain my assumption. Figure 13: Assumption based on CANDIDE model (1) My hypothesis is the following: (1) any three vertices in the CANDIDE model must form a curve; (2) by choosing three vertices properly, the curve formed may reflect the shape implied by the CANDIDE model almost perfectly; (3) based on the relationship among those three vertices, an assumption can be obtained, and it will provide non-linear interpolation in high quality. Jiang Shan 27 Set the angle ABC is 2α, based on the definition of dot product of vectors: BA BC BA BC BA * BC * cos2 cos2 . BA * BC The angle 2α is the assumption. Figure 13 shows how to use this assumption. Figure 14: Assumption based on CANDIDE model (2) Again, P1 and P2 are original vertices, P is the generated vertex, h is the height, and s is the length of edge P1P2. P1, P and P2 correspond A, B and C in Figure 13 following the sequence. Thus, angle P1PP2 = angle ABC = 2α. Edge P1P = P2P, so that angle P1PP2 is divided by h equally, each sub angle =α. Thus: s · ctg h . 2 ctg 2 2 1 , and s P1 P2 , so: 1 cos2 2 P1 P2 · 1 P1 P2 1 cos2 N N2 P · 1 . 2 2 2 Based on the result of testing, this assumption works much better than all those previous methods. Figure 15 shows the result. For this benefit, I decide to use this assumption as the interpolation factor. Jiang Shan 28 Figure 15: Non-linear Interpolation: Assumption (1) However, while testing, one problem is found. This non-linear interpolation works much more slowly than linear interpolation. If users choose to construct a model with very high resolution, this non-linear interpolation takes too long time to execute, which is unacceptable. Thus, I decide to mix linear and non-linear interpolation together. The strategy is the following: For lower resolution, non-linear interpolation is used, in order to construct relatively good geometric shape; for higher resolution, however, linear interpolation is used, in order to improve quality of texture mapping. 4.11 Texture Mapping and Reconstruction Once the wire frame of the 3D model is well constructed, the quality of geometric shape of the model can be insured. The only element left is texture. In this project, the texture of the 3D model relies on the two source images. By extracting sample colour from those images, colour of all vertices of the 3D model can be defined. For vertices with different position, the source of colour extracting is not the same. Generally speaking, colour of most of the central points is defined by frontal images, and colour of most of the points near the boundary is defined by side images. In Figure 16, the colour of the red part is defined by frontal images, and the blue part is defined by side images. Jiang Shan 29 Figure 16: Red = Defined by Front; Blue = Defined by Side When showing a generated 3D model on screen, colour of pixels which are not vertices is generated by MATLAB automatically based on a build-in interpolation algorithm. Once again, because CANDIDE model is based on mathematic Descartes‟ coordinate system, but the origin of coordinate of MATLAB is at the top left corner, y coordinates are required to be reversed in order to make sure all vertices are assigned right colour. The colour system I‟m using now is Red- Green-Blue (RGB) 256 colour. After all these steps, a 3D face reconstruction model is completed. For showing it, the MATLAB function patch is recommended. Two of the arguments of this function are FaceColor and EdgeColor. When they are assigned „interp‟ and „none‟ correspondingly, the quality of output can be maximized based on current research. Jiang Shan 30 5 Performance Evaluation Figure 17: Constructed 3D Face Models (1) Figure 17 shows two constructed 3D face models based on two real photos of two persons. The first column is for frontal images, the second column is for side images, and the third column is for constructed models. The results show that the current system can capture texture very well. Resolution of the constructed models can be ensured. Basic geometric features are captured as well. In order to evaluate the performance on capturing geometric features of this software, another kind of experiments is performed. For this kind of experiments, source images are no longer real photos, but screen shots of existing 3D face models in two angles. The constructed models are compared with source models. Figure 18 and 19 show two constructed 3D face models based on this idea. For each figure, the first row shows two screen shots which are considered as input images. The second row shows the comparison between source models and constructed models. The left ones are source models, and the right ones are constructed models. As we can see, the geometric features of models constructed by the software are not as same as the features of source models. Jiang Shan 31 There are two main reasons lead to it. One is about interpolation. The algorithm of interpolation is still not good enough for this application. The other reason is about MATLAB. In order to show the constructed models, a MATLAB function patch is used. However, while displaying on screen, shapes of models are modified by MATLAB along z-axis. Thus, the models shown on screen are not exactly as same as they are supposed to be. Figure 18: Constructed 3D Face Models (2) Figure 19: Constructed 3D Face Models (3) In order to get rid of this confusion caused by MATLAB, Iterative Closest Point (ICP) algorithm is required for testing. Basically this algorithm is Jiang Shan 32 used to compare geometric shapes of two 3D models. The basic idea of this algorithm is the following: First of all, two 3D models A and B are overlapped; then from one model, let‟s say A, to the other model, B, distances between points are calculated; for a point Ai from A and a point from Bj are considered as a pair, if the distance between them is the shortest one from Ai to any points in model B; each point from A is paired with a point from B, and the average distance is calculated; by rotating one of the two models, this average distance may be decreased; performing this algorithm iteratively for so many times until the average distance reaches the lowest value; examining this distance, if it is small enough, these two models can be considered as close enough to each other in geometric features; otherwise, they are not close enough. ICP algorithm requires relatively long time for implementation and testing, so that I leave it as future work in order to reach the schedule. Jiang Shan 33 6 Conclusions and Future Work 6.1 Conclusions Based on Chapter 5, the software can be concluded as the following so far: 1. Texture features can be captured accurately. 2. High resolution can be ensured. (188416 polygons at most) 3. Short execution time can be satisfied in a relatively high level. (from less than 1 second to about 1 minute, depending how high the resolution is) 4. The GUI is direct and relatively easy to use. 5. Basic geometric features are reconstructed, but not very accurate in details. More testing is required. 6.2 Future Work 1. More testing based on ICP algorithm. 2. Improving the GUI to make it easier to use, especially for mouse operations. 3. Improving algorithms used to constructed 3D models in order to minimize execution time. 4. Finding an accurate way to display constructed 3D models instead of using the default display procedure provided by MATLAB. Jiang Shan 34 References 1. Atick, J., Griffin, P., and Redlich, N. 1996. Statistical approach to shape from shading: Reconstruction of 3d face surfaces from single 2d images. Neural Computation 8, 1321{1340. 2. Leung, W., Tseng, B., Shae, Z., Hendriks, F., and Chen, T. 2000. Realistic video avatar. Multimedia and Expo. IEEE. 3. CANDIDE – a parameterized face. Last accessed: Tue Sep 26 17:08:18 2000 http://www.bk.isy.liu.se/candide/main.html. Jiang Shan 35 User Manual 1. When the software is initiated, all functions are disabled except load source images and 3D models. Users should first click menu File and choose item Load Images… to load source images. 2. The item Load Images…opens two standard windows open file dialogs, users select images used for 3D face reconstruction. 3. After selecting images will be used, users are asked to click the peak of nose and the midpoint of mouth in both frontal and side images. The blue points indicate where user clicks on. Jiang Shan 36 4. After opening images, most of functions provided by the GUI are active. 5. GUI group 1 and 2 are used to translate and scale feature points to fit faces. Group 1 is used to change units of translation and scaling. For translation, users can indicate how many pixels are considered as one translation unit. For scaling, users can manipulate in how many times scaling should be done. By clicking buttons in group 2, translation and scaling can be performed. The effect is shown in Area 5. 6. GUI group 3 is used to control the domain of feature points which translation and scaling will affect. By default, it is global. By checking check boxes corresponding to all parts of a face, users can manipulate individual parts instead of the whole face. Feature points corresponding selected parts are shown in red colour in Area 5. 7. By clicking the button Adjust Points by Mouse in Area4, users can manipulate individual feature points. First clicking on a feature point to select it. The one selected turns to red. Then clicking on the position of destination, the point will move there. Jiang Shan 37 8. By checking radio buttons in Area 6, users can swap between editing frontal and side images. 9. While editing side images, Z Translation and Z Scaling in Group 1 and 2 are active, and in Area 5, side images and side feature points are shown. 10. GUI group 7 controls quality of constructed models. More polygons mean higher quality, also mean longer time required. 11. After fitting all feature points to face, by clicking button Construct 3D Face in Area 8, a 3D face model is constructed and displayed. 12. By clicking the Rotate 3D button, users can rotate Jiang Shan 38 the constructed 3D face models. 13. Users can save constructed face models by clicking menu File and selecting item Save 3D Model As… . Jiang Shan 39 Software CD Jiang Shan 40

DOCUMENT INFO

Shared By:

Categories:

Tags:

Stats:

views: | 58 |

posted: | 11/4/2011 |

language: | English |

pages: | 40 |

OTHER DOCS BY xiuliliaofz

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.