Docstoc

GARCIA

Document Sample
GARCIA Powered By Docstoc
					External-Self-Calibration of a 3D time-of-flight camera
                 in real environments



                    Frederic Garcia Becerro




           Academic supervisor: Prof. F. Meriaudeau
             Industrial supervisor: Dr. B. Mirbach
                 Industrial partner: IEE S.A.




              A Thesis Submitted for the Degree of
      MSc Erasmus Mundus in Vision and Robotics (VIBOT)
                           · 2008 ·
                                              Abstract

    IEE S.A. has developed a 3D camera based on the time-of-flight principle. There is a variety of
applications under development based on this technology, like e.g. the surveillance of buildings or
traffic crossings.
    In order to be able to transform a range image acquired by such camera into Cartesian Coordinates,
the internal and external camera parameters need to be known. While internal parameters are specified
with a high accuracy and verified in qualification and production test by IEE, the external parameters
(being position and orientation of the camera with respect to an external reference system) must be
estimated, as they depend on the installation of the camera. Nobody to our knowledge has tried to
perform external-self-calibration with this technology, and more over, in non-lab-conditions where no
pattern geometrically well known is used. In practise that is, in non-lab-conditions, these calibration
methods, commonly used in 2D camera, are often not easy-to-handle, in particular as they require
the installation of a reference pattern.
    The goal of this thesis consists of estimating the external camera parameters from the raw data
supplied by IEE 3D camera, taking advantage from its technology and also from the range imaging
arrays information. In addition, any other handler than the camera itself is present in the scene.
    A mi familia y amigos. . .




i
Contents

List of Figures                                                                                               v

List of Tables                                                                                               vi

Acknowledgements                                                                                             vii

1 Introduction                                                                                                1
   1.1   Camera calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       1
         1.1.1   Internal camera parameters (intrinsic parameters) . . . . . . . . . . . . . . . .            1
         1.1.2   External camera parameters (extrinsic parameters) . . . . . . . . . . . . . . . .            3
         1.1.3   Homography between the model plane and its image projection . . . . . . . . .                4
         1.1.4   Constraints on the rotation matrix . . . . . . . . . . . . . . . . . . . . . . . . .         4
   1.2   3D camera technology vs 2D cameras . . . . . . . . . . . . . . . . . . . . . . . . . . . .           5
         1.2.1   Electro-Optical Distance Measurement . . . . . . . . . . . . . . . . . . . . . . .           5
         1.2.2   MLI Evaluation Tool Kit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .          7
   1.3   Context and motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .        9
         1.3.1   MSc Erasmus Mundus in Vision and Robotics (VIBOT) . . . . . . . . . . . . .                 10
         1.3.2   International Electronics and Engineering - IEE S.A. . . . . . . . . . . . . . . .          10
   1.4   Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    11
   1.5   Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    12

2 State-of-the-art in 3D camera calibration techniques                                                       13
   2.1   Classification of camera calibration methods . . . . . . . . . . . . . . . . . . . . . . . .         13
         2.1.1   Practical Range Camera Calibration . . . . . . . . . . . . . . . . . . . . . . . .          14
         2.1.2   3D camera based on the time-of-flight principle calibration . . . . . . . . . . .            14
         2.1.3   Calibration for increased accuracy of the range imaging camera Swissranger . .              15
   2.2   Conclusion    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   16

3 A new calibration approach based on 3D camera technology                                                   17
   3.1   Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    17
   3.2   Basic approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      18
         3.2.1   Range angle definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       22


                                                      ii
        3.2.2   Recovering the external camera parameters (position and orientation) . . . . .              22
        3.2.3   Evaluation of the method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .        23
  3.3   Generic approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      24
        3.3.1   RANSAC based plane detection          . . . . . . . . . . . . . . . . . . . . . . . . . .   26
        3.3.2   Defining the world coordinate system {W } . . . . . . . . . . . . . . . . . . . .            26
  3.4   Improvements on the generic approach . . . . . . . . . . . . . . . . . . . . . . . . . . .          29
        3.4.1   RANSAC runtime improvements . . . . . . . . . . . . . . . . . . . . . . . . . .             30
        3.4.2   Method for enhancing the contrast in range images . . . . . . . . . . . . . . . .           30
  3.5   External camera parameters estimation from all the sequence . . . . . . . . . . . . . .             31
  3.6   Camera configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .         32

4 Experimental results                                                                                      34
  4.1   A real environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      34
  4.2   Basic approach evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       34
        4.2.1   Determining the accuracy and precision of the method . . . . . . . . . . . . . .            35
        4.2.2   Sensitivity to the noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    36
  4.3   Generic approach evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       37
        4.3.1   Discrimination of objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       38
        4.3.2   Identification of the calibration plane . . . . . . . . . . . . . . . . . . . . . . . .      38

5 Conclusions and further work                                                                              40
  5.1   Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     40
  5.2   Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     41
  5.3   Further work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      42
        5.3.1   Opto-mechanical Orientation Sensor . . . . . . . . . . . . . . . . . . . . . . . .          42

A External-self-calibration toolbox                                                                         44

B Motorised Goniometric Test Bench                                                                          46
  B.1 Test Bench Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .         46
  B.2 Implementation of the controller driver in Matlab . . . . . . . . . . . . . . . . . . . . .           47
        B.2.1 Paths settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      48
        B.2.2 ISEL Configuration         . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   48
        B.2.3 Robot Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .         49
        B.2.4 Error reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       49

Bibliography                                                                                                49




                                                     iii
List of Figures

 1.1   Internal camera parameters to transform from the camera to the image coordinate system.            2
 1.2   External camera parameters to transform from the world to the camera coordinate
       system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    3
 1.3   Time-of-flight distance measurement ways. . . . . . . . . . . . . . . . . . . . . . . . . .         6
 1.4   Phase shift distance measurement principle. . . . . . . . . . . . . . . . . . . . . . . . .        6
 1.5   Sampling process and demodulation depth. . . . . . . . . . . . . . . . . . . . . . . . .           7
 1.6   Range imaging principle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     8
 1.7   Lateral section of a RIM sensor from PMDTec. . . . . . . . . . . . . . . . . . . . . . .           8
 1.8   MLI Evaluation Tool Kit developed by IEE. . . . . . . . . . . . . . . . . . . . . . . . .          9

 2.1   Left: Scene captured by the 3D camera. Right: Scene captured by the 2D camera.
       This pattern is used for calibration and the difference between them is noticeable. . .            15

 3.1   Unit vectors representation for each pixel of the sensor. . . . . . . . . . . . . . . . . .       18
 3.2   Simplest camera setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    19
 3.3   (a) structure of the setup, (b) rotations over x-axis (pitch angle) and y-axis (roll angle),
       (c) amplitude image acquired by IEE 3D camera and (d) digital level used for empirical
       measurements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     24
 3.4   Camera coordinate system definition. . . . . . . . . . . . . . . . . . . . . . . . . . . . .       27
 3.5   World coordinate system definition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      27
 3.6   Defined camera rotation over its x-axis. . . . . . . . . . . . . . . . . . . . . . . . . . .       28
 3.7   Identification of the calibration plane with reference to the camera coordinate system.            28
 3.8   First and second approach to consider several frames in the recorded sequence. . . . .            31
 3.9   Last approach to consider several frames in the recorded sequence. . . . . . . . . . . .          32
                                                                              ◦
 3.10 Camera configuration. (a) default position, (b) camera rotated 180 , (c) camera rotated
       90◦ and (d) camera rotated 270◦ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     33
 3.11 External-Self-Calibration toolbox interface. . . . . . . . . . . . . . . . . . . . . . . . .       33

 4.1   A real environment setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     35
 4.2   Accuracy versus precision. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    36


                                                   iv
5.1   Function of the SFH 7710. Orientation 1: Light path is blocked by ball. Orientation 2:
      Light reaches the detector. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   43

A.1 Implemented user interface in Matlab. . . . . . . . . . . . . . . . . . . . . . . . . . . .         45
A.2 Allowed data to be showed to the user. . . . . . . . . . . . . . . . . . . . . . . . . . . .        45

B.1 Motorised Goniometric Test Bench 3D design. . . . . . . . . . . . . . . . . . . . . . . .           47
B.2 Motorised Goniometric Test Bench prototype. . . . . . . . . . . . . . . . . . . . . . . .           47
B.3 Motorised Goniometric Test Bench user interface: Path Settings tab. . . . . . . . . . .             48
B.4 Motorised Goniometric Test Bench user interface: ISEL Configuration tab. . . . . . .                 48
B.5 Motorised Goniometric Test Bench user interface: Robot Control tab. . . . . . . . . .               49
B.6 Motorised Goniometric Test Bench user interface: Error Reporting tab. . . . . . . . .               49




                                                   v
List of Tables

 1.1   MLI Evaluation Tool Kit camera specifications . . . . . . . . . . . . . . . . . . . . . .          9

 3.1   Results for the basic approach method evaluation . . . . . . . . . . . . . . . . . . . . .       24

 4.1   Arbitrary method execution on different 3D camera configurations . . . . . . . . . . .             36
 4.2   Sensitivity to the noise in a complete sequence (50frm) . . . . . . . . . . . . . . . . . .      37
 4.3   Mean (x) and standard deviation (σ) in a sequence of frames . . . . . . . . . . . . . .          37
 4.4   Object discrimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    38
 4.5   Identification of the calibration plane . . . . . . . . . . . . . . . . . . . . . . . . . . . .   39




                                                   vi
Acknowledgements

I have the satisfaction of being one of the first VIBOT promotion student. Who introduced me to this
MSc was Dr. Quim Salvi, professor during my MSc degree in Computer Science, colleague after some
collaborations in the EIA1 department at the University of Girona, and finally, the ”culprit” who
guided and encouraged me to take a key decision two years ago. Thanks to Quim I have been involved
during many months with new international colleagues, new friends, new professors, new cultures and
spite of being a guinea pig, after the amount of assessments, courseworks and presentations, what
remains in my mind are the special events, dinners and parties shared with ”vibotians”. I will never
forgot these experiences, which has bring me to where I am today.
   Without any relation with this thesis context but with a lot of relation with the knowledge that
I have acquired during this two years, I would like to thank my special colleagues Luca Giancardo
(Gianca) or ”the musician” and Josep Aulinas (Pep). Together we made a really good team to
overcome the handicaps imposed by professors who though that we are ”brilliant”, well, maybe you,
Gianca and Pep... but not in my case.
   I would like to thank to all the collaborative universities, in special the University of Girona with
                                                                                                  ı,
which I have a ”special” feeling due to my relation with some of professors such as Dr. Xavier Cuf´ Dr
           ıa
Rafael Garc´ or Dr. Jordi Freixenet, whom I consider more than professors, and also PhD students
and ex-PhD students such as Tudor Nicoseivic and Dr. Carles Matabosch.
   What to say about the UdG Sport Center and its members... they are part of my family and, spite
of leaving my collaboration with them, they still supporting me, indeed, they hope that one day, in the
future, I will come back. Thanks to Oscar Ramiro (aun nos quedan muchas cosas pendientes pollo!),
Anna Vi˜olas (la jefa de la tribu), Josep Ma Gasull (el jefe de la tribu), Jordi Buch (m’encanta el teu
       n
                       u e
sentit de l’humor), Jes´s T´llez (sembla tranquil, calmat,... no el conexeu!), Adam Fuentes (without
comments), Christian Mata (recorda que de gran vull ser com tu), David Grau (que en podriem dir...)
and Francisca Carballo or Paqui (quien siempre te contagia de bienestar). Us trobo a faltar!
   According to the thesis project context, I would like to express my gratitude to Dr. Fabrice
Meriaudeau and Dr. David Fofi. Both Vibot professors who helped us not only in class but they
helped with our personal handicaps, accommodation, grands,... Special thanks to Fabrice who give
me the opportunity to perform my MSc thesis project in an industrial company, at IEE. You gave me
this opportunity and I will take the maximum profit of it!
   With reference to the accommodation, special thanks to the ”Foyer de jeunes travailleurs” where
  1 Website:   http://eia.udg.edu


                                                  vii
with Gianca and Pep we lived during the third Vibot semester. Special thanks to Adelaide (my ”new”
mum during that period), Sora, Philip and Regis. I enjoyed a lot with you, in every dinner, party and
special events.
   More recently, three months ago, my MSc thesis project was assigned to be done at IEE, in Lux-
embourg. As I mentioned before, Fabrice is who encourage me to do it. I seized this big opportunity
without thinking too much on what could suppose to perform the thesis project in an industrial com-
pany. Alone, I arrived to a big company. I jumped directly from the university, where almost all is
abstract, to the company, where almost nothing is abstract.
   I would like to express my special gratitude to Dr. Bruno Mirbach, who from the first day at IEE
helped me with all doubts and small problems that I had. And, of course, thanks to the advanced
engineers department members: Frederic, Roberto, Thomas, Jean-Luc, Marta, Gilles and all the IEE
employees who helped me during this period of time or simply have shared a coffee with me, sometimes
it is necessary!
   To sum up, I would like to dedicate this thesis and my special gratitude to all those people who
have been with me and supporting my decisions.
   Now... quisiera dedicar unas palabras a aquellas personas que lamentablemente no decimos cuanto
                                          e                  ıa
queremos aunque que por mucho que lo dij´ramos NUNCA ser´ suficiente. Quiero agradecer el apoyo
                                                            o
incondicional de mis padres. Ellos han estado a mi lado no s´lo durante este periodo de tiempo sino
                                     e               n             ıa
durante toda mi vida, desde que empec´ hace ya unos a˜os hasta el d´ de hoy y, pongo la mano en
                       a    ı.
el fuego, que no acabar´ aqu´
                                                         e           o                              o
   Daros las gracias por ser unos padres geniales, ya sab´is que no s´lo os considero mis padres sin´
     e
tambi´n mis amigos y que, por mucha cabecita que uno tenga, sin vosotros estoy seguro de que no
                                                                              e               ıs
hubiera conseguido nada de lo que he conseguido. Reconozco el esfuerzo que hab´is hecho y segu´
                                e                                                           a
haciendo por nosotros y os estar´ eternamente agradecido. Sin duda alguna, lo mas duro del m´ster
                                                                   ıa   ıa
ha sido la distancia, estoy muy hecho a mi tierra, a vosotros, al d´ a d´ en casa y eso ha sido lo que
 a                                         e                     e        a
m´s he echado de menos. No se cuando volver´ pero si se que estar´is esper´ndome con los brazos
abiertos. GRACIAS!
                           e
   Como no, reconocer tambi´n el esfuerzo, apoyo, reconocimiento, todo lo que me han aportado mis
                     o                                  e
abuelos, Carmen y Sim´n, tranquilo yayo que no le contar´ a nadie todo lo que se; a mi yaya Leonor,
que no hay quien la encuentre en casa y a una persona que fue muy especial para mi y que siempre
      e              o                                                                   ı
llevar´ conmigo, no s´lo ingeniero yayo, he llegado mucho mas lejos, un beso. Lo dejo aqu´ aunque
    ıa                                   ıa
podr´ redactar varios tomos en donde podr´ plasmar lo que siento por vosotros.
               a
   Dedico un p´rrafo a mi hermanito, Jose, a quien podr´                                        a
                                                          ıamos agradecer sacarme de quicio en m´s
            o
de una ocasi´n, jajaja, pero bueno, es lo que tiene ser el hermano mayor y lo dicho, lamento estar
fuera de casa y no poder ayudarte en todo lo que quisiera; aunque doy fe que te has espavilado y
   a                                     e
est´s saliendo adelante por tus propios m´ritos. Estoy muy orgulloso de ti y espero que sigas adelante
    e
haci´ndote tu propio camino.
       ´                                 ın,
   Por ultimo, unas palabras para mi bich´ la culpable de no saber si coger un billete con destino
a Girona o con destino a Santander. Para mi eres una persona muy especial, con quien no quiero
           o
compartir s´lo esta tesis sino todo lo que queda por llegar. Me encanta estar a tu lado y aunque


                                                 viii
                             a                ıas,      o
por ahora han sido viajes rel´mpago de 2 o 3 d´ de coraz´n deseo que con el tiempo acaben por
convertirse en un viaje sin fin.
                                               ıan
   Muchas gracias a todas las personas que conf´ en mi, que se alegran de mi y que han estado,
est´n y estar´n a mi lado, en especial a mis hombres de bar2 : Gerard, Jordi, Pascual, Llu´ Iv´n,
   a         a                                                                            ıs, a
                                e                                       a
Sergi, Agus, y pongo freno porqu´ me limitan la tesis a un total de 60 p´ginas. Esperemos que la
  o               ımite.
pr´xima no tenga l´
                   o
   Un beso de coraz´n.




  2 Website:   http://www.hombresdebar.com


                                               ix
Chapter 1

Introduction

Calibration and camera parameters estimation constitutes the framework of this thesis project. This
chapter presents a general overview of the topics included in the thesis as well as its motivation and
objectives. In addition, an outline of the thesis is presented at the end of the chapter.



1.1     Camera calibration
It is well known that the camera calibration is the first step that must be done in any computer vision
application. The more invested time in order to improve this step, the better results will be obtained
at the end.
   Roughly, a camera calibration process consists on establish some parameters in order to have a
relationship between the acquired data from the camera and the scene which is in its field of view.
   Basically, two main steps define this process: the estimation of the intrinsic parameters or internal
camera parameters and the estimation of the extrinsic parameters or external camera parameters
[12], [24], [15]. While the former defines the relationship between camera-centric coordinates system
and image coordinates system, the latter defines the relationship between a scene-centred coordinate
system and a camera-centred coordinate system.


1.1.1    Internal camera parameters (intrinsic parameters)
Internal camera parameters model the behaviour of the internal geometry and the optical character-
istics of the camera.
   Before entering into mathematical details, the notation and camera configuration used in this
thesis is presented. The camera coordinate system has its origin at the centre of projection {C}, its
z-axis along the optical axis, and its x and y axes parallel to the u and v axes of the image coordinate
system {I}, as shown in figure 1.1
   A 2D point in the image is denoted by m = [u, v]T and a 3D point in the camera by M = [x, y, z]T .
The augmented vector by adding 1 as the last element: m = [u, v, 1]T and M = [x, y, z, 1]T allows to


                                                   1
Chapter 1: Introduction                                                                                      2


                                                   {I }
                             Image coordinate
                              system (pixels)                     u
                                                          v


                                                   A                      (u 0 , v0 )


                                               c
                                                   z
                                     {C}
                                                                                   w
                                                        c                              z
                                                            x
                                           c
                                               y
                                                                                           {W }
                                                                          w                       w
                               Camera coordinate
                                                                              x                       y
                                   system                                  World coordinate
                                                                               system

Figure 1.1: Internal camera parameters to transform from the camera to the image coordinate system.


represent the homography between camera and image coordinates system. Then, the relation between
M and m is given by:


                                                       sm = A · M                                         (1.1)

where s is the scale factor and A is the intrinsic camera matrix, given by:
                                                                              
                                              fx                      γ   u0
                                           A=0                   fy      v0                             (1.2)
                                                                            

                                                              0       0   1

   The intrinsic camera parameters, presented in (1.1) and in (1.2), usually include the effective focal
length f , scale factor s and the image centre (u0 , v0 ) also called the principal point. γ describes the
skewness of the two image axes. However, many times it is not estimated which may not be a problem
as most cameras currently manufactured do not have centreing imperfections [4].
   Distortion parameters of the optics are also estimated in this step. Usually, the pinhole model
is a basis that is extended whit some corrections for the systematically distorted image coordinates.
The most commonly used correction is for the radial lens distortion that causes the actual image
point to be displaced radially in the image plane. Typically, one or two coefficients are enough to
compensate the distortion. Centres of curvature of lens surfaces are not always strictly collinear.
This introduces another common distortion type, decentering distortion which has both a radial and
tangential component [11].
   In many cases, this necessary step to estimate the internal camera parameters is solved by recover-
ing the position and attitude of a calibration target in the camera coordinate system. For this task a
few toolboxes are available on Internet such as the Camera Calibration Toolbox for Matlab from J. Y.
3                                                                                                           1.1 Camera calibration


                         a
Bouguet [4] or J. Heikkil¨ [10] and the Microsoft Easy Camera Calibration Tool from Z. Zhang [23],
to cite a few.
     An intrinsic parameters estimation is required for prototype cameras with unknown parameters.
For an industrialised camera system, as the IEE 3D camera, these parameters are usually specified
with a high accuracy and verified in qualification and production tests. Therefore, the goal is to
recover the extrinsic parameters being a compulsory condition to not use any geometrically known
target in the scene. For more information related with the intrinsic parameters estimation, please
review the presented references and bibliography.


1.1.2        External camera parameters (extrinsic parameters)
External camera parameters model the position and orientation of the camera with respect to the
world coordinate system.
     The relationship between a 3D point w M and its camera projection c M is given by a transformation
which consists of a translation and a rotation (Figure 1.2), position and orientation, respectively [12].
     w
If       M are the coordinates of a point measured in the world coordinate system and c M are the
coordinates measured in the camera coordinate system, then:

                                                 c
                                                     M = R · wM + t                                                          (1.3)

where R is a 3x3 rotation matrix an t is a translation vector t = [tx , ty , tz ]T .

                                                     {I }
                              Image coordinate
                               system (pixels)                       u
                                                            v


                                                                             (u 0 , v0 )


                                                 c
                                                     z
                                       {C}                               [R t]
                                                                                     w
                                                         c                               z
                                                             x
                                             c
                                                 y
                                                                                             {W }
                                                                             w                      w
                                Camera coordinate
                                                                                 x                      y
                                    system                                       World coordinate
                                                                                     system

Figure 1.2: External camera parameters to transform from the world to the camera coordinate system.

     Another way to present (1.3) in order to allow posterior transformations is:

                                             c
                                                 M=              R       t   · wM                                            (1.4)
Chapter 1: Introduction                                                                                4


   Must be noticed that this transformation has six degrees of freedom, three for rotation and three
for translation.


1.1.3     Homography between the model plane and its image projection
From (1.1) and (1.4), the relationship between a 3D point M in the world coordinate system and its
image projection m in the image plane coordinate system is given by:


                                                  sm = A     R   t   M                              (1.5)

   Without loss of generality, the world coordinate system can be any system convenient for the
particular design of the target. In the case of a planar calibration target, e.g. the floor, the z-axis is
chosen perpendicular to the plane, and Z = 0 in the target plane [12].
   Let’s denote the ith column of the rotation matrix R by ri . From (1.5):
                                                              
                                                            X                             
                          u                                                               X
                                                             Y 
                       s v  = A   r1       r2    r3    t    =A          r1    r2   t   Y      (1.6)
                                                                                          
                                                             0
                          1                                                                 1
                                                              
                                                              1

   By abuse of notation, M denotes a point on the model plane, but M = [X, Y ]T since Z is always
equal to 0. In turn, M = [X, Y, 1]T . Therefore, a model point M and its image m is related by the
homography H:


                               sm = H M            , with    H=A       r1   r2    t                 (1.7)

   The 3x3 matrix H is defined up to a scale factor.


1.1.4     Constraints on the rotation matrix
Any rotation matrix is a 3x3 orthonormal matrix, hence it has six orthogonality constraints between
its row vectors (or columns). These constraints will be crucial in order to determine the camera
position and orientation during the computations. Each vector is mutually orthogonal with a norm
equal to 1 [1], [8].

                                         3
                                             rij · rik = δkj ,   k, j = 1, 2, 3                     (1.8)
                                     i=1

where δkj denotes the Kronecker symbol (δkj is 1 if both indexes are equal or 0 otherwise) [1].
   Another way to represent this orthogonality is:


                                                        AT · A = I                                  (1.9)

being A a rotation matrix and I the identity matrix.
5                                                            1.2 3D camera technology vs 2D cameras


    Hence, a rotation matrix is represented by nine parameters and six nonlinear constraints [1]. From
(1.9):

    • The inverse of a rotation matrix is equal to its transpose A−1 = AT .

    • The determinant of a rotation matrix is equal to 1 det(A) = 1


1.2      3D camera technology vs 2D cameras
3D cameras, also called Range Imaging (RIM) are a fusion of two different technologies. According
to the terminology, they integrate distance measurement as well as imaging aspects.
    While with 2D cameras it is necessary to use techniques such as triangulation [9] from two or
more cameras or a camera and a laser beam, both involving stereo vision techniques [6] or structured
light techniques [17]; with 3D cameras technology it is possible to acquire directly from the camera
capacities a cloud of 3D points of the scene. This cloud of points is referenced to the camera coordinate
system and nothing else than the 3D camera is necessary for this acquisition, as is presented in the
following section. Although this technology still being quite expensive, it starts to be more and more
the focus for many researchers.


1.2.1     Electro-Optical Distance Measurement
The distance measurement principle of range imaging cameras is based on the time-of-flight principle.
The time the light needs to travel from one point to another is directly proportional to the distance
the light has travelled:

                                                     t
                                                d=     ·c                                          (1.10)
                                                     2
where d corresponds to the distance between the sensor and the object, c is the speed of the light and
t is the time between the emitting and the receiving light [13], [14].
    A modulated radiation (e.g. light) is emitted, reflected by an object and partially mapped onto
the receiver. Therefore, the distance sensor-object is half of the travelled distance of the radiation.
    Mainly, two different realisation ways for measure the time-of-flight are known:

    • Pulse Runtime Method (Figure 1.3(a)).

    • Phase Shift Determination (Figure 1.3(b)).

    In the first case the runtime of a single pulse is directly measured. In order to reach a distance
accuracy of a few millimetres, the clock accuracy has to be as low as a few picoseconds, thus very
good clock circuits are indispensable.
    On the other hand, the phase shift measurement principle avoids high precision clocks and uses
more complex and integrative sensor design. The sensor used in this thesis project is based on this
principle. The emitted (incoherent) light is modulated in amplitude with a sinusoidal modulation
Chapter 1: Introduction                                                                               6




                 (a) Pulse Runtime.                                        (b) Phase Shift.


                       Figure 1.3: Time-of-flight distance measurement ways.


(Figure 1.4) whereas other methods are FMCW, pseudo-noise or polarisation modulation [13]. The
reflected, sinusoidal modulated light is demodulated by means of four sampling points that are trig-
gered to the emitted wave. Out of the four tap measurements c(τ 0), c(τ 1), c(τ 2) and c(τ 3), the phase
shift ϕ, the offset B and the amplitude A can be calculated:

                                       c(τ 0) + c(τ 1) + c(τ 2) + c(τ 3)
                                  B=                                                             (1.11)
                                                       4

                                      (c(τ 0) − c(τ 2))2 + (c(τ 1) − c(τ 3))2
                              A=                                                                 (1.12)
                                                        2

                                                    c(τ 0) − c(τ 2)
                                       ϕ = arctan                                                (1.13)
                                                    c(τ 1) − c(τ 3)
The distance between the sensor and the object then can be calculated as follows:

                                                  λmod ϕ
                                             d=       ·                                          (1.14)
                                                   2    2π
where d corresponds to that distance, λmod is the modulation wavelength and ϕ is the phase shift.




                       Figure 1.4: Phase shift distance measurement principle.
7                                                           1.2 3D camera technology vs 2D cameras


    As is presented in [13], their distance accuracy limitation, given by the nature of light for the
investigated range imaging camera in that paper, is around 5 to 10mm. For the IEE camera case, is
around 7mm but in practise this value is affected by environment influences such as sun light, noise
or scene configuration.
    Because of the light’s quantum structure, the amplitude of light cannot be measured at a specific
point of time, thus, a time window ∆t is necessary. Two opposed aspects define the size of the window.
The quantum structure of the light is dominated by the light statistics at a higher ∆t. But the higher
∆t is, the lower the demodulation contrast is. According to [13] the sampling process is usually done
with ∆t at about half the modulation wavelength which results in a demodulation depth of about
64% (Figure 1.5).




                         Figure 1.5: Sampling process and demodulation depth.



1.2.2    MLI Evaluation Tool Kit
The distance measurement principle outlined in section 1.2.1 is primarily able to measure distances in
one direction at a time. Implemented in a scanning device (laser scanner, geodetic total station), a 3D
capturing of the environment is possible. But the serial mode of operation still remains. Range imaging
and also the 3D camera developed by IEE, called MLI Evaluation Tool Kit (Modulated Light Intensity
Evaluation Toolkit), combines the distance measurement technology with the advantages of an imaging
array. Simplified, it just enables each pixel to measure the distance towards the corresponding object
point. This is regarded as an array of range finders. Figure 1.6 presents the schematic principle of
range imaging. A single signal emitter sends out modulated light. This light is reflected by an object
partially back to the sensor and is mapped onto a custom-build solid state sensor by means of optics
(not shown in the figure) and therefore onto the smart pixels which are able to measure the distance.
The measured distances in connection with the geometrical camera relationships can be afterwards
used to compute the 3D coordinates which represent a reconstruction of the imaged object.
    One possible realisation, similar but not equal to the one used in the MLI Evaluation Tool Kit
Chapter 1: Introduction                                                                            8




                                 Figure 1.6: Range imaging principle.


camera, is shown in figure 1.7. By means of four photogates allow four-tap measurement, which means
that the four required intensities c(τ i) are acquired.




                      Figure 1.7: Lateral section of a RIM sensor from PMDTec.

   MLI Evaluation Tool Kit is presented in figure 1.8. The emitting system consists of 20 LEDs,
controlled simultaneously in order to reach a sufficient optical output power. The incoming radiation
is mapped onto the sensor array, which is produced in a specialized mixed CCD and CMOS process.
   Table 1.1 outlines some of the MLI Evaluation Tool Kit characteristics. Its sensor has a non-
ambiguity distance of 7m and 56x61 active pixels. The large pixel size corresponds to the high degree
of integration of electronics.
9                                                                            1.3 Context and motivations




                            Figure 1.8: MLI Evaluation Tool Kit developed by IEE.


                           Table 1.1: MLI Evaluation Tool Kit camera specifications

                                        Number of sensor pixels      64x64
                                        Effective sensor pixels       56x61
                                        Pixel size                68x49µm
                                        Mod. frequency              20MHz
                                        Non-ambiguity distance         7m
                                        Interface                  USB 2.0



     In order to cover a large signal dynamics, the sensor can be run with different integration times
determining how often/long the intensities c(τ i) are collected and thus integrated. For a higher
integration time, the electrons (also the photon generated ones) are collected for a higher number of
cycles.



1.3        Context and motivations
This thesis has been submitted for the Degree of MSc Erasmus Mundus in Vision and Robotics1
(VIBOT). In addition, this thesis is one of the first VIBOT promotion thesis projects, and with the
aim to introduce briefly this MSc, a small section has been dedicated to summarise its contents and
objectives, and also, to notice why VIBOT MSc is slightly different from the rest.
     As is mentioned in section 1.3.1 the last semester corresponds to the training stage. Training which
consists of making individual research on a specific subject applying the knowledge acquired during
the courses.
     Almost all the thesis projects are carried out in one of the three collaborative universities except
very few projects (hopefully more in next VIBOT promotions) which are oriented to solve a specific
project, related with the MSc contents, in a industrial company. Concretely, the present thesis has
been one of the selected to be achieved, in this case, at IEE S.A.2
    1 Website:   http://www.vibot.org
    2 Website:   http://www.iee.lu
Chapter 1: Introduction                                                                             10


1.3.1     MSc Erasmus Mundus in Vision and Robotics (VIBOT)
VIBOT MSc is a two-year Master Program in 3D Vision and Robotics accredited in 2006 by the
European Commission in the framework of the Erasmus Mundus program, a cooperation and mobility
program of the European Commission in the field of higher education in order to promote the European
Union as a centre of excellence in learning around the world. It is the only Erasmus Mundus Master
Program in 3D Vision and Robotics among the 103 Erasmus Mundus Master Programs accredited
since 2004 in all disciplines.
   VIBOT Master students have courses in the three collaborating universities: University of Bur-
gundy3 in France, University of Girona4 in Spain and Heriot-Watt University5 in Scotland. They
spend one semester in each of these three universities and the fourth semester in training. Every
year more than 300 international applications are processed and only about 30 students with the best
academic records are selected. The admission of the brightest students from all over the world, some
of whom were already working in a company, allows to create an international and mobile educated
workforce of high level for the European community.
   VIBOT Master courses are given by faculty and visiting professors who belong to well-known
research laboratories that have long-standing reputation for high quality research. The courses start
from a comprehensive coverage of the prerequisites in the field of digital imaging (hardware and
software) and basic image processing algorithms, and end up with research level teaching of their
applications in the fields of robotics, medical imaging, and 3D vision systems. The close location of
research laboratories on campus allows the faculty to involve students at all stages of research and
offers them many opportunities to participate in state-of-the-art research work.
   Besides technical skills and knowledge, graduate students from the VIBOT Master Program acquire
language skills and a sense of mobility and broad mindedness that will be a real asset in their career.


1.3.2     International Electronics and Engineering - IEE S.A.
Founded in 1989, IEE has demonstrated an outstanding innovation performance and an impressive
growth record. IEE is the global leader in automotive safety sensing systems for occupant detection
and classification, supplying electronic sensor products and applications support to major automotive
customers in Europe, North America and Asia. It has demonstrated significant competence in the
development manufacturing; and marketing of sensing applications.
   IEE is extending the reach of its sensing technologies to automotive comfort and convenience
applications as well as consumer and industrial applications.
   In less than 20 years IEE has grown from a small company into a manufacturer with one of
the occupant sensor industry’s most extensive research and development programs, more than 1200
employees worldwide and state of the art production facilities. IEE designs and manufactures elec-
tronic sensor products for use in automotive airbag systems with an improved safety. The company’s
patented products are also used in the control systems of medical and industrial equipment. IEE is
  3 Website: http://www.u-bourgogne.fr
  4 Website: http://www.udg.edu
  5 Website: http://www.hw.ac.uk
11                                                                                         1.4 Objectives


an R&D leader in occupant sensors technologies for airbag deployment control and one of the world’s
largest suppliers of sensor systems for advanced restraint systems.
     With their unique 3D vision sensing solutions, they are entering into safety and security, process
automation and medical markets.
     In 2001, IEE started the development of a novel system for occupant detection and classification
focused on the 3D camera used for this thesis project. Nowadays, there are several projects under
development which take profit of this system and, consequently, of its camera technology. One of them
is PeCo (People Counting), project in development to count the amount of people, e.g. coming into
a building. This current MSc thesis will be useful for these projects and also will be useful for future
projects where the 3D camera is needed, since it allows an easier installation retrieving the external
camera parameters by itself, avoiding the current and tedious calibration processes.



1.4       Objectives
The calibration process concerns the first step in any computer vision application. It allows to relate
what is coming from the camera to what is occurring in the reality.
     The more robust and accurate the calibration the better results will be obtained at the end and
consequently the better the achievement of the thesis objectives.
     In spite of the variety of techniques related with range cameras calibration, for this thesis proposal
it is necessary to change the philosophy used up to the present days. Actually, there is no known
target such as ’X’-shaped targets [16] or chess patterns [22] which can be placed in the scene in order
to have references to perform the calibration. The ambition of this thesis consists of being able to
calibrate the camera without the necessity of adding any target to the scene. Non-dependent to the
scene configuration, illumination, people, objects, the only constraint will be that at least few 3D
points need to come from a planar surface, preferable the floor.
     Hence, the goal of this thesis is to develop a method for determining the external camera parameters
of the IEE MLI Evaluation Tool Kit camera, which uses as A reference in the scene any planar surface
(e.g. the floor, a wall) being thus, independent of additional reference boards. The thesis aims for
the development of theoretical concepts and algorithms for a calibration method, which requires to
get involved into camera modelling and the mathematical concept of homogeneous transformations.
The thesis comprises the analysis, validation, and test of the algorithms on real data taken, involving
both theoretical and practical challenges.
     In order to achieve the main objective of the thesis, a sequence of sub-objectives has to be followed
and overcome to reach the extrinsic parameters in a self-form. Hence, the followed steps are:

     • Get involved with 3D camera technology and the time-of-flight principle. Study how it works
       and which raw data is acquired from the scene.

     • Perform a state-of-the-art analysis on the current calibration methods for such cameras.

     • Perform a feasibility study to estimate if the thesis project is or not viable.
Chapter 1: Introduction                                                                               12


   • Design, implement and test a basic approach in order to identify the calibration planar surface
      from the cloud of acquired points.

   • Propose a generic approach that solves some existing drawbacks, especially in planar surfaces
      identification.

   • Design a new calibration method by identifying and segmenting the planar target of the scene
      in order to compute the external camera parameters from its inliers, using homographies with
      specific matrix constraints.

   • Perform experimental procedures and a prototype application to analyse the accuracy between
      the obtained results and the empirical measures.


1.5     Thesis outline
This thesis project is structured in 5 chapters following a natural stream through the presented
objectives. Chapter 2 compress a state-of-the-art of the 3D calibration methods based on the time-
of-flight principle. There is a variety of projects which work with 3D cameras and for their purpose
a calibration process is needed, hence, few different techniques has been analysed in order to be
presented and discussed.
   Chapter 3 describes a new calibration method for the IEE 3D camera. This method takes advantage
specifically of 3D camera technology and the provided raw data by itself. This chapter is mainly divided
in two sections. The former deals with a basic approach where there is only one planar surface in the
scene while the latter deals with a generic approach where the scene concerns a real environment.
   To make the calibration method even more generic it is allowed to install the camera in four
different ways. Four camera orientations are allowed (0◦ , 90◦ , 180◦ and 270◦ ) in order to increase
the portability of the camera for other applications. This new challenge, which may seem easy to
overcome is presented and discussed at the end of Chapter 3. The main problem which appears when
the camera orientation is modified is that what in the initial scene is the floor, hence the planar surface
identified to perform the calibration, after rotating the camera 180◦ becomes the roof. In the same
way, by rotating the camera 90◦ or 270◦ , what was used to perform the camera calibration in the
initial configuration becomes a wall in the new configuration.
   Chapter 4 presents an exhaustive testing in order to determine the accuracy and precision of the
method. The application has been run considerable number of times over different scene configurations
(concerning real environments). In addition, different camera configurations have been used during
the acquisitions to deal with specific values of accuracy and precision determined by contrasting the
estimated values with the empirical measured values. Furthermore, the interface and a few details of
implementation are presented, basically because some parameters are configurable in order to adapt
the application to the setup configuration.
   Finally, Chapter 5 presents the conclusion of the thesis and further work derived from the results.
Chapter 2

State-of-the-art in 3D camera
calibration techniques

Nowadays, 3D cameras are becoming more and more available. They promise to make the 3D in-
spection of scenes easier, avoiding the practical issues resulting from 2D imaging techniques based
on triangulation or disparity estimation. In spite of their low resolution and frame rates as well as
their low signal-to-noise ratio, many researchers are seriously considering this technology for current
purposes in computer vision, such as segmentation or metric information.
   Camera calibration is a necessary step in 3D computer vision in order to extract metric information
from the acquired data. Much work has been done and the state-of-the-art is presented through this
chapter.


2.1        Classification of camera calibration methods
Starting in the photogrammetry community and more recently in computer vision, a large amount of
work can be found dealing with camera calibration [24]. Roughly those techniques can be classified
in two categories:

   • Photogrammetric calibration. Belongs to this category techniques which perform the camera
      calibration by observing a calibration target whose geometry in 3D space is known with very
      high precision. Calibration can be done very efficiently since the information given of both
      scene and images is together vast. The calibration target usually consists of two or three planes
      orthogonal to each other. Sometimes, a plane undergoing a precisely known translation is also
      used. Contrary, the downside is that at least an elaborate setup is required [15].

   • Self-calibration. Techniques in this category do not use a calibration target. Just by moving
      a camera in a static scene, the rigidity of the scene provides in general two constraints on the
      camera’s internal parameters. Therefore, if the images are taken by the same camera with fixed
      internal parameters, correspondences between three images are sufficient to recover both the

                                                  13
Chapter 2: State-of-the-art in 3D camera calibration techniques                                       14


      internal and external parameters. Spite this approach is very flexible, it is not yet mature.
      This is caused mainly by the many parameters to estimate. Must be noticed that not always
      is possible to obtain reliable results. Obviously, the main advantage of this methods is that no
      special calibration target needs to be elaborate, however, it has to be remembered that finding
      correspondences over many images is not a trivial task.

   The available documentation on Internet related with calibration fields such as: 3D camera cali-
bration, range imaging camera calibration or camera based on the time-of-flight calibration, belong
to one of the two presented categories with adaptations to their approaches.
   Moreover, there are other techniques which cannot be classify in the last two points. To cite a
few, vanishing points for orthogonal directions or calibration from pure rotation [24]. However these
techniques are not applicable to the thesis approach.
   In summary, spite the amount of techniques developed for 2D camera technology, for 3D cameras
it is really difficult to find any specific calibration method, being all the consulted bibliography based
on an extrapolation of 2D camera calibration techniques, most of them taking advantage of available
calibration toolboxes on Internet [4], [10], [23].
   In order to cite a few of them, the following subsections present the most characteristics papers
which used 3D cameras to solve their approaches.


2.1.1     Practical Range Camera Calibration
Beraldin et al. [3] present in their paper a calibration procedure adapted to a range camera intended
for space applications. The range camera they used can measure objects from about 0.5m to 100m
with a field of view of 30x30 degrees. They propose a two-step methodology to perform the calibration
being the former a specific calibration for the close-range volume (from 0.5m to 1.5m). For this first
step they use an array of targets positioned at known locations in the field of view of the range camera
using a precise linear stage. A large number of targets are evenly spaced in that field of view because
this is the region of highest precision. In the second step, they propose to use a smaller number of
targets (the number of targets is affected by the number of parameters in the model) positioned at
distances greater than 1.5m with the help of an accurate electronic distance measuring device. Beyond
10m, these instruments have angular and distance measuring accuracies 10 times better than their
range imaging camera. Once the registered range and intensity images have been acquired for all
the targets, the edges of those targets are extracted to sub-pixel accuracy using a moment preserving
algorithm. The centre of each target is then determined by fitting an ellipse to the edge points. An
iterative nonlinear simultaneous least-squares adjustment method extracts the internal and external
parameters of the camera model.


2.1.2     3D camera based on the time-of-flight principle calibration
Every pixel in the image contains distance information as well as intensity. This is useful for measuring
the shape, size and location of objects in a scene, hence is well suited to certain machine vision
15                                                      2.1 Classification of camera calibration methods


applications. For example, the use of range data instead of intensity data can simplify the segmentation
of objects in an image, especially when the objects contain high contrast patterns or artifacts.
     The paper presented in this section [22] deals with a segmentation purpose but due to the low
resolution of the 3D camera they resort to an auxiliary standard 2D camera. They acquired images
of the same scene from both cameras and, in order to calculate the mapping between the two images,
a calibration step has to be performed prior to using the camera combination for segmentation tasks.
     If the 3D camera’s image is scaled to 640x480 they are able to compare them directly. There is an
offset caused by the different positions of the cameras that can be fixed by a simple translation. Then,
both images are centred around the same point taking into account few camera distortion varieties
depending on the distance from the scene.
   They use a calibration pattern with at least 5x5 characteristic points, distributed as evenly as
possible on the screen of both cameras (Figure 2.1).




Figure 2.1: Left: Scene captured by the 3D camera. Right: Scene captured by the 2D camera. This
pattern is used for calibration and the difference between them is noticeable.

     Any pattern (e.g. circles, squares, ellipses or line intersections) can be used as calibration template
as long as it is distributed along the screen of both cameras and the difference between the points in
every part of the screen of both cameras is visible.
     Must be noticed that no pattern can be projected into the scene (using e.g. an overhead-projector)
to the 3D camera, as it only senses reflected light from real objects, such as pigments on a sheet of
paper.
     Once both cameras are calibrated the segmentation is performed in two main steps. By using the
3D camera a fast segmentation can be done, being used as a mask to be applied over the acquired
image from the 2D camera. After a pre-processing step using SIOX engine (Simple Interactive Object
Extraction) they achieve the objective with a real-time segmentation with 25 frames per second on a
video with 640x480 pixels.


2.1.3      Calibration for increased accuracy of the range imaging camera Swis-
           sranger
After a brief introduction to the time-of-flight (TOF) concept, this paper [13] makes a characterisation
of the Swissranger SR-2 camera and presents the sensor calibration method followed in order to increase
the accuracy of the camera.
     The calibration method is again based on the photogrammetric calibration. Due to the low resolu-
tion of the sensor, the selection of a suitable test field is crucial. A large test fields would not allow the
Chapter 2: State-of-the-art in 3D camera calibration techniques                                        16


precise measurements of its targets due to the small sensor format. Large rounded targets occupy too
many pixels. Squared targets have no well defined corners. Retro-reflective targets generated noise
effects in the images. Moreover, the lightning of the room influences a lot on the data acquisition.
For all these reasons, a planar test-field whose targets are represented by NIR LEDs was build. These
active targets can be automatically extracted from the intensity image provided by the camera due
to their high intensity in the image. They are automatically extracted by means of a threshold and a
centroid operator, and used as input measures for the calibration procedure.
   The calibration network of the TOF camera included 35 stations, well distributed, in particularly
in depth direction as they used a planar test-field. In order to calibrate the distance measurement
accuracy of a single pixel they apply different techniques such as using different integration times or
up to five targets with a different reflectivity (white, light grey, middle grey, dark grey and black).


2.2     Conclusion
There is no available documentation related with a specific calibration method for 3D camera taking
profit of its capabilities. As has been presented, the available calibration methods to make a corres-
pondence between the world and the camera are based on an extrapolation of 2D camera techniques.
   All the consulted bibliography and also the presented techniques in this chapter depend on a well
known target in the scene, hence, they are techniques that belong to the photogrammetric calibration
category.
   From this state-of-the-art, we are forced to think on a new method differing from the presented
here. A new method where the target which will allow the calibration might be unknown, as far
as its dimensions are concerned. A method where the rigidity of the scene is not present, as far
as no movement between both camera and scene exists. Hence, a completely new method for 3D
camera calibration concerning the camera position and orientation with reference to the world will be
presented in this thesis project.
Chapter 3

A new calibration approach based
on 3D camera technology

After reviewing the existing calibration techniques applied on 3D cameras in section 2.2 has been found
that it is necessary to investigate in a new calibration method able to achieve the initial objective
proposed for this thesis. Range imaging cameras are a fusion of two different technologies, integrating
distance measurements as well as imaging aspects. This chapter presents a new calibration method
which takes advantage of the 3D camera technology and its capabilities.


3.1     Introduction
Starting from the base that the intrinsic parameters of the camera are known, the most critical task
is estimating the camera position and orientation with reference to the world, the goal of this thesis.
   The resolution of the MLI Evaluation Tool Kit camera is 64x64 pixels, however, not all the pixels
are used since few arrays of pixels are dedicated to electronic functionalities and checking. In addition,
there is one row of pixels used to estimate the mean phase value (value used as reference to compute
the distances from the scene to each pixel in the camera plane). Thus, the active area, or region of
interest, of the sensor is comprised between 56x61 pixels.
   The acquired raw data from the camera is saved in a *.c3b file (IEE format file) and interpreted
by the readC3B.m procedure which concerns:

   • header: header that contains the camera parameters.

   • distance: 56x61 matrix with the distance values for each pixel.

   • phase: 56x61 matrix with the phase values for each pixel.

   • amplitude: 56x61 matrix with the amplitude values for each pixel.

   • ITimg: 56x61 matrix with the integration time used in each pixel.


                                                   17
Chapter 3: A new calibration approach based on 3D camera technology                                  18


   • RefRow: 1x61 array with the reference line.

   • TimeStamp: Time stamp provided by microcontroller.

   • IT: Integration time values.

   • T: Temperature inside the camera with which each frame was recorded.

   Therefore, with these information and also with the intrinsic parameters previously determined
the extrinsic parameters should be estimated.
   At this point, to calculate the metric coordinates out of the range image, unit vectors comprising
the optical distortion are used. More information about the computation of this unit vectors and how
to correct the optical distortion is available from the IEE patent [18].
   These unit vectors are returned by create unit vectors.m procedure in three 56x61 matrices, where
each matrix corresponds to x, y and z vector coordinates.
   Figure 3.1 shows a graphical representation of this unit vectors. It must be noticed that each unit
vector coordinates is referenced to the camera coordinate system.




                 Figure 3.1: Unit vectors representation for each pixel of the sensor.

   Hence, with the raw data acquired from the camera and from these unit vectors it is possible to
compute for each sensor pixel the 3D coordinates of the scene points with reference to the camera
coordinate system. To do it, a product between each unit vector and its pixel’s distance retrieved by
the camera must be done.
   Once this 3D coordinates are computed, it will be necessary to compute the transformation matrix
which relates the 3D points from the camera coordinate system to the world coordinate system. This
transformation matrix concerns the extrinsic camera parameters and thus, the goal of this thesis.


3.2     Basic approach
This basic approach is characterised by the simplest scene configuration, where only one planar surface
in the scene is visible by the 3D camera. The presented method may be able to determine the external
camera parameters by only knowing the direction of the planar surface.
   The idea to compute these parameters is based on the following schema (Figure 3.2):
   As is depicted in figure 3.2, in this scene configuration only is considered one planar surface, the
floor in this case, as a calibration target. Without loss of generality, it is assumed that this plane is
placed on w Z = 0 with reference to the world coordinate system, in order to can apply the homography
between the model plane and its image projection presented in section 1.1.3.
19                                                                                                                         3.2 Basic approach




                                                                  c                                     Rotation (R) and
                                                                      uz                                translation (t)
                                                 c
                                                     ux
                                                                       {C}
                                                                  c
                                                                      uy
                                                                                                        w
                                                                                                            z
                                             d
                                                                                        w
                                                                                            x            {W }
                                                                                                                w
                                                                  Floor                                             y
                         c
                             [du   x   , du y , du z      ]   T



                                           Figure 3.2: Simplest camera setup.


     Analytically, from figure 3.2 and from the theory presented in section 1.1.3 the following transfor-
mation can be derived:

                                                          w
                                                              M=               R    t       · cM                                        (3.1)

where R (othonormal rotation matrix) and t (translation vector) constitute the extrinsic transforma-
tion matrix that makes possible to translate the known 3D points in the camera coordinate system to
the world coordinate system.
     The rotation matrix R is presented using Euler angles α, β and γ that define a sequence of three
elementary rotations around x, y and z-axis respectively. The rotations are performed clockwise, first
around the x-axis, then around the y-axis that is already once rotated, and finally around the z-axis
that is twice rotated during the previous stages [1], [11]:


                                                          [R] = [Rz ] · [Ry ] · [Rx ]                                                   (3.2)

where each rotation matrix [Rz ], [Ry ] and [Rx ] is given by:
                                                                                                       
                                                          cos α                     − sin α         0
                                                 [Rz ] =  sin α                       cos α        0                                  (3.3)
                                                                                                    

                                                                           0            0           1
                                                                                                       
                                                                       cos β            0       sin β
                                                 [Ry ] =                  0            1         0                                    (3.4)
                                                                                                     

                                                                      − sin β           0       cos β
                                                                                                       
                                                                      1         0               0
                                                 [Rx ] = 0                    cos γ        − sin γ                                    (3.5)
                                                                                                   

                                                                      0        sin γ        cos γ
Chapter 3: A new calibration approach based on 3D camera technology                                                        20


   By substituting the rotation matrices (3.3,3.4,3.5) in (3.2) the resulting matrix is:

                                                                                                           
                     cos α      − sin α         0      cos β      0     sin β   1                0        0
              [R] =  sin α     cos α           0 ·  0          1      0  · 0               cos γ   − sin γ 
                                                                                                           

                        0           0           1     − sin β     0     cos β           0       sin γ   cos γ

                                                                                                                   
                cos α cos β      cos α sin β sin γ − sin α cos γ          cos α sin β cos γ + sin α sin γ
         [R] =  sin α cos β     sin α sin β sin γ + cos α cos γ          sin α sin β cos γ − cos α sin γ                (3.6)
                                                                                                                   
                                                                                                                    
                    − sin β                     cos β sin γ                             cos β cos γ

   To simplify the notation, let’s denote the ith row and the j th column of the rotation matrix R by
aij respectively. From (3.1):

                                                                                       
                                  w                                              c
                                        X  a11            a12     a13    tx             X
                                w  
                                                                         ty   c Y 
                                                                              
                                 Y  a21                a22     a23       · 
                                 w Z  = a                                                                             (3.7)
                                      
                                       31              a32     a33    tz   c Z 
                                                                              
                                   1        0                 0   0      1       1

   By fixing the Z-coordinate of the plane to w Z = 0:

                                                                                       
                                  w                                                 c
                                        Xa11              a12     a13    tx             X
                                w  
                                                                         ty   c Y 
                                                                              
                                 Y  a21                a22     a23       · 
                                 0  = a                                                                               (3.8)
                                    
                                     31                a32     a33    tz   c Z 
                                                                              
                                  1       0                   0   0      1       1

   From (3.8) the equation of interest is the one resulting from the product between the last trans-
formation matrix row and the point c M which is:


                                        0 = a31 c X + a32 c Y + a33 c Z + tz                                             (3.9)

   Equation (3.9) presents four unknown variables that implies that at least four points (n = 4) from
the calibration target (the floor in this case) are needed in order to solve the system:

                                 
                                  0 = a31 c X1 + a32 c Y1 + a33 c Z1 + tz
                                 
                                                     .
                                 
                                                     .
                                                     .                                                                  (3.10)
                                 
                                 
                                   0 = a31 c Xn + a32 c Yn + a33 c Zn + tz
                                 


   The more points used from the calibration object, the more precise will be the value of each
unknown variable [a31 , a32 , a33 , tz ]. Thus, a system with n-equations may be solved in order to obtain
these unknown variables.
   One way to efficiently solve the system in (3.10) is by singular value decomposition (SVD) but it
21                                                                                               3.2 Basic approach


is necessary to write the system in a matrix form:
                                                                            
                                   
                                        c           c         c
                                                                           a31
                                  0       X1            Y1        Z1    1   
                                 .  .                .         .       a32 
                                                                        .  
                                 . =  .              .         .     . ·                                (3.11)
                                 .  .                .         .     .
                                        c           c         c
                                                                           a33 
                                  0       Xn            Yn        Zn    1
                                                                             tz

     After the SVD computation, the singular vector v that corresponds to its smallest singular value
u gives the least square approximation to its solution [8].
   However, the solution given by SVD is one of the possible solutions which fits with the system, but
can (and probably will) not be the valid one. This means that each value of each unknown variable
is up to a scale factor, called λ. Then, the valid and unique solution will be determined by:

                                             a31 = λv1       a33 = λv3
                                                                                                             (3.12)
                                             a32 = λv2       tz = λv4

     Fitzgibbon et al. [7] presented a way to solve an homogeneous system in order to avoid the trivial
solution ai = 0 for the unknown variables. They propose a solution based on constraining in some
way the parameters which must be estimated. The aim of this paper consists of fitting a cloud of
3D points inside an ellipse, then, the constraint they fix is based on imposing the equality constraint
4ac − b2 = 1.
     Our goal is not fitting the points in an ellipse, but conceptually what is needed is solving a
homogeneous system, what Fitzgibbon et al. achieve using a nice method constraining the unknown
parameters with a quadratic constraint on the parameters. From [7] is showed that if a quadratic
constraint is set on the parameters, the minimization can be solved by considering rank-deficient
generalised eigenvalue system.
     Following the same criteria, it is necessary to fix a quadratic constraint on the unknown variables
[a31 , a32 , a33 , tz ]. As far as a31 , a32 and a33 correspond to the last row of a rotation matrix and also,
from the orthonormal constraints presented in section 1.1.4, from (1.9) is given:

                                            T                                           
                          a11    a12   a13          a11      a12       a13      1    0   0
                         a21    a22   a23  · a21          a22       a23  = 0    1   0                  (3.13)
                                                                                     

                           a31   a32   a33          a31      a32       a33       0   0   1

     Therefore, from these matrix product results the following system:
                                          
                                           a2 + a2 + a2 = 1
                                           11     12   13
                                            a2 + a2 + a2 = 1                                                 (3.14)
                                           21
                                           2
                                                   22   23
                                            a31 + a2 + a2 = 1
                                                   32   33

where the last equation will give our necessary quadratic constraint to fixed the system of equations
(3.11) to one and unique valid solution:


                                              a2 + a2 + a2 = 1
                                               31   32   33                                                  (3.15)
Chapter 3: A new calibration approach based on 3D camera technology                                    22


    To sum, from the SVD computation a solution for the system of equations (3.11) is given up to
a scale factor λ (3.12). Thereby, the correct and unique solution is fixed by applying the constraint
presented in (3.15) as follows:


                                    (λv31 )2 + (λv32 )2 + (λv33 )2 = 1


                                             2     2     2
                                        λ2 (v31 + v32 + v33 ) = 1                                  (3.16)

    Notice that from (3.16) two possible solutions can be obtained, but only will be considered the
one with the positive sign.

                                                        1
                                        λ=±       2     2     2                                    (3.17)
                                                 v31 + v32 + v33


3.2.1     Range angle definition
The two rotations that are estimated by this method are the rotation over the x-axis (γ or pitch angle)
and the rotation over the y-axis (β or roll angle). The rotation over camera’s optical axis or z-axis
which is defined by α or the yaw angle cannot be determine since no reference points are used in the
scene. Assuming that the camera is not rotated over its optical axis (z-axis), the α value can be fixed
to 0◦ .
    As far as the range of rotation is considered, the pitch rotation is determined between [0, π/2]
since the camera must be always looking to the floor (more details are given in section 3.3.2) and the
roll rotation is determined between [−π/4, π/4]. If a wire rotation is needed it must be considered
another camera configuration (Section 3.6).


3.2.2     Recovering the external camera parameters (position and orienta-
          tion)
Once the λ value is determined from (3.17), the unknown variables [a31 , a32 , a33 , tz ] can be automati-
cally estimated by arranging (3.12).
    tz is directly determined since tz = λv4 but in order to determine the angle values it is necessary
to solve the following system of equations derived from (3.6):
                                          
                                           a31 = − sin β
                                          
                                            a = cos β sin γ                                        (3.18)
                                           32
                                            a33 = cos β cos γ
                                          

where β can be directly determined as follows:


                          a31 = − sin β → sin β = −a31 → β = arcsin(−a31 )                         (3.19)

    Notice that there are two possible β values in which − sin β has the same value. This would be a
23                                                                                  3.2 Basic approach


problem but since the range of β values is defined in section 3.2.1, only one possible solution can be
given. According to the γ angle, it can be determined from the following system of equations:

                                                                   a32
                                     a32 = cos β sin γ → sin γ =  cos β
                                                                    a33
                                                                                                  (3.20)
                                     a33 = cos β cos γ → cos γ   = cos β

     Many ways can be used in order to solve the previous system. One way can be dividing both
equations:

                                                        a32
                                              tan γ =                                             (3.21)
                                                        a33
     where γ can be directly determined as follows:

                                                          a32
                                             γ = arctan                                           (3.22)
                                                          a33
     According to the implementation of the method, it is recommended to use arctan 2 instead of
arctan to not limit the γ value to the interval [−π/2, π/2]. By using the arctan 2 command, the given
value will be selected according to the right signs.
     From these computations, β and γ angles which correspond to roll and pitch rotations respectively,
have been estimated. The remaining angle to fulfil the rotation matrix is the yaw angle (α) which is
fixed for our convenience to 0◦ as has been mentioned before.
     According to the translation vector t, it is fixed to [0, 0, tz ]T since the world coordinate system
is fixed just below to the camera (there is no translation in x and y-axis) and belonging to the
model plane (Figure 3.5). Finally, from the determined values and from the imposed constraints,
the external camera parameters are estimated in order to transform the 3D points form the camera
coordinate system to the world coordinate system, as is depicted in (3.1).


3.2.3      Evaluation of the method
In order to evaluate this new external-self-calibration method, the setup depicted in figure 3.3 (a) has
been used to acquire raw data from the scene.
     As far as this basic approach has been designed to work in scenes where only one plane may be
visible by the camera, the camera has been oriented looking to the floor (Figure 3.3 (c)) and no other
object or plane have been interposed in the scene.
     A few acquisitions have been performed with different camera configurations such as different pitch
and roll angles (Figure 3.3 (b)) in order to run on the calibration process and determine the external
camera parameters.
     The results of this evaluation are depicted in table 3.1. The estimated values become from only
one execution of the algorithm over the raw acquired data. It can be noticed that the estimated values
and the empirical measures differ between [0◦ , 1.5◦ ] which is quite good since there is a small external
error on the empirical measures becoming from the digital level (Figure 3.3 (d)) used for this purpose.
     It must be taken into account that the method takes several arbitrary points from the cloud of
points to make possible the SVD computation of (3.11). Any point of the scene can be used since
Chapter 3: A new calibration approach based on 3D camera technology                                      24




Figure 3.3: (a) structure of the setup, (b) rotations over x-axis (pitch angle) and y-axis (roll angle), (c)
amplitude image acquired by IEE 3D camera and (d) digital level used for empirical measurements.


                      Table 3.1: Results for the basic approach method evaluation
                          measured        20◦       30◦      40◦      0◦       30◦            50◦
         Pitch (γ) angle                      ◦        ◦       ◦       ◦
                          estimated 19.81        29.48    39.93    0.68     29.70◦         50.07◦
                                              ◦        ◦       ◦       ◦
                          measured          0     −10        10      30      −20◦           −10◦
         Roll (β) angle                       ◦        ◦       ◦       ◦
                          estimated     0.01     −9.78    11.51   31.13    −20.82◦        −10.09◦


all the points will belong to the only planar surface present in the scene. However, the determined
position and orientation of the camera coming from one execution to another will present slightly
differences depending on the used points to solve (3.11). Hence, an exhaustive evaluation may be
done in order to determine the accuracy and precision of the method and it will be done in the next
chapter together with the experimental results coming from the generic approach. For this basic
approach evaluation, the results obtained and depicted in table 3.1 are enough to realise that the
implemented method works and therefore, it is viable to continue by adding more complexity to the
scene.



3.3       Generic approach
At this point the main objective of this thesis has been achieved, since the external camera parameters
(R, t) have been determined with high accuracy and precision as is shown in next chapter. However,
until now it has been demonstrate that the method can be applicable on scenes where only one plane
25                                                                                3.3 Generic approach


is visible by the camera. The following step will be adding more complexity to the scene in order to
achieve the same objective in a real environment.
     By the moment, the external camera parameters cannot be retrieved directly from a real scene
because there are many factors to take into account such as distinction between the roof and the floor,
and discrimination of objects, furniture or people. While in the basic approach the scene concerns
only one planar surface, now more objects (e.g. furniture or people) will be added to increase its
complexity. Notice that, the more dense the scene, the more complex it will be to determine the
external camera parameters.
     Starting from a scene with one planar surface (the floor) and few objects (e.g. furniture or people),
the first problem will be determining from the acquired cloud of points, only the 3D points which
belong to the floor, since it is the region of interest to perform the calibration. If the algorithm is
able to distinguish few points belonging to the floor (at least three non collinear points are necessary
to determine a plane) from the rest of the scene, the complexity of the scene will be reduced to the
same complexity presented in the basic approach (Section 3.2), thus, the external parameters would
be achieved.
     In order to be able to distinguish between the 3D points which belong the floor (modelled mathe-
matically as a plane in order to be used as a calibration target) from the 3D points which belong to
the objects, a good approach could be trying to identify the 3D points which fit to a plane, that is,
trying to determine the parameters of a plane Π with the maximum number of 3D points belonging
to it (inlier points).
     According to Bauer et al. [2] they present a system for creating plane based models of buildings
from dense 3D point clouds. In order to detect all dominant facade planes they use a RANSAC based
plane method. The method they present to detect 3D planes in the 3D point cloud follow the next
steps:

     1. Randomly choose a seed point from the cloud of points (this is comparable to a bucketing
         approach).

     2. Perform a RANSAC based plane detection algorithm in a local neighbourhood around the seed
         point (a KD-tree is used for fast nearest neighbour queries).

     3. Verify the resulting local plane hypothesis.

     4. Refine the local RANSAC solution with an iterative least square fit using the whole point cloud.

     5. Repeat from step 1 until no new planes are found.

     Instead of having a 3D cloud of points resulting from a dense image matching process applied
on multiple images, in our case we have a 3D cloud of points resulting from only one acquisition
(Section 3.1). However, we share the same approach that is presented in [2] and by the same way
the extraction of the 3D points which belong to the plane can be performed by applying a RANSAC
based plane detection algorithm.
Chapter 3: A new calibration approach based on 3D camera technology                                  26


3.3.1       RANSAC based plane detection
The RANSAC algorithm is an algorithm for robust fitting of models. It is robust in the sense of good
tolerance to outliers in the experimental data. The estimate is only correct with a certain probability,
since RANSAC is a randomised estimator.
   The structure of the RANSAC algorithm is simple but powerful. First, samples are drawn uni-
formly and randomly from the input data set. Each point has the same probability of selection. For
each sample a model hypothesis is constructed by computing the model parameters using the sample
data. The size of the sample depends on the model one want to fit. Typically, it is the smallest size
sufficient to determine the model parameters. In our case this sample set will be three, since three
points are needed to define a plane. In addition, drawing more than the minimal number of sample
points is inefficient [5].
   A RANSAC based plane detection has been implemented ”fitPlaneRansac.m” to compute, during
a certain number of iterations (maxTrials), the best three 3D points from the scene to determine the
plane coefficients with the maximum number of inliers (according to the plane definition function
”definePlane.m”). The inliers which correspond to this best plane estimation will be returned from
the function, being used as input to the ”fitPlane.m” procedure which by least squares computes the
coefficients [a, b, c, d] of a plane Π [8].

                                            Π : ax + by + cz + d = 0                             (3.23)


3.3.2       Defining the world coordinate system {W }
At this point the algorithm is able to distinguish between objects (e.g. furniture or people) and the
calibration target, in our case a plane which corresponds to the floor. The downside appears when
more complexity is added to the scene understanding as more complexity the fact of adding more
than one plane. The algorithm must be able to distinguish the calibration plane (e.g. the floor) from
another planes (e.g. the roof or walls).
   A scene comprising more than one plane and also objects is considered a real environment. There-
fore, by solving the calibration approach in this new challenge, the thesis approach will be completely
achieved.
   Figure 3.4 depicts how the camera coordinate system is established to introduce some constraints
regarding the positioning of the camera. The right hand rule is used in order to fix the axes directions,
thus the index finger corresponds to the x-axis, the middle finger to the y-axis and the thumb finger
to the z-axis. It is established that the origin of the camera coordinate system is on the bottom-right
of the camera where the x-axis corresponds to the base and the y-axis to the lateral of the camera.
27                                                                                                  3.3 Generic approach




                                                           c
                                                               y
                                                                     c
                                                                         z
                                                       c
                                                           x   {C}
                           Figure 3.4: Camera coordinate system definition.


     With reference to the world coordinate system, it is established according to our preferences. Then,
if the camera is situated at a certain altitude (tz ) and looking directly to the floor (default position),
the world coordinate system will be the same as the camera coordinate system with a translation over
the z-axis. This default configuration is depicted in figure 3.5.
                                                                                 c
                                                                                     z
                                                                                         {C}

                                                                         c
                                                                             x             c
                                                                                               y

                                                                                                      tz
                                      →
                                                                                 w
                                      n                                              z
                    floor, Π                                                             {W }

                                                                         w
                                                                             x             w
                                                                                                y

                              Figure 3.5: World coordinate system definition.

     The camera can move freely in the scene but always looking to the floor since it corresponds to the
calibration target and must be visible by the camera. Then, the allowed rotation over the x-axis in
the camera coordinate system is defined in the range of [0, π/2] as has been discussed in section 3.2.1.
The unit vector at the principal point of the camera c u must be always looking directly to the floor
as is depicted in figure 3.6
     In order to distinguish the calibration plane (e.g. the floor) from the rest of planes in the scene
(roof or walls) the following criteria is followed:
Chapter 3: A new calibration approach based on 3D camera technology                                                 28



                                                                                                     {C}
                                                                          c
                                                                              uz

                                                                     d
                                              →
                                                                                            w
                                              n                                                 z
                          floor, Π                     γ                                             {W }
                               c
                                   [du   x   , du y , du z   ]   T
                                                                                        w
                                                                                            x            w
                                                                                                             y

                         Figure 3.6: Defined camera rotation over its x-axis.



   Notice that all the 3D points coordinates in the scene are referenced to the camera coordinate
system being the origin p0 = (0, 0, 0) the camera coordinate system. A line s is defined from the
origin p0 through the calibration target Π intersecting it at p1 . In addition, the distance between p0
and p1 must be minimal. This fact implies that s must be orthogonal to Π and hence, the director
vector of s will be the same as n, normal vector of Π (Figure 3.7).




                                                                         xw                         zw
                                                                                   p0

                                                                              yw

                                                                                            s

                                                                                   zw
                                                                                            →
                                     floor, Π                                      p1 n

                                                                              xw                    yw

 Figure 3.7: Identification of the calibration plane with reference to the camera coordinate system.



   The definition of the line s is:


                                         x − p0 (1)   y − p0 (2)   z − p0 (3)
                              s:                    =            =                                               (3.24)
                                           n(1)         n(2)         n(3)
29                                                           3.4 Improvements on the generic approach


or in the parametric form:
                                
                                 x = p0 (1) + λn(1)
                                
                           s:     y = p0 (2) + λn(2)       , with     p0 = (0, 0, 0)
                                
                                  z = p0 (3) + λn(3)
                                

                                                
                                                 x = λn(1)
                                                
                                           s:     y = λn(2)                                        (3.25)
                                                
                                                  z = λn(3)
                                                

     In order to define the intersection point p1 it is necessary to retrieve the λ value from the system
of equations composed from (3.23) and (3.25). By arranging these equations is given:


                            λn(1) · n(1) + λn(2) · n(2) + λn(3) · n(3) + d = 0


                                     λ(n(1)2 + n(2)2 + n(3)2 ) = −d


                                                       −d
                                       λ=                                                          (3.26)
                                            n(1)2   + n(2)2 + n(3)2
     Once λ has been determined, it is possible to calculate the intersection point p1 between the line
s and the plane Π. From its coordinates the following constraints must be accomplished:

     1. The Z component of p1 must be negative, since the calibration target should be below the
       camera. Notice that the unit vector c u which corresponds to the camera’s principal point is
       (0, 0, −1) to be consistent with the camera coordinate system.

     2. The dot product between n and the camera’s x-axis indicates the cosine of the angle (β) between
       the base of the camera (x camera axis) and the calibration plane which is the same than the
       rotation over the y-axis, roll rotation. As defined in section 3.2.1 the β angle or roll rotation is
       defined between [−π/4, π/4], then, the absolute value of this dot product may be in the range
             √
       of [0, 2/2].

     From this verification, the calibration object can be identified and used to accomplish the thesis
objective. It must be noticed that only one plane below the camera is considered as a calibration
target. Thus, if e.g. a step is being seen by the camera, the altitude of this step can be an offset to
consider in the translation vector returned by the algorithm while the determination of the camera
orientation will not be affected.


3.4       Improvements on the generic approach
Two main improvements have been implemented in order to increase the computational efficiency of
the method and the accuracy and precision of the results.
Chapter 3: A new calibration approach based on 3D camera technology                                    30


   What requires most computation time in the external parameters estimation is the RANSAC based
plane detection execution. As RANSAC works with an iterative approach, its execution in order to
fit a plane model into the cloud of points can be considerable. From [5] different ways in order to
improve significantly its efficiency are presented being the most emphasised implemented.



3.4.1     RANSAC runtime improvements

The speed of execution for the RANSAC algorithm depends on two factors: firstly, the number of
samples which have to be drawn to guarantee a certain confidence to obtain a good estimate; and
secondly, the time spent evaluating the quality of each hypothetical model. The latter is proportional
to the size of the data set [5].
   According to the first factor, the number of iterations is fixed while the data set can be reduced
significantly by discarding all the 3D points which belong to a false plane.
   A certain amount of inliers are required in order to estimate a sub-cloud of 3D points as a calibration
plane. The quantity of inliers is determined according to the scene configuration; if the scene contains
only the calibration plane, all the scene points can be considered as inliers while the more objects
and planes in the scene, the less inliers would be required to define the calibration plane. Thus, the
number of required inliers is a threshold defined by the user easily configurable in the user interface
(Figure 3.11).
   If the estimated plane does not fit with the imposed restrictions in order to define a calibration
plane (Section 3.3.2), all the points which belong to this plane can be neglected for the next RANSAC
iterations. Hence, at each RANSAC iteration more and more 3D points will be discarded reducing
the data set and accelerating the RANSAC execution.



3.4.2     Method for enhancing the contrast in range images

TOF cameras rely on active illumination, and deliver range data by measuring the time needed for
light to travel from the camera light source to the scene and back to the camera. However, depth
measurement may be affected by secondary reflections, namely by reflections between the lens and
imager, designated thereafter by scattering. In situations where a wide range of depths is imaged, the
large signal range can make scattering from close objects come in competition with the primary signal
from far objects, causing artifacts in the depth measurements [20].
   A method for enhancing the contrast in range images is presented by an IEE patent [19]. In this
patent is presented a method to correct the distance measurement error caused by the loss of contrast
by image processing algorithm involving the measured amplitude and distance image.
   After applying this correction to the raw data, the 3D points which are far from the non-ambiguity
distance (Table 1.1) can be neglected and thus, the amount of data to process is reduced. This decrease
of data to process will improve the computation time.
31                                                  3.5 External camera parameters estimation from all the sequence


3.5       External camera parameters estimation from all the se-
          quence
Until now the calibration approach has been applied directly to only one frame of an acquisition
sequence. In order to increase the accuracy and precision of the recovered external parameters a good
challenge would be considering several or all the frames of an acquisition sequence.
     From this point there are three possible ways to follow:

     1. The first possibility consist of computing the calibration plane with most inliers for each frame.
       After that, retrieve the camera orientation (β and γ rotation angles) and position (tz value)
       as is documented in section 3.2 for each calibration plane, and finally compute the median of
       all these values [Figure 3.8(a)]. By computing the median the extreme values (outliers) will be
       discarded while using the mean these values would perturbed the results.

     2. The second possibility consists of merging the cloud of points for each frame and computes
       a unique calibration plane with the most inliers. Once the calibration target is determined,
       the estimation of the camera orientation and position can be done. However, this approach is
       unviable due to the run out of memory error [Figure 3.8(b)].



                                         L                                         1               L            n
                            1                      n

                                                                           1                           n
                  1
                      ( X , Y , Z )1     n
                                             ( X , Y , Z )1                    ( X , Y , Z )1              ( X , Y , Z )1
                             M                      M                                  M                        M
                                                                          1                           n
                 1
                     ( X ,Y , Z )m       n
                                             ( X ,Y , Z )m                    ( X ,Y , Z )m                ( X ,Y , Z )m
                                                                                   n       m

                           Π1                      Πn                            ∑∑
                                                                                  i =1 j =1
                                                                                               i
                                                                                                   ( X ,Y , Z ) j


                      β 1 , γ 1 , t z1       β n , γ n , t zn                                      Π

                                 Δβ , Δγ , Δt z                                                β ,γ ,tz
                            (a) First approach.                                   (b) Second approach.


       Figure 3.8: First and second approach to consider several frames in the recorded sequence.


     3. The last possibility consists of computing the calibration plane with most inliers for each frame
       and after that computing a plane which fits with all the previous planes. External camera
       parameters will be derived from this global plane [Figure 3.9]. In this case, a problem appears
Chapter 3: A new calibration approach based on 3D camera technology                                  32


      when there are more than one plane below the camera, e.g. a step on the road. This case will
      be avoided by the first two approaches but in the latter case the results will be perturbed.


                                         1                      L                 n


                             1
                                 ( X , Y , Z )1:m ∈ Π 1             n
                                                                        ( X , Y , Z )1:m ∈ Π n

                                                n   m

                                              ∑∑
                                               i =1 j =1
                                                           i
                                                               ( X ,Y , Z ) j

                                                               Π

                                                           β ,γ ,tz

           Figure 3.9: Last approach to consider several frames in the recorded sequence.


   From the presented approaches to consider more than one frame of the sequence, the proposed
solution corresponds to the former possibility. Hence, a plane is fitted for each cloud of points of each
frame and camera position and orientation is estimated. At the end, all the values are sorted and the
median value is taken as a representative value of all the considered sequence.


3.6     Camera configuration
The proposed method is able to estimate the external camera parameters from a scene where at least
few points coming from the calibration target (the floor for this purpose) are visible from the camera
during the calibration process. The method is able to distinguish the plane which corresponds to
the calibration target from the rest of potential planes such as walls or the roof that appear in the
camera’s field of view. In addition, it does not matter how many objects (e.g. furniture or people)
are present in the scene since they do not fit in a plane model and they are thus discarded. However,
a drawback appear if the camera configuration changes.
   Until this point, the camera has been moved and rotated over the scene and its position and
orientation has been determined from the new proposed calibration method. However, the camera
coordinate system has been always fixed as is depicted in figure 3.4. The new challenge which has
been added to the project is the capacity to allow different cameras configurations (Figure 3.10).
   Depending on the application it may be necessary to install the camera in a vertical position
instead of horizontal (default position). This implies to apply a 90◦ or 270◦ rotation on the default
position, swapping the x-axis from the default configuration to the y-axis, and the y-axis from the
default configuration to the x-axis in the new configuration.
33                                                                           3.6 Camera configuration


     In order to allow these new camera configurations (Figure 3.10), four buttons have been added to
the application interface (Figure 3.11) where the user can select in which configuration the camera
has been installed (more details are presented in the appendix A).




Figure 3.10: Camera configuration. (a) default position, (b) camera rotated 180◦ , (c) camera rotated
90◦ and (d) camera rotated 270◦ .




                  Camera configuration buttons


                       Figure 3.11: External-Self-Calibration toolbox interface.

     According to the implementation, the estimation of the external values are computed always in
the same way and once they are computed, a rotation of 90◦ , 180◦ or 270◦ is applied over the resulting
extrinsic rotation matrix. Unfortunately, it is inevitable, if one wants to exclude the roof or walls in
the scene, to select the camera configuration in the interface since the camera by itself is not able to
know in which configuration it is.
Chapter 4

Experimental results

This chapter reports on an analysis of the results obtained from the methods presented in Chapter 3. It
is divided in two main sections, firstly it is presented and analysis of the accuracy and the precision of
the basic approach, where only a planar surface is considered in the scene; secondly it is presented how
the method is able to discriminate objects such as furniture or people and also how it can distinguish
the calibration plane from the rest of the planes in the scene. By identifying the calibration plane in a
complex scene, its complexity will be reduced to the same as in the basic approach. All data has been
acquired from a real environment with different camera configurations, orientations and positions.



4.1     A real environment
In order to achieve the goal of the thesis, the method had to work in non-lab conditions, that is what
is depicted in figure 4.1. As can be observed from the figure, objects such as people or furniture and
also walls are considered in this new setup. With the generic approach presented in Chapter 3 all the
3D data which not belong to the calibration plane will be removed in order to reduce a complex scene
to a trivial scene where the method is able to extract the external camera parameters.
   Must be noticed that no geometrically known calibration object has been used in the scene, such
as chess patterns or known reference points. Only the points coming from a planar surface below to
the camera are used in order to model the calibration plane. Normally this plane will correspond to
the floor and part of it may be present in the camera’s field of view during the calibration process.



4.2     Basic approach evaluation
This basic approach evaluation determines the accuracy and precision of the presented method. For
this purpose a Motorised Goniometric Test Bench has been built by Roberto Orsello (Appendix B)
in order to orientate the 3D camera in any desired roll and pitch rotation angle (inside the working
range). With this device it is possible to achieve an accuracy of ±1◦ since it is controlled by two
stepper motors.


                                                   34
35                                                                       4.2 Basic approach evaluation




                                 Figure 4.1: A real environment setup.


     It has been necessary to implement a controller unit interface under Matlab in order to can control
both stepper motors and to configure the camera orientation in the desired orientation (more details
are presented in the appendix B).
     With reference to the test used for the method accuracy evaluation, a new module to handle the
controller unit has been implemented in C and integrated by Thomas Solignac to the current 3D
camera acquisition software. Once it has been functional, a script to make an acquisition every 5
degrees displacement in roll and pitch orientations for all the working range has been executed. To
sum, from 19 steps in pitch angle ([−pi/2, π/2]) and 7 steps in roll angle ([0, π/4]); a total of 133
acquisitions have been performed to validate the method accuracy.
     The difference between the empirical orientation, determined by the test bench, and the estimated
orientation given by the presented calibration method, determines the accuracy of the method.
     In order to determine the precision it is necessary to execute the method over the same acquired
raw data several times, in our case 10 executions on the same raw-data have been performed to
compute the variability in the reported results.


4.2.1      Determining the accuracy and precision of the method
Must be noticed that the accuracy is the degree of veracity while precision is the degree of repro-
ducibility (Figure 4.2).
     Since the method uses RANSAC to identify the calibration object and RANSAC starts with an
Chapter 4: Experimental results                                                                          36




                               (a)      High               (b)       High
                               accuracy, but               precision, but
                               low precision.              low accuracy.


                                Figure 4.2: Accuracy versus precision.


arbitrary seed, the results from different executions on the same acquisition may be different. A total
of 20 executions (N = 20) on the same raw data have been performed and by analysing statistically
the variation in the measurements, that is, by computing the standard deviation σ over the estimated
values, the precision of the method will be defined.
   According to the accuracy, it is determined from the difference between the mean x over the
estimated values and the empirical measures obtained from the motorised goniometric test bench.
   Table 4.1 depicts the accuracy and precision on four arbitrary method executions with different
3D camera configurations:


           Table 4.1: Arbitrary method execution on different 3D camera configurations
   Pitch angle in ◦ (γ)      Empirical value     x           90       70       60        40
                             Estimated value     x      91.1213 71.8179 62.3639 42.1312
                             Accuracy         (x − x)    1.1213   1.8179   2.3639    2.1312
                             Precision           σ       0.6377   0.6452   0.9090    0.5631

                   ◦
   Roll angle in       (β)    Empirical value       x             0              5        10        15
                              Estimated value       x       -2.1718         3.7822   11.3024   14.3043
                              Accuracy           (x − x)     2.1718         1.2178    1.3024    0.6957
                              Precision             σ        1.4166         1.0131    0.9926    0.8626

   Distance in cm (z-axis)    Empirical value       x           105            112       107        98
                              Estimated value       x        105.12         112.65    107.74     97.78
                              Accuracy           (x − x)       0.12           0.65      0.74      0.22
                              Precision             σ          1.09           0.70      0.59      0.47

   As can be observed from table 4.1 a great estimation is performed by the proposed method. By
taking into account the 133 different configurations, it can be defined a roll accuracy of less than 2
degrees and pitch accuracy of less than 4 degrees. According to the precision, all the estimated values
are comprised in less than 1.5 degrees which also means really high precision.


4.2.2    Sensitivity to the noise
The internal camera imperfections during the manufacturing process disturb the measurement accu-
racy but more important are the environmental measurement uncertainties coming from the scattering
and multiple reflection, harder to be well modelled statistically. This imperfections and environment
37                                                                       4.3 Generic approach evaluation


effects appear in the camera acquisitions as noise, changing in an arbitrary way in each frame.
     To evaluate how sensitive the method is to the noise, it has been selected a complete acquired
static sequence of 50 frames and a total of 100 executions applying the calibration method have been
applied to each one. Then, the x and σ values have been computed between all the estimated roll,
pitch and altitude values, and from its variability the sensitivity to the noise has been estimated. It
must be noticed that the disparity proportioned by RANSAC has converged during 100 executions on
each frame.
     Table 4.2 shows the variability obtained between the estimated values during one acquisition
sequence (50 frames). It can be observed that this variability is less than half degree which means
that the sensitivity to the noise can be neglected.


                   Table 4.2: Sensitivity to the noise in a complete sequence (50frm)
                                               β      γ     z-axis
                                        σx 0.26 0.33         0.34
                                        σσ 0.12 0.14         0.08

     Table 4.3 depicts an example of the resulting values for six selected frames in a sequence. However,
the total mean and standard deviation have been computed over all the sequence values.


                Table 4.3: Mean (x) and standard deviation (σ)      in a sequence of frames
                              x       fr.1     fr.5    fr.10         fr.15    fr.20     fr.25
               x of γ         30    29.03    29.60    29.10         28.90    28.48     29.15
               σ of γ          0     0.81     0.99      1.07          1.18     1.00      1.03

                               x        fr.1      fr.5     fr.10     fr.15     fr.20     fr.25
               x of β          10     10.33     10.63     10.49     10.27     10.34     10.79
               σ of β           0      0.76      1.03       0.87      0.89      0.68      0.85

                                x       fr.1      fr.5     fr.10     fr.15     fr.20     fr.25
               x of (z-axis)   108   108.38    108.43    108.75    109.17    109.18    109.19
               σ of (z-axis)    0      0.45      0.65       0.62      0.45      0.53      0.57




4.3       Generic approach evaluation
The generic approach derives from the basic approach with improvements related to the scene com-
plexity. The method to determine the external camera parameters is exactly the same since it is based
on retrieving the external parameters from a calibration plane.
     Hence, in which this generic approach contributes to the method is in the identification of the
calibration plane between all the present planes in the scene and in the discrimination of the objects
in the scene. Summarising, the goal of this generic approach consists of transforming from a complex
scene (understanding as complex scene a real environment with different objects, people or planes) to
a trivial scene where only the calibration plane is considered.
Chapter 4: Experimental results                                                                           38


4.3.1     Discrimination of objects
Since the calibration method tries to model the 3D points to a plane, all the 3D points which belong
to an object without a planar shape are directly neglected. Indeed, objects which have a planar shape
are neglected since normally there are less 3D points which belong to this object than the required
inliers. A problem would be a planar object with a considerable area visible by the camera, which
could lead to a false calibration parameters. Another problem could occur when a step appears in the
scene, then two or more planes could be selected as a calibration target, leading to different height
position of the camera. The estimated orientation would, however, not be affected by this situation.
   Table 4.4 presents a few tests where scene objects appear in the scene to perform the external
calibration process. From this table can be observed how the method is able to distinct between
objects and the calibration plane. A 2D image has been captured from a conventional 2D camera in
order to show what is present in the scene, since from the amplitude image it is difficult to imagine
what is happening in the scene. The points which the method considers to belong to the calibration
plane are depicted in blue. If there is any object in the scene with a planar side, its inliers are neglected
since it is consider as a false calibration plane. In that case, these points are depicted in yellow but
more details are commented in the next section which is specifically related with the plane distinction.


                                  Table 4.4: Object discrimination
           2D scene       Amp. Image 3D points Empirical measures               Estimated values

                                                               β=7                ∆β = 6.82
                                                              aγ = 80            ∆γ = 80.81
                                                            z-axis = 92         ∆z-axis = 93.14


                                                               β=5                ∆β = 6.26
                                                              γ = 65             ∆γ = 66.36
                                                            z-axis = 90         ∆z-axis = 90.07


                                                              β=0                 ∆β = −0.35
                                                              γ = 28              ∆γ = 28.13
                                                           z-axis = 190         ∆z-axis = 191.67




4.3.2     Identification of the calibration plane
In section 3.3.2 has been presented the way to identify the calibration plane in the potential planes
which are present in the scene. In order to distinct between planes such as walls and the floor (the last
considered as a calibration object) it is necessary to recur to few computations in order to compute
the normal vector of each plane and identify its orientation with respect to the camera.
39                                                                      4.3 Generic approach evaluation


     A few tests have been performed in order to show how the method is able to distinct the calibration
plane from the rest of planes in the scene, walls or the roof. Table 4.5 presents a few test where the
blue points corresponds to the calibration object inliers while the yellow points are neglected points
as they belong to false calibration planes (roof or walls in that case).


                           Table 4.5: Identification of the calibration plane
            2D scene      Amp. Image 3D points Empirical measures Estimated values

                                                            β=3                 ∆β = 3.11
                                                            γ = 80             ∆γ = 80.40
                                                         z-axis = 170        ∆z-axis = 170.89


                                                            β=4                 ∆β = 4.09
                                                            γ = 85             ∆γ = 87.01
                                                         z-axis = 176        ∆z-axis = 176.60


                                                            β=3                 ∆β = 3.12
                                                            γ = 72             ∆γ = 71.97
                                                         z-axis = 150        ∆z-axis = 149.05
Chapter 5

Conclusions and further work

Conclusions and personal impressions from each chapter of the thesis are presented in this chapter.
Afterwards, scientific contributions of this thesis are discussed and finally, further work and future
perspective are presented.



5.1     Conclusions
This thesis introduces a new method for the external range imaging camera parameters calibration
intended for real environments applications. The method is used to extract the camera position and
orientation with an iterative non-linear simultaneous least-squares adjustment process.
   After reviewing the current calibration techniques applied on 3D cameras, it was necessary to
propose a new method which takes profit from the specific 3D camera technology, since all the bibli-
ography uses an adaptation of 2D calibration techniques.
   The presented method is able to determine the external camera parameters from a planar surface
present on the camera’s field of view. From that plane, configured as a default position to be under the
camera and parallel to the x camera axis, the method is able to determine the roll and pitch camera
orientation and the altitude of the camera with reference to the world coordinate system. This scene
configuration concerns the basic approach of this project and will be the starting point to achieve the
same goal in a real environment scene.
   To consider a scene as a real environment it is necessary to add more complexity than a simple
planar surface (initial scene configuration). Then, objects such as furniture or people and, in addition,
more than one plane (walls and roof) will be added to the scene configuration and the method must
be able to discriminate between objects and planes and identify the calibration plane to be use for
the calibration process.
   Since the method is based on modelling a plane over raw 3D data, the discrimination of objects
is quite easy, while no planar surfaces are contained in the object, by using the ranging arrays given
by the camera. It is for this reason that one of the most important application of 3D cameras is
segmentation. The presented method uses a RANSAC algorithm to model a plane in the data and


                                                  40
41                                                                                      5.2 Contributions


returns the three 3D points which generate the best plane, that is, the plane which contains the more
number of inliers. It is evident that any planar surface in the scene will contain more points than any
object, thus, they are discriminated automatically.
     Nevertheless, in order to determine which plane corresponds to the calibration plane it is necessary
to impose some constraints to the camera installation with respect to the world coordinate system.
These constraints are that the calibration plane must be below the camera and present, at least part
of it, in the camera’s field of view, and, in spite of allowing an orientation in the range of [−π/4, π/4]
with respect to the x camera axis, it must be as horizontal as possible.
     From these constraints, the scene complexity is reduced to the initial scene configuration and the
method can be applicable in order to determine the camera orientation and position.
     Once the generic approach has been implemented and tested, a few improvements in order to
accelerate the computation time have been applied. Since the calibration method uses an iterative
approach and considers the distance of all raw data points to each potential plane, by reducing the
number of points to be checked, the computation time must decrease. Hence, the points which belong
a plane oriented outside the allowed range are discarded for the next iterations.
     Another improvement in order to reduce the amount of data to process is related with the scattering
compensation for time-of-flight camera [19]. A correction of the distance measurement error allows to
neglect the 3D points which are far from the non-ambiguity distance of the camera (7m).
     Rigorous range camera calibration is essential for vision applications, if one wants to speed up
the transition of vision systems from laboratory to the real environment where accurate measurement
is required. An important factor is the lack of reliable methods for predicting the performance of a
proposed solution for a given applications. It is also very difficult to model or predict the performance
without a near perfect and stable calibration process.



5.2       Contributions
This section spots briefly the list of contributions of this thesis project, though they have been discussed
and analysed in the previous sections.
     The contributions are:


     • A new state-of-the-art on calibration methods for range imaging cameras. Several methods have
       been discussed and concluded as an extrapolation and adaptation of 2D camera techniques to
       3D camera.

     • A basic approach has been performed in order to demonstrate that it is possible to determine
       the external camera parameters from the information supplied by the 3D camera.

     • Design and implementation of a generic approach where the external camera parameters can be
       retrieved from a real environment, where no geometrically known calibration target is used.
Chapter 5: Conclusions and further work                                                                42


5.3      Further work
The proposed objectives at beginning of this thesis have been satisfactory achieved. A new method
for the external camera parameters estimation has been presented and discussed in chapters 3 and 4.
Its accuracy and robustness make it useful for the applications which are currently under development
and also for future applications, allowing an easier installation of the camera in any setup. The person
in charge to install the camera does not have to follow a strict camera setup, since the camera itself
is capable to retrieve its position and orientation while a few points of the calibration object (e.g. the
floor) is visible for few instants.
   Since the application allow four different camera installation, it is necessary the user interaction
to indicate in which configuration the camera is installed since it is necessary to establish the camera
coordinate system before start the external-self-calibration. If the camera coordinate system is not
determined, the method would be unable to distinguish between the different planes in the scene and
thus, to determine the one which corresponds to the calibration object.
   By software it is impossible to detect in which configuration is the camera because there is no
appreciable difference between the floor and the roof, both are planes and both can be used as a
calibration object. Hence, as a further work, a hardware modification to the IEE 3D camera is
proposed.


5.3.1     Opto-mechanical Orientation Sensor
There exists specific devices in order to determine in which orientation are at any moment. An
example is the SFH 7710 opto-mechanical orientation sensor from Osram [21]. This sensor detects
vertical upright and horizontal orientations. It is based on a moving mass (a steel ball) which is
controlled by gravity. The position of the ball corresponds to the orientation of the sensor. It is
detected by a light barrier at one end of the ball’s path (Figure 5.1). The light barrier consists of a
phototransistor (PT) and a light emitting diode (LED). If the phototransistor is unblocked, it receives
light and generates a photocurrent. An ASIC pulses the emitter, measures the photocurrent during
the emitter on-time and sets the tilt sensor output to ground (ball is blocking the light) or to open
drain (ball is not blocking the light). Therefore a pull up resistor is needed to provide high voltage
under open drain condition. In order to limit the output current of the sensor when the output is low.
   As real applications are described:


   • Digital cameras to automatically recognise whether pictures are taken in portrait or landscape
      format.

   • Portable devices with displays in order to rotate the image according to the device’s orientation.

   • Devices which need a defined horizontal position to operate (e.g. electronic compass).

   • Any application in which the distinction between horizontal or vertical upright positions matter.
43                                                                                  5.3 Further work




Figure 5.1: Function of the SFH 7710. Orientation 1: Light path is blocked by ball. Orientation 2:
Light reaches the detector.


     For our purpose, by installing two opto-mechanical orientation sensors in the IEE 3D camera, its
orientation could be detected and then the external-self-calibration would be completely automatic
independently on the camera configuration.
Appendix A

External-self-calibration toolbox

Together with the new calibration method implementation, an user interface under Matlab has been
implemented to facilitate the calibration method utilisation for any user. From this user interface the
user can configure some input parameters of the method and get directly the estimation of the camera
position and orientation. Must be noticed that all the user interaction is checked in order to avoid
wrong inputs.
   The user interface in question, presented in figure A.1 is divided in two main sections:

  1. Settings: This frame contains all the camera configuration aspects in order to specify which is
     the optics to use, which data (*.c3b file) must be analysed, how is the camera installed (camera
     orientation) and which data must be showed (Figure A.2).

  2. Self Calibration: This frame allows the user to introduce the minimum number of requires inliers
     in the calibration plane, the maximum RANSAC iterations, if the calibration process must take
     into account all the sequence frames and finally, if the user wants to fix the 3D point which
     corresponds to the camera’s principal point to be one of the points to determine the calibration
     plane (at least three points are needed for this cause).

   In addition, once the calibration process has been done, all the 3D raw points are represented
allowing to the user some operations such as zoom in/out, rotation or moving the 3D representation.
   The estimated values are showed at the bottom-right of the interface.




                                                  44
45




        Figure A.1: Implemented user interface in Matlab.




     (a) Distance image.                     (b) Phase image.




     (c) Amplitude image.               (d) Integration time image.


        Figure A.2: Allowed data to be showed to the user.
Appendix B

Motorised Goniometric Test Bench

In order to validate the results reported by the presented self-external-calibration method it is unviable
to configure the camera roll and pitch angles with high accuracy for all the working range by hand.
Thus, it has been necessary to build a Motorized Goniometric Test Bench prototype in order to control
the camera orientation with an accuracy of ±1◦ for each rotation angle to be determined.
   This prototype has been designed and built by Roberto Orsello, the chief designer of the advanced
engineering department at IEE.


B.1      Test Bench Architecture
Two subassemblies allows a direct angle control around the two axis (roll and pitch) without the neces-
sity of applying any complex kinematics computation. This two rotation axis crosses the optical axis
ten millimetres behind the camera sensor. This choice avoid the interference between the mechanical
parts of the test bench and the camera’s field of view, fact which must be taken into account to avoid
scattering compensation.
   Stepper motors (NEMA23) have been used because:

   • There is no necessity of a positioning transducer. After the controller unit initialisation, homing
      is performed to have absolute positioning reference.

   • There is a direct position control (no PID required).

   To achieve high angular accuracy and high torque an harmonic gearbox has been chosen (300:1).
   The full test bench has been designed in 3D (pro/engineer wildfire 3.0) including all mechanical
components with all details (Figure B.1).
   The best design, the less manufacturing time; being the number of components or mechanical
parts minimised.




                                                   46
47                                               B.2 Implementation of the controller driver in Matlab




                       Figure B.1: Motorised Goniometric Test Bench 3D design.


     All mechanical parts of the two subassemblies have been CNC milled directly out of the 3D design.
     The final assembling was like a kit and the achieved precision coincides with the expected one
(Figure B.2).




                       Figure B.2: Motorised Goniometric Test Bench prototype.



B.2       Implementation of the controller driver in Matlab
The communication with the Motorised Goniometric Test Bench is done by an implemented driver in
Matlab which allows to setting the controller configuration such as serial port parameters, timeout,
transmission ration between the stepper motors and the ..... or the axis configuration and also to
interact with the stepper motors in order to reach a desired camera orientation.
     A friendly interface with user input control has been also implemented in order to make the test
bench useable for any non-expert user. Thus, by only setting the correct parameters and by selecting
the desired orientation and velocity the user is able to orientate the camera without being worried on
the possibility of damage the device.
     The user interface is divided in four different tabs:
Chapter B: Motorised Goniometric Test Bench                                                            48


B.2.1     Paths settings
The default controller settings such are default communication port number, baud rate, axis configu-
ration, transmission ratio or homing velocity are saved in a configuration file which can be browsed
in this user interface tab (Figure B.3).




          Figure B.3: Motorised Goniometric Test Bench user interface: Path Settings tab.



B.2.2     ISEL Configuration
This tab allows the user to set all the controller settings before being initialised. Must be noticed that
all the inputs from the user are checked in order to avoid any damage in the controller unit or stepper
motors (Figure B.4).




       Figure B.4: Motorised Goniometric Test Bench user interface: ISEL Configuration tab.
49                                             B.2 Implementation of the controller driver in Matlab


B.2.3     Robot Control
From this tab the user is able to change the camera orientation to any other orientation in the working
range. Only the axis which will be used are enabled (X-Y in our case) in order to prevent any damage
(Figure fig:robotControl).




         Figure B.5: Motorised Goniometric Test Bench user interface: Robot Control tab.



B.2.4     Error reporting
If there is any error with the serial port communication, an overloaded timeout, or any non-expected
response comes from the controller unit, it will be reported in this tab (Figure B.6).




        Figure B.6: Motorised Goniometric Test Bench user interface: Error Reporting tab.
Bibliography

                  e                                  e
 [1] Annexe a repr´sentation d’une rotation. Biblioth`ques de l’INSA de Lyon.

 [2] Joachim Bauer, Konrad Karner, Konrad Schindler, Andreas Klaus, and Christopher Zach. Seg-
     mentation of building models from dense 3d point-clouds. VRVis Research Center for Virtual
     Reality and Visualization. Institute for Computer Graphics and Vision, Graz University of Tech-
     nology.

 [3] J. Beraldin, S. El-Hakim, and L. Cournoyer. Practical range camera calibration. Proc. Soc.
     Photo-Opt. Instrum. Eng.

 [4] J.        Y.     Bouguet.              Camera       calibration         toolbox   for        matlab.
     http://www.vision.caltech.edu/bouguetj/calib doc/index.html.

 [5] H. Cantzler. Random sample consensus (ransac). Institute for Perception, Action and Behaviour,
     Division of Informatics, University of Edinburgh.

 [6] J. Forest Collado. New methods for triangulation-based shape acquisition using laser scanners.
     PhD thesis, Department of Electronics, Computer Science and Automatic Control. University of
     Girona, May 2004.

 [7] Andrew Fitzgibbon, Maurizio Pilu, and Robert B. Fisher. Direct least square fitting of ellipses.
     Tern Analysis and Machine Intelligence, Vol. 21, No. 5, May 1999.

 [8] G. H. Golub and C. F. Van Loan. Matrix Computations. Third edition, 1996. Johns Hopkins
     University, Press, Baltimore, MD, USA.

 [9] R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge
     University Press, ISBN: 0521540518, second edition, 2004.

[10] J.   Heikkila.              Camera   calibration    toolbox       for    matlab   (version     3.0).
     http://www.ee.oulu.fi/∼jth/calibr/.

[11] Janne Heikkila and Olli Silven. A four-step camera calibration procedure with implicit image
     correction. Infotech Oulu and Department of Electrical Engineering. University of Oulu. FIN-
     90570 Oulu, Finland.

                                                  50
51                                                                                   BIBLIOGRAPHY


[12] Berthold K.P. Horn. Tsai’s camera calibration method revisited. 2000.

[13] T. Kahlmann, F. Remondino, and H. Ingensand. Calibration for increased accuracy of the range
     imaging camera swissranger. ISPRS Commission V Symposium ’Image Engineering and Vision
     Metrology’, XXXVI, Part 5. Desden 25-27, 2006.

[14] Robert Lange and Peter Seitz. Solid-state time-of-flight range camera. IEEE Journal of Quantum
     Electronics, Vol. 37, No.3, March 2001.

[15] Mattias Marder. Comparison of calibration algorithms for a low-resolution, wide angle, 3d camera.
     18th of March, 2005. Master of Science Thesis, Stockholm, Sweden.

[16] Tiberiu Marita, Florin Oniga, Sergiu Nedevschi, Thorsten Graf, and Rolf Schmidt. Camera
     calibration method for far range stereovision sensors used in vehicles. Intelligent Vehicles Sympo-
     sium, June 2006. Computer Science Department, Technical University of Cluj-Napoca, Romania.
     Electronic Research Volkswagen AG, Germany.

[17] Carles Matabosch. Hand-held 3D-scanner for large surface registration. PhD thesis, Department
     of Electronics, Computer Science and Automatic Control. University of Girona, Appril 2007.

[18] Bruno Mirbach, Marta Castillo, and Romuald Ginhoux. Method for distortion-free coordinate
     determination from 3d-camera data. Patent.

[19] Bruno Mirbach, Romuald Ginhoux, Marta Castillo, Thomas Solignac, and Frederic Grandidier.
     Method for enhancing the contrast in range images. Patent.

[20] James Mure-Dubois and Heinz Hugli. Real-time scattering compensation for time-of-flight cam-
     era. University of Neuchatel. Institute of Microtechnology, 2000 Neuchatel, Switzerland.

[21] OSRAM. Opto-mechanical orientation sensor sfh 7710. April 2007. Application Note.

[22] Neven Santrac, Gerald Friedland, and Raul Rojas. High resolution segmentation with a time-of-
     flight 3d-camera using the example of a lecture scene. September 2006. Freie Universitat Berlin.
     Department of Mathematics and Computer Science.

[23] Zhengyou       Zhang.                 Microsoft       easy      camera       calibration      tool.
     http://research.microsoft.com/∼zhang/calib/.

[24] Zhengyou Zhang. A flexible new technique for camera calibration. ISPRS Commission V Sympo-
     sium ’Image Engineering nd Vision Metrology’, 1998, last updated on Aug. 10, 2002. Technical
     Report MSR-TR-98-71, Microsoft Research.

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:338
posted:10/5/2010
language:English
pages:62