Object Tracking with Iphone 3Gs

Document Sample
Object Tracking with Iphone 3Gs Powered By Docstoc
					Object Tracking with
    Iphone 3Gs
                  Lars Alin




                  May 25, 2010
Master’s Thesis in Computing Science, 30 credits
      Supervisor at CS-UmU: Ola ˚gren
                                   A
                                 o
           Examiner: Per Lindstr¨m




            Ume˚ University
               a
   Department of Computing Science
          SE-901 87 UME˚A
              SWEDEN
                                        Abstract

In June of 2007 Apple Inc. released the smartphone Iphone. It was a groundbreaking
success that set a new standard for what a smartphone should be able to do. Apple has
improved the Iphone every year since then and the 3Gs is the newest Iphone model. As
the phones have improved, both when looking at hardware and software, the applications
have improved as well. The Iphone 3Gs provides the possibility to use the camera as an
application background and with that the possibility to analyze the surroundings, making
it possible to track objects that the phone is pointed towards.
    This thesis examines how object tracking can be implemented in applications for Iphone
3Gs as well as providing a survey of four different areas of use that have been implemented
in Xcode: an augmented reality car game, a letter tracking application, a face recognition
application and an object recognition application.
ii
Contents

1 Introduction                                                                                   1
  1.1   Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    2
  1.2   Iphone 3Gs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      2
  1.3   Augmented Reality and Tracking . . . . . . . . . . . . . . . . . . . . . . . . .          3
  1.4   Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     4

2 Problem Description                                                                            5
  2.1   Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       5
  2.2   Purposes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    5
  2.3   Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     5
  2.4   Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      6

3 Tracking in Handheld Devices                                                                   7
  3.1   Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    7
  3.2   Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      8
        3.2.1   Markertracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      8
        3.2.2   Edge detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      9
        3.2.3   Mean-shift algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
        3.2.4   Parallel Tracking and Mapping . . . . . . . . . . . . . . . . . . . . . . 11
  3.3   Fields of interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4 Accomplishment                                                                                 15
  4.1   Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
  4.2   How the Work was done . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
        4.2.1   The Preparing and Designing phase . . . . . . . . . . . . . . . . . . . 16
        4.2.2   Early Development phase . . . . . . . . . . . . . . . . . . . . . . . . . 16
        4.2.3   Development of Tracking . . . . . . . . . . . . . . . . . . . . . . . . . 16
        4.2.4   Completion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
  4.3   Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17


                                               iii
iv                                                                                                                         CONTENTS


5 Results                                                                                                                                          19
  5.1 Tracking . . . . . . . . . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   19
  5.2 Augmented Reality Car Game .         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   20
      5.2.1 Icon and Menus . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   20
      5.2.2 Game Mode . . . . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   20
  5.3 Object Tracking . . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   22
      5.3.1 Icon and Menus . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   22
      5.3.2 Object recognition . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   23
      5.3.3 Letter recognition . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   24
      5.3.4 Face recognition . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   25

6 Conclusions                                                                               27
  6.1 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
  6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

7 Acknowledgements                                                                                                                                 29

References                                                                                                                                         31

A Concept sketches                                                                                                                                 33

B Lo-Fi                                                                                                                                            35

C Interactive prototypes                                                                                                                           37
List of Figures

 1.1   Different companies share of the smartphone market worldwide in percent . .                                1
 1.2   The front and the camera on the back of a Iphone 3Gs . . . . . . . . . . . . .                            2

 3.1   An Iphone using a mean-shift algorithm tracking an orange object                  .   .   .   .   .   . 8
 3.2   An early marker used by the ARToolKit . . . . . . . . . . . . . . .               .   .   .   .   .   . 8
 3.3   An illustration over how camera angle and marker angle is mapped                  .   .   .   .   .   . 9
 3.4   Edge detection on calculator and pen . . . . . . . . . . . . . . . . .            .   .   .   .   .   . 10
 3.5   Histogram of a normalized colorspace . . . . . . . . . . . . . . . . .            .   .   .   .   .   . 11
 3.6   Parallel tracking and mapping . . . . . . . . . . . . . . . . . . . . .           .   .   .   .   .   . 12

 4.1   Preliminary time chart on the project . . . . . . . . . . . . . . . . . . . . . . 15

 5.1   Cargame icon . . . . . . . . . . . . . . . . . .    . . . . . . . . . .   .   .   .   .   .   .   .   .   20
 5.2   Cargame splashscreen . . . . . . . . . . . . .      . . . . . . . . . .   .   .   .   .   .   .   .   .   20
 5.3   Screenshot from the game played on a desk at        North Kingdom         .   .   .   .   .   .   .   .   21
 5.4   Icon to the application: What?What? . . . .         . . . . . . . . . .   .   .   .   .   .   .   .   .   22
 5.5   Splashscreen and menu of the application . .        . . . . . . . . . .   .   .   .   .   .   .   .   .   22
 5.6   Tracking of the Apple logo on a MacBook . .         . . . . . . . . . .   .   .   .   .   .   .   .   .   23
 5.7   The letter tracking function in progress . . .      . . . . . . . . . .   .   .   .   .   .   .   .   .   24
 5.8   Face tracking in progress . . . . . . . . . . . .   . . . . . . . . . .   .   .   .   .   .   .   .   .   25

 A.1 Concept sketch on the car game, before it was reduced to 2D . . . . . . . . . 33
 A.2 Concept sketch on the letter reading application . . . . . . . . . . . . . . . . 34

 B.1 Lo-fi sketches on possible ways to steer the car . . . . . . . . . . . . . . . . . 35
 B.2 Lo-fi sketches on possible ways to steer the car . . . . . . . . . . . . . . . . . 36

 C.1 HiFi prototype to test the usability of a spinning steerwheel with gas and
     break pedals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
 C.2 HiFi prototype to test the usability of a steeringcross . . . . . . . . . . . . . . 38




                                             v
vi   LIST OF FIGURES
Chapter 1

Introduction

With the technical progress of smartphones today, designers and developers of software
suited for these smartphones strive to push the edge of what is possible to create. One
of these fields, for which the technical progress is essential for development, is augmented
reality. Augmented reality (AR) is a term for merging computer generated material into
the physical world in real time, see section 1.3.
    AR applications can be created in two ways, one of which the application does not
consider its surroundings and by that merge digital objects irrespective of what is displayed
in the physical world. The other way is to react to what is displayed in the physical world
and then merge it with appropriate digital objects. In order to match the digital objects
with physical objects the physical world has to be analyzed. This is where object tracking
comes into play and this is the kind of augmented reality that this thesis is built around.
Tracking makes it possible for the computer, which in this thesis is the Iphone 3Gs, to find
and identify objects, and then react to different situations.
    The reasons for the Iphone 3Gs to be the device of choice are many. First of all it contains
all the needed hardware to be able to manage these kind of applications in combination with
the hype and continuous growth in the smartphone market, see section 1.2. In just a couple
of years Iphone have approximately taken over 17.8 percent of the smartphone market
worldwide for the third quarter 2009, Figure 1.1.




 Figure 1.1: Different companys share of the smartphone market worldwide in percent [1]



                                               1
2                                                                Chapter 1. Introduction


1.1     Task
The task is to investigate how far it is possible to take AR and object tracking with the
Iphone 3Gs. The main goal of the task is to produce software which shows the possibilities
in the form of an augmented reality car game and a tracking application that both track
and make some recognition of what is tracked.
    The work is developed in collaboration with North Kingdom, which is a digital creative
                                                                           a.
agency from Sweden. Its main locations are in Stockholm and Skellefte˚ North Kingdom
provides digital storytelling in innovative ways to provide clients with digital media [2]. As
for the assignment provided by North Kingdom, it is something that at the moment is not
included in their ordinary area of work but, as they are striving to be in the front edge of
development, investigations like these are essential to push the limit of what they can offer
their clients even further.


1.2     Iphone 3Gs
Iphone 3Gs is the third version of Apples praised smartphone.




          Figure 1.2: The front and the camera on the back of a Iphone 3Gs [3]
1.3. Augmented Reality and Tracking                                                        3


    It contains features such as touch screen, voice control, accelerometers, proximity sen-
sor, ambient light sensor, Wi-Fi, digital compass GPS and more. To use it as a tool for
augmented reality and tracking the Iphones key features and limitations are:
   – Camera The smartphone is equipped with a 3 megapixel camera. It has autofocus
     and the camera has a frame rate of 30 frames per second [3]. A limitation is that no
     flashlight is provided, which limits the use to already lightened areas.
   – 3.5 inch multi-touch display The screen has a 480-by-320-pixel resolution which
     enables the user to easily interact with the phone [3]. Because of the widescreen format
     the camera view has to be scaled by 1.3 times the size, in order to have the camera
     view fill the whole screen.
   – Processor The Iphone 3Gs has a 600MHz CPU and 256MB of RAM that contribute
     to a fast and powerful handheld device [4].
   – Iphone SDK At the time this thesis is created, the newest version of the Iphone SDK
     is the 3.1.2 [3]. This update enables users to print the screen, making it possible to
     analyze the screenshot. A huge limitation with this version of the SDK is the fact that
     it is impossible to get access to the raw data stream from the camera, neither trough
     the SDK nor any workaround supported by Apple.


1.3     Augmented Reality and Tracking
Although this thesis focus on the tracking part of augmented reality, there is a need to go
a little bit deeper into what AR actually is. Augmented reality is a term first coined in the
1990s and as stated in the introduction the most commonly used description is that AR is
digital objects merged into the physical world [5]. This technology is traditionally used to
enhance the physical world providing the user with information and assistance regarding
the field it is used in [5].
    Some fields where AR has been implemented throughout the years:
   – Military. Aircraft pilots use head mounted displays to help navigation [5]. Surveys
     have also been done regarding the use of AR in military operations in urban terrain [6].

   – Healtcare. AR for example is used to create live scenarios in simulators where
     surgeons can develop their skills [7]. It could also be used in real surgeries to assist
     the doctor [5].
   – Entertainment The entertainment industry has also adopted this technology. The
     idea of having digital creatures in the real world can be found in several games, such
     as ARhrrrr - An augmented reality shooter [8].
    In order to map digital objects to the physical world some kind of analysis of the world
has to be done. This is usually performed using some kind of tracking algorithm. An in
depth study of how the algorithms are implemented and how they work in handheld devices
is presented in Chapter 3.
4                                                              Chapter 1. Introduction


1.4      Outline of the Thesis
    – Chapter 2 presents a detailed view of the task. In this chapter the task is stated
      and the purpose of the task is defined. It also contains a overview of the methods
      used when conducting this thesis and a look at what has already been done within
      this subject.
    – Chapter 3 presents an theoretical study on tracking in handheld devices.

    – Chapter 4 presents the preliminary timeframe and what was planned to be done.
      It also presents a detailed description on how it was actually done and ends with a
      comparison of planned and actual activities.
    – Chapter 5 presents the final results of this project; a walk through the central parts
      of the resulting applications, complete with screenshots and pseudo code.
    – Chapter 6 presents the conclusions of the results. This chapter also states the limi-
      tations of the result and future work that the result could lead to.
    – Chapter 7 presents acknowledgements to those who contributed to this master thesis.
Chapter 2

Problem Description

In this chapter an in-depth explanation of the task is presented. To clarify things the
problem is divided into sub-problems. This chapter also contains the purpose of the task,
how the task is solved and related work.


2.1     Problem Statement
The main problem is stated as: how well suited is the Iphone 3Gs as a platform for aug-
mented reality and object tracking?
   This statement is not just a rhetorical question but rather a starting point for develop
applications for Iphone, testing this statement by pushing the limits of what can be done.
   The sub-tasks are:
   – Augmented reality car game. The main idea of this game is for the user to be able
     to play a car game on any physical area with the physical objects posing as obstacles.
   – Tracking application. The focus of this application is to track and recognize ob-
     ject and patterns. Examples include logotypes, desktop material and human faces.
     Another functionality is to read hand written letters and display the resulting word.


2.2     Purposes
The purpose of this master thesis is to provide an insight of the abilities that the Iphone
3Gs has when it comes to handle applications with object tracking and augmented reality.
Therefore the applications are not meant to be uploaded to Apples Appstore and introduced
to the public but rather to be used to display what is possible to create within this field.
    There is never a purpose to directly transfer this knowledge to North Kingdoms ordinary
activity but as the mobile application market progresses this kind of work certainly will be
a part of that activity in the future.


2.3     Methods
This master thesis is initially conducted through a literature review regarding the subject
tracking in handheld devices. This review is the foundation of the thesis and it is an influ-
ence to both the design process and the development process of the project, see chapter 3.


                                             5
6                                                      Chapter 2. Problem Description


After the literature review the project switch to the second phase of this thesis – develop-
ment. After a review of the capabilities of the development environment Xcode a couple
of applications are designed. The design process contain sketches, LoFi prototypes, HiFi
prototypes and usability testing. The finished designs is implemented in Xcode.


2.4     Related Work
There are numerous companies that have produced and displayed visions in the form of
demo videos of what they think they can do with augmented reality and tracking. But since
no actual applications are displayed these cannot be regarded. One example of an AR and
object tracking application is the Sudoku Grab [9]. This application can track a sudoku
puzzle and solve it, adding the missing numbers in the empty sudoku slots. This application
is by the time this thesis is written running for most innovative way of hardware use in an
Iphone applications award [10]. I use similar ideas to what the Sudoku Grab is presenting
in my implementations.
    Another example is produced by Georg Klein and David Murray. They have created an
application were the surroundings can be analyzed, making it possible to render different
3D characters look like they are sitting on the desk in the physical world [11]. Their study
is conducted on an older version of Iphone with the possibility to access the raw camera
stream.
    An example of an application tracking its environment is the application Red Laser. It is
developed by Occipital [12] and is a good example of how it is possible to scan and analyze
the camera view within the iphone.
Chapter 3

Tracking in Handheld Devices

This chapter will give an in depth survey regarding some different ways that tracking can
be used in mobile devices, as well as what they are used for. This section will discuss some
of the most commonly used tracking algorithms such as marker tracking and edge detection
but it will also highlight some alternative methods.


3.1     Introduction
Tracking in handheld devices is almost synonymous with marker tracking. The reason for
this is because of how easy it is to calculate the angle of the camera and then rotate the AR
object accordingly to that angle [13]. The process is described in the following section and
one of the earliest successfully attempts to implement this on an “off-the-shelf hardware”
was done by Daniel Wagner and Dieter Schmalstieg in 2003 [14]. They implemented an AR
marker tracker system on a unmodified personal digital assistant (PDA).
    The single biggest contribution within the field of marker tracking was done by Hirokazu
Kato. He developed the ARTool kit, an AR and marker tracking framework which became
open source in 2004 and since then has had hundreds of thousands of downloads [15]. The
ARToolKit has since then evolved into versions more optimized for handheld devices [16].
    Even if marker tracking is a big part of tracking with hand held devices, this subject
has a lot more to offer. If the main goal of the tracking is to detect shapes it is optimal
to use an edge detecting algorithm [17]. This can be done using a number of different
approaches [18, 19], but the main goal is to highlight pixels that do not match a fixed
threshold value in order to detect object edges.
    Another more unconventional method is tracking with a mean-shift algorithm. This is a
method that relies on features in the picture such as the histogram value of a specific area
in order to track an object [20]. This method is often used only with single objects that are
present in the view at all times. Figure 3.1 shows this method implemented on an Iphone
device [21].




                                             7
8                                           Chapter 3. Tracking in Handheld Devices




    Figure 3.1: An Iphone using a mean-shift algorithm tracking an orange object [21]

    To be able to position a 3D generated object in correct angles without a marker is a far
more complicated process. To do this there is no use in tracking a single object but rather
to track the whole environment and matching a grid to specific points in the environment.
It is this grid that then changes its position resulting in the 3D object changing angle [11].
A great example of how to do this was created by Georg Klein and David Murray with the
label: parallel tracking and mapping [11].


3.2     Algorithms
For a better understanding of how these kinds of algorithms are working, an explanation
of how the algorithms mentioned in the section above are operating follows. This section
focuses on the theoretical part, explaining how each algorithm works rather than show
exactly how they are implemented.

3.2.1    Markertracking
The essential part of this method is the marker. A key feature of the marker is that the
pattern itself cannot be identical from two different angels. Another thing is that both the
pattern and its size have to be known. Figure 3.3 shows a marker used in the ARToolKit
project [15].




                   Figure 3.2: An early marker used by the ARToolKit

   Due to the fact that the size of the marker is known it is possible to map the camera
angle to how much that is seen of the marker. This is done by image processing where the
3.2. Algorithms                                                                                 9


black borders on the marker is searched for and when found the pattern inside is analyzed,
calculating the angle. Figure 3.4 is an illustration of how this is done. By knowing the angle
of the camera with consideration to the x, y and z axis it is possible to rotate a 3D object
creating the illusion of a digital object in the physical world [13]. Changing the distance
between the marker and the camera will result in adjustments in size of the 3D object.
Increasing the distance will shrink the object as long as the camera is still able to recognize
the marker and decreasing the distance will enlarge the object.




      Figure 3.3: An illustration over how camera angle and marker angle is mapped


3.2.2     Edge detection
There are a large number of ways to do an implementation of edge detection but the two
main categories are search-based and zero crossing based. Search-based uses edge strength as
measurement and searches for local maxima of that value while zero cross based algorithms
use, as the name imply, zero crossings computed from the images to find edges. Common
to the approaches is that they use deviations in the picture to localize edges. The most
common way is a gradient operation that determines the level of variance between selected
pixels [18]. Figures 3.4 shows how edges are detected in a mobile camera photo. In this
case all pixels in the picture are processed and if there is a deviation in color value of the
pixel compared to its neighbours, the pixel gets the color white. If there is no deviation,
the color of the pixel is set to black and after all pixels are processed the resulting picture
has a black background with the white edges from the starting picture. When edges are
calculated the resulting image makes it possible to identify objects in the picture. With
objects identified it is possible to place digital artifacts positions in relations to the physical
objects. A walkthrough of a edge detection algorithm is given in chapter 5.1. This is a simple
way of implementing an edge dection algorithm and a large reduction of the algorithms used
in computer vision, like the extensive work of John Canny [19] and the work by Harris and
Stephens [22]. The reduction is vital due to the difference in computer strength between a
handheld device and a stationary computer. As for the calculation there is a difference in
performance but that is a sacrifice that has to be made.
10                        Chapter 3. Tracking in Handheld Devices




     Figure 3.4: Edge detection on calculator and pen
3.2. Algorithms                                                                              11


3.2.3    Mean-shift algorithm
A mean-shift algorithm needs a pre-decided object to track. It all starts with a cluster
of pixels being chosen and this area must contain the object. After the initial stage the
mean-shift works, frame by frame, calculating the area in the frame that has the closest
color distribution to the pre-selected area [20]. As is seen in Figure 3.1 the pre-selected area
is the bottom left side of the orange object, the picture to the right showes how the object
is moved and the blue box surrounding the area is moving to the best matching area. To
illustrate this Figure 3.5 shows a selected histogram of a normalized colorspace [23]. The
method will try to find the local maxima that matches this histogram and in the case of
Figure 3.1, move the blue square in that direction [24].




                     Figure 3.5: Histogram of a normalized colorspace


3.2.4    Parallel Tracking and Mapping
The main goal with this method is the same as with the marker tracking method; to estimate
the camera pose in order to adjust the augmented reality content. It is performing that by
tracking key points in the user environment and mapping them to a digital representation
of the enviroment. This method contains two different processes running in parallel, a
point-based tracking system and a mapping system that bundles points and keyframes to a
map representation of the environment [25]. This section will only regard the point-based
tracking system.
12                                          Chapter 3. Tracking in Handheld Devices


   One way of performing the task of point-based tracking is described in six steps in
Parallel Tracking and Mapping for Small AR Workspaces [25]:

     – Step 1. A new frame is recived and the camera pose from the prior frame is estimated.
     – Step 2. From the estimation in step 1, the map points are added to the frame.
     – Step 3. 50 of the coarsets-scale points are searched for in the frame.

     – Step 4. When thay are found the estimated camera pose is updated to the new
       estimation.
     – Step 5. 1000 points are drawn into the frame again and searched for.
     – Step 6. Finally a new camera pose is estimated from all the points that were found
       i step 5.

The picture below shows the mapping in progress.




Figure 3.6: Parallel tracking and mapping, courtesy of Georg Klein and David Muray [11]
3.3. Fields of interest                                                                    13


3.3     Fields of interest
At the moment the interest in the field of tracking in hand held devices is rising but there
is not a lot of commercial usage out there. Traditionally, tracking algorithms is all part of
image processing and that is basically the biggest challange. Hand held devices are always
going to produce shaky images so the better the algorithm is to withstand obstacles like
motion blur the more efficient the method will be [11].
    Tracking is an essential part of computer vision. This is a field that reaches from special
effects in movies to industrial robots inspecting manufactering. At this time, the most track-
ing is performed to filter out data, leaving the intresting parts and removing unnecessary
parts of the picture to reduce the amount of data [19]. Therefore, various types of edge
detection are the most commonly used tracking algorithms [19].
    Even in computer vision the field of hand held tracking is limited as of today. Aside
from a couple of barcode readers and augmented reality games, there is not a big market
yet. Leading AR researchers predictict that the market for both AR in computors as well
as AR in handheld devices will rapidly increase in the not so distant future so the market
and contributions to hand held tracking will probably increase as well [26].
14   Chapter 3. Tracking in Handheld Devices
Chapter 4

Accomplishment

This chapter will compare the preliminary time plan and order of execution to the actual
way it was executed.



4.1     Preliminaries
Below is the preliminary estimation of how the work would proceed.




                   Figure 4.1: Preliminary time chart on the project



                                          15
16                                                          Chapter 4. Accomplishment


4.2     How the Work was done
The work is divided into four phases which are described in this section. The first of these
four phases, the preparing and designing phase, refers to an appendix which contains some
of the LoFi and HiFi prototypes that were tested as well as the design concepts.

4.2.1    The Preparing and Designing phase
The preparation weeks in the beginning of the project where spent on gaining insight into
Apples development environment Xcode. As a fairly new user of both Xcode and the
development language Objective-C, this process was important to prepare for the work
later on. In addition to the familiarization of the development environment, studies were
done in order to see what already had been done within this field and to discover essential
limitations.
    Between weeks 40 and 41, the second and third week of this project, design concepts
were created and finalized. Appendix A shows the first conceptual sketches of the two appli-
cations. Because of the extensive work required on the tacking mechanisms, the framework
surroundings were striped and very minimalistic. Booth applications present a splash screen
with information on launch and by tapping the screen the tracking mode starts. Some LoFi
design sketches on how to steer the car is presented in Appendix B and the final decision
is based on the fact that the steering wheel metaphor is obviously suited for a car game.
So the steering wheel became the steering device of choice. Regarding the other application
the main focus was set on the back end part simplifying the interface as much as possible
by implementing the standardized Iphone button and label classes. This implementation
simplifies the user interaction due to the fact that the user will immediately be familiar with
the environment.
    The interaction with the game mode was tested with interactive prototypes. A screenshot
of such a prototype and a substitute for the steering wheel can be found in Appendix C.
    In this prototype booth gas and brake pedals were tested, but neither making the final
version due to the minimization of on screen objects.

4.2.2    Early Development phase
Due to a delay at Telia who were providing the Iphone, only the framework and menus could
be created. These were created and tested in an Iphone simulator provided by Xcode. This
took place during weeks 42 and 43.
   In the beginning of November, a couple of weeks late on the schedule due to the waiting
time of receiving an Iphone, the implementation of the live video feed was done. This set
the starting point for tracking implementations.

4.2.3    Development of Tracking
Between weeks 46 and 50 different tracking algorithms were implemented and tested, striving
for an as efficient algorithm as possible. Running in parallel to the tracking adjustments
was numerous attempts to bypass the standard SDK in order to access the raw data feed
direct from the camera. As all attempt failed the fact had to be faced that the only way of
analyzing the camera stream was by printing the whole screen. The original idea of having
a 3D generated car had to be withdrawn and instead a concept of a 2D game was created
due to the print screen problem. A 2D solution without depth consideration reduces the
calculation needed and retains the frame rate, and by removing the 3D rendering performed
4.3. Conclusions                                                                        17


with Open GLes, the print screen method could be used. This also added to the problem
that the project was running late and I had no choice but postpone the presentation from
January to February.

4.2.4    Completion
Finally an edge detecting algorithm was chosen as the most efficient due to its capability
to sustain satisfying frame rate despite the limitations. An interpolation technique was
implemented to leave no trace on the prints. This was created by printing the edge marks
in a separable layer on top of the camera stream with just enough alpha for the user to
see but also for the image to be usable. This makes it possible to interpolate the edge
marks so that they wont be present in the next frame captured, see chapter 5.1. As this
was done by the end of the year 2009 the project started again week 2 and in the following
weeks two applications were built. The car game was now created as a 2D game seen from
above with the edge detecting algorithm keeping track of the location of physical objects
and adjusting the car to these objects. The edge detecting algorithm was also used in the
other application. In this application a backend thread was created to match the edges
detected with the algorithm to pre-computed edges of letters, logos and human faces.
   In week 5 I held a presentation of the project for a couple of companies also located in
         a
Skellefte˚ and showed a few demos of my work. In week 6 this report was written and it
was completed at the start of week 7 2010.


4.3     Conclusions
Comparing the preliminary schedule to the actual outcome it is clear that there is a rather
big difference. Knowing that it would take several weeks for Telia to supply the phone may
have avoided some of the delay but not enough to finish on time. The main reason for the
delay was the ridiculous amount of time spent on trying to access the raw data of the live
video feed. When investigating in which order the task had to be done it is rater accurate.
The amount of weeks spent on each task is also accurate other than the tracking algorithm
task which could be prolonged due to the fact that the time for graphics could be shortened
when the 3D idea got scrapped.
18   Chapter 4. Accomplishment
Chapter 5

Results

In this chapter the outcome of this project is presented. Every screenshot of the working
applications has had its edge detection points colored in order to provide the reader of this
thesis with a better understanding of what is tracked in the picture. The car game has
white tracking points and the tracking points for the object tracking application are red.


5.1     Tracking
The heart of the following applications is the edge detection algorithm that I have imple-
mented. It is therefore essential to explain how the algorithm works. Two versions of the
same algorithm have been implemented and the following pseudo code will explain the dif-
ferent steps. A threshold value is chosen before the algorithm starts, the larger the value is
the less sensitive the edge tracking will be. A low value generates more and thicker edges.

  1. Capture a screenshot of the whole display

  2. Loop through all the pixels of the captured screenshot
      (a) Check color value of the pixel
      (b) Check color value of pixels that border on the current pixel
       (c) If the color value differs from the threshold value then save position in array

  3. Update the array by removing pixels that no longer differs from the threshold value
  4. Start all over again at 1.

    In order to leave colored traces, like in the screenshots below, some more steps have to
be completed. The trick here is to print points on a clear canvas instead of saving them
in an array. To make sure that the points does not get captured on the screenshot, which
would lead to a one colored screen, interpolation has to be obtained. By interpolation the
color of the drawn pixel gets substituted by a mean value of the surrunding pixel colors.




                                             19
20                                                                       Chapter 5. Results


5.2     Augmented Reality Car Game
As mentioned before the car game is implemented in 2D and therefore has a couple of
limitations. The game has to be played directly from above, pointing the camera straight
down. Because it is a 2D game, changing the angle will not rotate the car, and therefore
loose the illusion of a merged digital object in the real world. Changing distance between
table and camera is an implementation limitation. For this illusion to make sense, the car
has to be scaled down when the distance increases and scaled up if the distance decreases.
The decrease part is the problem when it is not possible to access the raw camera stream. If
the camera gets too close, the car probably will fill the whole screen and by then no tracking
will be possible and the car will never scale down even if the distance is increasing. Another
limitation is that some of the physical objects from when the game is started has to be
present at all time, if all of the original objects gets substituted it will be another scenario
and the car will be adjusted to that scenario instead.

5.2.1    Icon and Menus
To start to game the application icon has to be pressed in the Iphone menu. The picture
below shows the application icon.




                                  Figure 5.1: Cargame icon

    The game is started by pressing the play button on the splashscreen that appears at
startup and the game mode will soon appear.




                              Figure 5.2: Cargame splashscreen



5.2.2    Game Mode
The play sequence in this game is rather simple. The steering wheel to the right controls
the car and the player is rotating it by touch interaction. The phone can be moved in two
dimensions as long as the angle of the camera and the distance between the objects and the
5.2. Augmented Reality Car Game                                                            21


phone does not differ too much. The key feature in this game is that the digital car appears
as if it is merged into the physical world, leaving the car at the same place as the physical
objects even when the phone is moved. The example below shows two pictures where the
digital car is keeping its angle and distance to the physical objects, which in this case is a
stapler, even when the position of the phone has been changed.




        Figure 5.3: Screenshot from the game played on a desk at North Kingdom
22                                                                     Chapter 5. Results


5.3     Object Tracking
This application has three features based on object tracking and object recognition. It
should be pointed out that the main focus of this thesis is object tracking so the recognition
part of this application has been a bit foreseen and especially the face recognition function
is a bit of showcase work. It will not tell the difference between faces, just recognize if a
human face is present.



5.3.1    Icon and Menus
To start this application you have to tap the icon below in the Iphone menu.




                     Figure 5.4: Icon to the application: What?What?


    When the application is started, the splash screen to the left in the picture below is
visible. When the screen is tapped the picture to the right appears. It contains a toolbar in
the bottom of the screen where the user can change from the default mode, which is object
and face recognition, to the letter recognition mode by tapping the cross to the right in the
toolbar. The done button exits the application and the space to the left contains a label
that prints what the application has found.




            Figure 5.5: Splashscreen (left) and menu (right) of the application
5.3. Object Tracking                                                                      23


5.3.2    Object recognition
In order for this function to work, the application has to have the ratio between different
edges in the objects precalculated. As is shown in Figure 5.6, the ratio between six different
edges are compared and found. Once again it should be pointed out that this is probably
not the most efficient way to do this kind of application, but as is stated in the beginning of
this chapter, this is only implemented to show the potential that this kind of tracking has.




                  Figure 5.6: Tracking of the Apple logo on a MacBook
24                                                                      Chapter 5. Results


5.3.3    Letter recognition
The letter recognition function can be accessed through the bottom menu by pressing the
cross on the right hand side. When pressing this button a red aim will appear on the screen.
The user then has to fit the letters within this aim in order for the recognition to do its job.




                    Figure 5.7: The letter tracking function in progress

    In this mode the regular recognition is turned off and the application switches its focus
only to what is present within the aim. The recognition is built like a grid with every letter
having unique tracking points within that grid. The example above shows how the letter R
is recognized.
5.3. Object Tracking                                                                        25


5.3.4    Face recognition
This face recognition tracks the ratio between eyes and mouth. Both eyes and the mouth
generates large cluster of edge pixels, making them easy to locate in a picture. If such a
cluster is found triangulation is obtained in order to check if there is a cluster within a pre
calculated range. This range is based on the ratio between the eyes and mouth of a human
face. A feature in this mode is that if a face is recognized the facebook page of the detected
person is accessible through the facebook icon appearing at the bottom of the screen.




                            Figure 5.8: Face tracking in progress
26   Chapter 5. Results
Chapter 6

Conclusions

To summarize this whole thesis it is appropriate to revisit the main goal with the project.
The goal was to find out how well suited the Iphone 3Gs is as a platform for augmented
reality and object tracking. From this point the project has been a success, despite setbacks
like redesigning the whole concept of a augmented reality game in 3D to a augmented reality
game in 2D. Because that was what the whole project was about, testing the limits. The
main conclusion of this thesis could easily be summed up in one sentence:
      “As long as the SDK does not allow developers to access the raw camera stream,
      the Iphone 3Gs is not suited for augmented reality that depends on analyses of
      the physical world”
To add pictures on top of the camera stream works just fine but as long as the print screen
method discussed in the last chapter is the only way of analyzing what is present on the
screen, the picture will also be a part of the equation and complicate everything. This
can be displayed in Figure 5.3 where looking closely both the steering wheel and the car
has white edge around them, proving that the steering wheel is also considered a physical
objects by the algorithm.
    When talking about tracking of the physical world without the interference of digital
objects merged with it, it is another story. The phone possesses enough features and CPU
power to complete very demanding operations. Recognizable software as Iphone applications
are just in the beginning of what I believe is an upcoming trend. The area of use is almost
endless and as long as the Iphone continuously thrive on the market the development will
continue.


6.1     Limitations
As for the limitations in my work some of it has already been mentioned. Due to what
could be called an overview of the possibilities with Iphone 3Gs the focus has not been
on developing bug free and solid applications. All of the applications should be seen as
demonstrations of what can be done and to optimize these would probably be a master
thesis of each and every one on their own. To sum up some of the limitations the cargame is
used as starting point: It only works in 2 dimensions, some of the objects that are present
at the start has to be present at all time and the car has no collision detection on objects.
The letter tracker only tracks a bold handwritten font. The object tracker only tracks three
different shapes at the moment and the face tracker only tracks that a face is present, not


                                             27
28                                                               Chapter 6. Conclusions


who the face belongs to and therefore the name of the person and the facebook link is just
implemented for one person.


6.2     Future Work
As for future work there is quite a bit here that can be done. The work on the letter tracker
will continue, and the first thing will be to make it possible to track whole words with a
common font such as Times New Roman. It will also be made possible to save the text to
a document.
    When it comes to object and face recognition a database of objects and faces in different
angles has to be implemented. My guess is that if the database grows large, the most
efficient way to use it is to create a server/client application. In such a case the phone only
provide photos to a back-end server that does all the calculations.
Chapter 7

Acknowledgements

This master thesis has been really interesting and I’m sure that I will have great use of this
experience later on in my carrier. I would therefore like to take this opportunity to thank
the CEO of North Kingdom David Eriksson for providing me with the subject and letting
me work at their firm. I would also like to thank my supervisor at North Kingdom Hans
Eklund for helping me with my work and also my supervisor at the department of computer
science, Ola ˚gren for the help and feedback on this report.
             A




                                             29
30   Chapter 7. Acknowledgements
References

 [1] Canalys. Smart phone market shows modest growth in q3 - but apple and rim hit
     record volumes. http://www.canalys.com/pr/2009/r2009112.html, December 25 2009.

 [2] North Kingdom. Official website. http://www.northkingdom.com/about/, January 20
     2010.

 [3] Apple Inc. Apple iphone. http://www.apple.com/iphone, January 10 2010.

 [4] T-mobile Netherland. Leaked Iphone secret. http://www.mobilewhack.com/t-mobile-
     netherlands-leaks-iphone-3g-s-hardware, February 10 2010.

 [5] R. T. Azuma. A survey of augmented reality. Presence: Teleoperators and Virtual
     Environments, pages 355–385, 1997.

 [6] M. A. Livingston, L. J. Rosenblum, S. J. Julier, D. Brown, Y. Baillot. J, E. Swan II,
     J L. Gabbard, and D. Hix. An augmented reality system for military operations in
     urban terrain. I/ITSEC, page 89, 2002.

 [7] J. Moline. Virtual reality for health care: a survey. Technical report, National Institute
     of Standards and Technology, Gaithersburg, MD, 1997.

 [8] Augmented environments lab. Arhrrrr! http://www.augmentedenvironments.org/lab/
     research/handheld-ar/arhrrrr/, February 14 2010.

 [9] CMG Research. Sudoku grab. http://www.cmgresearch.com/sudokugrab/, Febru-
     ary 12 2010.

[10] Best App Ever Awards. Second annual iphone os application achievement awards.
     http://bestappever.com/awards/2009/, December 17 2009.

[11] G. Klein and D. Murray. Parallel tracking and mapping on a camera phone. ISMAR’09,
     2009.

[12] Occipital. Redlaser. http://redlaser.com/, February 10 2010.

[13] H. Kato and M. Billinghurst. Marker tracking and hmd calibration for a video-based
     augmented reality conferencing system. IWAR’99, pages 85–94, 1999.

[14] D. Wagner and D Schmalstieg. First steps towards handheld augmented reality. ISWC
     2003, pages 127–137, 2003.

[15] ARToolKit. Official website. http://www.hitl.washington.edu/artoolkit/, January 20
     2010.


                                              31
32                                                                    REFERENCES


[16] D. Wagner and D Schmalstieg. Artoolkitplus for pose tracking on mobile devices.
     CVWW’07, pages 139–146, 2007.
[17] D. Marr and E. Hildreth. Theory of edge detection. PROC. ROY. SOC.(London), vol.
     B207, pages 187–217, 1980.

[18] H. S. Neoh and A. Hazanchuk. Adaptive edge detection for real-time video processing
     using fpgas. GSPx 2004, 2004.
[19] J. Canny. A computational approach to edge detection. IEEE Trans. Pattern Analysis
     and Machine Intelligence, pages 679–698, 1986.

[20] D. Comaniciu, V. Ramesh, and P. Meer. Real-time tracking of non-rigid objects using
     mean shift. Proceedings of 2000 IEEE Conference on Computer Vision and Pattern
     Recognition, pages 142–149, 2000.
[21] I.     Halil.             Mean-shift    based  moving      object     tracker.
     http://www.cs.bilkent.edu.tr/∼ismaila/MUSCLE/MSTracker.htm, January 14 2010.

[22] C. Harris and M. Stephens. A combined corner and edge detector. Fourth Alvey Vision
     Conference, pages 147–151, 1988.
[23] D. Comaniciu and P. Meer. Mean shift: A robust approach towards feature space
     analysis. IEEE Trans. Pattern Anal. Machine Intell., pages 603–619, 2002.

[24] R. Collins, O. Amidi, and T. Kanade. An active camera system for acquiring multi-
     view video. Proceedings of the International Conference on Image Processing, pages
     517–520, 2002.
[25] G. Klein and D. Murray. Parallel tracking and mapping for small ar workspaces.
     ISMAR’07, pages 225–234, 2007.

[26] T. Carpenter. Gamesalfresco. http://gamesalfresco.com/, January 02 2010.
Appendix A

Concept sketches




   Figure A.1: Concept sketch on the car game, before it was reduced to 2D




                                     33
34                                            Chapter A. Concept sketches




     Figure A.2: Concept sketch on the letter reading application
Appendix B

Lo-Fi




        Figure B.1: Lo-fi sketches on possible ways to steer the car




                                    35
36                                                         Chapter B. Lo-Fi




     Figure B.2: Lo-fi sketches on possible ways to steer the car
Appendix C

Interactive prototypes




Figure C.1: HiFi prototype to test the usability of a spinning steerwheel with gas and break
pedals




                                            37
38                                          Chapter C. Interactive prototypes




     Figure C.2: HiFi prototype to test the usability of a steeringcross

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:68
posted:11/14/2010
language:English
pages:46