Robust Single-Shot Structured Light by pengxiang


									                                  Robust Single-Shot Structured Light

                 Christoph Schmalz                                               Elli Angelopoulou
             Siemens AG, CT GTF NDE                                     University of Erlangen-Nuremberg
      Otto-Hahn-Ring 6, 81739 Munich, Germany                        Martenstrasse 3, 91074 Erlangen, Germany

   Structured Light is a well-known method for acquiring
3D surface data. Single-shot methods are restricted to the
use of only one pattern, but make it possible to measure even
moving objects with simple and compact hardware setups.
However, they typically operate at lower resolutions and are
less robust than multi-shot approaches. This paper presents              (a) Dragon without texture       (b) Dragon with texture
an algorithm for decoding images of a scene illuminated by a
single-shot color stripe pattern. We solve the correspondence
problem using a region adjacency graph, which is generated
from a watershed segmentation of the input image. The
algorithm runs in real time on input images of 780x580
pixels and can generate up to 105 data points per frame. Our
methodology gives accurate 3D data even under adverse
conditions, i.e. for highly textured or volume-scattering
                                                                        (c) DP result without texture    (d) DP result with texture
objects and low contrast illumination. Experimental results
demonstrate the improvement over previous methods.

1. Introduction
    Structured Light is a general term for many different
methods of measuring 3D surface data. The underlying idea
is to project a known illumination pattern on a scene. Shape            (e) Our result without texture   (f) Our result with texture
information is then extracted from the observed deformation
of the pattern (figure 1). The most basic hardware setup             Figure 1: Example input images and color coded depthmap
consists of one camera and one projector, but the details of        results. DP is the typical decoding method based on Dy-
the implementations vary widely. A good overview of the             namic Programming [17].
various proposed pattern designs can be found in [13]. Range
data is generated by triangulating the camera ray and the
projection ray to a point in the scene. This requires solving       to solve the correspondence problem is through the use of
the correspondence problem: determining which pairs of              spatial coding, where the necessary information is encoded
rays belong together.                                               in a spatial neighborhood of pixels. This requires that the
    One way to address this issue is temporal coding. Typ-          neighborhood stays connected, which means that the object
ical examples of this method are the Gray Code and the              must be relatively smooth. Nonetheless, spatial coding offers
Phase Shifting techniques (e.g. [14]). In these methods,            the advantage that only one pattern suffices to generate 3D
disambiguating information is embedded into a sequence of           data. This makes it particularly applicable to moving scenes.
patterns that can be evaluated for each camera pixel sepa-          It allows the use of simpler hardware, which in turn allows
rately. This imposes the limitation that the object may not         easy miniaturization. This can then lead to structured light-
move while the sequence is acquired. Another approach               based 3D video endoscopes for medical, as well as industrial

applications.                                                    by Forster [4] also has hard performance figures. They use a
   We propose a new Single-Shot Structured Light method          color stripe pattern with scanline-based decoding, running at
that offers a combination of high speed, high resolution and     20Hz and with up to 105 data points per frame. The RMS
high robustness. Our contributions are:                          error of a plane measured at a distance of 1043mm is given
                                                                 as 0.28mm. The Forster system offers a good compromise of
  • superpixel representation of the input image                 resolution and speed. However, the scanline-based decoding
                                                                 leads to suboptimal robustness. It is decomposing a 2D
  • construction of a region adjacency graph
                                                                 problem into a series of 1D problems and thus losing a
  • pattern decoding on the region adjacency graph               considerable amount of spatial information. Consider for
                                                                 example the scene shown in figure 3. Although the order
  • quantitative results on synthetic reference scenes with      of the stripes is clearly visible it cannot be determined in a
    ground truth                                                 1D scan because of the holes in the object. In comparison,
                                                                 we apply full 2D decoding using a region adjacency graph.
  • qualitative results on real images                           Hence, our approach combines the high robustness with high
                                                                 resolution and high speed.
2. Single-Shot Structured Light                                      In general, color stripe patterns offer a good compromise
                                                                 between robustness and resolution. Their primitives are one-
    The design of Single-Shot Structured Light systems in-
                                                                 dimensional, so resolution is lost only in one direction. Color
volves a fundamental tradeoff between resolution and ro-
                                                                 checkerboard patterns seem to double the resolution but their
bustness. For high resolution, small pattern primitives are
                                                                 use requires triangulation in two dimensions, which signifi-
needed, but the smaller the primitives, the harder it is to
                                                                 cantly increases the complexity of the system. Additionally,
reliably detect them. Therefore, many different single-shot
                                                                 triangulation at the intersections of edges is less precise, so
patterns have been proposed. Most designs are based on
                                                                 the potentially higher resolution is mostly lost again. We
pseudorandom sequences or arrays [10],[7]. They have the
                                                                 therefore chose to project color stripes.
property that a given window of size N or NxM occurs at
most once. This is known as the window uniqueness prop-
                                                                 3. Superpixel Representation
erty. Observing such a window suffices for deducing its
position in the pattern. Another design decision is the alpha-       Our goal now is to identify the color stripes in the camera
bet size of the code, i.e. the number of different symbols       image. We observe that the pixel representation is redundant:
that are used. Ideally one wants to use a large alphabet for a   many pixels have approximately the same color. We can,
long code with a small window size. However, the smaller         therefore, reduce the complexity of the image representation
the distance between individual code letters, the less robust    by using superpixels instead. The Watershed Transform of-
the code.                                                        fers a computationally efficient way to achieve this. It was
    A well known single-shot 3D scanning system is the one       popularized in image processing by Vincent and Soille [16].
by Zhang et al. [17]. The color stripe pattern used in that      A number of variations of the basic algoritm have been pro-
system is based on pseudorandom De Brujin sequences [1].         posed [12]. The basic idea is that pixel values (usually the
The decoding algorithm works per scanline and is based on        magnitude of the gradient of the original image) are treated
Dynamic Programming. Its largest drawback seems to be            as height values. As this 3D surface is gradually flooded,
the high processing time of one minute per frame. However,       water collects in valleys separated by ridges. An image can
our re-implementation and testing of this method (see figure      thus be segmented into a set of distinct catchment basins,
1 and section 5) revealed deficits in robustness as well. An-     also known as regions. The advantages of the watershed
other pattern type is based on the so-called M-arrays [9], but   transform are that it is unsupervised, parameter free and fast.
it results in comparatively low resolution with depthmaps of     It is, however, a low-level method that produces severe over-
only 45x45 pixels. Koninckx et al. [6] present an interesting    segmentation. Usually, an area perceived as homogeneous
approach based on a black-and-white line pattern with one        will be broken up into many small individual regions due
oblique line for disambiguation. It runs at about 15Hz and       to noise. For our system this is immaterial: Our goal is to
the resolution is given as 104 data points per frame, which is   represent the image with superpixels that are internally uni-
also relatively low.                                             form and thus significantly reduce image complexity. This,
    A recent paper by Kawasaki [5] uses a pattern of vertical    in turn, allows us to use graph-based decoding algorithms
and horizontal lines. It is one of the few articles containing   in real-time. An additional advantage of superpixel-based
quantitative data about the accuracy of the methodology,         representations is that properties like color are defined as
which is given as 0.52mm RMS error on a simple test scene.       statistics over an ensemble of ’simple’ pixels, thus reducing
The runtime is 1.6 sec per frame, but there is no information    the effects of defocus blur and noise. Figures 2 and 3 show
on the number of points reconstructed per frame. The paper       the input and output of the watershed transform.
                                                                    directly, but only color changes between two neighboring re-
                                                                    gions. If the surface color is constant across the two regions,
                                                                    its influence will be cancelled out. If it is not, spurious color
                                                                    changes may be detected. However, our decoding algorithm
                                                                    is explicitly designed to handle them.
                                                                        A second effect is that the surface color influences the
                                                                    relative response of the different color channels. For example
                                                                    the blue channel will appear very weak compared to green
                                                                    and red on a yellow surface. So 10 digits change in blue may
                                                                    be more significant than 20 digits change in red. Our pattern
Figure 2: Example plot of the gradient magnitude of a stripe        is designed so that each color channel changes after at least
pattern. It is the input to the watershed transform.                every second stripe. We know, therefore, that each valid
                                                                    superpixel must have at least one neighbor where a given
                                                                    channel changes. Thus, we can define the color range per
                                                                    channel as the maximum absolute change over all neighbors
                                                                    and use it to normalize the color change.

                                                                                          dinv =
                                                                                           i                                    (1)
                                                                                                   max|ck |
          (a) Input image           (b) Resulting superpixels
                                                                    Where ci denotes the color change in the individual channels
Figure 3: Example of a watershed segmentation performed             and k iterates over all neighbors of a superpixel. The key as-
on a scene illuminated by a stripe pattern                          sumption for this equalization is that each region is as wide
                                                                    as a stripe in the camera image and thus directly borders
                                                                    with both neighboring stripes. This is a reasonable conjec-
4. Graph-Based Pattern Decoding                                     ture since in our pattern the stripes are very densely packed
   Once the image is segmented into superpixels, we can             in order to produce a high resolution depth map. Empirical
perform the core pattern decoding algorithm. It consists of         evidence shows that the condition is nearly always met. Re-
the following series of steps. For details see the subsequent       covering the specific position of a stripe in the pattern can
subsections.                                                        then be performed via graph traversal once the edges and
                                                                    their weights are appropriately set up.
  1. Build the region adjacency graph.
                                                                    4.2. Edges
  2. Calculate the color change over each edge. Assign edge
     symbols and error estimates.                                      The edges of the graph describe how the color changes
                                                                    between two adjacent superpixels. The raw color change
  3. Find a unique path of edges.                                   ˆ
                                                                    C is a three-element vector. The scalar edge weight w is
                                                                    defined to be its L∞ norm:
  4. Recursively visit all neighbors in a best-first-search,
     while the edge error is sufficiently low.                                           ˆ              T
                                                                                        C = [ˆr cg cb ] ∈ R3
                                                                                             c ˆ ˆ                              (2)
4.1. Vertices
    The region adjacency graph has one vertex for every su-                                  ˆ
                                                                                       w = ||C||∞ = max|ˆi |
                                                                                                        c                       (3)
perpixel. A one megapixel image of a scene illuminated by
our color stripe pattern is typically represented by a graph                                                   ˆ
                                                                    We would like to assign each element of C as belonging to
with 50000 vertices. Each superpixel has an associated color.       one of three categories: channel rising, constant or falling.
It is determined by a robust nonlinear rank filter over all the      Since we use an alphabet with two intensity levels per chan-
original image pixels that belong to the superpixel (figure 3).      nel, there are only three possible labels. We denote triples
Since color is a vector we use marginal ordering [11]. The          of labels by symbols, e.g. the symbol for red rising, green
color is additionally corrected for the color crosstalk that        falling, blue constant is S = [+1 − 1 0] . An alternative
occurs in the camera, using a technique similar to [3].             representation is R+G-. The actual assignment of these sym-
    In general, the observed superpixel color is not the origi-     bols involves some preparation. To equalize the response
nal projected color, but rather a color distorted by the object’s   across the three channels we first multiply each component
reflectivity. Therefore, we cannot use the observed color                ˆ
                                                                    of C by its corresponding inverse range dinv from eq. 1.
The equalized color change C is then normalized so that the
maximum absolute channel value is 1.

                       C      ˆ
                              C ⊗ Dinv
                 C=      =                                  (4)
                       w     ˆ
                           ||C ⊗ Dinv ||∞
Here ⊗ denotes the elementwise multiplication of two vec-
                                                                  Figure 4: Importance of the secondary gradient. The upper
tors. We define the symbol match error Ematch associated
                                                                  and the lower half of the image belong to different objects.
with assigning symbol S to C as:
                                                                  Some stripes appear to be continuous in the secondary direc-
                                                                  tion when in fact they are not. The window size of the code
             Ematch (C, S) =               et (ci , si )2   (5)   is 5.

with the single channel error et defined as:
                                                                    Furthermore, each such edge has to be classified accord-
                            1+ci
                            1−t     if si = −1                   ing to its direction in relation to the primary direction of the
           et (ci , si ) =   |ci |
                                     if si = 0              (6)   pattern. The edge direction can be forward, backward or
                            1−c                                  parallel to the pattern. We perform a line fitting over all the
                                     if si = +1
                              1−t                                 pixels associated with the image edge. The position of the
where t is a threshold value below which color changes are        superpixel centroids relative to the line allows us to label the
considered insignificant. One could use t = 1 for an even          directions. The edges consist of only a few pixels each, so
partitioning of the interval [−1; +1]. We can now finally          the line approximation works well.
define the optimal edge symbol with the lowest possible
error by
                        −1 if ci ≤ −t
           si (ci , t) = 0 if − t < ci < t             (7)
                         +1 if ci ≥ t
                                                                  Figure 5: Illustration of edge direction assignment. The
Note that there can be several symbols that fit almost equally     location of the region centroids relative to the fitted lines
well. In later steps the algorithm can also assign a suboptimal   gives the edge direction. In this case, from the viewpoint
symbol (in the sense that it has a higher match error than the    of the red region, the edge to the cyan region is “backward”
optimal symbol) if necessary.                                     and the edge to the blue region is “forward”.
   The match Ematch alone can not correctly capture color
transitions that occur at occlusion boundaries. Consider
for example the case shown in figure 4. The blue, yellow,          4.3. Graph Traversal
magenta and green stripes appear to be vertically continuous,
but are not. At the border of the two objects a high secondary        In our methodology, decoding the pattern is equivalent
gradient occurs, even though the colors of the regions above      to finding the correspondence between the vertices of the
and below the boundary are very similar. To capture this          region adjacency graph and the pattern primitives in the
information, we define                                             projected image. The window uniqueness property of the
                                                                  pattern makes this association possible. Identifying the po-
                                       | ds I|                    sition of a subsequence within a longer code can be done
                   Egradient =           d
                                                            (8)   analytically [8]. However, it is easier and faster to simply
                                       | dp I|                    use pre-computed lookup tables which store all the locations
       d         d
                                                                  where a given primitive occurs. In our case the primitives
where ds I and dp I denote the components of the image            are colored stripes and the windows used for identification
gradient in secondary and primary direction as shown in           are represented as sequences of edge symbols.
figure 4. The index e iterates over all edge pixels.                   To find the correspondences between graph paths and
   Thus, the total edge error for a given graph edge and          stripes in the pattern, we first sort all the edges in the adja-
symbol is then                                                    cency graph by their edge error E, see eq. 9. The lowest
                E = Ematch + αEgradient                     (9)   error edge is selected as the starting point of a new graph
                                                                  path. The set of its possible positions in the pattern is de-
with a suitable proportionality constant α, typically set to      termined by its optimal symbol S. These positions, in turn,
0.2.                                                              determine the next edge symbols that could be used to extend
the path. If one of the two end-vertices of the path has a        detected indirectly. This is illustrated in figure 6. The edge
qualified edge we add it to the path. To qualify for extending     between regions b and c has a low weight relative to the edge
the path an edge must: a) have the correct direction and b)       between a and b. In such a case we calculate the optimal
its error E for the desired symbol must be less than a certain    symbol of the (possibly virtual) edge between a and c. If
user-defined threshold A which controls the ’aggressiveness’       it is identical to the edge symbol assigned between a and b,
of the decoding algorithm. The value of A depends on t            region c has the same color and the same pattern position as
from eq. 6 and α from eq. 9, but is typically 0.5. A number       b. This scheme is independent of local contrast.
of possible positions that would have needed different neigh-         Our decoding algorithm is local. It starts at a certain
boring edge symbols to continue the path are invalidated          edge and recursively visits the neighboring vertices in a
by adding a given edge. This process is repeated until only       best-first-search. We also experimented with an MRF-based
one possible position remains. This happens, at the latest,       graphcut optimization algorithm [2] for finding the globally
when the path length equals the unique window length of           optimal labeling of the vertices. However the results were
the pattern. When there is more than one edge that can be         less accurate because of complexities of reliably modeling
added, the one with the lowest error is selected. If there are    long-range interactions. Furthermore, the runtime of the
no valid edges to add, we start again with a new edge.            MRF method was much higher due to the large number of
    Once a unique path has been found, the associated pat-        possible labels, which is typically more than 100.
tern positions are uniquely determined as well. We pick an            An example subgraph with assigned edge symbols is
arbitrary seed vertex on the path and propagate the pattern       shown in figure 6. The bold edges could actually be used,
position to its neighbors. The neighbors are visited in a best-   the dashed ones were ignored. In the supplemental material
first search as follows. If the direction of the edge between      the we show an animation of the decoding algorithm at work.
the two vertices is correct and the edge error is smaller than    There may be shadow areas in the image where there is no
A, we add the neighboring vertex to an ’open’ heap. The           pattern to decode. It is statistically possible that a valid edge
edge symbol used to calculate the edge error may be differ-       sequence can still be detected, but it is extremely unlikely
ent from the optimal symbol, as long as the error is below        that the growth process will lead far. We can, therefore,
the threshold. Additionally, we maintain a token bucket that      easily suppress these false positives if an insufficient number
can be used to temporarily exceed the error threshold. If the     of valid neighbors is found.
bucket is not full, tokens are collected when an edge with
an error lower than A is encountered. Available tokens are
subtracted from the edge error when it is higher than A. This
allows us to tolerate isolated bad edges. When all neighbors
of a vertex have been visited, we continue the identification
process with the best neighbor on the heap, i.e. the one with
the lowest edge error. When the heap is empty, the propa-
gation stops. If there are unused edges left, we begin a new
pass and try to assemble a new unique path starting with the
best unused edge.
    The quality of a position assignment is captured in the
position error. It is defined as:
                        βEk        gradient
           Q = min              + Ek                      (10)
                   k      wk

where β is a suitable scale factor (typically 255 since that
is the maximum edge weight with 8 bit colors) and k is the
neighbor index as before. The inclusion of the edge weight
w reflects the fact that high-weight edges are less likely to be   Figure 6: An example subgraph of the region adjacency
disturbed by noise. At occlusion boundaries two conflicting        graph
pattern positions may occur. In that case, the one with the
lower position error is chosen.
    Note that, because of the normalization in the edge sym-      5. Results
bol calculation every edge gets a symbol, even if it connects
two regions of equal color. The color change in this case            The proposed decoding method is robust because of the
is only noise, but that is not known beforehand. In a stripe      rank filtering used to assign the superpixel colors and the 2D
pattern there are many such null edges which have to be           decoding algorithm that allows it to cope with non-smooth
surfaces and textured materials. The edges used during the            The comparison of our results with the Dynamic
stripe identification process need not conform to scanlines,       Programming-based decoding shows both a large increase in
which makes it possible to overcome disruptions in the ob-        inliers and a marked decrease in the number of outliers. Un-
served pattern. We evaluate the performance quantitatively        der less-than-perfect image conditions the DP approach [17]
on synthetic images and qualitatively on real objects.            produces unreliable results, especially considering that the
    When generating depth data we work on the original im-        results shown in figure 9 are for the lowest noise level. Note
ages. Thus, in the following, edges refer to edges in the         that we did not confirm the authors’ claim that violations of
image, not in the region adjacency graph, unless explicitly       the ordering constraint can be solved by multi-pass dynamic
stated. We iterate over the segmented image looking for           programming. Many edges were falsely identified in the
regions that belong to consecutive pattern positions. If a pair   1st pass and were therefore not eligible for a 2nd pass. Our
of such regions is found, we calculate the exact location of      method in contrast is truly independent of the ordering con-
the edge between them with subpixel precision by interpo-         straint. The runtime of the DP algorithm was given as 1 min
lating the gradient. Depth values are then determined via         per frame in the original paper. Our implementation takes 5
ray-plane intersection. Precision depends on several factors      seconds (on a faster processor), but that is still considerably
like camera resolution, triangulation angle and the quality of    more than the 100ms required by the proposed algorithm.
the projector. We found a best case standard deviation from
the plane of 0.12mm with the following setup: projector res-
olution 1024x768, camera resolution 1388x1038 with pixel
size 6.45 microns square, baseline 366mm, triangulation
angle 19.2 degrees, working distance roughly 1000mm. The
standard deviation was calculated over 2700 samples, i.e. on
a small patch of the image, excluding possible calibration
    Doing a comparative evaluation of the performance of a                    (a) Full view                      (b) Detail view
Structured Light system is difficult because different method-
ologies use different patterns. There are thus no standard-            Figure 7: Test scene ’sun’ at medium noise level
ized test images as, for example, for stereo algorithms. To
test the performance of our method, we therefore created
in Povray a number of synthetic test scenes with known
ground truth. There are also no publicly available implemen-
tations of other algorithms. Thus for comparison purposes
we re-implemented [17], one of the most widely known
Single-Shot Structured Light methods.
    The virtual objects are located 1000mm from the camera.
The images were corrupted with different levels of additive
white Gaussian noise. One test scene is shown in figure 7.
It is heavily textured and non-planar. Other synthetic test                (a) Mean error for various levels of noise and contrast
scenes are available on the internet [15]. No smoothing was
used on the depth data. Outliers are defined as deviations
from the ground truth by more than 10mm. The precision
of the depth data is shown in figure 8. For acceptable noise
levels and contrast, the standard deviation is around 1/1000
of the working distance. For the simulated geometry and
camera resolution an edge localization error of 1 pixel re-
sults in a depth error of about 4mm. Thus the given standard
deviations correspond to about 1/4 of a pixel. This is quite
good considering the significant amount of noise in the im-              (b) Standard deviation for various levels of noise and contrast
ages. There is a small systematic error of about 0.2mm or
1/5000 in the depth values, corresponding to about 1/20 of a      Figure 8: Evaluation of measurement error on the test scene
pixel. The exact reason for this has yet to be determined. It     ’sun’
is probably an artifact of the simulation, which is sensitive
to aliasing. For the simulated test scenes the calibration data      To demonstrate the scalability of the proposed 3D scan-
of the sensor is exactly known.                                   ning approach, we show results obtained with a desktop
                                                                         (a) Input image ’book’ with strong tex-   (b) Input image ’doll’ with strong tex-
     (a) Number of inliers for various levels of noise and contrast
                                                                         ture                                      ture

     (b) Number of outliers for various levels of noise and contrast.    (c) Color coded depth map of the book,    (d) Color coded depth map of the doll,
                                                                         generated with the proposed algorithm.    generated with the proposed algorithm.
Figure 9: Comparison of the results of our algorithm and                 Range is 35 mm.                           Range is 40 mm.
the DP approach on the test scene ’sun’. The reference is
the performance of the DP decoding algorithm a the lowest
noise level.

system (figure 10) and results from a prototype miniature
design based on a projection slide (figure 12). Again, no
postprocessing was applied. The DP decoding method has
problems with the saturated colors and texture of the book               (e) Depthmap of the book, generated       (f) Depthmap of the doll, generated
cover and the doll’s eye, but the proposed algorithm works al-           with the DP approach                      with the DP approach.
most everywhere. Skin has no strong texture but is a difficult
surface to measure because of its subsurface scattering and             Figure 10: Book and doll results. DP decoding results in
the frequent specularities. Nevertheless, the pattern could             large holes and erroneous depth values.
be completely decoded and the accuracy of the depth data
is good enough to discern the ridge lines of the fingerprint.
The fan in figure 11 is also a challenging object because it             tem works with almost arbitrary scenes and is scalable from
violates the local smoothness assumption. Two-dimensional               working areas of less than 10mm square to 1000mm square.
decoding still succeeds in most areas.                                  We have also demonstrated its miniaturization potential via
                                                                        3D fingerprinting. We are already working on adapting our
                                                                        method to endoscopic imaging.
6. Conclusion and Future Work
   We presented a robust algorithm for the decoding of                  References
Single-Shot Structured Light patterns that outperforms the
                                                                         [1] F. Annexstein. Generating de bruijn sequences: An effi-
previous method [17] in the number of data points generated
                                                                             cient implementation. IEEE Transactions on Computers,
and the number of outliers that have to be rejected. These
                                                                             46(2):198–200, 1997. 2
improvements are due to: a) the superpixel representation
                                                                         [2] Y. Boykov, O. Veksler, and R. Zabih. Fast approximate energy
of the image that allows robust filtering of the color and b)                 minimization via graph cuts. IEEE Transactions on Pattern
the use of a true 2D decoding algorithm that does not have                   Analysis and Machine Intelligence, 23(11):1222–1239, Nov.
to work along scanlines. It works well down to a pattern                     2001. 5
contrast of only 0.3 and can run at 15 Hz with input images              [3] D. Caspi, N. Kiryati, and J. Shamir. Range imaging with
of 780x580 pixels on a 3Ghz machine, generating up to 105                    adaptive color structured light. IEEE Transactions on Pattern
data points per frame. The typical accuracy is 1/1000 of the                 Analysis and Machine Intelligence, 20(5):470–480, May 1998.
working distance, the best case accuracy is 1/10000. The sys-                3
               (a) Input image ’fan’ with non-smooth                                (a) Input image ’fingertip’. Volume
               geometry                                                             scattering and specularities make skin
                                                                                    a difficult surface.

               (b) Color coded depth map of the fan,
               generated with the proposed algorithm.
               Range is 200 mm.
                                                                               (b) Deviation of the interpolated depth data from
                    Figure 11: Fan results                                     the plane. The ridges of the fingerprint are clearly

 [4] F. Forster. A high-resolution and high accuracy real-time 3d                     Figure 12: Fingertip results
     sensor based on structured light. In 3D Data Processing
     Visualization and Transmission, International Symposium
     on, pages 208–215, Los Alamitos, CA, USA, 2006. IEEE           [12] J. B. T. M. Roerdink and A. Meijster. The watershed trans-
     Computer Society. 2                                                 form: definitions, algorithms and parallelization strategies.
 [5] H. Kawasaki, R. Furukawa, R. Sagawa, and Y. Yagi. Dynamic           Fundam. Inf., 41(1-2):187–228, 2000. 2
     scene shape reconstruction using a single structured light     [13] J. Salvi, J. Pagès, and J. Batlle. Pattern codification strategies
     pattern. In Proc. IEEE Conference on Computer Vision and            in structured light systems. Pattern Recognition, 37:827–849,
     Pattern Recognition CVPR ’08, pages 1 –8, june 2008. 2              2004. 1
 [6] T. P. Koninckx and L. Van Gool. Real-time range acquisition    [14] G. Sansoni, M. Carocci, and R. Rodella. Three-dimensional
     by adaptive structured light. IEEE Transactions on Pattern          vision based on a combination of gray-code and phase-shift
     Analysis and Machine Intelligence, 28(3):432–445, March             light projection: Analysis and compensation of the systematic
     2006. 2                                                             errors. Appl. Opt., 38(31):6565–6573, 1999. 1
 [7] C. J. Mitchell. Aperiodic and semi-periodic perfect maps.      [15] Structured            Light            Survey           Website.
     IEEE Transactions on Information Theory, 41(1):88–95, Jan., 2009. 6
     1995. 2                                                        [16] L. Vincent and P. Soille. Watersheds in digital spaces: An
 [8] C. J. Mitchell, T. Etzion, and K. G. Paterson. A method for         efficient algorithm based on immersion simulations. IEEE
     constructing decodable de bruijn sequences. IEEE Transac-           Transactions on Pattern Analysis and Machine Intelligence,
     tions on Information Theory, 42(5):1472–1478, Sept. 1996.           13(6):583–598, June 1991. 2
     4                                                              [17] L. Zhang, B. Curless, and S. M. Seitz. Rapid shape acqui-
 [9] R. A. Morano, C. Ozturk, R. Conn, S. Dubin, S. Zietz, and           sition using color structured light and multi-pass dynamic
     J. Nissano. Structured light using pseudorandom codes. IEEE         programming. In Proc. First International Symposium on
     Transactions on Pattern Analysis and Machine Intelligence,          3D Data Processing Visualization and Transmission, pages
     20(3):322–327, March 1998. 2                                        24–36, 19–21 June 2002. 1, 2, 6
[10] K. G. Paterson. Perfect maps. IEEE Transactions on Infor-
     mation Theory, 40(3):743–753, May 1994. 2
[11] I. Pitas and P. Tsakalides. Multivariate ordering in color
     image filtering. IEEE Transactions on Circuits and Systems
     for Video Technology, 1(3):247–259,295–6, Sept. 1991. 3

To top