# Computational Vision 493.69 � 051

Document Sample

```					3-D Computer Vision
CSc 83029
Stereo

CSc83029 3-D Computer Vision / Ioannis Stamos
Stereopsis

 Recovering 3D information (depth) from
two images.
   The correspondence problem.
   The reconstruction problem.
   Epipolar constraint.
   The 8-point algorithm.

CSc83029 3-D Computer Vision / Ioannis Stamos
The 2 problems of Stereo
The setting: Simultaneous acquisition of 2 images
(left, right) of a static scene.
 Correspondence: Which parts of the left and
right images are projections of the same scene
element?
 Reconstruction: Given:
 A number of corresponding points between the left
and right image,
 Information on the geometry of the stereo system,
Find:
3-D structure of observed objects.

CSc83029 3-D Computer Vision / Ioannis Stamos
Stereo Vision

depth map

CSc83029 3-D Computer Vision / Ioannis Stamos
A simple stereo system
Z-axis                       P                                  Fixation
Point:Infinity.
Parallel
optical axes.
cl                    Z                          cr

X-axis                                                         f
Ol                                              Or
Left Camera                     T          Right Camera

CSc83029 3-D Computer Vision / Ioannis Stamos
Triangulation
Z-axis                       P                                  Fixation
Point:Infinity.
Parallel
optical axes.
cl                    Z                          cr
pl                       pr                 f
X-axis
Ol                                              Or
Left Camera      T    Right Camera
Calibrated Cameras

CSc83029 3-D Computer Vision / Ioannis Stamos
Triangulation
Z-axis                  P                                        Fixation
Point:Infinity.
Parallel
optical axes.
cl    xl       Z                 xr         cr
pl                       pr                    f
X-axis
Ol                                         Or
Left Camera          T    Right Camera
Calibrated Cameras
Similar triangles:     T  xl  xr T      T
       Z  f       , d  xl  xr
Z f           Z            d
d:disparity (difference in retinal positions).
T:baseline.
Depth (Z) is inversely proportional to d (fixation at infinity)
Triangulation
Z-axis                  P                                        Fixation
Point:Infinity.
Parallel
optical axes.
cl    xl       Z                 xr         cr
pl                       pr                    f
X-axis
Ol                                         Or
Left Camera          T    Right Camera
Calibrated Cameras
Similar triangles:     T  xl  xr T      T
       Z  f       , d  xl  xr
Z f           Z            d
d:disparity (difference in retinal positions).
T:baseline.
Baseline T: accuracy/robustness of depth calculation.
Triangulation
Z-axis                  P                                        Fixation
Point:Infinity.
Parallel
optical axes.
cl    xl       Z xr cr
pl      pr                       f
X-axis
Ol                             Or
Left Camera          T    Right Camera
Calibrated Cameras
Similar triangles:     T  xl  xr T      T
       Z  f       , d  xl  xr
Z f           Z            d
d:disparity (difference in retinal positions).
T:baseline.
Small baselines: less accurate measurements.
Triangulation
Z-axis                  P                                        Fixation
Point:Infinity.
Parallel
optical axes.
cl    xl       Z xr cr
pl      pr                       f
X-axis
Ol                            Or
Left Camera          T    Right Camera
Calibrated Cameras
Similar triangles:     T  xl  xr T      T
       Z  f       , d  xl  xr
Z f           Z            d
d:disparity (difference in retinal positions).
T:baseline.
Large baselines: occlusions/foreshortening.
Parameters of Stereo System
Z-axis                       P                                  Fixation
Point:Infinity.
Parallel
optical axes.
cl      xl            Z                xr        cr
pl                       pr                 f
X-axis
Ol                                              Or
Left Camera                     T          Right Camera
1) Intrinsic parameters (i.e. f, cl, cr)
2) Extrinsic parameters: relative position and
orientation of the 2 cameras.
STEREO CALIBRATION PROBLEM
CSc83029 3-D Computer Vision / Ioannis Stamos
Stereo – Photometric Constraint
• Same world point has same intensity in both images.
• Lambertian fronto-parallel
• Issues (noise, specularities, foreshortening)
From Jana Kosecka

• Difficulties – ambiguities, large changes of appearance, due to change
Of viewpoint, non-uniquess
Correspondence Is Difficult

Ambiguity: there may be many possible 3D reconstructions.
CSc83029 3-D Computer Vision / Ioannis Stamos
Correspondence Is Difficult

No texture: difficult to find a unique match.

CSc83029 3-D Computer Vision / Ioannis Stamos
Correspondence Is Difficult

Foreshortening: the projection in each image is different.

CSc83029 3-D Computer Vision / Ioannis Stamos
Correspondence Is Difficult

Occlusions: there may not be a correspondence.

Assumptions: 1) Most scene points are visible from both views.
2) Corresponding image regions are similar.
Correspondence Is Difficult

Curved surfaces: triangulation produces incorrect position.

CSc83029 3-D Computer Vision / Ioannis Stamos
Correspondence is difficult: The Ordering
Constraint
Points appear in the same order

But it is not always the case ...

CSc83029 3-D Computer Vision / Ioannis Stamos
More Correspondence Problems
 Regions without texture
 Highly Specular surfaces
 Translucent objects

CSc83029 3-D Computer Vision / Ioannis Stamos
Methods For Correspondence

 Correlation based (dense correspondences).
 Feature based (such as edges/lines/corners).

CSc83029 3-D Computer Vision / Ioannis Stamos
Correlation-Based Methods
R(pl )

pl

Left Image                           Right Image
1) For each pixel pl in the left image search in a region R(pl) in the
right image for corresponding pixel pr.
2) Use image windows of size (2W+1)x(2W+1).
3) Select the pixel pr that maximizes a correlation function.

HAVE TO SPECIFY: Region R, size W, and correlation function ψ.
Correlation-Based Methods
R(pl )

pl

Left Image                                       Right Image
For each pixel pl=[i,j] in the left image
For each displacement d=[d1,d2] in R(pl)
W  W
Compute c(d) 
  ( I l (i  k , j  l ), Ir (i  k  d1, j  l  d 2 ))
k  W l  W

The disparity of pl is the d that maximizes c(d)

HAVE TO SPECIFY: Region R, size W, and correlation function ψ.
Correlation-Based Methods
R(pl )

pl

Left Image                           Right Image
 (u, v)  uv                 CROSS-CORRELATION
 (u, v)  (u  v)2         SUM OF SQUARED DIFFERENCES
SSD
SSD is usually preferred: handles different intensity scales.

Normalized cross-correlation is better (but is more expensive).
Correspondence Is Difficult

Intensities in window may differ.

Normalized cross-correlation may help.
CSc83029 3-D Computer Vision / Ioannis Stamos
Image Normalization
 Even when the cameras are identical models,
there can be differences in gain and sensitivity.
 The cameras do not see exactly the same
surfaces, so their overall light levels can differ.
 For these reasons and more, it is a good idea to
normalize the pixels in each window:
I         1
Wm ( x , y )            I (u, v)
( u ,v )Wm ( x , y )
Average pixel

I   Wm ( x , y )
               [ I (u , v)]2
( u ,v )Wm ( x , y )
Window magnitude

ˆ( x, y )  I ( x, y )  I
I                                                              Normalized pixel
I  I W ( x, y )
m

CSc83029 3-D Computer Vision / Ioannis Stamos
From Sebastian Thrun/Jana Kosecka
Comparing Windows:
Minimize                                                       Sum of Squared
Differences

Maximize                                                       Cross correlation

It is closely related to the SSD:

CSc83029 3-D Computer Vision / Ioannis Stamos
From Jana Kosecka
Region based Similarity Metrics
• Sum of squared differences

• Normalize cross-correlation

• Sum of absolute differences

CSc83029 3-D Computer Vision / Ioannis Stamos
From Jana Kosecka
NCC score for two widely
separated views

NCC score

CSc83029 3-D Computer Vision / Ioannis Stamos
From Jana Kosecka
Window size

W=3                             W = 20

 Effect of window size
Better results with adaptive window
•   T. Kanade and M. Okutomi, A Stereo Matching Algorithm with an Adaptive
Window: Theory and Experiment,, Proc. International Conference on Robotics and
Automation, 1991.
•   D. Scharstein and R. Szeliski. Stereo matching with nonlinear diffusion.
International Journal of Computer Vision, 28(2):155-174, July 1998

CSc83029 3-D Computer Vision / Ioannis Stamos    (S. Seitz)
Stereo results
 Data from University of Tsukuba

Scene                                               Ground truth

CSc83029 3-D Computer Vision / Ioannis Stamos
(Seitz)
Results with window correlation

Window-based matching                                  Ground truth
(best window size)
CSc83029 3-D Computer Vision / Ioannis Stamos
(Seitz)
Results with better method

State of the art                                       Ground truth
Boykov et al., Fast Approximate Energy Minimization via Graph Cuts,
International Conference on Computer Vision, September 1999.
CSc83029 3-D Computer Vision / Ioannis Stamos    (Seitz)
Feature-Based Methods

Left Image                                               Right Image
Match sparse sets of extracted features.
A feature descriptor for a line could contain:
length l, orientation o, midpoint (x,y), average contrast c

An example similarity measure (w’s are weights):
1
S
1  2 ) 
w0 (ll  lr )2  w1 (CSc83029 3-Dw2 (ml  Vision2/  w3 (cl  cr )2
2    Computer
mr ) Ioannis Stamos
Correspondence Using
Correlation
Left                                              Disparity Map

Images courtesy of Point Grey Research
CSc83029 3-D Computer Vision / Ioannis Stamos
Correspondence By Features
LEFT IMAGE

corner                            line

structure

CSc83029 3-D Computer Vision / Ioannis Stamos
From Sebastian Thrun/Jana Kosecka
Correspondence By Features
RIGHT IMAGE

corner           line

structure

 Search in the right image… the disparity (dx, dy) is the
displacement when the similarity measure is maximum
From Sebastian Thrun/Jana Kosecka
Comparison of Matching Methods
Correlation-Based                            Feature-Based

 Dense depth maps.                           Sparse depth maps.
 Need textured images                        Insensitive to
illumination changes.
 Sensitive to
foreshorening/illumination                   A-priori info used.
changes                                      Faster.
 Need close views
Problems: occlusions/spurious matches:
=>Introduce constraints in matching
(i.e. left-right consistency constraint)
CSc83029 3-D Computer Vision / Ioannis Stamos
Epipolar Constraint (Geometry)
P     Scene point

Pl           Pr
EPIPOLAR
LINE                                            EPIPOLAR
EPIPOLAR                LINE
Image plane                       PLANE                                Image plane
πl                                                            pr       πr
pl

el                         er
Ol                                                                   Or

Center of projection                                         Center of projection

Epipoles

CSc83029 3-D Computer Vision / Ioannis Stamos
Epipolar Constraint
P   Scene point

Pl       Pr
EPIPOLAR
LINE                                   EPIPOLAR
EPIPOLAR          LINE
Image plane                    PLANE                         Image plane
πl                                                 pr        πr
pl

el                 er
Ol                                                         Or

Center of projection                               Center of projection
Epipoles

Extrinsic parameters: Left/Right Camera Frames:
Pr=R(Pl-T), T=Or-Ol (1)
Epipolar Constraint
Ol, Or, pl =>
Enough to define right E.L. P Scene point
Pl    Pr
EPIPOLAR
LINE                                EPIPOLAR
EPIPOLAR        LINE
Image plane                   PLANE                      Image plane
πl                                              pr       πr
pl

el              er
Ol                                                     Or

Center of projection                           Center of projection
Epipoles
Given pl, pr is constrained to lie on the Epipolar Line (E.L.).
For each left pixel pl, find the corresponding right E.L.
Searching for pr reduces to a 1-D problem.
Epipolar Constraint
Scene points

Image plane

el                er

Center of projection                        Center of projection

Epipoles

All E.L.s go through epipoles.
Parallel image planes => epipoles at infinity.
Essential Matrix
Estimate the epipolar geometry: correspondence
between points and E.L.s.
Pl , Pl -T and T are coplanar
Pr  Pl  T
Pl

( Pl  T )T T  Pl  0

(1)

( R T Pr )T T  Pl   0
T

T  Pl  SPl
Pr RT  P   0
Link bw/ epipolar constraint and                                   T
l
extrinsic parameters of stereo          0         TZ    TY 
system.                            S   TZ        0      TX 
                      
 TY
          TX       0 
Pr ( RS) P  0
T
Pr EP  0
T
l                                                                  l
Perspective:
Essential Matrix
pl=[xl,yl,fl]T, pr=[xr,yr,zr]T
Pr EP  0
T
pl= fl/Zl Pl, pr=fr/Zr Pr                                                     l
Perspective
Pr
Pl                                                                               Projection
pr Epl  0
T
ur

pl                                                     pl and pr are points in camera coordinate s
pr
ul                                                 0          Tz   Ty    
el            er                                                       pl  0
T
pr R  Tz         0    Tx

T                                        Ty       Tx     0    

E
ur  Epl                       Essential matrix
Epipolar lines are found by
ul  E T pr                        Rank 2
CSc83029 3-D Computer Vision / Ioannis Stamos
Camera Models (linear versions)
Elegant decomposition.
No distortion!

Homogeneous
Coordinates

Measured Pixel       World Point
(xim, yim)           (Xw, Yw,Zw)
?
Fundamental Matrix
Ml (Mr) matrix of intrinsic parameters for left (right) camera.

Camera to pixel coordinates:
Pr
Pl                                                              pl  M l pl
ur                                pr  M r pr
pl                                                        Essential matrix equation
pr
ul                                           becomes:
el            er
pr M rT EMl1 pl  0
T
T

F Fundamental matrix
ur  Fpl
F: pixel coordinates !
Epipolar lines:
ul  F T pr           E: camera coordinates !
CSc83029 3-D Computer Vision / Ioannis Stamos
Conclusions
Essential Matrix                             Fundamental Matrix
 Encodes information on                    Encodes information
extrinsic parameters.                      on both the extrinsic
 Has rank 2.                                and intrinsic
 Its 2 non-zero singular                    parameters.
values are equal.                         Has rank 2.

CSc83029 3-D Computer Vision / Ioannis Stamos
Estimating the epipolar geometry

el            er

CSc83029 3-D Computer Vision / Ioannis Stamos
Estimating the epipolar geometry

el                 er

Problem: Find the fundamental matrix from a set of image correspondences    p , p 
i
l
i
r

p   i T
r    F pil  0
CSc83029 3-D Computer Vision / Ioannis Stamos
Estimating the epipolar geometry

el                  er

min  p
F
   i T
r
i
F pl   
2

i

With the respect to the constraint: Rank(F) = 2.

CSc83029 3-D Computer Vision / Ioannis Stamos
The 8-point algorithm

n>=8 correspondences
p   i T
r    F pil  0                        Av  0
v: the 9 elements of F.
A: n x 9 measurement matrix.
Solve using SVD (solution up to a scale factor).
Enforce rank(F)=2 =>SVD on the computed F.
Be careful: numerical instabilities.
Epipolar Lines – Example

CSc83029 3-D Computer Vision / Ioannis Stamos
Example
Two views

Point Feature Matching

From Jana Kosecka
Example
Epipolar Geometry

Camera Pose
and
Sparse Structure Recovery

CSc83029 3-D Computer Vision / Ioannis Stamos
From Jana Kosecka
Locating the Epipoles from E & F

Accurate epipole localization:
1) Refining epipolar lines.
2) Checking for consistency.
3) Uncalibrated stereo.
el             er

F => el, er in pixel coordinates.
E => el, er in camera coordinates.

Fact: All epipolar lines pass through epipoles.
CSc83029 3-D Computer Vision / Ioannis Stamos
Image rectification

 Given general displacement how to warp the views
 Such that epipolar lines are parallel to each other
 How to warp it back to canonical configuration
(more details later)
CSc83029 3-D Computer Vision / Ioannis Stamos   (Seitz)
Epipolar rectification

• Rectified Image Pair
• Corresponding epipolar lines are aligned with the scan-lines
• Search for dense correspondence is a 1D search
CSc83029 3-D Computer Vision / Ioannis Stamos
Epipolar rectification

Rectified Image Pair

CSc83029 3-D Computer Vision / Ioannis Stamos
Rectification (Trucco, Ch. 7)
 Rotate left camera so that epipole goes to
infinity (known R, known epipoles)
 Apply same rotation to right camera
 Rotate right camera by R
 Adjust scale in both camera reference frames

CSc83029 3-D Computer Vision / Ioannis Stamos
Rectification
 Problem: Epipolar lines not parallel to scan
lines                                        P

Pl               Pr

Epipolar Plane

Epipolar Lines
p                                               p
l                                               r

Ol             el                          er                Or

Epipoles

CSc83029 3-D Computer Vision / Ioannis Stamos
From Sebastian Thrun/Jana Kosecka
Rectification
 Problem: Epipolar lines not parallel to scan
lines                                        P

Pl               Pr

Epipolar Plane
Rectified Images

Epipolar Lines
p                                               p
l                                               r

Ol                                                           Or

Epipoles at infinity

CSc83029 3-D Computer Vision / Ioannis Stamos
From Sebastian Thrun/Jana Kosecka
3-D Reconstruction

CSc83029 3-D Computer T. Kanade, IEEE Trans. on Pattern Analysis and Machine
Reprinted from “Stereo by Intra- and Intet-Scanline Search,” by Y. Ohta andVision / Ioannis Stamos
Intelligence, 7(2):139-154 (1985).  1985 IEEE.
3-D Reconstruction
A Priori Knowledge                  3-D Reconstruction from two views

 Intrinsic and extrinsic  Unambiguous (triangulation)
 Intrinsic only           Up to unknown scaling factor
 No information           Up to unknown projective
transformation

CSc83029 3-D Computer Vision / Ioannis Stamos
Projective Reconstruction

Euclidean reconstruction                        Projective reconstruction

CSc83029 3-D Computer Vision / Ioannis Stamos
From Sebastian Thrun/Jana Kosecka
Euclidean vs Projective
reconstruction
 Euclidean reconstruction – true metric properties of objects
lenghts (distances), angles, parallelism are preserved
 Unchanged under rigid body transformations
 => Euclidean Geometry – properties of rigid bodies under
rigid body transformations, similarity transformation

 Projective reconstruction – lengths, angles, parallelism are
NOT preserved – we get distorted images of objects – their
distorted 3D counterparts --> 3D projective reconstruction
 => Projective Geometry

CSc83029 3-D Computer Vision / Ioannis Stamos
From Sebastian Thrun/Jana Kosecka
Check this out!

http://www.well.com/user/jimg/stereo/stereo_list.html
CSc83029 3-D Computer Vision / Ioannis Stamos
How can We Improve Stereo?

Space-time stereo scanner
uses unstructured light to aid
in correspondence
Result: Dense 3D mesh (noisy)

CSc83029 3-D Computer Vision / Ioannis Stamos
From Sebastian Thrun/Jana Kosecka
Active Stereo: Adding Texture to Scene

By James Davis,
Honda
Research,
CSc83029 3-D Computer Vision / Ioannis Stamos
Now UCSC
From Sebastian Thrun/Jana Kosecka
Active Stereo (Structured Light)
rectified

CSc83029 3-D Computer Vision / Ioannis Stamos
From Sebastian Thrun/Jana Kosecka
Range Images (depth images, depth
maps, surface profiles, 2.5-D images)
•Sensors that produce depth directly.
•Pixel of a range image is the distance between a known reference frame
and a visible point in the scene.
•Representations:
•Cloud of Points (x,y,z)
•Rij form (spatial information is explicit)

CSc83029 3-D Computer Vision / Ioannis Stamos
Active Range Sensors

 Project energy or control sensor’s parameters.
 Laser, Radars (accurate)/Sonars(inaccurate).
 Active Focusing/Defocusing.

CSc83029 3-D Computer Vision / Ioannis Stamos
Triangulation

CSc83029 3-D Computer Vision / Ioannis Stamos
Triangulation
Light Stripe System.

Light Plane:
AX+BY+CZ+D=0
(in camera frame)
Yc
Image Point:
Zc
x=f X/Z, y=f Y/Z
(perspective)
Xc

Image
Triangulation: Z=-D f/(A x + B y + C f)

or object.
Move light stripeCSc83029 3-D Computer Vision / Ioannis Stamos
Time of Flight

CSc83029 3-D Computer Vision / Ioannis Stamos

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 0 posted: 2/8/2013 language: English pages: 73
How are you planning on using Docstoc?