Computer Vision
P@trik Haslum
COMP3620/6320
Computer Vision: What’s the Problem?
Computer Vision
Image Formation: From World to Image
* Camera model (optics & geometry): From points in 3D scene to
points on 2D image.
* Photometry: From lights and surfaces in scene to intensity
(brightness) and color in image.
Vision: From Image to (Knowledge of the) World
* Reconstruct scene (world model) from images.
* Extract sufficient information for detection/control task.
Computer Vision: What’s the Problem?
A Hard Problem
* Under-constrained inverse problem – 3D world from 2D image.
* Images are noisy – shadows, reflections, focus, (ego-)motion
blur – and noise is hard to model.
* Appearances – shape, size, color – of objects change with pose
and lighting conditions.
* Image understanding requires cognitive ability (“AI-complete”).
* Robotics & Control: massive data rate, real-time requirements.
Computer Vision: What’s the Problem?
Example: Edges of Shadows
Image Result of Edge Detection
Computer Vision: What’s the Problem?
Example: Color Non-Constancy
Image Color Histogram
Computer Vision: What’s the Problem?
Successful for specific tasks in
controlled environments:
* Optical Character Recognition
(OCR).
* Face recognition and tracking.
* Industrial
- Inspection & quality control.
- Automation: Part recognition & pose
estimation.
* Medical image analysis.
* Remote sensing – Surveillance &
satellite images.
* Robotics – Navigation & Control.
Vision Techniques Feature Extraction / Low-Level Vision
Template Matching
* Measure correlation (similarity)
between images.
* Match template image against
each point in some image area of
interest.
* Basic operation, many uses:
- Recognition & Pattern matching. 1 −1 −1 1
−1 −9 −1 1
- Tracking. 12 9
1 1 −1
9 9
− 1 − 9 12
1 1
9 12
1
12 12 9 9 12
- Motion detection. 1 1 1
12 12 12
1 1 1
− 9 12 12 1
12
- Registration (stereo vision).
* Runtime proportional to
|template| ∗ |area|.
Vision Techniques Feature Extraction / Low-Level Vision
Edge Detection
* Edge: Discontinuity in intensity
(brightness).
- Depth discontinuity.
- Surface orientation.
250
- Surface properties (reflectance,
Intensity
150
color, etc).
50
- Illumination (shadows).
0
yd 0 20 40 60 80 100
* Orientation (θ = arctan( dx )) per X Coordinate
pixel.
150
Gradient
50
* Usually combined with filter to
0
cancel out (Gaussian) noise.
−100
0 20 40 60 80 100
* Problems: X Coordinate
- Selecting appropriate threshold.
- Orientation accurate only to about
±20◦ .
Vision Techniques Feature Extraction / Low-Level Vision
Vision Techniques Object Recognition
Basic Shape Detection
Generalised Hough
Transform
Image Edges
* Voting algorithm: Each edge
pixel “votes” for a set of
shape parameters.
“If I were the edge of a circle
with radius R, the center
would be at x, y .”
R = 20 R = 30
* General, robust, but
computationally expensive –
many more efficient
specialised methods.
Vision Techniques Object Recognition
Object / Pattern Recognition
Pattern Recognition
* Classification problem:
Image −→ yes/no.
* Variety of machine learning
methods.
Abstract Pattern Matching
* Detect basic shapes and match
their relationships to (abstract)
pattern.
- Object-object similarity: reducible
to weighted graph matching.
- Multiple object detection in
cluttered scene is harder.
* Problem: Tolerance to
deformations / partial matches.
Vision Techniques Motion & 3D
Apparent Motion: Optical Flow
* Measure pixel displacement by
matching neighbourhood
(template) in image t with window
in image t + 1.
* Detects apparent motion.
- Still camera & background: Detect
moving objects.
- Assuming no movement in scene,
egomotion translates into optical
flow in known way – usable for
control.
* Problem: Template/window size
(time vs. quality).
Vision Techniques Motion & 3D
3D Reconstruction
* Something varies while something
stays the same. Texture
* Structure from motion:
- Camera moving, static scene.
- Rigid objects moving, static
camera. Motion
* Shape from texture/shading:
- Uniformity.
* Stereopsis (two p.o.v.)
* Shape from undestanding:
- Known object shapes, constraints
on lines.
Vision Techniques Motion & 3D
Stereo Vision
* Different points of view cause
disparity between views of same
object: in parallel views, disparity is
inversely proportional to depth.
* Measure disparity by matching:
- Pixel neighbourhood (template).
- Features (edges & basic shapes).
- Matching constraints: Similarity,
uniqueness, continuity, order.
* Focused (non-parallel) view: higher
depth resolution around focus point
(depth).
Applications / High-Level Vision
Gaze Tracking
* Detect face and facial features,
reconstruct head position & gaze
direction.
* Exploits:
- All faces look more or less the
same.
- Known camera position.
Demos by Seeing Machines (seeingmachines.com)
Applications / High-Level Vision
Finding & Tracking Cars from the Air
* Identifying “a car” is difficult (shape? color?).
* Detect & track “car-sized” moving objects, compute object
position in world – hypothesise objects to be cars as long as
they move on roads.
* Exploits: Known camera position and road/ground geometry to
map image objects into the world.
LiU UAVTech group (http://www.ida.liu.se/∼patdo/auttek/)
Applications / High-Level Vision
Road Sign Detection & Identification
* Fast shape-detector extracts regions of interest (potential signs).
* Pattern matching applied only to those regions.
* Exploits: Distinct shapes of road signs; signs are made to be
seen.
Demo by ANU/NICTA Smart Cars project
(http://users.rsise.anu.edu.au/∼rsl/car/)
Links & References
Vision on the Web
General Info
* http://homepages.inf.ed.ac.uk/rbf/CVonline/
* http://www.cs.cmu.edu/afs/cs/project/cil/ftp/
html/vision.html
* http://iris.usc.edu/Vision-Notes/bibliography/
contents.html
* ANU Computer Vision Course
- http://users.rsise.anu.edu.au/∼luke/cvcourse.htm
- http://studyat.anu.edu.au/courses/ENGN4528.html
Demos!
* http:
//users.ecs.soton.ac.uk/∼msn/book/new demo/
* http:
//aakash.ece.ucsb.edu/imdiffuse/segment.aspx
* http://extra.cmis.csiro.au/IA/changs/motion/