SIFT_KA

Document Sample
SIFT_KA Powered By Docstoc
					SIFT: Scale Invariant Feature
Transform
David G. Lowe
“Distinctive image features from
scale-invariant keypoints” (IJCV 2004)

                                    Presented By:
                                    Kirill Dyagilev 317845089
                                    Ayelet Dominitz 034431304
Why Features?
• A brief yet comprehensive representation of the
  image
• Can be used for:
 ▫   Image alignment
 ▫   Object recognition
 ▫   3D reconstruction
 ▫   Motion tracking
 ▫   Indexing and database search
 ▫   More…
Desired Feature Properties
• Robustness => Invariance to changes in illumination,
  scale, rotation, affine, perspective

• Locality => robustness to occlusion and clutter.

• Distinctiveness => easy to match to a large database
  of objects.

• Quantity => many features can be generated for even
  small objects

• Efficiency => computationally “cheap”, real-time
  performance
Related Research
• Corner-based local interest points:
  ▫ Moravec(1981), Harris (1992)

• Descriptors:
  ▫ Correlation window around each corner
     Zhang (1995).
  ▫ Local, rotationally invariant
     Schmid & Mohr (1997).
  ▫ Scale-invariance:
     Crowley & Parker (1984), Shokoufandeh et. al. (1999),
      Lindeberg(1993,1994), Mikolakczyk & Shmid (2002).

• Maximally-Stable Extremal Regions (MSER)
  ▫ Matas(2002)
Algorithm
1.   Scale-space extrema detection
2.   Keypoint localization
3.   Orientation assignment
4.   Keypoint descriptor
Step 1: Scale-Space Extrema Detection
• Need to find “characteristic scale” for features
• Scale space representation:




                              L  x, y,   G  x, y,    I  x, y 

                                                 1                           2 2
                             G  x, y ,   
                                                               x2  y 2
                                                          e
                                                2   2
Step 1: Scale-Space Extrema Detection
• Mikolajczyk (2002): Experimentally, extrema of
  LoG gives best notion of scale:
Step 1: Scale-Space Extrema Detection
• LoG is computationally expensive

• Approximation: DoG

     D  x , y ,     G  x , y , k   G  x , y ,     I  x , y 
                     L  x, y, k   L  x, y ,  

 ▫ Smoothed images should be computed anyway
   => calculation is reduced to image subtraction.
Step 1: Scale-Space Extrema Detection
DoG scale space
Step 1: Scale-Space Extrema Detection
• X is selected if it is larger or smaller than all 26
  neighbors
Algorithm
1.   Scale-space extrema detection
2.   Keypoint localization
3.   Orientation assignment
4.   Keypoint descriptor
Step 2: Keypoint Localization




         (a) 233x189 image         (b) 832 DOG extrema



• Too many keypoints, some are unstable:
 ▫ points with low contrast (sensitive to noise)
 ▫ points that are localized along an edge
Step 2: Keypoint Localization
Low contrast points elimination:
• Fit keypoint at x to nearby data using quadratic
  approximation.
                       DT   1 T  2 DT
          D( x )  D      x x         x
                        x   2    x  2

• Calculate the local maxima of the fitted function.
                                1
                         D D
                            2
                     x 2
                     ˆ
                        x  x
• Discard local minima D( x )  0.03
                          ˆ
Step 2: Keypoint Localization




 729 out of 832 are left after contrast thresholding
Step 2: Keypoint Localization
• “Edge“ keypoints are sensitive to noise, thus
  should be eliminated.
  ▫ Solution: check “cornerness” of each keypoint.

• On the edge one of principle curvatures is much bigger
  than another.

• High cornerness  No dominant principle curvature
  component.
Step 2: Keypoint Localization
• Principle curvature is proportional to
  eigenvalues max , min of Hessian matrix:
                     Dxx      Dxy 
                 H 
                     Dxy      Dyy 
                                   

• Harris (1988): Equivalently,
               max     Tr ( H )2 (r  1)2
                    r           
               min     Det ( H )    r
Step 2: Keypoint Localization




536 out of 729 are left after cornerness thresholding
Algorithm
1.   Scale-space extrema detection
2.   Keypoint localization
3.   Orientation assignment
4.   Keypoint descriptor
Step 3: Orientation assignment
• Required: Rotation invariance of features
• Solution:
 ▫ Assign orientation to feature based on local
   gradients
 ▫ Transform relative data accordingly.
Step 3: Orientation assignment
• Create weighted (magnitude +
  Gaussian) histogram of local
  gradient directions computed at
  selected scale

• Assign canonical orientation at
  peak of smoothed histogram

• For location of multiple peaks
  multiply key point

                                    
Algorithm
1.   Scale-space extrema detection
2.   Keypoint localization
3.   Orientation assignment
4.   Keypoint descriptor
Step 4: Keypoint descriptor
• We have assigned location, scale, and orientation to each
  keypoint:
  ▫  Impose a repeatable local 2D coordinate system
  ▫  Provide invariance to these parameters.


Remaining goal:
• Define local descriptor invariant to remaining variations:
  ▫ Illumination
  ▫ 3D Viewpoint
Step 4: Keypoint descriptor
• Create 16 gradient histograms (8 bins)
 ▫ Weighted by magnitude and Gaussian window ( σ is half the
   window size)
 ▫ Histogram and gradient values are interpolated and smoothed




                                           => Feature vector (128)
Step 4: Keypoint descriptor
• Invariance to affine illumination changes:
 ▫ Gains do not affect gradients
 ▫ Normalization to unit length removes contrast

• Non-linear illumination changes :
 ▫ Saturation affects magnitudes much more than orientation
 ▫ Threshold gradient magnitudes to 0.2 and renormalize
Step 4: Keypoint descriptor
• Justification:
  ▫ Inspired by the human visual system
  ▫ Parameters r (# of bins) and n×n (# of
    histograms) chosen empirically
Keypoint Matching
•       Nearest Neighbor algorithm based on L2
        distance

•       How to discard bad matches?
    ▫    Threshold on L2 => bad performance
    ▫    Solution: threshold on ratio
                         best match
                     second best match
Typical Usage:
For set of database images:
  1. Compute SIFT features
  2. Save descriptors to database

For query image:
  1. Compute SIFT features
  2. For each descriptor find its match in the database
  3. Verify object recognition by checking feature
     consistency (relative location, scale and orientation)
    -   RANSAC
    -   Hough transform
  4. Verify with affine transform
Recognition under occlusion
Test of illumination Robustness
• Same image under differing illumination




                              273 keys verified in final match
Location recognition
Image Registration Results




             [Brown & Lowe 2003]
Evaluation
• Robustness to:
  ▫ Viewpoint
  ▫ Lighting
  ▫ Scale

• Repeatability (50%) [Lowe]:
  ▫ Textured planar surfaces ~50◦ rotation
  ▫ Improve by adding transformed images to database
  ▫ 3D objects considerably less

• Weak point [Lowe]: key point detection
Evaluation
• Very popular: ~ 3000 citations

• [Mikolajczyk & Schmid]:
    ▫ Close second best descriptor after GLOH

• [Moreels & Perona]:
    ▫ The best descriptor
    ▫ There is a better detector – affine-rectified

•   K.Mikolajczyk, C.Schmid. “A Performance Evaluation of Local Descriptors”. (CVPR, 2003)
•   P.Moreels, P.Perona. “Evaluation of Features Detectors and Descriptors based on 3D objects”.
    (ICCV,2005)
Evaluation
• Computationally inexpensive

• Efficient codes available on www

• Simplified implementation =>
           results considerable worse

• Doesn’t exploit color information

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:6
posted:6/26/2011
language:English
pages:36