# A Robust Algorithm for Feature Point Matching

Document Sample

```					            A Robust Algorithm for Feature Point Matching*
Ji Zhou, Youbing Zhao, Jiaoying Shi

Abstract Image matching is a key problem of computer vision and frequently used in 3D model
reconstruction, object recognition, image alignment, camera self-calibration and so on. Feature
point matching is the most common one among all kinds of image matchings. The result of feature
point matching is affected greatly by many factors, such as object occlusions, lighting conditions
and noises, therefore it’s important to find a robust algorithm of feature point matching. In this
paper we extend the method for standard assignment algorithm to solve extended assignment
problem and propose a new feature point matching algorithm. It employs the condition that the
depth of the scene is local continuous as extra constraint, and uses the method for extended
assignment problem to do global optimization. Moreover, this algorithm only needs two
optimizations and can be implemented with almost complete matrix computation, so its efficiency
is higher than existed algorithm. Experiments show that the results of the algorithm are very
satisfactory.
Keywords: feature point matching, match strength, extended assignment problem

1. INTRODUCTION
The algorithm for feature point matching can be decomposed into three stages: feature point
extraction; feature point matching and outlier elimination. So far there have been a large number
of algorithms for feature point extraction. Grossly they can been divided into three categories. The
first one uses non-linear filter, such as SUSAN corner detector proposed by smith[1] which relates
each pixel to an area centered by the pixel. In this area, which is called SUSAN area, all pixels
have similar intensities as the center pixel. If the center pixel is a feature point (some times a
feature point is also be referred to as a “corner”), its SUSAN area is the smallest one among the
pixels around it. SUSAN corner detector can suppress noise effectively, for it does not need the
derivative of image. The second one is based on curvature, such as Kitchen and Rosenfeld’s
method[2]. This kind of method needs to extract edges in advance, and then find out the feature
points using the information of curvature of edges. The disadvantage of that kind of methods is
that they need complicated computation, e.g. curve fitting, thus their speeds are relatively low. The
third kind of method exploits the change of pixel intensity. The typical one is Harris and Stephens’
method [3]. It produces corner response through eigenvalues analysis. Since it doesn’t need to use
slide window explicitly, its speed is very fast. However, this method is sensitive to noise for the
usage of first order derivative of image.
Feature point matching is to find out the pixel pairs projected by the same points of a scene.
According to the cue used for matching, existing algorithms can be divided into two categories.
The first one is area based matching (ABM). Most of the algorithms are of this kind. They pick
out matching pixel pairs according to local area correlation coefficient. Unfortunately, local area
correlation coefficient only uses the local characterization of image, thus a high coefficient value

*
This work is supported by NSFC under Grant 69823003 and 60083009, in part by NSF in Zhejiang Province of
China.
may not lead to right match. The second kind of matching method is feature based matching
(FBM). This kind of methods requires one to calculate the features of edges or areas. These
features are more abstract descriptions of the content of the image, and are invariant under
different lighting conditions and wide base line transform, but the cost of computation is usually
very high. For example, Daniel P. Huttenlocher et. al[4] proposed an algorithm that can decide
whether two set of points are mapped by the same set of 2D or 3D points. The author proved that
its temporal complexity is close to O(n7), with n the number of feature points. Moreover, no
example is presented in this paper. Feature point matching algorithms can also be classified by
their optimization scheme. Some use global optimization methods, such as dynamic
programming[5], exhausted search, relaxation method and so on. The other use some local
optimization algorithms, such as greedy algorithm, simulated annealing and randomized
search[6,7] and so on. Most of the methods above need some extra constraints. If these constraints
are satisfied, those methods will be useless. João Maciel et. al[8] use linear programming, which is
very efficient in optimization. However, this method consumes to much memory and needs to
estimate the number of correct matches.
A very important problem must be taken into account. In the second stage we can’t guarantee
all found matches are correct, so these outliers must be eliminated. The most effective method
currently is proposed by Zhengyou Zhang et. al[9]. He uses the epipolar line constraint to
eliminate the outliers.

2. FEATURE POINT EXTRACTION
Our method is similar to that of Harris[3], it only uses matrix calculations, and has a very high
efficiency. However, we find it is too sensitive to noise, so we use another feature point response
First of all, the average gradient matrices at each pixel must be calculated:
  I 2           I I      
    ,W         x y ,W  
          
x
M     
  I I 

2
  

 I 
 
      ,W        ,W  
 
  x y 
                    y       

where <>denotes 2D convolution, W is a 2D weight matrix, and in fact it is a low-pass filter.
Let λ1 and λ2 be the eigenvalues of M, and V1, V2 the eigenvectors of M, then λ1 and λ2 can
be seen as average ratio of change in intensity in the directions of V1, V2 respectively. Based on
the analysis, we can use the smaller of λ1 and λ2 to decide whether a point is a feature point or not.
The feature point response function is:

frsp ( x, y)  min( 1 ( x, y), 2 ( x, y))                 (1)

First, normalize the map of λ2, here we assume λ1≥λ2; second apply non-minimum
suppression operator to the result and at last threshold those maximums. Our method can suppress
noise effectively, as shown Figure 1.
a                                                                          b
Figure 1 Comparison of the algorithms for feature point extraction. a, our algorithm and b, Harris’ algorithm.
The length of W, l, depends on the variance of the image, σ. However, the estimation of σ is
very difficult, it is usually set to 1 (pixel). Thus l can be set to 3, for when the length of window is
greater than 3σ, more than 95% energy of signal can pass. The threshold is used to suppress false
feature point, and can be chosen according to the histogram of λ2.

3. FEATURE POINT MATCHING
3.1 match strength
Most of applications use local area correlation coefficient (LACC) to decide whether a two
feature points match. The LACC is:
n     m       I u   1
                   
 k , vi1  l  I1 ui1 , vi1  I 2 u 2  k , v 2  l  I1 u 2 , v 2       
cij    
1   i                                        j         j            j     j
(2)
k  nl   m                        2n  12m  1  i 2 I1   j 2 I 2 
where I1 and I2 is the intensities of the two images, (ui1, vi1) and (uj2, vj2) the i-th and j-th feature
points to be matched, m and n the half width and half length of the sliding window,

I (u, v)  i  n  j  m I (u  i, v  j ) /[(2m  1)(2n  1)]
n            m

the average intensity of the window,

 
n            m
i  n       j  m
I 2 (u  i, v  j )
                                                        I 2 (u, v)
(2m  1)(2n  1)
the standard variance of the window. cij ranges from –1 to 1, indicating the similarity from
smallest to largest. The cij’s are thresholded, and those possible matches whose LACC is greater
than a threshold become candidate match. The LACCs which are smaller than the threshold are set
to 0.
That LACC being large is only the prerequisite of feature point matching, so if we only use
LACC there could be outliers, and sometimes the ratio of outliers is very high so as to invalidate
the algorithm. We must use extra constraint of matching.
The most straightway constraint is the constraint of disparity. Since the relative displacement
and rotation of the camera is not very large, the matching feature point in the second image will
appear in a relative small area of the feature point in the first image. So we can set a radial and set
the LACC of them to 0.
The disparity constraint is not enough, especially when the density of feature point is high. In
most cases, the change of depth of the scene is not very large. Let mi1, mk1, mj2, ml2 be the feature
points of two images. As a result, if mi1 and mk1 match mj2 and ml2 respectively, and mi1 is close to
mk1, we can expect that the relative position of mj2 to ml2 is similar to that of mi1 to mk1. To
quantify this, let

d pq  mr  mq
r
p
r

then dik1 should be similar to djl2. We use
d ik  d 2
1

 ijkl 
jl

                2
(3)
1
max d ik , d 2
jl

to describe the similarity with ● denoting dot production of two vectors.
ξijkl reflects how the feature points around the mi1 and mj2 “support” them to match. We call
the candidate pair mj1 and mk2 a supporter to mi1 and mj2 if ξijkl is positive. In other words, if mi1
and mj2 are really matching points, they will have many supporters, and vise versa. Naturally, we
can imagine that the supporting value decreases with the growth of the distance from the nearby
feature points to the one to be matched. So we define a function with respect to the distance:
1
df ik                                                             (4)
1  mi  mk
2

In addition, if ξijkl<0, then the candidate match gets an “opposite”. In some cases, such as
repeated texture, even if a candidate match gets some supporters, it is still not a correct match
because it has a considerable portion of opposites. Use rn to denote the percentage of the opposites
of a candidate, then the more opposites it has, the less support value it gets, even if there are a
large number of supporters.
We define match strength as:
                                                                                
ms ij  cij               max {c kl ijkl df ik }                max (c kl ijkl df jl ) (1  rn ) (5)
 m  ( m ) ml  2 ( m j )                            m  ( m )               
 k 1 i                                  ml  2 ( m j ) k 1 i                   
where μ1(mi) and μ2(mj) are neighbor areas of mi and mj respectively. Note the two items of the
left hand of equation (4) is to achieve symmetry, so that we can specify an arbitrary image as the
first image, but the msij is the same. The msij’s are also thresholded. The msij’s which are smaller
than the threshold are set to 0.
The match strength algorithm is analogue to have a poll. The candidate who gets the most
supporters will win out. This idea is shown in figure 2.
m1i2
m2i2
m1i1       m1i5
m2i1
m1i3

m2i3
m1i4                                                                     m2i4
μ1
μ2
image1                                                          image2

Figure 2 Local continuity condition in feature point matching. m1i1 and m2i1 is a candidate of
match to be polled. The feature point pairs linked by solid lines, such as m1i2 and m2i2, are
supporters, and those who linked by dotted lines, such as m1i4 and m2i2, are opposites.

3.2 Disambiguition
Even though the match strengths are thresholded, the ambiguity still exists, i.e. a feature point
could have more than one candidates. This problem can be solved through optimization. Our
object is to make the sum of the match strengths of the candidate of match to get a maximum.
Since one feature point in the first image only matches to unique one in the second image and visa
versa, obviously it is an assignment problem[10].
The classical algorithm for assignment problem was proposed by W. W. Kuhn[10] in 1955.
However this method requires the coefficient matrix is a square matrix. Usually the match strength
matrix MS = (msij)mxn is not a square matrix, so we can not use Kuhn’s method directly. We extend
the method first.
Let C = (cij)mxn be the coefficient matrix of assignment problem. If m=n, it is a standard
assignment problem, otherwise we call it a extended assignment problem. Next we prove the
following theorem.
Theorem: Given Ce=(cij)mxn is the coefficient matrix of the extended assignment problem,
and its solution is Ze*=(ze,ij*)mxn. Without losing generality, assume that m<n, h=n-m.
Construct a standard assignment problem, whose coefficient matrix is:
 cij mn 
Cs                                                       (6)
 a hn  nn
where a is an arbitrary constant, and the solution is Zs* = (zs,ij*)nxn. Then Ze* is the first m
rows of Ze*.
Proof: Let fs*=Zs*  Cs is the minimum of the standard assignment problem (For assignment
problem optimum is referred to minimum).  denotes dot production of two matrices here, i.e.
m     n
(aij ) mn  (bij ) mn   aij bij
i 1 j 1

Let fe*=Ze*  Ce be the minimum of the extended assignment problem. Use Zs,mxn* denote the first
m rows of Zs*, and fmxn= Zs,mxn*  Ce. Now assume that Zs,mxn*≠Ze*.
n
Let h = n-m. , for Zs,ij*  {0,1} and             z
i 1
s ,ij    1 , then fs*=fmxn+h*a. According to the
assumption, fmxn> fe*, so fs*=fmxn+h*a> fe*+h*a. It is contradictory to the prerequisite that Zs* is
the solution of the standard assignment problem. Therefore Zs,mxn* must be the solution of the
extended assignment problem.                                                                     □
Let

 M  msij mn 
MSE                                      (7)
     0 hn       nn
where M is the maximum of msij, 0hxn means a h×n matrix whose entries are all zeros, and h=n-m.
Note that if the number of rows of MS is greater than the number of columns of MS, we can
simply transpose it. According to the theorem above, we should solve the standard assignment
problem, whose coefficient matrix is MSE, and the first m rows of the solution is the solution of
the original problem.

4. ELIMINATION OF OUTLIERS
We use the LMedS algorithm proposed by Zhengyou Zhang[10] to eliminate outliers. The
steps of this method are as following:
1) Select m samples from the points set, every sample comprises 8 points. m are determined by
the estimated percentage of the outliers in the matching points set with the following equation:

P  1  [1  (1   )8 ]m
where ε the estimated proportion of outliers，P is the probability of at least one correct one in
m samples.
2) Calculate out the fundamental matrix. For sample J using 8-point algorithm to get a FJ,
J=1,…,m.
3) For each FJ compute a group of square residuals from the whole set of data, then pick out their
median MJ, i.e.:

i 1,...,n

M J  med d 2 (mi2 , FJ mi1 )  d 2 (mi1 , FJT mi2 )    
where mi1 and mi2 are the i-th pair of matched feature points gotten from section 3 and d is the
distance for feature point to line.
4) Find the maximum among the MJs: MM. Let its corresponding fundamental matrix is FM.
5) Calculate the so-called robust standard deviation:
^
  1.4826 [1  5 /(n  8)] M M
Assign a weight to each match:


^

wi  1            if    ri 2  (2.5 ) 2
0
             otherwise
where

ri  d 2 (mi2 , FM mi1 )  d 2 (mi1 , FM ' mi2 )
Those matches has weights of zero are outliers, and are eliminated.
6) After solving the following optimization problem using least square method:
min    w r
i
i i
2

we can get a more accurate estimation of the fundamental matrix.

5. EXPERIMENTAL RESULTS
The Following are some of our experimental results.

Figure 3 Feature point matching of the first scene without elimination of outliers. There are 46 pairs of feature
points. The ratio of correct match is 91%.

Figure 4 The epipolar lines of the first scene.

Figure 5 Feature point matching of the second scene without elimination of outliers. There are 179 pairs of
feature points. The ratio of correct match is also 91%.
Figure 6 The epipolar lines of the second scene.

6. CONCLUSION
In this article we propose a robust algorithm for feature point matching. Experimental results
show that this method can eliminate many kinds of outliers effectively and acquire a very high
ratio of correct match. Therefore it is robust. As far as the efficiency of the algorithm is
concerned, , it only needs to scan the image once in the stage of feature point extraction. If an
image has r rows and c columns, and let s=r×c, then the time complexity is O(s). In the stage of
feature point matching, the algorithm is faster than the linear program, so the time complexity will
not exceed O(m×n), with m and n the number of feature points of the first and second image. In
the stage of outlier elimination, for the experimental results show that the ratio of the correct
matches acheived in the second stage is far more than 50%, so the LMedS algorithm will stop
during 300 loops, and the number of iterations of the weighted least square method is far smaller
than the number of candidate matches, i.e. m×n, therefore the time complexity of the whole
algorithm is smaller than O(max(s, m×n)). The resolution of our test images is 600×800, and the
execution time is less than 1 minutes. It is noticeable that our algorithm is very suitable for matrix
computation.

ACKNOWLEDGEMENTS
We’d like to thank Ning Qiu for her hard work of programming.

REFERENCE
[1] Stephen Smith. Feature Based Image Sequence Understanding. PhD thesis, Department of
Engineering Science, University of Oxford, 1992.
[2] L. Kitchen and A. Rosenfeld. Gray-Level Corner Detection. Pattern Recognition Letters,
1:95-102, 1982.
[3] C. G. Harris and M. Stephens. A Combined Corner and Edge Detector. In Proc. of the 4th
Alvey Vision Conference, pages 147-151, 1988.
[4] Daniel P. Huttenlocher and Jon M. Kleinberg. Comparing point sets under projection. In
Proceedings of the Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1-7,
Arlington, Virginia, 23-25 January 1994.
[5] Y. Ohta and T. Kanade, Stereo by Intra- and Iinter-scanline Search Using Dynamic
Programming, IEEE Trans. PAMI, 7(2):139-154, March 1985.
[6] G. Sudhir, S. Banerjee and A. Zisserman. Finding Point Correspondence in Motion Sequence
Preserving Affine Structure, CVIU, 68(2)237-246, November, 1997.
[7] P. Torr, Motion Segmentation and Outlier Detection, Ph.D thesis, Dept. Engineering Science,
U. Oxford, 1995.
[8] João Maciel and João Costeira, Robust Point Correspondence by Concave Minimization,
VisLab-TR 11/99 Dec. 1999.
[9] Zhengyou Zhang, Rachid Deriche, Oliver Faugeras and Quang-Tuan LUONG. A Robust
Technique for Matching Two Uncalibrated Images Through the Recovery of the Unknown
Epipolar Geometry. INRIA (Natl Inst Res Comp & Ctl Sci, France).
[10] Songdi Qian, et. al., Operation Research, Tsing Hua University Press, 128-134, 1990.

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 12 posted: 12/6/2011 language: English pages: 9