# Pascal VOC Classification challenges - PowerPoint - PowerPoint

Document Sample

```					       Pascal VOC
Classification challenges

Lear Group
INRIA - France

01/02/2010    Pascal Workshop Challenge   1
Kernel-Based Classification of Visual
Objects using a Sparse Image
Representation

Jianguo Zhang, Cordelia Schmid

INRIA Rhone-Alpes, 665, avenue de l'Europe
38330 Montbonnot, FRANCE
Email: {jianguo.zhang, cordelia.schmid}@inrialpes.fr

01/02/2010                  Pascal Workshop Challenge               2
Sparse Representation

   Scale invariant regions: robustness against geometric
transformations
      sparse spatial selection
A Without representation: saliency, compactness selection
With spatial

01/02/2010         Pascal Workshop Challenge        3
Outline

01/02/2010   Pascal Workshop Challenge   4
Pascal VOC: image signatures #1

(…)
}      Codebook
(…)
Histograms

Keypoint description                         Keypoint description
 2distance
Standard ‘Bag of words’ representation
01/02/2010             Pascal Workshop Challenge        5
Pascal VOC: image signatures #2

}   Codebook #1

(…)                    (…)                     (…)

}   Codebook #n

Keypoint description                          Keypoint description   Center coordinates +
histograms

EMD distance

01/02/2010              Pascal Workshop Challenge             6
Technical details

   Extraction of a sparse set of descriptors, scale-invariant interest
regions: Harris-Laplace and Laplacian.
   SIFT is used as region descriptor, resulting in 128 dimensional
description vectors. Note that the version of SIFT used here is not
rotation invariant.
   Vocabulary construction: k-means We cluster the descriptors of
each class separately and then concatenate them.
   Here we extract 250 clusters per class with the k-means algorithm.
The concatenation results in 1000 clusters

01/02/2010            Pascal Workshop Challenge            7
Distance measure
   For each image we compute a frequency histogram for our set of
visual words.
   Compare these histograms with the distance:

h1 , h2 are the vocabulary histograms of two different images
(h1 (i )  h2 (i )) 2
 
2

h1 (i )  h2 (i )

01/02/2010             Pascal Workshop Challenge         8
Using EMD kernel

   Clustering the descriptors of each images into 40 cluster centers and
form signatures for each image.

   Compute earth mover’s distance on the signatures between images.

   Kernelization and classification is as the similary way as      2
distance.

01/02/2010             Pascal Workshop Challenge                    9
Classification / histograms

   We use Support Vector Machines (SVM) for classification.
Our kernel is a Gaussian kernel based on the  distance
2


K ( I1 , I 2 )  exp( 1 / A   2 (h1 , h2 ))

   The parameter A is obtained by 5-fold cross validation on the training
images. The distance between images is computed separately for each
detector/descriptor pair. Results are combined by adding the distances
and estimating A for the combination. Here we combine Harris-Laplace /
SIFT and Laplacian / SIFT. For each of the 4 classes
   we train a binary classifier which separates a class from the others. The
output of the SVM is normalized to [0, 1] and used as a confidence
measure

01/02/2010                Pascal Workshop Challenge               10
Results

   For the training images and test set 1 (1373 images in total), the
average number of points detected per image is 796 for Harris-
Laplace and 2465 for the Laplacian.
   The minimum number of points detected for an image is 15
(Harris-Laplace) and 71.

01/02/2010            Pascal Workshop Challenge             11
ROC curve

Cars                                   Bikes

ROC curve of x2 kernel and EMD kernel on test set 1
01/02/2010       Pascal Workshop Challenge           12
ROC Curve: test1

Motobikes                                People

ROC curve of x2 kernel and EMD kernel on test set 1

01/02/2010        Pascal Workshop Challenge        13
ROC Curve: test2

Bikes                                Cars
ROC curve of x2 kernel and EMD kernel on test set 2

01/02/2010        Pascal Workshop Challenge          14
ROC Curve: test2

Motobikes                               People

ROC curve of x2 kernel and EMD kernel on test set 1

01/02/2010        Pascal Workshop Challenge            15
Difficult images in test set 1

Difficult bikes

Difficult cars
01/02/2010      Pascal Workshop Challenge   16
Difficult images in test set 1

Difficult motorbikes

Difficult people
01/02/2010       Pascal Workshop Challenge   17
Difficult images in test set 2

Difficult bikes

Difficult cars
01/02/2010      Pascal Workshop Challenge   18
Difficult images in test set 2

01/02/2010   Pascal Workshop Challenge   19
Other results

   We tried several combination of detector/descriptors.
   We denote
•   the Harris detector with different levels of invariance as HS, HSR and HA
•   Laplacian detector as LS, LSR and LA. Note that HA and LA are
•   Both are by construction rotation invariant.
   The combination of detectors and descriptors is denoted by
(detector+detector)(descriptor+descriptor), e.g.,
(HS+LS)(SIFT+SPIN) means the combination of HS and LS
detectors described each with SIFT and SPIN descriptors.

01/02/2010                    Pascal Workshop Challenge                         20
Other results

01/02/2010    Pascal Workshop Challenge   21
Conclusions

   The framework using a sparse image representation with kernel
works well for image categorization.
Under current experimental settings,  kernel works slightly
2

better than EMD kernel.
   Parameter k is not critical
   Laplacian detector works better.
   Comparable recognition performance can also be achieved
without building the ‘global’ vocabulary with EMD. Could be useful
in case of large data sets.
   Positive / Negative examples in test2: only background should be
taken as negative; several objects can appear in the same test
image – it’s not the case for the training set.

01/02/2010            Pascal Workshop Challenge           22

```
DOCUMENT INFO
Shared By:
Categories:
Stats:
 views: 50 posted: 2/1/2010 language: English pages: 22