Visual Categorization with Bags of Keypoints by hMe9pu2

VIEWS: 0 PAGES: 20

									     Visual Categorization
    with Bags of Keypoints
                       Gabriella Csurka
                  Christopher R. Dance
                               Lixin Fan
                     Jutta Willamowski
                           Cedric Bray
            Presented by Yun-hsueh Liu
5/30/2006             EE 148, Spring 2006   1
              What is Generic Visual
                Categorization?
    Categorization: distinguish different classes




    Generic Visual Categorization:
           Generic  to cope with many object types simultaneously
           readily extended to new object types.
           Handle the variation in view, imaging, lighting, occlusion, and
            typical object and scene variations

5/30/2006                            EE 148, Spring 2006                      2
               Previous Work in
             Computational Vision
    Single Category Detection
     Decide if a member of one visual category is present
     in a given image. (faces, cars, targets)


    Content Based Image Retrieval
     Retrieve images on the basis of low-level image
     features, such as colors or textures.


    Recognition
     Distinguish between images of structurally distinct
     objects within one class. (say, different cell phones)



5/30/2006                            EE 148, Spring 2006      3
      Bag-of-Keypoints Approach


 Interesting Point   Key Patch       Feature                               Multi-class
                                                        Bag of Keypoints
     Detection       Extraction     Descriptors                            Classifier




                                          0 .1 
                                                
                                          0 .5 
                                        .       
                                                
                                        .       
                                        .       
                                                
                                          1 .5 
                                                




5/30/2006                         EE 148, Spring 2006                                    4
                     SIFT Descriptors

 Interesting Point   Key Patch       Feature                               Multi-class
                                                        Bag of Keypoints
     Detection       Extraction     Descriptors                            Classifier



                                           0 .1 
                                                 
                                           0 .5 
                                         .       
                                                 
                                         .       
                                         .       
                                                 
                                           1 .5 
                                                 




5/30/2006                         EE 148, Spring 2006                                    5
                  Bag of Keypoints (1)
 Interesting Point      Key Patch         Feature                               Multi-class
                                                             Bag of Keypoints
     Detection          Extraction       Descriptors                            Classifier




    Construction of a vocabulary
           Kmeans clustering  find “centroids”
            (on all the descriptors we find from all the training images)
           Define a “vocabulary” as a set of “centroids”, where every centroid
            represents a “word”.



5/30/2006                              EE 148, Spring 2006                                    6
                 Bag of Keypoints (2)
 Interesting Point      Key Patch        Feature                               Multi-class
                                                            Bag of Keypoints
     Detection          Extraction      Descriptors                            Classifier




    Histogram
           Counts the number of occurrences of different visual words in each
            image




5/30/2006                             EE 148, Spring 2006                                    7
                Multi-class Classifier
 Interesting Point     Key Patch       Feature                               Multi-class
                                                          Bag of Keypoints
     Detection         Extraction     Descriptors                            Classifier




    In this paper, classification is based on conventional machine
     learning approaches
           Naïve Bayes
           Support Vector Machine (SVM)



5/30/2006                           EE 148, Spring 2006                                    8
            Multi-class classifier –
               Naïve Bayes (1)
    Let V = {vi}, i = 1,…,N, be a visual vocabulary, in which each vi
     represents a visual word (cluster centers) from the feature space.

    A set of labeled images I = {Ii } .

    Denote Cj to represent our Classes, where j = 1,..,M

    N(t,i) = number of times vi occurs in image Ii (keypoint histogram)

    Score approach: want to determine P(Cj|Ii), where


                                                                           (*)

5/30/2006                              EE 148, Spring 2006                       9
            Multi-class Classifier –
               Naïve Bayes (2)
    Goal: Find one specific class Cj so that

                                has maximum value

    In order to avoid zero probability, use Laplace smoothing:




5/30/2006                           EE 148, Spring 2006           10
   Multi-class classifier –
Support Vector Machine (SVM)
    Input: the keypoints histogram for each image

    Multi-class  one-against-all approach

    Linear SVM gives better performances than quadratic or cubic SVM

    Goal: find hyperplanes which separate multi-class data with
     maximun margin




5/30/2006                       EE 148, Spring 2006                 11
            Multi-class classifier –
                    SVM (2)




5/30/2006             EE 148, Spring 2006   12
                     Evaluation of
                 Multi-class Classifiers
    Three performance measures:
           The confusion matrix
               Each column of the matrix represents the instances in a predicted class
               Each row represents the instances in an actual class

           The overall error rate
                = Pr(output class = true class)

           The mean ranks
               The mean position of the correct labels when labels output by the multi-
                class classifier are sorted by the classifier score.




5/30/2006                                   EE 148, Spring 2006                            13
                n-Fold Cross Validation
    What is “fold”?

           Randomly break the dataset into n partitions

           Example: suppose n = 10
               Training on 2, 3,…,10; testing on 1 = result 1
               Training on 1, 3,…,10; testing on 2 = result 2
               …
               Answer = Average of result 1, result 2, ….




5/30/2006                                 EE 148, Spring 2006    14
    Experiment on Naïve Bayes –
             k’s effect
    Present the overal error rate as a
     function of # of clusters k

    Result

           Error rate decreases as k increases

           Selecting point: k = 1000

           After passing the selecting point, the
            error rate decreases slowly




5/30/2006                                EE 148, Spring 2006   15
   Experiment on Naïve Bayes –
        Confusion Matrix
              faces   buildings   trees          cars   phones   bikes   books
     faces     76        4         2                3     4       4       13
  buildings    2         44        5                0     5       1       3
     trees     3         2         80               0     0       5       0
      cars     4         1         0               75     3       1       4
   phones      9         15        1               16    70       14      11
     bikes     2         15        12               0     8       73      0
    books      4         19        0                6     7       2       69
     error
               24        56        20              25    27       27      31
     rate
     mean
              1.49      1.88      1.33           1.33    1.63    1.57    1.57
     rank


5/30/2006                         EE 148, Spring 2006                            16
             Experiment on SVM –
               Confusion Matrix
             faces   buildings   trees          cars    phones   bikes   books

  faces       98        14        10              10     34       0       13

buildings     1         63        3                0      3       1       6

   trees      1         10        81               1      0       6       0
   cars       0         1         1               85      5       0       5
 phones       0         5         4                3     55       2       3
  bikes       0         4         1                0      1       91      0
  books       0         3         0                1      2       0       73

error rate    2         27        19              15     45       9       27

  mean
             1.04      1.77      1.28           1.30     1.83    1.09    1.39
  rank

5/30/2006                         EE 148, Spring 2006                            17
             Interpretation of Results
    The confusion matrix
           In general, SVM has more correct predictions than Naïve Bayes
            does

    The overall error rate
           In general, Naïve Bayes > SVM

    The Mean Rank
           In general, SVM < Naïve Bayes




5/30/2006                             EE 148, Spring 2006                   18
            Why do we have errors?
    There are objects from more than 2 classes in one image
    The data set is not totally clean (noise)
    Each image is given only one training label




5/30/2006                      EE 148, Spring 2006             19
                            Conclusion
    Bag-of-Keypoints is a new and efficient generic visual categorizer.

    Evaluated on a seven-category database, this method is proved that
     it is robust to
           Choice of clusters, background clutter, multiple objects

    Any Questions?

    Thank you for listening to my presentation!! :)




5/30/2006                              EE 148, Spring 2006                 20

								
To top