Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out

sharing

VIEWS: 3 PAGES: 59

									Shared features and Joint
Boosting
Sharing visual features for multiclass and
multiview object detection
A. Torralba, K. P. Murphy and W. T. Freeman PAMI.
vol. 29, no. 5, pp. 854-869, May, 2007.


                            Yuandong Tian
Outline
   Motivation to choose this paper
   Motivation of this paper
   Basic ideas in boosting
   Joint Boost
   Feature used in this paper
   My results in face recognition
Motivation to choose this paper
Axiom:
  Computer vision is hard.
Assumption: (smart-stationary)
  Equally smart people are equally
  distributed over time.
Conjure:
  If computer vision cannot be solved
  in 30 years, it won’t be solved
  forever!
   Wrong!
Because we are standing on
the Shoulder of   Giants.
Where are the Giants?
More computing resources?
Lots of data?
Advancement of new
 algorithm?         What I
Machine Learning? believe
Cruel Reality
Why ML seems not to help much in CV
 (at least for now)?


     My answer: CV and ML are

        weakly coupled
 A typical question in CV
Q:
Why do we use feature A instead of feature B?

    A1: Feature A gives better performance.
    A2: Feature A has some fancy properties.
A3:
The following step requires the feature to have
a certain property that only A has.
                    A strongly-coupled answer
 Typical CV pipeline


Preprocessing Steps
(―Computer Vision‖)      Feature/Similarity

                                              ML black box

          Have some domain-
          specific structures   Design for generic structures
Contribution of this paper
 Tune the ML algorithm in a CV
  context
 A good attempt to break the black
  box and integrate them together
Outline
   Motivation to choose this paper
   Motivation of this paper
   Basic ideas in boosting
   Joint Boost
   Feature used in this paper
   My results in face recognition
This paper
 Object Recognition Problem
   Many object category.
   Few images per category
 Solution—Feature sharing
   Find common features that distinguish a
    subset of classes against the rest.
Feature sharing
Concept of Feature Sharing
    Typical behavior
    of feature sharing




Template-like features                Wavelet-like features,

100% accuracy for a   single object   weaker discriminative power
But too   specific.                   but shared in   many classes.
Result of feature sharing
Why feature sharing?
 ML: Regularization—avoid over-fitting
   Essentially more positive samples
   Reuse the data
 CV: Utilize the intrinsic structure of
  object category
   Use domain-specific prior to bias the
    machine learning algorithm
Outline
   Motivation to choose this paper
   Motivation of this paper
   Basic ideas in boosting
   Joint Boost
   Feature used in this paper
   My results in face recognition
Basic idea in Boosting
 Concept: Binary classification
       samples,    labels(+1 or -1)

Goal: Find a function (classifier) H which
  maps positive samples to the positive value

Optimization: Minimize the
 exponential loss w.r.t the classifier H
Basic idea in boosting(2)
 Boosting: Assume H is additive



  Each          is a ―weak‖ learner (classifier).
    Almost random but uniformly better
     than random
  Example:
    Single feature classifier:
       make decision only on a single dimension
  How weak learner looks like




Key point:
The addition of weak classifiers gives a strong classifier!
Basic idea in boosting(3)
 How to minimize?
   Greedy Approach
     Fix H, add one h in each iteration
   Weighting samples
     After each iteration, wrongly classified
      samples (difficult samples) get higher
      weights
Technical parts
 Greedy -> Second-order Taylor
  Expansion in each iteration


            weights
                               The weak learner
                      labels   to be optimized
                               in this iteration

                               Solved by
                               Least Square
Outline
   Motivation to choose this paper
   Motivation of this paper
   Basic ideas in boosting
   Joint Boost
   Feature used in this paper
   My results in face recognition
Joint Boost—Multiclass
 We can minimize a similar function
  using one-vs-all strategy



 This doesn’t work very well, since it is
  separable in c.
 Put constraints. -> shared features!
Joint Boost (2)
 In each iteration, choose
   One common feature
   A subset of classes that use this feature
 So that the objective decreases most
Sharing Diagram
                             #Iteration

        I
       II
#class III
       IV
        V
 Features    1   3   4   5   2   1   4    6 2   7   3
Key insight
 Each class may have its own favorite
  feature
 a common feature may not be any of
  them, however it simultaneously
  decreases errors of many classes.
Joint Boost – Illustration
Computational issue
 Choose the best subset is prohibitive
 Use greedy approach
   Choose one class and one feature so that
    the objective decreases the most
   Iteratively add more classes until the
    objective increases again
     Note the common feature may change
 From O(2^C) to O(C^2)
#features = O(log #class)




                     (greedy)
0.95 ROC




           29 objects, average over 20 training sets
Outline
   Motivation to choose this paper
   Motivation of this paper
   Basic ideas in boosting
   Joint Boost
   Feature used in this paper
   My results in face recognition
Feature they used in the paper
 Dictionary
   2000 random sampled patches
     Of size from 4x4 to 14x14
     no clustering
   Each patch is associated with a spatial
    mask
The candidate features




template position
   Dictionary of 2000 candidate patches and position masks,
          randomly sampled from the training images
Features
 Building feature vectors
   Normalized correlation with each patch
    to get response
   Raise the response to some power
     Large value gets even larger and dominate
      the response (max operation)
   Use spatial mask to align the response to
    the object center (voting)
   Extract response vector at object center
Results
 Multiclass object recognition
   Dataset: LabelMe
   21 objects, 50 samples per object
   500 rounds
 Multiview car recognition
   Train on LabelMe, test on PASCAL
   12 views, 50 samples per view
   300 rounds
70 rounds, 20 training per class, 21 objects
12 views
50 samples per class
300 features
Outline
   Motivation to choose this paper
   Motivation of this paper
   Basic ideas in boosting
   Joint Boost
   Feature used in this paper
   My results in face recognition
Simple Experiment
 Main point of this paper
   They claimed shared feature helps in the
    situation of
      many categories, only a few samples in
       each category.
 Test it!
   Dataset: face recognition
   ―Face in the wild‖ dataset.
      Many famous figures
Experiment configuration
 Use Gist-like feature but
   Only Gabor response
   Use finer grid to gather histogram
     Face is aligned in the dataset.
 Feature statistics
   8 orientation, 2 scale, 8x8 grid
   1024 dimension
Experiment
 Training and testing
   Find 50 identities with most images
   For each identity, random select 3 as
    training
   The rest for testing
Nearest neighbor (50 classes, 3 per class)
Chance rate = 0.02


orientation   Scale Block K   blur   L1       L2       Chisqr

8             2    8     3 No        0.1338   0.1033   0.13

8             2    8     1 No 0.1868 0.1350 0.1681

8             2    6     1 No        0.1651   0.1285   0.1544

8             2    8     1 1.0 0.1822         0.1407   0.1754

8             2    8     1 2.0 0.1677         0.1365   0.1616
80% better than NN
Result on More images
 50 people, 7 images each
 Chance rate = 2%
 Nearest neighbor
   L1 = 0.2856 (0.1868 in 50/3)
   L2 = 0.2022
   Chisqr = 0.2596
          Joint Boost doubles
          the accuracy of NN




More feature         Single->Pairwise
is shared            Pairwise->Joint
                     7% percent
Result on More Identities
 100 people, 3 images each
 Chance rate = 1%
 Nearest neighbor
   L1 = 0.1656 (0.1868 in 50/3)
   L2 = 0.1235
   Chisqr = 0.1623
Joint Boost is still better than NN
yet the increment is less (~60%)
compared to the previous cases.




                 The performance of single Boost
                 is the same as NN
Conclusion
 Joint Boosting indeed works
   Especially when the number of images
    per class is not too small (otherwise NN)
 Better performance in the presence of
   Many classes, each class has only a new
    samples
   Introduce regularization that reduce
    overfitting
 Disadvantages
   Train slowly, O(C^2).
Thanks!
 Any questions?

								
To top