Learning Image Similarity from Flickr Groups Using Stochastic

Document Sample
Learning Image Similarity from Flickr Groups Using Stochastic Powered By Docstoc
					Gang Wang   Derek Hoiem   David Forsyth

(implement detail)


Using online photo sharing sites → Flickr(Group)
Determine which image are similar , how they are similar

  Learn these Group membership likelihoods
  Due to the time that it would take to learn categories
 Propose a new method for stochastic learning of SVMs
        using Histogram Intersection Kernel (HIK)
               Combine with [14] and [18]
Related work Algorithm classes
(train very large scale kernel SVM)
i)    Exploits the sparseness of the lagrange multipliers → SMO[22]

ii)   Use stochastic gradient descent
                         without touching every example

Kivinen [14] → method applies to kernel machines
Maji[18] → very quickly evaluating a histogram
                intersection kernel
Flickr provide an organizational structure
              How people like to group

SIKMA classifier allows efficient and accurate
      learning of these categories

This property generalizes well
 Even the test dataset was not obtained from Flickr
Suppose we have a list of
training examples

For the test example u
The classification score
      Approximate the gradient by replacing the sum over all
  examples(batch) with a sum over some subset, chosen at random.
             It is usual to consider a single example.

New decision

  It’s expensive to calculate ft-1. The NORMA Algo.[14] keeps a set of
  support vectors of fixed length by dropping the oldest ones.

  Doing so comes at a considerable cost in accuracy !
D is feature dimension
                                     Conventional SVM

  The Computational
                            O(TMD)        O(T2D)


    The Space cost          O(MD)           O(D)
                                       is Evaluation
                                     for each example
T: # of training example
M: # of quantization bins
D: # of feature dimension
Measuring image similarity

   Found a simple Euclidean distance between
    the SVM outputs.

   Since we have names(groups), we can also
    perform text-based queries
    (get image like “people dancing”) and
    determine how two image are similar
    Use four type of feature:

   SIFT feature
           Detect and describe local patches
   Gist feature
            960 dimensions Gist descriptor
   Color feature
         RGB space, value range from 1 to 512
   Gradient feature
        The whole image is represented as a 256
                  dimensional vector
                   Combine the outputs of these four classifier to be a final
                            prediction on a validation data set
For 103 Flickr categories, using

15,000 ~ 30,00 positive images and
   60,000      negative images.

The average AP over these categories is
Select top five negative examples and five randomly chosen positive
examples from among the top 50 ranked images

            yi is 1 if it is positive, otherwise 0
Flickr category can be described with several word, we can support text-
based queries.

Input a word query      find the Flickr group whose description contains
                        such word

Test this on the Corel data set,
with two queries ”airplane” and
SIKMA, an algorithm to quickly train an SVM with the histogram
intersection kernel using tens of thousands of training examples

two images that are likely to belong to the same Flickr groups are
considered similar.

Experimental results show that matching with Prediction features better
than matching with visual features

Shared By: