# Learning Image Similarity from Flickr Groups Using Stochastic

Document Sample

```					Gang Wang   Derek Hoiem   David Forsyth
INTRODUCTION

APROACH
(implement detail)

EXPERIMENTS

CONCLUSION
Using online photo sharing sites → Flickr(Group)
Determine which image are similar , how they are similar

Learn these Group membership likelihoods
Due to the time that it would take to learn categories
Propose a new method for stochastic learning of SVMs
using Histogram Intersection Kernel (HIK)
SIKMA
Combine with [14] and [18]
Related work Algorithm classes
(train very large scale kernel SVM)
i)    Exploits the sparseness of the lagrange multipliers → SMO[22]

without touching every example
http://0rz.tw/BDHWJ

Kivinen [14] → method applies to kernel machines
Maji[18] → very quickly evaluating a histogram
intersection kernel
Flickr provide an organizational structure
How people like to group

SIKMA classifier allows efficient and accurate
learning of these categories

This property generalizes well
Even the test dataset was not obtained from Flickr
Suppose we have a list of
training examples

For the test example u
The classification score
Approximate the gradient by replacing the sum over all
examples(batch) with a sum over some subset, chosen at random.
It is usual to consider a single example.

New decision
function

It’s expensive to calculate ft-1. The NORMA Algo.[14] keeps a set of
support vectors of fixed length by dropping the oldest ones.

Doing so comes at a considerable cost in accuracy ！
D is feature dimension
Conventional SVM
SIKMA
solver

The Computational
O(TMD)        O(T2D)
complexity

O(T2)

The Space cost          O(MD)           O(D)
is Evaluation
for each example
T: # of training example
M: # of quantization bins
D: # of feature dimension
Measuring image similarity

   Found a simple Euclidean distance between
the SVM outputs.

   Since we have names(groups), we can also
perform text-based queries
(get image like “people dancing”) and
determine how two image are similar
Use four type of feature:

   SIFT feature
Detect and describe local patches
   Gist feature
960 dimensions Gist descriptor
   Color feature
RGB space, value range from 1 to 512
The whole image is represented as a 256
dimensional vector
Combine the outputs of these four classifier to be a final
prediction on a validation data set
For 103 Flickr categories, using

15,000 ~ 30,00 positive images and
60,000      negative images.

The average AP over these categories is
0.433
Select top five negative examples and five randomly chosen positive
examples from among the top 50 ranked images

yi is 1 if it is positive, otherwise 0
Flickr category can be described with several word, we can support text-
based queries.

Input a word query      find the Flickr group whose description contains
such word

Test this on the Corel data set,
with two queries ”airplane” and
“sunset”.
SIKMA, an algorithm to quickly train an SVM with the histogram
intersection kernel using tens of thousands of training examples

two images that are likely to belong to the same Flickr groups are
considered similar.

Experimental results show that matching with Prediction features better
than matching with visual features

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 0 posted: 3/25/2013 language: Unknown pages: 19