Document Sample
munson Powered By Docstoc
					A Relevance Model for
 Web Image Search
Cheng Thao and Ethan V. Munson
 Multimedia Software Laboratory
University of Wisconsin-Milwaukee
How to Search the Web for Images?

 Commercial search engines use HTML
 source to choose relevant images
 Google’s image search FAQ says:
  “Google analyzes the text on the page
  adjacent to the image, the image caption and
  dozens of other factors to determine the
  image content.”
 But what is really being done?
   Which text? What caption? Which other
          Prior Research
Several systems use text to find candidate
images for further image processing
  WebSeer, Diogenes, WebPiction
Shen et al and Lu and William use text
features combined by heuristic weights
All of these systems use heuristic
relevance models
         Prior Research (2)
MARIE-4 uses an empirically-derived model to
identify a likely image caption
  Image is considered relevant when caption matches
  text query
Tsymbalenko and Munson (WDA 2001) built a
search engine that accepted single word queries
  Found image filename, page title and ALT text to
  have useful levels of precision
  No general relevance model, small data sample
  Some methodologic problems
           A New Study
Construct relevance model for Web image
search by statistical analysis
Candidate pages retrieved by commercial
text search engine
Multiple human raters set standard for
Constructed model will try to match human
24 (mostly) two-word queries
  3 queries each from 8 categories
Google returned 1000 links for each query
100 valid links chosen at random
  2400 pages total
  Contained 5806 “interesting” images
    Larger than 100-by-100 pixels
    0.5 <= aspect ratio <= 2.0
                 Query Selection
24 queries (8 categories)
   Famous people
       “bill gates”, “george bush”, “britney spears”
   Less famous people
       “john white”, “michael brown”, “william black”
   Famous places
       “new york”, “michigan lake”, “yellowstone park”
   Less famous places
       “spokane”, “burlington vermont”, “haw river”
       “new year”, “thanksgiving”, “halloween”
       “happy child”, “sad woman”, “burning house”
       “raining”, “volcano erupt”, “bomb explode”
       “eiffel tower”, “vietnam memorial”, “statue liberty”
          Methodology (2)
Pages cleaned and analyzed for matching
text in 53 HTML features
  Phrase matching, except in URL features
Each image rated for relevance by 3
human raters
  Majority vote
Relevance model constructed by forward-
stepwise logistic regression
     HTML features studied
53 HTML features
  Page level
  Image level
  Text formatting
                                   HTML features
Clue ID                     Name      Clue ID               Name   Clue ID                 Name
          1   PageFile                     19   CellText                     37   H2_above
          2   PagePath                     20   CellAbove                    38   H3_above
          3   PageHost                     21   CellBelow                    39   H4_above
          4   PageTitle                    22   CellRight                    40   H5_above
          5   MetaKeywords                 23   CellLeft                     41   H6_above
          6   MetaDescription              24   Column                       42   H1_below
          7   ImageFile                    25   Row                          43   H2_below
          8   ImagePath                    26   ColumnHeading                44   H3_below
          9   ImageHost                    27   RowHeading                   45   H4_below
     10       ImageTitleAttr               28   RowTitleAttr                 46   H5_below
     11       ImageIdAttr                  29   RowIdAttr                    47   H6_below
     12       ImageNameAttr                30   RowNameAttr                  48   Bold
     13       ImageAltAttr                 31   TableCaption                 49   Emphasis
     14       LinkText                     32   TableSummary                 50   Italic
     15       ObjectText                   33   TableTitleAttr               51   Strong
     16       CellTitleATtr                34   TableIdAttr                  52   Underline
     17       CellIdAttr                   35   TableNameAttr                53   Body
     18       CellNameAttr                 36   H1_above
                         Feature Precision

7-image filename
13-image alt attribute
14-link text
19-25 table cells
42-43 heading
                        Feature Recall

4-document title
13-alt attribute
14-anchor text
19-cell text
24-24 column and rows
53-body text
         Logistic Regression
Non-linear regression technique
Models the probability of an event based on k
independent variables x1..xk as

Coefficients B1..Bk are computed iteratively using the
maximum-likelihood method
Independent variables were chosen using forward
stepwise method
Selecting independent variables
 Forward stepwise method
  Start with empty model
  Add independent variables one at a time
  An independent variable is chosen if it causes
  the maximal improvement in the model
  according to some goodness-of-fit statistic
  Uses -2LL (negative 2 times the log-
  likelihood) goodness-of-fit statistic
Feature Frequency
                  Frequency-based Analysis

4-document title
13-alt attribute
14-anchor text
19-cell text
24-24 column and rows
53-body text
                  Frequency-based Analysis

4-document title
13-alt attribute
14-anchor text
19-cell text
24-24 column and rows
53-body text
Relevance Model (13 independent variables)
           Model Quality
Classification Table

  Recall: 36.7%
  Precision: 66.0%
Variance accounted for: 22–33% (approx.)
       Overfitting Analysis
Regression models may “overfit”
idiosyncratic qualities of data sample
Effect can be estimated by sample splitting
We did this (details on request) and found
  Recall of 35.5%
  Precision of 64.4%
Overfitting accounts for 1.2% of the recall
and 1.6% of the precision
George Bush Query
Burning House Query
Britney Spears Query
Created an empirically-based relevance
model for Web images
Most important features are: image filename,
page title, and page filename
Confirms our earlier research
           Future Work
Standard text IR techniques including
stemming and synonomy
Richer representations of HTML features
Analysis of CSS rules
Differences between query categories
More queries
More categories of queries
 Cheng Thao & Ethan V. Munson
 Multimedia Software Laboratory
          Dept. of EECS
University of Wisconsin-Milwaukee
           (Otto Project)

Shared By: