Keyword Generation for Search Engine Advertising
Amruta Joshi*, Yahoo! Research Rajeev Motwani, Stanford University
* This work was done at Stanford
18 December 2006
Amruta Joshi and Rajeev Motwani, Stanford University
1
Search Results
Sponsore d Search Results
18 December 2006
Amruta Joshi and Rajeev Motwani, Stanford University
2
Long Tail
Expensive, high frequency keywords Frequency in query-logs
Target inexpensive, low frequency keywords instead
Queries
Amruta Joshi and Rajeev Motwani, Stanford University
18 December 2006
3
Keyword Pricing
18 December 2006
Amruta Joshi and Rajeev Motwani, Stanford University
4
Pick the right keywords
Advantages
more focused audience lesser competition, easier to get #1 position cost-effective alternative
Keywords should be
Highly Relevant to base query Nonobviousness to guess from the base query
E.g.:
hawaii vacation $3 kona holidays $0.11
Amruta Joshi and Rajeev Motwani, Stanford University
18 December 2006
5
Objective
To generate, with good precision and recall, a large number of keywords that are relevant to the input word, yet nonobvious in nature.
18 December 2006
Amruta Joshi and Rajeev Motwani, Stanford University
6
Who’s doing all this?
Large Advertisers SEO companies and small start-ups manage advertising profiles Eg: www.adchemy.com, www.wordtracker.com, http://www.globalpromoter.com Eventually every advertiser is interested in optimizing his portfolio
Amruta Joshi and Rajeev Motwani, Stanford University 7
18 December 2006
Other Techniques …
Meta-tag Spidering:
Extract Keyword & Description tags from top search hits Example of meta-tags for query ‘hawaii travel’
Relevant: hawaii travel, hawaii vacation, hawaiian islands, hawaii tourism Off-topic: hawaii homes, moving to hawaii, hawaii living, hawaii news, living in hawaii, hawaii products, Irrelevant: sovereignty, volcanoes, sports, music
Amruta Joshi and Rajeev Motwani, Stanford University
18 December 2006
8
Other Techniques …
Proximity-based tools
Pick phrases in the proximity of given word e.g.: family hawaii vacations, discount hawaii vacations
Query log Mining
Suggest popular queries containing seed keywords
18 December 2006
Amruta Joshi and Rajeev Motwani, Stanford University
9
Other Techniques
Advertiser log mining or Query Cooccurrence based mining
Exploits co-occurrence in advertiser keyword search logs Increase competition!
18 December 2006
Amruta Joshi and Rajeev Motwani, Stanford University
10
Directed Relevance Relationships
Word A strongly suggests word B, but the reverse may not hold true
A
x
B
B
y
A
x≠y
Example:
eurail 25 railways railways 2 eurail
18 December 2006
Amruta Joshi and Rajeev Motwani, Stanford University
11
Building Context
Characteristic Document Build context of the term using terms found in the proximity of seed term in the top 50 hits from search engine for that term
europe . europe .
Search Engine C
18 December 2006
Amruta Joshi and Rajeev Motwani, Stanford University
12
Building the Graph
TermsNet Nodes = terms Edges = directed relevance relationships Weights = strength of directed relationship, i.e., the frequency of destination term in characteristic document of source term
18 December 2006
Amruta Joshi and Rajeev Motwani, Stanford University
13
TermsNet
25
railways
32
eurail C
30
C
14
maps C
19
europe . C
euro
15
atlas C
C
schengen C
18 December 2006
Amruta Joshi and Rajeev Motwani, Stanford University
14
Ranking Suggestions
Quality Score Incorporates
Edge-weights Normalization for common words
x
wx,q
q
Quality Q(x, q) = wx,q / (1+log (1+∑wx,i))
where each i is an outneighbor of ‘x’
18 December 2006
Amruta Joshi and Rajeev Motwani, Stanford University
15
Ratings
Relevance
Indicates Relevance of suggested keyword to seed word Given by human editors e.g.: For query ‘flights’
Relevance (‘flights’, ‘cathay pacific’) = 1 Relevance (‘flights’, ‘cheap flight’) = 1 Relevance (‘flights’, ‘magazines’) = 0
Nonobviousness
Indicates nonobviousness of suggested keyword relative to seed word Calculated as: If No base query word/stem present in suggested keyword, Nonobviousness = 1, else = 0 e.g.: For query ‘flights’
Relevance (‘flights’, ‘cathay pacific’) = 1 Relevance (‘flights’, ‘cheap flight’) = 0 Relevance (‘flights’, ‘magazines’) = 1
Used standard Porter stemmer for automating this rating
Amruta Joshi and Rajeev Motwani, Stanford University
18 December 2006
16
Evaluation
Evaluation Measures Average Precision: Ratio of number of relevant keywords retrieved to number of keywords retrieved.
Indicates quality of results
Average Recall The proportion of relevant keywords that are retrieved, out of all relevant keywords available. For our expts Recall (Ti) = # retrieved by Ti / # retrieved by (T1 U T2 U…U Tn) Average Nonobviousness Average of all nonobviousness ratings of suggested keywords
18 December 2006
Amruta Joshi and Rajeev Motwani, Stanford University
17
Output for query ‘flights’
Co-occurrence Based Airfare airfares airlines Cyprus goa flys holidays trains aer aeroflot aeromexico aircanada alicante bwia heathrow icelandair bookings Consolidator Query Log Flights cheap flights airline flights cheap airline flights cheap international flights flights to europe business class flights flights new york australia flights cheap flights to europe cheap flights to orlando cheap flights las vegas track flights flights florida flights europe las flights cheap flights to australia Meta-Tag Spidering real time flight arrivals airfare flights flight map delays cruises us flight arrivals flight arrivals state map flight arrival flight cancellation s arrival times arrival delays flight departure vacation packages street map Meta-Crawler Lists air travel airline discount tickets airline fares airline tickets airline tickets under 100 american airlines bargain flights bmibaby british airways british airways flights british airways home page british airways timetable british midland budget airline Query-log Mining flight cheap flight las vegas flight flight tracker flight to orlando flight to london flight to new york airline flight flight to los angeles flight 93 flight to fort lauderdale light of the phoenix flight to honolulu flight to chicago flight to miami TermsNet cheap flights airline flights air newzealand flight prices bmibaby globespan low cost airlines united airlines airlineconsolidators charter flights airfare flight reservations cathay pacific british midland airways discount airfare flight tickets jet2 travelocity
18 December 2006
Amruta Joshi and Rajeev Motwani, Stanford University
18
Avg. Precision, Recall, Nonobviousness
1.2
1 1 0.94 1 0.913793
1
0.8
0.636364
0.788043 0.744681
0.6
0.479675
0.559322
0.58
0.4
0.254 0.196 0.201 0.118
Avg. Precision Avg. Recall
0
0.2
0.094
0
0
Query Cooccurrence
Query-Log Mining
Meta-Tag Spidering
MetaCrawler Query Logs Lists with recency
TermsNet
Avg. Nonobviousness
18 December 2006
Amruta Joshi and Rajeev Motwani, Stanford University
19
Evaluation Measures
F-measures
Measure of overall performance
F(PR) – Avg. Precision & Avg. Recall F(RN) – Avg. Recall & Avg. Nonobviousness F(PN) – Avg. Precision & Avg. Nonobviousness F(PRN) – Avg. Precision, Avg. Recall & Avg. Nonobviousness
Harmonic mean of
18 December 2006
Amruta Joshi and Rajeev Motwani, Stanford University
20
F-Measures
0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 F(PR) 0 Query Cooccurrence Query-Log Mining Meta-Tag Spidering MetaCrawler Lists Query Logs with recency TermsNet F(RN) F(PN) F(PRN)
18 December 2006
Amruta Joshi and Rajeev Motwani, Stanford University
21
Quality of Suggestions over different intervals of ranked results
1 Avg. Precision & Avg. Nonobviousness over Number of Top Suggestions
0.8
0.6
0.4 Avg. Nonobviousness 0.2 Avg. Precision
0 0 100 200 300 400 Top n keyw ord suggestions
Amruta Joshi and Rajeev Motwani, Stanford University
500
600
Figure 2: Quality of keywords over different ranked intervals
18 December 2006
22
Future Directions
Incorporate keyword frequency in ranking suggestions Incorporate keyword pricing information in ranking suggestions Applications to other domains
Find related movies, papers, people
18 December 2006
Amruta Joshi and Rajeev Motwani, Stanford University
23
Thank You!
Questions? amrutaj@cs.stanford.edu
18 December 2006
Amruta Joshi and Rajeev Motwani, Stanford University
24