PageRank and HITS PageRank Repeated here for comparison only

Reviews
Shared by: Zach McClure
Stats
views:
96
rating:
not rated
reviews:
0
posted:
4/30/2009
language:
English
pages:
0
PageRank and HITS ✓ PageRank  Repeated here for comparison only Hypertext induced topic search Internet Information 2006/2007 Web retrieval (3) March 5, 2007  HITS  Valentin Jijkoun Maarten de Rijke ISLA, University of Amsterdam http://ilps.science.uva.nl/Teaching/II0607 2 PageRank PageRank example (1) 3 4 PageRank example (2) PageRank example (3) 5 6 PageRank example (4) PageRank example (5) 7 8 PageRank and HITS ✓  HITS  PageRank  Idea due to Kleinberg [1998] There are two kinds of web pages:   Repeated here for comparison only Hypertext induced topic search HITS   Authorities Hubs     Authorities are web pages to which many hubs point Hubs are web pages that point to many authorities A web page is an authority or a hub to a certain degree The degree is computed recursively 9 10 HITS (2)  Computing hubs and authorities Given a query, perform regular term-based retrieval      The set W contains the top t (~200) pages Expand the set W to the set S by adding all pages that link to or are linked from S Restriction: a page in W cannot add more than m (~50) pages (Delete domain-internal links) Put the remaining links into E and the pages in V and we have a (small) web graph G = (V, E) W S 11 12 Flashback  HITS algorithm   Bibliometric analysis   Citation analysis Citations generate “links” Compute iteratively the hub score and the authority score for page in V Rank the documents with respect to their authority score The original HITS algorithm identifies authorities per query   Two key notions  Co-citation    If papers i and j are both cited by k, they are said to be co-cited by k ~authority If papers i and j both cite paper k, there is a bibliographic coupling between them ~hub Alternative: compute authorities globally (T = whole collection)  Bibliographic coupling    Three shortcomings of the HITS algorithm    ... ... ... 13 14 Integration into the retrieval model   PageRank and HITS ✓ ✓ Page content Page structure   PageRank  Repeated here for comparison only Hypertext induced topic search Layout Positional information Uses link structure “Slash counting” HITS       Page ranking (PageRank or HITS)  Site structure  Age of page “Physical location” of page  Uses positional information about the user Observed user behavior 15 16

Related docs
1 PageRank
Views: 0  |  Downloads: 0
pagerank
Views: 120  |  Downloads: 7
PageRank for Product Image Search
Views: 20130  |  Downloads: 323
A Cautious Surfer for PageRank
Views: 0  |  Downloads: 0
Project _3 Implement PageRank
Views: 2  |  Downloads: 1
premium docs
Other docs by Zach McClure