Docstoc

Googling

Document Sample
Googling Powered By Docstoc
					Googling


Library Research in Context:
Communication Studies
Fall 2007
   Glossary

      Anatomy of a URL
    http://www.uiowa.edu/commstud/faculty/index.html


• Type of file (http, ftp, telnet)

• Domain name (location on the Internet) and
  server (closest thing to a publisher)

• Path or directory on the computer to this file

• Name of file, and its file extension (usually ending
  in .html or .htm)
Searching on the Web - Agenda

   How search engines work
   What search engines can and
    cannot do
   Variety of search tools and
    strategies
How Google Works—
Indexing/Locating Pages

   Pages are indexed by spiders (a.k.a.
    robots or “bots”)
   Spiders reads pages, report back,
    and then visit all other pages to
    which that new pages link
   Pages with no links are unlikely to be
    indexed
   Each search engine has it’s own
    algorithms for retrieving results
How Google Works—
Searching Cached Pages
   When you search, you’re actually
    searching Google’s cache of Web pages.
   And because of this, you can search for
    more than text or phrases in the body of
    a Web page.
   Google has some secret, advanced search
    operators that let you search specific
    parts of Web pages or specific types of
    information.
Source: Google Hacks, p. 5
Linda J. Goff, http://library.csus.edu
How Google Works—
Google looks at:
 Your words as a phrase (Xs)
 Your words as they appear adjacent to
  one another (Ys)
 The number of times your words appear
  within a page (Zs)

   YOUR WORD ORDER MATTERS


Source: Google Hacks, p. 21
Linda J. Goff, http://www.library.csus.edu
How Google Works—
Part 2

Google also considers:
 About 100 other secret variables
 Throws out everything but the top 2,000
 Multiplies each remaining page’s
  individual score by its “PageRank”—how
  often others link to that page
 And, finally, displays the top 1,000 in
  order.
Googling Basics

   AND is the default operator
   Word order matters
   Capitalization does not matter
   “Stop words” are ignored (in, is, it,
    of, etc.)
Advanced Googling

   Search by field (site:, link:,
    filetype:,intitle:, etc.)
   Advanced search screen
   Other Googles (Scholar, Froogle,
    News, Book, language tools, etc.)
What Search Engines like Google
CANNOT do (yet)

   Search the Invisible Web
    (proprietary databases, sites that
    can’t be crawled)
   Provide evaluation of content
Other Search Engines

   http://www.yahoo.com
   http://www.askjeeves.com (Teoma)
   http://www.alltheweb.com
   http://www.gigablast.com
   http://www.looksmart.com
   http://dmoz.org/
MetaSearch Engines

   clusty.com
   www.dogpile.com
   www.surfwax.com

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:3
posted:7/21/2012
language:English
pages:12