Docstoc

Googling to the Max

Document Sample
Googling to the Max Powered By Docstoc
					Search Smarter
        Improving Your
   Search Engine Strategies.
MVLS Workshop, March 16, 2006

Linda J. Goff, Head of Instructional Services
   California State University, Sacramento

               Linda J. Goff - Spring 2006 -    1
                  http://library.csus.edu
This presentation was created by
Linda J. Goff, March 16, 2006 for
Mountain Valley Library System.

Some of the content of this
presentation was borrowed with
permission from a presentation by
Patrick Douglas Crispen “Google
201:Advanced Googolgy.” which can
be viewed at: http://netsquirrel.com

               Linda J. Goff - Spring 2006 -   2
                  http://library.csus.edu
Glossary
   algorithm                       http
   browser                         hypertext link
   cache                           metasearch
   cookies                         invisible Web
   directory path                  phishing
   domain name                     portal sites
   html                            URL


              Linda J. Goff - Spring 2006 -           3
                 http://library.csus.edu
Reading Parts of the URL
http://library.csus.edu/databases/
  The part before the colon is the access
   method or protocol, (hypertext
   transfer protocol).
  The part after the double slashes is the
   net address or domain name of the
   computer where the resource is
   located.
  The directory path and filename
   come after the next slash.


               Linda J. Goff - Spring 2006 -   4
                  http://library.csus.edu
Today’s Agenda
 What search engines can & can’t do.
 How search engines think and work.
 Picking the right web search tool.
 Searching techniques & tips.
 Hands-on Exercises.



            Linda J. Goff - Spring 2006 -   5
               http://library.csus.edu
Today’s Goals – To learn
 How Google and other search engines
  really work.
 About alternatives to Google and
  Yahoo.
 How to construct better searches.
 Explore some of the newest web tools
  and features.


             Linda J. Goff - Spring 2006 -   6
                http://library.csus.edu
Part 1
                                         Search
What Search                              Engines
Engines Can
and Cannot Do



         Linda J. Goff - Spring 2006 -             7
            http://library.csus.edu
They can’t do (yet)
 Access the invisible Web (most
  proprietary databases).
 Search the content of pages that
  require interaction (filling out forms).
 Provide human evaluation of content
  (expert pages).



               Linda J. Goff - Spring 2006 -   8
                  http://library.csus.edu
Expert Pages
Infomine - Scholarly Internet Resource
   Collection http://infomine.ucr.edu/
Librarians’ Internet Index http://lii.org/
The WWW Virtual Library
http://www.vlib.org
CSUS Librarian Guides:
http://library.csus.edu/guides/


              Linda J. Goff - Spring 2006 -   9
                 http://library.csus.edu
Scholarly Research
 Google Scholar
  http://scholar.google.com/ is
  attempting to remedy this.
 Elsevier has produced “Scirus for
  Scientific Information Only”
  http://www.scirus.com/srsapp/



              Linda J. Goff - Spring 2006 -   10
                 http://library.csus.edu
Web search tools can now...
   Search multiple search engines.
   Answer natural language questions.
   Rank sites by links made to them.
   Cluster results into categories.
   Limit searches with advanced features.
   Search by file type.
   Or - a combination of the above.

               Linda J. Goff - Spring 2006 -   11
                  http://library.csus.edu
Natural Language Search
 Type your questions in Natural Language,
  (e.g., http://ask.com, (formerly
  AskJeeves.com).
 Analyzes words, grammar and syntax, and
  uses "templatics" to look for patterns in the
  way questions are asked.
 Ask.com responds with one or more closely
  related questions that it already knows the
  answer to.

                Linda J. Goff - Spring 2006 -   12
                   http://library.csus.edu
Metasearch engines

 Search simultaneously across multiple
  search engines and displays top sites
  in each:
     Dogpile.com
     Vivisimo.com
     GahooYoogle.com (Yahoo & Google)
     Jux2.com (Yahoo, Google & MSN)


               Linda J. Goff - Spring 2006 -   13
                  http://library.csus.edu
Part 2
                                          Search
How Search                                Engines
Engines Think
and Work



          Linda J. Goff - Spring 2006 -             14
             http://library.csus.edu
Writing a Search Statement
 Most users are not aware what
  happens once they type in a string of
  words and push the search button.
 We need to fully understand and
  intelligently perform web searches in
  order to better serve our users.



              Linda J. Goff - Spring 2006 -   15
                 http://library.csus.edu
       All Search Engines use
      Boolean Search Strategy
       a   AND          b                       a
   family AND violence                              b
           a   OR   c
    family OR domestic                          a
                                                    c
       b   NOT          d
violence NOT sexual abuse
                                                a
                                                    d
                Linda J. Goff - Spring 2006 -           16
                   http://library.csus.edu
How pages are indexed
 Programs called spiders (a.k.a. robots
  or “bots”) that constantly search the
  Internet looking for new or updated
  Web pages.
 When a spider finds a new or updated
  page, it reads that entire page, reports
  back, and then visits all of the other
  pages to which that new page links.

               Linda J. Goff - Spring 2006 -   17
                  http://library.csus.edu
Why is this significant?
 Pages with no links are unlikely to
  be indexed.
 Different search engines have
  different algorithms and therefore
  their indexes have different results.
 Let’s look in depth on how the most
  popular search engine (Google)
  actually works.
             Linda J. Goff - Spring 2006 -   18
                http://library.csus.edu
       Searching
 When you search, you’re actually searching
  Google’s cache of Web pages.
 And because of this, you can search for
  more than text or phrases in the body of a
  Web page.
 Google has some secret, advanced search
  operators that let you search specific parts
  of Web pages or specific types of
  information.
  Source: Google Hacks, p. 5

                          Linda J. Goff - Spring 2006 -   19
                             http://library.csus.edu
How Google works
 When you search for multiple keywords,
  Google first searches for all of your
  keywords as a phrase.
 So, if your keywords are disney
  fantasyland pirates, any pages on
  which those words appear as a phrase
  receive a score of X.



  Source: http://netsquirrel.com
                       Linda J. Goff - Spring 2006 -   20
                          http://library.csus.edu
How Google Works - Adjacency
                                       Google then
                                        measures the
                                        adjacency between
                                        your keywords and
                                        gives those pages
                                        a score of Y.
                                       What does this
                                        mean in English?
                                        Well …
   Image source: Google                      Source: Google Hacks, p. 21

                      Linda J. Goff - Spring 2006 -                        21
                         http://library.csus.edu
How Adjacency Works
A page that says
  “My favorite Disney attraction, outside of
  Fantasyland, is Pirates of the Caribbean”
will receive a higher adjacency score than a page
that says
    “A Walt Disney was a both a genius and a
    taskmaster. The team at WDI spent many
    sleepless nights designing Fantasyland. But
nothing could compare to the amount of
    Imagineering work required to create Pirates
of the Caribbean.”
                                       Source http://netsquirrel.com
                   Linda J. Goff - Spring 2006 -                 22
                      http://library.csus.edu
How Google Works - Weights
 Then, Google measures the number
  of times your keywords appear on the
  page (the keywords’ “weights”) and
  gives those pages a score of Z.
 A page that has the word disney four
  times, fantasyland three times, and
  pirates seven times would receive a
  higher weights score than a page that
  only has those words once.
                                             Source: Google Hacks, p. 21

             Linda J. Goff - Spring 2006 -                         23
                http://library.csus.edu
Putting it All Together
 Google takes
     The phrase hits (the Xs),
     The adjacency hits (the Ys),
     The weights hits (the Zs), and
     About 100 other secret variables
 Throws out everything but the top 2,000
 Multiplies each remaining page’s individual
  score by its “PageRank”
 And, finally, displays the top 1,000 in
  order.
                                       Source http://netsquirrel.com
                   Linda J. Goff - Spring 2006 -                 24
                      http://library.csus.edu
PageRank?
 There is a premise in scholarship
  that the importance of a research
  paper can be judged by the
  number of citations the paper has
  from other research papers.
 Google simply applies this premise
  to the Web: the importance of a
  Web page can be judged by the
  number of hyperlinks pointing to it
  from other pages.
                                             Source: Google Hacks, p. 294

             Linda J. Goff - Spring 2006 -                          25
                http://library.csus.edu
Also
 Google’s Boolean default is AND.
 The order of your keywords matters.
 Capitalization does not matter.
 Google has a hard limit of 10
  keywords.
 Google ignores a BUNCH of common
  “stop” words.

                                 Source http://netsquirrel.com
             Linda J. Goff - Spring 2006 -                 26
                http://library.csus.edu
Knowledge is Power
 Those who understand how Google
  works can manipulate the end results.
 When we move into the lab - Type
  “miserable failure” and hit “I’m
  feeling lucky” button.
 This is called Google bombing.



             Linda J. Goff - Spring 2006 -   27
                http://library.csus.edu
Part 3                                         World
Picking the                                    Wide
                                               Web
Right Search
Engine



               Linda J. Goff - Spring 2006 -           28
                  http://library.csus.edu
 Google and Yahoo are biggest
 qSearch monitors 1.5
  million English-speakers
  worldwide (1 million in
  the United States) via
  proxy metering. July’06
  to measure searching.




Source:
    http://searchenginewatch.com/reports/article.php/2156431
                          Linda J. Goff - Spring 2006 -        29
                             http://library.csus.edu
Choose based on your
Information Need
 Try Noodle Tools:
  http://www.noodletools.com/debbie/li
  teracies/information/5locate/adviceen
  gine.html




             Linda J. Goff - Spring 2006 -   30
                http://library.csus.edu
Google.com is the most popular
 Rankings based on number of links
  made to the sites, so results have
  been “voted on” by these links.
 .edu link counts more than one from
  a .com page.
 Special features include Advanced
  Search, Image, Froogle, Blogger,
  Google Catalogs, Google World etc.

             Linda J. Goff - Spring 2006 -   31
                http://library.csus.edu
Yahoo.com
 Originated “Directory” format to
  organize sites by subject and
  subheadings.
 Can be personalized: “My Yahoo”.
 Geographic versions “Get Local.”




            Linda J. Goff - Spring 2006 -   32
               http://library.csus.edu
Vivisimo.com
 Queries one or more web search engines
  (Metasearch).
 Clusters Documents into groups based
  on this information. Try Clusty.com also.
 Groups the documents. Orders the
  groups and the documents within each
  group.
 Displays the hierarchical categories.

               Linda J. Goff - Spring 2006 -   33
                  http://library.csus.edu
Ask.com
 Supports natural language searches.
 Recently absorbed Teoma.com
 Side bar suggests how to Narrow,
  Expand or find Related sites.




             Linda J. Goff - Spring 2006 -   34
                http://library.csus.edu
GahooYoogle.com
 Single search box queries both
  Google and Yahoo simultaneously.
 Displays results side-by-side.




             Linda J. Goff - Spring 2006 -   35
                http://library.csus.edu
Consult the Experts
 Searchenginewatch.com
 Searchengineguide.com
 Noodle.com
 netsquirrel.com
 LLRX.com
 Infotoday.com


             Linda J. Goff - Spring 2006 -   36
                http://library.csus.edu
Part 4
                                             World
Searching                                    Wide
Techniques                                   Web
and Tips



             Linda J. Goff - Spring 2006 -           37
                http://library.csus.edu
Command Searching
 Most search engines support these
  commands: plus/AND (+)
  minus/NOT(-) “phrase”
 Some support truncation (*) but not
  Google.
 Most ignore “stop” words (articles,
  conjunctions etc.)


             Linda J. Goff - Spring 2006 -   38
                http://library.csus.edu
Be as specific as possible
 If you’re planning a trip to Yosemite
  and know you’ll be camping, don’t
  search just Yosemite, instead try:
  +Yosemite +”camping reservation” –hotels
 Now all this can be done better from
  an Advanced Search screen.




               Linda J. Goff - Spring 2006 -   39
                  http://library.csus.edu
My Favorite Advanced Features
   Limit to domain name.
   Search in Title.
   Search by file type.
   Links – to find pages that link to the
    page.




                Linda J. Goff - Spring 2006 -   40
                   http://library.csus.edu
Guessing works
 Try typing a domain name into the
  address bar – it’s often quicker than
  using a search engine:

 ibm.com
  Pepsi
  whitehouse


               Linda J. Goff - Spring 2006 -   41
                  http://library.csus.edu
Use Shortcuts!
 Shortcuts are keywords that will take
  you to specific search features, such
  as maps, calculations, local info,
  airport conditions etc.

http://www.googleguide.com/shortcuts.html
http://tools.search.yahoo.com/shortcuts/index.html




                   Linda J. Goff - Spring 2006 -     42
                      http://library.csus.edu
Not sure the site is trustworthy?
 Erase the file name to get to the root
  of the URL and then see if you can
  find “About Us” or FAQ that will tell
  you enough to make a judgment.




              Linda J. Goff - Spring 2006 -   43
                 http://library.csus.edu
Lost a page?
Try the Wayback Machine!
 Internet Archive:
  http://www.archive.org/
 The Wayback Machine searches the
  archive for cached pages of versions
  of older web pages.




              Linda J. Goff - Spring 2006 -   44
                 http://library.csus.edu
Can you stand to take in any
more information?
 Yes?
 No?




           Linda J. Goff - Spring 2006 -   45
              http://library.csus.edu
Part 5
                                          Search
Bonus                                     Engines
Section




          Linda J. Goff - Spring 2006 -             46
             http://library.csus.edu
Google Tools




          Linda J. Goff - Spring 2006 -   47
             http://library.csus.edu
Google Services




          Linda J. Goff - Spring 2006 -   48
             http://library.csus.edu
Google Web Search Features




          Linda J. Goff - Spring 2006 -   49
             http://library.csus.edu
Google Print
                       Google has made deals
                        with publishers to make
                        new books searchable
                        online.
                       Search link to books
                        containing your search
                        terms, as well as other
                        information about the title.
                       Click one of the links under
                        "Buy this Book" and you'll
                        go straight to a bookstore
                        selling that book online.
Image Source:http://msnbc.msn.com/id/9785346/site/newsweek/
                    Linda J. Goff - Spring 2006 -        50
                       http://library.csus.edu
Google’s Digital Library
 Google has made a deal with large
  libraries to scan thousands of out of
  print library books to put them online.
 Association of American Publishers (AAP),
  has sued in federal court to stop Google.
 Publishers charge that by making electronic
  copies, the search giant is committing
  massive copyright infringement.
 We still recommend using real books!
Source: http://msnbc.msn.com/id/9785346/site/newsweek/
                    Linda J. Goff - Spring 2006 -        51
                       http://library.csus.edu
Google Labs




          Linda J. Goff - Spring 2006 -   52
             http://library.csus.edu
RSS or Feeds
 "RSS" may have come from "Really Simply
  Syndication." A feed is simply a way in
  which a reader may subscribe to website
  content, such as a blog or news site.
 You can use Feedster when you are looking
  for timely information from millions of
  news, blog and podcast sources.
  http://www.feedster.com/
 Most feeds are subscription services (you
  must register and login).
               Linda J. Goff - Spring 2006 -   53
                  http://library.csus.edu
Linda J. Goff - Spring 2006 -   54
   http://library.csus.edu
This presentation was created by
Linda J. Goff, March 16, 2006 for
Mountain Valley Library System.

Some of the content of this
presentation was borrowed with
permission from a presentation by
Patrick Douglas Crispen “Google
201:Advanced Googolgy.” which can
be viewed at: http://netsquirrel.com

               Linda J. Goff - Spring 2006 -   55
                  http://library.csus.edu

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:5
posted:12/25/2010
language:English
pages:55