Google I’m Feeling Lucky by lqh68203

VIEWS: 9 PAGES: 32

									     Google:
I’m Feeling Lucky

           Catherine Ure
           October 2009
Google
There are lots of search                                   Share of Searches
                                                           (%)
  engines-
                                           Google          64%

why are we just looking                    Yahoo           16%
 at Google?
                                           MSN/WindowsLive 10.7%
                                           /Bing
                                           AOL             3.1%

Top Search Providers for August 2009
(data from Nielsen, SEW                    Ask.com         1.7%
   http://searchenginewatch.com/3634991)
Google Facts
•   Google's name is a play on the word googol, which refers to the number 1
    followed by one hundred zeroes (It’s original name was Backrub).

•   Google started as a research project at Stanford University and was
    created by Ph.D. candidates Larry Page and Sergey Brin when they were
    24 years old and 23 years old.

•   Google debuted in 1998

•   Google's index of web pages, comprises of billions of web pages. Google
    searches this collection of web pages often in less than half a second (1
    trillion, 2008)

•   “google" was added to the Merriam Webster Collegiate Dictionary and the
    Oxford English Dictionary in 2006, meaning "to use the Google search
    engine to obtain information on the Internet
Google Myths
•   Google searches the whole web
     – No search engine searches the whole web

•   The best information is found in the first 10 results
     – Search engines use formulas or algorithms to rank search results
     – No guarantee that the first 10 results are best for the researchers purpose

•   Searching is easy
     – Searching is easy; finding is more difficult
     – Successful research does require time for planning and evaluation

•   Everything important is free
     – Information is a commodity

•   Everything is truthful, authoritative and accurate
     – Information may be tinted with bias, opinions and inaccuracies- researchers
       need to develop evaluation skills
How does Google work?
Searching the web is like

 “looking in a very large book with an impressive index telling you exactly where
    everything is located. When you perform a Google search, programs check
    the index to determine the most relevant search results to be returned
    ("served") to you”


The three key processes in delivering search results to you are:

•    Crawling
•    Indexing
•    Serving



http://www.google.com/support/webmasters/bin/answer.py?answer=70897&topic+8843
How does Google work?
Crawling

• Crawling is the process by which computer programs (called Web
  spiders/crawlers/robots) discover new and updated pages to be
  added to the Google index.

• Web spiders fetch (or "crawl") pages on the web and use an
  algorithmic process to determine which sites to crawl, how often,
  and how many pages to fetch from each site.

• As the Web spider visits each of these websites it detects links on
  each page and adds them to its list of pages to crawl. New sites,
  changes to existing sites, and dead links are noted and used to
  update the Google index.
How does Google work?
Indexing

• Web spiders process each of the pages crawled
  in order to compile a massive index of all the
  words it sees and their location on each page.

• Also processes information in key content tags

• Not all content types can be processed eg
  dynamic pages or rich media
How does Google work?
Serving results

• When a user enters a query, machines search the index for
  matching pages and return the results believed to be most relevant
  to the user.

•   Relevancy is determined by over 200 factors, one of which is the
    PageRank for a given page.
•
• PageRank is the measure of the importance of a page based on the
  incoming links from other pages. In simple terms, each link to a
  page from another site adds to a site's PageRank.

• PageRank is unique to Google
Searching Google
• Accessible from:
    – Toolbar
    – http://www.google.co.uk/
    – Advanced search

• Some basic facts
• Every word matters. Generally, all the words you put in the query
  will be used. There are some exceptions.

• Search is always case insensitive. Searching for [ new york times ]
  is the same as searching for [ New York Times ].

• With some exceptions, punctuation is ignored (that is, you can't
  search for @#$%^&*()=+[]\ and other special characters).
Searching Google: search tips

To see a definition for a word or phrase

Define: renewable energy
Searching Google: search tips
Phrase search ("")

• By putting double quotes around a set of words,
  you are telling Google to consider the exact
  words in that exact order without any change.

• You might miss good results accidentally. For
  example, a search for "Alexander Bell" (with
  quotes) will miss the pages that refer to
  Alexander G. Bell.
Searching Google: search tips
Search within a specific website (site:)

• Google allows you to specify that your search
  results must come from a given website. For
  example, the query recession site:FT.com will
  return pages about recession but only from
  FT.com.

• You can also specify a whole class of sites, for
  example renewable energy site:.ac.uk will
  return results only from a .ac.uk domain
Searching Google: Search tips
Terms you want to exclude (-)

• Attaching a minus sign immediately before a
  word indicates that you do not want pages that
  contain this word to appear in your results.

• For example the query anti-virus -software
  will search for the words 'anti-virus' but exclude
  references to software.
Searching Google: search tips
Search exactly as is (+)

• Google employs synonyms automatically, so that it finds
  pages that mention, for example, childcare for the query
  [ child care ] (with a space)

• But sometimes Google helps out a little too much and
  gives you a synonym when you don't really want it. By
  attaching a + immediately before a word (don't add a
  space after the +), you are telling Google to match that
  word precisely as you typed it. Putting double quotes
  around a single word will do the same thing.
Searching Google: search tips
The OR operator

Google's default behavior is to consider all the words in a
  search. If you want to specifically allow either one of
  several words, you can use the OR operator (note that
  you have to type 'OR' in ALL CAPS).

For example, renewable energy 2005 OR renewable
  energy 2006 will give you results containing either one
  of these years, whereas renewable energy 2005 2006
  (without the OR) will show pages that include both years
  on the same page.
Google Book Search
Google Book Search
Background

•   Launched 2004.
•   Full text of approximately seven million books scanned by Google stored in an
    electronic database.
•   Users can read and download entire works in the public domain. Works still in
    copyright are shown in the style of a card catalogue (unless the copyright owner has
    consented to greater access), featuring basic bibliographic information, “snippets” of
    text showing the search term in context.
•   Google has partnered with select libraries around the world to digitize their collections
    and include them in GBS.
•    Sept. 20, 2005, the Authors Guild and certain authors filed a class action lawsuit
    against Google, alleging that Google infringed copyright by digitizing works contained
    in the libraries’ collections without the permission of the copyright holders and by
    showing “snippets” of those works as part of GBS (The Authors Guild, Inc., et al. v.
    Google Inc., Case No. 05 CV 8136 (S.D.N.Y.)).
•   Oct.19, 2005, five publisher-members of the AAP also filed a lawsuit against Google,
    raising issues identical to those brought in the class action lawsuit.
•   Google maintains that its activities do not infringe copyright. It also argues that even if
    its activities are found to be infringing copyright, they fall under the fair use doctrine.
•   After two years of negotiations, a settlement agreement was reached on Oct. 28.
Google Book Search
The books in Google Books come from two sources.

• The Library Project
   – Google have partnered with libraries around the world to include
     their collections in Book Search. For Library Project books that
     are still in copyright, Google results are like a card catalogue;
     showing info about the book and, usually, a few snippets of text
     showing your search term in context.
   – For Library Project books out of copyright, you can read and
     download the entire book.
• The Partner Programme
   – Google have partnered with over 20,000 publishers and authors
     to make their books discoverable on Google. Preview pages of
     these books are available.
Google Book Search
Why use?

• Library catalogues do not search inside books

• Google Book search allows you to search inside
  the book to identify books containing information
  on your topic

• Search our Library catalogue to find out if we
  hold the book
Google Scholar

    Stand on the shoulders of
             giants
Google Scholar
Advantages:

•   Freely available
•   Familiar search interface
•   Good starting point
Disadvantages:

•   No information on coverage
•   Limited capabilities for searching, limiting, sorting, printing.

Service cannot be seen as a substitute for the use of special abstracting and
   indexing databases and library catalogues due to various weaknesses
   (such as transparency, coverage and up-to-dateness).


P. Mayr and A. Walter An exploratory study of Google Scholar 2009
Evaluating websites
Evaluating websites
The quality of information on the Internet

The Internet has no standard system of quality control so
  it's important to be careful about which information you
  use and not to trust everything you read.

There is a danger that the information you find on the
  Internet will:

   – Be from a source that is unreliable, lacking in authority or
     credibility
   – Have content that is invalid, inaccurate, out-of-date
   – Not be what it seems!
Evaluating websites
Choosing appropriate resources

You need to question the quality of information you find on
  the Internet before you use it in your research.

To evaluate websites use the WWW approach:

• Who? - question the source of information
• What? - question the content of information
• When? - question the currency of the information
Evaluating websites
Who?

It's important to identify who is providing
   the information and to consider whether
   they can be relied on to provide the
   information you need.

• demo
Evaluating websites
Ask yourself:

•   Who is the author?
•   Who is the publisher?
•   Who sponsored or funded the site?
•   Do you recognise them as an authoritative source?
•   What are their credentials, qualifications, background
    and experience?
•   Has the information been edited or peer reviewed?
•   Are the sources trustworthy?
•   What are their motives for publishing the information?
•   What standpoint do they take: impartial? Biased?
Evaluating websites
What?

Can you trust the content of the information?

In selecting resources to quote in your essay would you
   prefer:

- A Management journal article with a full list of references
  to all the sources of evidence used?

- Opinions on a personal blog by a manager with no
  supporting evidence?
Evaluating websites
Ask yourself:

• Relevancy - does the information help answer your research
  questions?
• Validity - are the arguments rational and logical, and supported by
  evidence? Can you differentiate fact from opinion?
• Accuracy - are the arguments well reasoned, and is supporting
  evidence relevant and correct?
• Bias - What perspective is the author coming from? Are they giving
  both sides of the story? Or are they arguing from a particular
  position / worldview, or with a particular motivation that might skew
  their writing? Do you need to find counter-arguments that give an
  alternative point of view?
• Evidence - what examples are given to support the arguments?
  How has any evidence been gathered? Is all the evidence
  referenced with a source that could be used for verification?
Evaluating websites
When?

The accuracy of your source may be affected by the date it
  was published. In some fields using the latest research
  is very important - as the findings might have been
  disproved by more recent discoveries.

For example:

If you were looking for information on this year's top
    brands, would you choose statistics or surveys on
    websites that has not been updated for three years?
Evaluating websites
Ask yourself:

• When was the information originally produced?
• Has it been / will it be updated?
• Is it still useful or has the work been updated,
  superseded, disproved?
• Look for a publication date on the title or home
  page, last updated dates in the header or footer
  or explore the About page for clues about
  currency.
Evaluating websites
• You must critically evaluate material found
  on the internet

• No quality control

• Much of the material is not of sufficient
  quality to be used for academic
  assignments

Ask yourself:

• Who?
• What?
• When?
Evaluating websites
Further sources



Internet tutorials on using the Web for
  education and research

• http://www.vts.intute.ac.uk/

								
To top