Handout - Searching the Internet by leeonw


									                                                                                   Searching the Internet
Internet searching is the “art” of submitting a word or phrase to a web catalog or engine and receiving a series of
URLs containing the word or phrase. Search engines have become an important method of locating data on the web.

Uniform Resource Locator (URL) is a specification of the location of a link. It specifies the
protocol (http:// for a web page,) site name, path and file name to the resource. Think of it as
a networked extension of the standard filename concept: not only can you point to a file in a
directory, but that file and directory can exist on any machine on the network, can be served via
any of several different methods, and might not even be something as simple as a file: URLs can
also point to … queries stored deep within databases…               (…from The Webmaster’s Lexicon)

  The standard format of a URL is:

1. Scheme: appears before the colon, and describes the protocol, or the way the browser should
handle the resource.
       http: = HyperText Transfer Protocol, the native transfer method on the Web.
       ftp:    = File Transport Protocol, for downloading files from an FTP server.
       file:   = specifies a file on your computer as the resource
       news: = specifies a newsserver and newsgroup as the host and resource
       mailto: = starts the mail program associated with the browser, with a recipient as
                  the resource.

2. Host: appears after two forward slashes (//), and references the host computer (site) on which
the resource resides. The host segment includes the domain name of the host computer. Domain
names end in 2-5 letter zone names to indicate what type of site you are contacting:
        .com = commercial organization
        .edu = educational institutions
        .gov = U.S. government and public sites
        .mil = U.S. military sites
        .net     = networking organizations, communications service providers
        .org = non-profit organizations and others not fitting existing categories
        .fr, .uk, .us, .ca, … = international domains end in a two-letter country code

Internet Corp for Assigned Names and Numbers (www.icann.com) has added 7 new domains:
        .info = information services                 .aero = aviation
        .biz   = trademarked businesses              .coop = business cooperatives
        .name = individual/personal sites            .museum = museums
        .pro = professionals

3. Resource: appears after a single forward slash (/), describing the full path to a file or
document. Index.html is the default resource on an http location.

      http://www.unf.edu/index.html or http://osprey.unf.edu or http://www.state.fl.us
      file:///C|/temp/jenny.gif (note changes to standard file notation)

[K.Brown – rev: 5/02]                                                                       page 1 of 2
                                                                        Searching the Internet

    Where do I start?
    A good place to start Internet searching is through the UNF Library’s home page.
    On http://www.unf.edu/library you will find a link to Internet Search Engines.
    This link, http://www.unf.edu/library/guides/search.html, describes and connects to several of
    the most popular and useful search tools available.

    I. Search Engines:
    Search Engines are tools to let you explore the databases containing text from over a billion
    unclassified Web pages (documents.) Most concentrate on providing powerful search
    capabilities, not organization of the data. Search engines index data, they do not provide a
    review process on the content or value of the data.
    Most of the major search engines now also include additional services such as directories and
    meta-index searches, as discussed below.
           The most comprehensive search engine is AltaVista.
           Others are Fast, HotBot, Infoseek, and Excite.
           Excite includes reviews, discussion groups and classified ads.

    II. Internet Directories:
    Internet Directory tools provide multi-level topic directories of a smaller database of
    documents, allowing you to browse for information on a given subject. Topic directories are
    established based on reviewing and classifying each Web site for content.
    Since classification of Web sites requires human intervention, these directories are smaller in
    scope, but often lead to more precise results. The data is organized!
            Yahoo arranges and reviews over a million sites.
            LookSmart contains over 500,000.
            Magellan reviews sites for value, allowing the user to screen out “content for mature
            Lycos includes abstracts for sites matching search results.

    III. Meta-Indexes:
    Meta-indexes search other indexes. These tools translate your query into the format of several
    other search tools and return results categorized by the tool used.
            Google, Dogpile and MetaCrawler query most of the major search engines.
            Google is currently the largest index with access to over 1.3 billion pages.
            One site, www.37.com, claims to search using thirty-seven different engines.

Search Qualifiers / Boolean Operators: examples of commonly used operators
 AND +               Gore AND Bush           returns documents with both Gore and Bush
                     +Gore +Bush
 OR                  Gore OR Bush            returns documents with either Gore or Bush
 NOT -               mickey NOT mouse        returns mickey but not mouse. Mickey Mantle
                     +mickey -mouse          would be found, but not Mickey Mouse.
 Capitalization      Mouse                   returns proper name. Mickey Mouse would be
                                             found, but not field mouse.
 “phrase in quotes” “ Duke Blue Devils”      returns exact phrase. Excludes pages about Duke
                                             Power, devil worship or blue suede shoes
 NEAR                Duke NEAR Blue          similar to quotes except proximity of words
                     NEAR Devils             determines results
    [K.Brown – rev: 5/02]                                                          page 2 of 2
                        Searching the Internet

[K.Brown – rev: 5/02]            page 3 of 2

To top