invisible by liuhongmei


									The Invisible Web
      ‫מבוסס על‬
  Gary Price, MLIS
 George Washington University

    Chris Sherman
       Associate Editor
     Search Engine Watch

How Search Engines Work


Indexer              The Web

                   URL3    URL4

 Search                                    All - 90%
 Engine                                       Eggs
                                          Eggo - 81%
Database   Eggs?                               by
                                           Ego- 40%
                                  Eggs.     S. I. 10%
                                          Huh? - Am
  What is the Invisible Web?
• “Stuff” that search engine crawlers
  (spiders) can not -- or will not --
  add to their databases
• 2 to 50 times larger than the
  visible Web
• Resources often much higher
  quality than the visible Web
  What is the Invisible Web?
• Certain file formats (PDF, Flash,
  Office files, streaming media)
  – Why? They aren’t HTML text
• Most real-time data (stock quotes,
  weather, airline flight info)
  – Why? Ephemeral & storage intensive
  What is the Invisible Web?
• Dynamically generated pages
  (cgi, javascript, asp, or most
  pages with “?” in URL)
  – Why? Spider traps
• Web accessible databases
  – Why? Spiders can’t type
       Hidden Web sites
• Opaque Web – material that can be,
  but is not included in search engine
  results. Ex: new material added and not
  yet picked up.
• Private Web – sites intentionally
  excluded from search engine results.
  Ex: password protected
• Proprietary Web – sites that require
  user registration. Ex: eBay, New York
     Invisible Web Gateways
• Complete Planet
• Librarians’ Index to the Internet
• Digital Librarian
• Direct search (‫)לא מעודכן ולא מסודר‬
           The Invisible Web
            & The Librarian
         The Need For Knowledge!
• Awareness that the IW Exists
  Maybe the IW Hold the Content Your Users Can’t
  Find! What is the cost in both wasted time/effort and
  total frustration?
• Let Others Know About the IW
• Awareness of The Synonyms
   – Invisible Web
   – Deep Web
   – Hidden Web
          The Invisible Web
           & The Librarian
Why is the IW Useful to the Librarian
          and the End User?
• Quality of Content (Authority)
• Deep Content on Subject Area (Comprehensiveness)
• Focused Databases (Limited Scope)
  Smaller Universe of Documents to Search (Maximize
           The Invisible Web
            & The Librarian
         Why is the IW Useful to the
         Librarian & the End User?
• Material Unavailable Elsewhere on the Web
• Many Options to Limit, Sort, Interact with the Data
  (Maximize Precision)
• Timeliness vs. Time Lag of General Search Tools
            The Invisible Web
             & The Librarian

•   It’s Not The Magic Bullet. It’s a Tool
•   We Still Need Traditional Online Databases
•   Learning Curve, Sorry!
•   Database Selection, When To Use the IW?
•   Numerous Interfaces, Syntax
•   A Non-Stop Flow of New Material
       The Invisible Web
        & The Librarian

  Types of IW Content in Librarian Terms

• Bibliographic    • Non-Bibliographic
  - OPAC’s           - Full-Text
  - Subject Bibs     -   Numeric
                     -   Graphic
                     -   Directory
                     -   Real-Time
• Information stored in tables (Access, Oracle,
  SQL Server, DB2) and accessible only by
• Examples:
   – Phone books, People finders
   – Patents, laws
   – Items for sale in a Web store or Web-based
   – Digital exhibits
   – Multimedia and graphical files
   – Stock and bond prices
         Invisible Web:
      Scholarly information
• Citeseer (computer science)
• Google Scholar
• Infomine: Scholarly Internet Resource
• Scirus
         Invisible Web:
      Intellectual Property
• USPTO search
• ESP@CENET (European Patent
  Office) Patent Database
          Invisible Web:
           Art & Artists
• ADAM (Art, Design, Architecture &
  Media Information Gateway)
• Artcyclopedia
         Invisible Web:
     Real-Time Information
• Flight Tracker
• Stock prices –
• Weather
• Currency exchange rates
       Invisible Web:
 Maps and Driving Directions
• Google maps
         Invisible Web:
        Health & Medicine
• Medline Plus – Medical encyclopedia
• WebMD
• Economics of Tobacco Control Database
       Invisible Web:
    News & Current Events
• Google news
• RSS feeds (Web 2.0)

To top