Sometime in the near future, a research project will be assigned. And information will need to be gathered.
With billions of pages on the Internet, the first thing to figure out is which search engine to use.
How do Search Engines Work?
Search Engines do not really search the World Wide Web directly. Each one searches a database of the full
text of web pages automatically collected from the billions of web pages out there. When you search the
web using a search engine, you are always searching a somewhat stale copy of the real web page.
However, when you click on the links provided in a search engine's search results, you retrieve the current
version of the page.
Search engine databases are selected and built by computer programs called spiders. These "crawl" the
web, looking for pages by following the links in the pages they already have in their database (i.e., the ones
it already "knows about"). They cannot think or use judgment to "decide" to go look something up and see
what's out there on the web.
If a web page is never linked to in any other page, search engine spiders cannot find it. The only way a
brand new page - one that no other page has ever linked to - can get into a search engine is for its address
to be sent by some human to the search engine companies as a request that the new page be included. All
search engine companies offer ways to do this.
After spiders find pages, they pass them on to another computer program for "indexing." This program
identifies the text, links, and other content in the page and stores it in the search engine database's files so
that the database can be searched by keyword and whatever more advanced approaches are offered, and
the page will be found if your search matches its content.
Interesting Facts to Ponder About Search Engines
--no search engine covers more than 25% of the Internet
--all of them work slightly differently in how they find websites; hence, you get somewhat different
returns from different search engines.
a search engine is not a directory. Directories like Yahoo! or Pandia Plus are collections of links to
websites reviewed and edited by human beings like you and me.
My Top Ten Choices of Search Engines (in no particular order)
Gigablast has a huge index and some nice advanced search options, although not terribly advanced. A tool
called Giga Bits helps you narrow your search. It works like this: If you search for the word ”chocolate”, Giga
Bits suggests ”chocolate bars”, ”chocolate gifts”, ”recipes” and more. If you select ”recipes” it goes on to
suggest ”cookies”, ”fudge”, ”deserts” and more. A tool like this can be particularly handy if you don’t know
exactly what you are looking for. By adding ?raw=9 to the end of the URL of any search result page, you
can create an RSS feed. I.e., http://www.gigablast.com/search?q=chocolate+%22Recipes%22?raw=9
gives you an RSS feed for chocolate recipes.
This search engine doesn’t just search for web sites that match your search term--they search the whole
topic area. In this way it can return relevant results on your topic that don’t necessarily mention the word
you searched for. The search results are presented in a way slightly different from your average search
engine: every site is described by up to three meaningful, complete sentences from the site in question.
This means that you can often gain information on a topic without having to leave the search page.
Exalead is a European search engine. It has some very handy advanced search options and a new way of
presenting search results. The advanced features include truncation, proximity search, stemming, phonetic
search, and language field search.
The search result page is brimming with information. In the left column, you can narrow your search by
choosing related terms or related categories. The middle column contains the search results and in the right
hand column you find thumbnail images of the web pages. You can choose to display only the search
results or only the thumbnails. Clicking a search result or thumbnail opens the web page in an small
window. If it’s not what you were looking fore, you just close it and your search results are still there. Cool,
Like Exalead, Snap wants to deliver visually enhanced search results. It gives you text on the left and web
site previews on the right. So, you don’t have to click through every search result to see what you’ll get, but
have a chance to “look before you click”.
You can move up and down the list of search results using the arrow keys on your keyboard or the scroll
wheel of your mouse. Snap also offers a live preview on the right. You click the image and the web site
loads – either inside the Snap window or in a new browser window.
Voted four times Most Outstanding Search Engine by Search Engine Watch readers, Google has a well-
deserved reputation as the top choice for those searching the web. The crawler-based service provides
both comprehensive coverage of the web along with great relevancy.
Google provides the option to find more than web pages, however. Using on the top of the search box on
the Google home page, you can easily seek out images from across the web, discussions that are taking
place on Usenet newsgroups, locate news information or perform product searching.
Google search results give you the most relevant sites at the top of the result list. It does by rating sites
according to how many other sites there are linked to it. Links from popular or important sites counts more
than links from smaller, unknown sites. The search engine figures that if many high quality sites link to a
particular site, that site must contain some high quality information.
This search tool organizes results into categories and subcatergories. It brings up results that would be
buried in the many pages of Google's vertical lists. Clustering 2.0 expands the cluster party by allowing
searchers to remix, to view those clusters that didn't make the first page of results.
The Clusty folks explain:
Although clustering reveals the major topics in the top 200, 500, or more search results, there are
always more topics than can be shown, without overloading the user with a very long list. There
hasn’t been any better approach, until now.
With a single click, remix clustering answers the question: What other, subtler topics are there? It
works by clustering again the same search results, but with an added input: ignore the topics that the
user just saw. Typically, the user will then see new major topics that didn’t quite make the final cut at
the last round, but may still be interesting.
The Yahoo! search engine is a very powerful tool, and can probably be compared with Google as regards
quality, although not as regards scope (i.e. number of pages listed). In addition to excellent search results,
you can use tabs above the search box on the Yahoo home page to seek images.
The Yahoo Directory still survives. You'll notice "category" links below some of the sites lists in response to
a keyword search. When offered, these will take you to a list of web sites that have been reviewed and
approved by a human editor. It can be found at Web directory.
Previously known as Ask Jeeves, this search engine is the small sibling of the three big search giants
Google, Yahoo! and MSN Search. The main page contains a search box and a convenient but unobtrusive
set of tools. The tools menu can be configured and lets you search for image, news, weather, blogs, feeds,
maps, shopping, stocks and more. The search results page suggests way to narrow your search. Some
results are adorned with a small image of binoculars. When you point your mouse to this image, a preview
of the web page in question pops up. There is also a wide variety of tools and answers that show up in the
search results pages when relevant. The Ask search engine integrates the Teoma search technology, the
only search engine known to take the profile of the whole site into consideration when deciding on search
Powered by Yahoo, you may find AllTheWeb a lighter, more customizable and pleasant "pure search"
experience than you get at Yahoo itself. The focus is on web search, but news, picture, video, MP3 and
FTP search are also offered.
Live Search (MSN Search)
Live Search (formerly Windows Live Search) is the name of Microsoft's web search engine, successor to
MSN Search, designed to compete with the industry leaders Google and Yahoo. The search engine offers
some innovative features, such as the ability to view additional search results on the same web page
(instead of needing to click through to subsequent search result pages) and the ability to adjust the amount
of information displayed for each search-result (i.e. just the title, a short summary, or a longer summary). It
also allows the user to save searches and see them updated automatically on Live.com. Microsoft is now
making MSN Search part of its Web 2.0 strategy, i.e. part of its total set of online services called Windows
Want something REALLY special?
SpaceTime is a kind of stackable metasearch tool. That is, it also you to search across multiple seach tools
and view your results as a visual 3D stack. This looks like it's going to be especially helpful with image and
video searches. In a regular keyword search, SpaceTime creates buttons for your keywords making it
easier to search within your search. I suspect my students are going to love this one. It's elegant!
Note: The PC demo looks great, but unfortunately, the Mac version is not yet ready for prime time.
The SpaceTime folks explain:
The days of mining through pages and pages of tiny thumb-nails in an effort to find the item you are
looking for are over. Because SpaceTime™ has unlimited space, you can display hundreds of
items at once to find exactly what you are looking for.