Search Engine Optimization
SEO Strategy AND Search Engine industry
As Internet started to grow and became an integral part of day-to-day
work, it became almost impossible for a user to fetch the exact or
relevant information from such a huge web. And hence „Search Engines‟
were developed. Search engines became so popular that now more than 80%
of the web-site visitors come from the search engines. But what exactly
Search Engine is? According to webopedia, “Search Engine is a program
that searches documents for specified keywords and returns a list of the
documents where the keywords were found”.
For Example, if you want to know about the Automobile market in India,
you will type some keywords like automotive market, automobiles in India,
automobile manufacturers in India etc. and once you click on search
button, you‟ll get the best relevant data related to those keywords.
Search engine industry is dominated by 3 major players- Google, Yahoo!
and MSN with a market share of 35%, 32% and 16% respectively.(According
to searchenginewatch.com survey, 2004). A report resealed in March 2005
indicated that the search engines are being used between 2 to 3.5 billion
times per day to find information online. And of course people using
search engines are on the hunt for specific information and hence the
audience is highly targeted. There are many other search engines such as
AskJeeves, AOL, Excite etc are also famous among net users. But what is
attractive is the stats regarding usage of search engines.
The use of search engine is a top online activity and netizens
increasingly feel they get the information they want when they execute
On the probable eve of Google‟s initial public offering, new surveys and
traffic data confirm that search engines have become an essential and
popular way for people to find information online. A nationwide phone
survey of 1,399 Internet users between May 14 and June 17 by the Pew
Internet & American Life Project shows:
84% of internet users have used search engines. On any given day
online, more than half those using the Internet use search engines. And
more than two-thirds of Internet users say they use search engines at
least a couple of times per week.
The use of search engines usually ranks only second to email use as
the most popular activity online. During periods when major news stories
are breaking, the act of getting news online usually surpasses the use of
There is a substantial payoff as search engines improve and people
become more adept at using them. Some 87% of search engine users say they
find the information they want most of the time when they use search
The convenience and effectiveness of the search experience solidifies
its appeal. Some 44% say that most times they search they are looking for
vital information they absolutely need.
comScore Networks tracking of Internet use shows that among the top 25
Americans conducted 3.9 billion total searches in June
44% of those searches were done from home computers, 49% were done
from work computers, and 7% were done at university-based computers.
The average Internet user performed 33 searches in June.
The average visit to a search engine resulted in 4.4 searches.
The average visitor scrolled through 1.8 result pages during a
In June, the average user spent 41 minutes at search engine sites.
ComScore estimates that 40-45 percent of searches include sponsored
Approximately 7 percent of searches in March included a local
modifier, such as city and state names, phone numbers or the words “map”
The percentage of searches that occurred through browser toolbars in
June was 7%
Search Engines Market Share:
Four times voted as Most Outstanding Search Engine, Google is an
undisputed market leader of search engine industry. Google is a crawler
based search engine, which is known for providing both comprehensive
coverage of web page and most relevant information. It attracts the
largest number searches and the number goes upto 250 million searches
Yahoo! is the second largest player in the industry with 23.4% of market
share. Yahoo! started as a human based directory but turned into Crawler
based search engine in 2002. Till early 2004, it was powered by Google
but after that they started to use their own technology.
Overture stands next to Google in terms of number of searches per day. It
is owned by yahoo and attracts more than 167 million searches per day.
Overture was the first search engine who came up with PPC program.
AskJeeves initially gained fame in 1998 and 1999 as being the "natural
language" search engine that let you search by asking questions and
responded with what seemed to be the right answer to everything. When
launched, it was run by around 100 editors who monitored search logs.
Today, however, AskJeeves depends on crawler-based technology to provide
results to its users.
Search Engine History:
Though Google is responsible for where the search engines stands today,
actual search engine was invented much before Google incorporated.
Alan Emtage, a student at McGill University, created the first search
engine in 1990 and he named it „Archie‟. Back then there was no world
wide web! FTP was the mean to share the data. It was effective in smaller
groups but the data became as much fragmented as it was collected. Archie
helped solve this data scatter problem by combining a script-based data
gatherer with a regular expression matcher for retrieving file names
matching a user query. Essentially Archie became a database of web
filenames, which it would match with the users queries.
As word of mouth spread, it started to become word of computer and Archie
had such popularity that in 1993 the University of Nevada System
Computing Services group developed Veronica. Veronica served the same
purpose as Archie, but it worked on plain text files. Soon another user
interface name Jughead appeared with the same purpose as Veronica, both
of these were used for files sent via Gopher, which was created as an
Archie alternative by Mark McCahill at the University of Minnesota in
Now the challenge was to automate the process. And the first internet
robot was introduced. Computer robots are simply programs that automate
repetitive tasks at speeds impossible for humans to reproduce. He
initially wanted to measure the growth of the web and created this bot to
count active web servers. He soon upgraded the bot to capture actual
URL's. His database became knows as the Wandex. The Wanderer was as much
of a problem as it was a solution because it caused system lag by
accessing the same page hundreds of times a day.
By December of 1993, three full-fledged bot fed search engines had
surfaced on the web: JumpStation, the World Wide Web Worm, and the
Repository-Based Software Engineering (RBSE) spider. JumpStation gathered
info about the title and header from Web pages and retrieved these using
a simple linear search. As the web grew, JumpStation slowed to a stop.
The WWW Worm indexed titles and URL's. The problem with JumpStation and
the World Wide Web Worm is that they listed results in the order that
they found them, and provided no discrimination. The RSBE spider did
implement a ranking system.
Brian Pinkerton of the University of Washington released the WebCrawler
on April 20, 1994. It was the first crawler, which indexed entire pages.
Soon it became so popular that during daytime hours it could not be used.
AOL eventually purchased WebCrawler and ran it on their network. Then in
1997, Excite bought out WebCrawler, and AOL began using Excite to power
its NetFind. WebCrawler opened the door for many other services to follow
suit. Within 1 year of its debuted came Lycos, Infoseek, and OpenText.
In 1998 the last of the current search super powers, and the most
powerful to date, Google, was launched. It decided to rank pages using an
important concept of implied value due to inbound links. This makes the
web somewhat democratic as each off going link is a vote. Google has
become so popular that major portals such as AOL and Yahoo have used
Google and allowed that search technology to own the lions share of web
searches. In 1998 MSN search is launched. The open directory and direct
hit were also launched in 1998.
Search Engines and Directories
Web Directory is a web search tool compiled manually by human editors.
Once websites are submitted with information such as a title and
description, they are assessed by an editor and, if deemed suitable for
addition, will be listed under one or more subject categories. Users can
search across a directory using keywords or phrases, or browse through
the subject hierarchy. Best examples of a directory are Yahoo and the
Open Directory Project.
The major difference between search engine and directory is the human
factor. A web site search directory indexes a web site based on an
independent description of a site. While directories perform many of the
same functions of a web page search engine, their indexing format is
different. The main difference is that directories do not spider your
site to gather information about it. Instead they rely on a few text
entries, typically a site title, domain name, and description, to
determine which keywords describe your site. While sites in the search
engines are scanned and resulted by program (crawler), they are edited
manually in directories. Directories contain number of websites according
to theme or industry i.e automobile related sites are placed in one sub-
directory, sports sites are placed into the other sub-directory and so
on. So directories help organize thousands of web sites together. A
directory contained inside another directory is called a subdirectory of
that directory. Together, the directories form a hierarchy, or tree
There are directories on the web for almost any category you could name.
Some search engines are adding general directories to their web pages.
While helping researchers by suggesting general topics to search under,
these general directories also place banner ads on their search engine,
which encourage some users to spend more time on their sites browsing,
and the banner ads help pay directories‟ costs for posting on the
internet. A web directory is a directory on the World Wide Web that
specializes in linking to other web sites and categorizing those links.
Web directories often allow site owners to submit their site for
inclusion. editors review submissions for fitness.
There are 5 types directories namely Human Edited, User Categorized, User
Classified, Independently Classified and Pay Per Click (PPC).
Human Edited (Categories):
This is the 'traditional' directory. It is the most prestigious, as each
listed site is 'hand picked' and reviewed by a human editor. The
assumption is that the editor is an 'expert' in his/her field and will
select for inclusion only appropriate sites. Such directories usually
have very clear and stringent acceptance rules, which ensure the quality
of the search results. Invariably, the Directory is comprised of
categories to which sites are 'assigned'. This type of Directory is
relatively hard to maintain, as it is labor intensive and hence
expensive. That also explains why many such directories are using
volunteers to do the work. Notable examples of Human Edited Directories
are Yahoo, Dmoz, Joeant and Gimpsy, but there are many more. There is no
doubt that this is the most important type to submit your site to. Only
the scrutiny of an independent human reviewer can ensure the quality and
suitability of a web site to a given category.
2. User Categorised:
The Directory is structured in a very similar way as the Edited
Directory, but it is the user's decision as to the best category to place
the site in. While this is quite attractive for the Directory Owner (the
users do the 'hard work') as well as the Site Owner (freedom to place the
site in any category), the search results may be far from satisfactory.
One such Directory is Websquash. You may get benefits from registering in
such a directory, but make sure you consider all the relevant aspects.
3. User Classified:
Sites are classified by keywords, entered by the Site Owner in the Meta
Tags of the home page. The attraction here is that the site is classified
(potentially) by many keywords and the process is fully automatic (low
maintenance). While easy to register, the sorting algorithm has very
little to go by, hence the position of the site in the search results
doesn't mean much. Moreover, should you choose popular keywords you have
little chance of being found due to the number of sites competing with
you. On the other hand, selecting a rare combination of keywords suffers
from the obvious problem of the miniscule number of searchers using that
combination. One of the better known examples is ExactSeek, which enjoys
significant popularity. Its attraction may be related to the use of the
Alexa ranking, which measures the site's popularity, as a primary sorting
criterian of the searched results.
Instead of letting the Site Owner decide which keywords to use for
finding his site, this type of directory allows every user to determine
the relevancy of keywords. This latest addition to the Directory family
harnesses the public vote to examine and determine relevancy of keywords
to sites. Each user may choose to rate a (random) site and voice his/her
opinion of the suitability of specific keywords to that site. The best
example for such a site is Netnose. Due to the democratic process, it is
highly likely that relevancy will be good. However, for such a site to
achieve prominence requires a larger number of users willing to donate
their time and effort to that rating activity.
5. Pay Per Click:
While technically PPC Directories are of the User Classified type, their
business model implies some significant characteristics that Site Owners
should be aware of:
· A link from a PPC is never a direct or simple link. Hence being
listed in a PPC irectory will never help to increase Link Popularity with
· A link from PPC directory remains in place only as long as the
user‟s account is cash positive.
· PPC Directories try to maximize their revenues by encouraging
Site Owners to bid for as many keywords as they can, even those that are
only remotely related to their site‟s business.
At the beginning of the web era, users would go to directories to find
sites relevant to their interests. In fact, Yahoo!, the web's number one
destination, started as a directory. Nowadays, most users rely on search
engines, not directories, to find what they're looking for.
When search engines started to become popular, they relied on web pages'
'keyword metatags' to determine the topic and relevance of the page (the
keyword metatag is a section within a web page's HTML code where
webmasters can insert words that are relevant to the page's content).
Webmasters discovered that by stuffing their meta tags with popular
search terms repeated hundreds of times, they could propel their pages to
the top of the search results.
Search engines caught up to the abuse and decided to ignore the meta tags
and rely instead on web page copy. Webmasters then started to overstuff
their page copy with popular search terms, often writing them in the same
color as the web page's background, so that they could be detected by
search engines while being invisible to users.
Again, search engines discovered the trick and decided that the best way
to rank a web page's content and its topical relevance was to rely on
inbound links from other pages. The rationale behind this is that it is
much more difficult to influence other people to link to you than it is
to manipulate your own web page elements.
There are several ways to get inbound links, among them writing articles
that include your bylines with a link to your page, exchanging links, and
listing your site in directories.
Listing your sites in good directories is probably the best way to get
quality links that are highly valued by the search engines. Since
directories rely on human editors who enforce strict criteria to list a
site, and since directories organize the information in highly focused
categories, they are an invaluable resource for search engines to measure
the quality and the relevance of a web page.
In summary, directories are important not because they generate
significant traffic, but because they are given great importance by the
search engines to qualify and rank web pages, and to determine their
Major Search Engines
Among the thousands of search engines, very few are famous. Thanks to
their algorithm, which helps the user to find most relevant information.
As observed earlier, Google, Yahoo! and MSN are the top three search
engines in the world. But then there is Teoma, Excite, Ask Jeeves, AOL,
HotBot, Alta Vista, Lycos etc. also counts lots of searches.
The listing in these search engine can attract huge traffic to the
website. Hence it is very important for search engine optimizer to know
which search engine is best and highly used. It is very important for
searchers as well! For them well known, commercially backed search
engines mean dependable results. These search engines are more likely to
be well maintained and upgraded when necessary, to keep pace with the
There are 9 main features on which search engines can be evaluated. They
are as bellow.
Boolean: Boolean searching refers to how multiple terms are combined
in a search.
and requires that both terms be found.
or lets either term be found
not means any record containing the second term will be excluded
( ) means the Boolean operators can be nested using parentheses
+ is equivalent to AND, requiring the term; the + should be placed
directly in front of the search term
- is equivalent to NOT and means to exclude the term; the - should
be placed directly in front of the search term
Operators can be entered in the case shown by the example.
(salad and (lime or kiwi)) not nuts
+salad –nuts lime kiwi
Default: What happens when multiple terms are entered for a search
using no Boolean operators, + or - symbols, phrase marks, or other
Two terms could be processed as
Two AND terms
Two OR terms
or “two terms” as a pharse
Proximity: Proximity searching refers to the ability to specify how
close within a record multiple terms should be to each other. The most
commonly used proximity search option in Internet finding aids is a
phrase search that requires terms to be in the exact order specified
within the phrase markings. The default standard for identifying phrases
is to use double quotes (" ") to surround the phrase.
Phrase searching example: “Phrase searching is fun”
Beyond phrase searching, other proximity operators can specify how close
terms should be to each other. Some will also specify the order of the
search terms. Each search engine can define them differently and use
various operator names such as NEAR, ADJ, W, or AFTER.
Truncation: This search technique refers to the ability to search
just a portion of a word. Typically, a symbol such as the asterisk is
used to represent the rest of the term. End truncation is where several
letters at the beginning of a word are specified but the ending can vary.
With internal truncation, a symbol can represent one or more characters
within a word.
Stemming related to truncation, usually refers to the ability of a search
engine to find word variants such as plurals, singular forms, past tense,
present tense, etc. Some stemming only cover plural and singular forms.
End Truncation Examples: Colleg* finds college, colleges, collegium,
Internal Truncation Examples: col*r finds color, colour, colander
Stemming: lights finds light,lights, lighting, lit
Case Sensitive: In general, most search engines will match upper
case, lower case, and mixed case as all the same term. Some search
engines have the capability to match exact case. Entering a search term
in lower case will usually find all cases. In a case sensitive search
engine, entering any upper case letter in a search term will invoke the
exact case match
Next finds next, Next, NeXt, next
NeXT finds only NeXT
Fields: Fields searching allows the searcher to designate where a
specific search term will appear. Rather than searching for words
anywhere on a Web page, fields define specific structural units of a
document. The title, the URL, an image tag, or a hypertext link are
common fields on a Web page.
Example: title: searching will look for the word „searching‟ in the title
of a web page
Limits: The ability to narrow search results by adding a specific
restriction to the search. Commonly available limits are the date limit
and the language limit. The latter would restrict the search results to
only those Web pages identified as being in the specified language.
Top of Form
Stop Words: Frequently occurring words that are not searchable. Some
search engines include common words such as 'the' while others ignore
such words in a query. Stop words may include numbers and frequent HTML
strings. Some search engines only search stop words if they are part of a
phrase search or the only search terms in the query.
Examples: the, a, is, of, be, 1, html, com
Sorting: The ability to organize the results of a search. Typically,
Internet search engines sort the results by "relevance" determined by
their proprietary relevance ranking algorithms. Other options are to
arrange the results by date, alphabetically by title, or by root URL or
Now lets see Major search engines and their features:
Google: right from the establishment in 1999, till date Google is most
favorite search engine on net. Since its beta release, it has had phrase
searching and the - for NOT, but it did not add an OR operation until
Oct. 2000. In Dec. 2000, it added title searching. In June 2000 it
announced a database of over 560 million pages, which grew to 4 billion
in February 2004. It‟s biggest strength is it‟s size and scope. Google
includes PDF, DOC, PS and many other file types. Also it has additional
databases in the form of Google Groups, News, Directory etc.
No, but stemming, word in phrase
intitle, inurl, link, site, more
Language, filetype, date, domain
Varies, + searches
Yahoo!: Yahoo! is one of the best known and most popular internet
portals. Originally just a subject directory, it now is a search engine,
directory and portal. It‟s a large, new search engine. It includes cached
copies of pages and also includes link to the Yahoo! directory. It also
supports full Boolean searching. But it lacks in providing some advanced
search features such as truncation. It indexes first 500KB of a web page
and link searches require inclusion of http://
AND, OR, NOT,
( ), -
intitle, url, site, inurl, link, more
Language, file type, date, domain
MSN: MSN Search is the search engine for the MSN portal site. For years
it had used databases from other vendors including Inktomi, LookSmart,
and Direct Hit. As of Feb. 1, 2005, it began using its own, unique
database. MSN Search uses its own Web database and also has separate
News, Images, and Local databases along with links into Microsoft's
Encarta Encyclopedia content. Text ads, currently from Yahoo! Search
Marketing Solutions (formerly known as Overture). Its large and unique
database, query building Search Builder and Boolean searching, cached
copies of web page including date cached and automatic local search
options are its strengths.
However, limited advanced features, inconsistent availability of
truncation and no title search, truncation and stemming are its
AND, OR, NOT,
( ), -
link, site, loc, url
Varies, + searches
Ask Jeeves / Teoma: Debuting in Spring 2001 and relaunching in April
2002, this new search engine has built its own database and offers some
unique search features. It was bought by Ask Jeeves in Sept. 2001. It
lacks full Boolean and other advanced search features, but in has more
recently expanded and improved its search features and added an advanced
search. While Teoma results can show up in three separate sections, there
is only the one single database of indexed Web pages. It may also include
paid ad results (from Google's AdWords database) under the heading of
'Sponsored Links.' No additional databases or portal features are
directly available. Ask Jeeves switched to Teoma instead of Direct Hit in
Jan. 2002 for the search engine results after its question and answer
matches. Identifying metasites and Refine feature to focus on web
communities are the strengths and smaller database, no free URL
submissions, no cached copies of pages are its weaknesses.
Language, site, date
Yes, + searches
Who powers whom
There are thousands of search engines available on internet. But it‟s not
possible for all of them to create, maintain and update their own
database. Therefore they display results from major search engines on
It is not necessary that all the primary and secondary results should be
provided by one search engine. Different search engines can provide
different results to one search engine. Directories also can be used from
a third party. So a supplier and receiver relationship establishes
between different search engines. This is very important to understand
this relationship if you want top ranking for your site.
Now let‟s check out the relationship between top 10 search engines and
top 2 directories i.e. which search engine is a supplier and who is
1. Google :
· Google's main search results are provided solely from Google's
search technology, offering results from no other engine or source.
· The Google Directory is comprised of listings from The Open
Directory Project (ODP, DMOZ).
· Google provides results to AOL, NetScape, IWon, Teoma,
AskJeeves and Yahoo! Web Results.
· Paid and free submissions (currently includes Inktomi's results
· Paid results from Overture.
· Provides main results to HotBot, Excite, Go.com, MSN, Excite,
Infospace, About, and backup results to LookSmart, and Overture.
· MSN provides sponsored results from paid advertising sources.
· MSN provides primary results from LookSmart.
· Secondary results are provided from Inktomi.
· AOL results for "Recommended Sites" are listings that have been
hand picked by AOL editors.
· AOL "Sponsored Sites" are supplied by Google AdWords.
· AOL "Matching Sites" are supplied by Google results. The
results in AOL may not always match the results in Google as Google often
updates their database more frequently.
· AOL directory listings are provided by the ODP.
5. Alta Vista:
· Alta Vista receives sponsored listings from Overture and Alta
Vista's own advertisers.
· Alta Vista will use results from their own database for the
main search results.
· Alta Vista obtains its directory results from LookSmart.
· HotBot results contain three categories: Top 10 Results,
Directory Results & General Web Results.
· Top 10 results include popular sites and searches.
· Directory results are hand-picked by human editors.
· Web Results are provided by Inktomi.
· HotBot offers the capability to search the HotBot database,
Lycos, Google and / or AskJeeves all from one location with the click of
· iWon Spotlight results are comprised of web pages found within
iWon or web sites that iWon has a direct business partnership with.
· iWon Sponsored Listings are provided by a variety of paid
advertisements through third party pay for performance listings including
Google, AdWords and Overture.
· iWon Web Site Listings are powered by Google.
· iWon Shopping Listings are provided by Dealtime.com
· Lycos provides directory results from The Open Directory
· Lycos provides sponsored listings from Overture.
· Lycos provides Web Results from Fast and from the Lycos
· Netscape's sponsored links are provided by Google AdWords.
· Netscape's matching results include sites that are handpicked
by ODP editors mixed with results powered by Google.
· AllTheWeb crawl and index ODP results.
· AllTheWeb powers the main search results in Lycos.
· AllTheWeb provide results from Lycos.
· AllTheWeb also powers the Lycos advanced search feature, the
FTP search feature and their MP3 specialty engine.
1. Dmoz: Directory listings are provided to AOL, Google, Lycos and
Netscape and many other web sites, directories & portals.
2. Yahoo!: Yahoo! Directory listings are supplied by Yahoo! editors and
require a fee for commercial sites. Yahoo directory results are provided
to Alta Vista.
How Search Engine Rank Pages
Broadly search engines are divided into 2 main categories:
Crawler based search engines
Human powered directories
Web crawler is a program, developed to scan the web page. Crawler scans
the entire page, indexes it and lists it on the search engine. It
evaluates any particular web page based on several different factors such
as keywords, table, page titles, body copy etc. Since listings are one
automatically, it can change if you change some content of the website.
Manual listing is done in case of „Human Powered Directories‟. Human
editors are responsible for listing any site in the directory. Webmaster
needs to submit a short description to the directory for the entire site
and a search looks for matches only in the description submitted.
Listings are not affected if you change some content in your web site.
Listing in directory and in search engines is totally different and hence
parameters for listing are different in both the cases. But it‟s very
necessary to create most informative and content rich site to attract
Any crawler based search engine is made up of 3 basic components.
a. Crawler or Spider
c. Search engine software
All these components work one after one and list the page on search
engine. Search engine finds the website in 2 ways: 1. by accepting
listings send by webmasters 2. by crawlers that roam the internet storing
links to and information about each page they visit. Once the site is
found by the search engine crawlers scan the entire site. While scanning
crawler visits the web page, reads it and then follows link to other
pages within the site. Major search engines like Google, Yahoo and MSN
use multiple search engines simultaneously. Google uses 4 spiders which
crawl over 100 pages per second and generating around 600KBs of data each
Then index program starts after crawler. Once a webpage is crawled, it is
necessary to transfer them to the database. Index contains copy of such
web pages scanned by crawler. If the webpage is changed index is updated
with new information. It is very important that the page is added to the
index. Until and unless it is indexed, it is not available to those
searching with the search engine.
The search engine software performs a task of relevant listing. It
searches the entire database i.e. indexed pages and matches it with the
search. Then it ranks and lists the most relevant matches. These listings
are done on how the search engine software is programmed. It gives
listing according to what it believes the most relevant is!
There are many more factors on which search engine rank a page. We will
look at it in detail later.
Broadly, it depends on On-page factors and Off-page factors. On-page
factors include keyword targeting, HTML tags, Content, Anchor Text and
URL while Off-page factors include Link Building, Link Popularity and
Though these terms are explained later, right now let‟s see what
strategies any search engine opts to list a page. Crawler based search
engines list the sites without any human interference. This means it
ranks a page based on what it thinks the most relevant page is! There are
few parameters on which crawler checks whether the site is relevant to
the search query or not. This program is called Search Engine Algorithm.
No one knows the exact algorithm of any search engine. But studies and
research proved that there are few factors, which are common in most of
the search engine algorithms.
Location of keywords: Once keywords are finalized the main task is
„placement of keywords‟. The search engine algorithm mainly revolves
around the location of keywords. The keywords can be placed in HTML tags,
content, Headline or in first few paragraphs. The importance varies
according to location. Like keywords placed in headline or first few
paragraphs are more important than other locations in web site. If
keywords are placed from the beginning, search engine assumes that the
page is more relevant to that particular theme.
Frequency: Though it‟s very important to place keywords in the most
visible parts of the web page, it is important to limit the number of
keywords. This is called frequency. Search engines also measure frequency
of keywords while ranking a page. Search engine analyses how often
keywords appear in relation to other words in a web page. Websites with
more number of keywords are considered to be more relevant than others.
Added features in Location and Frequency: Location and frequency are just
the basics of search engine algorithm. Once search engines found out that
anyone can play around it and can successfully rank their pages, they
increased the complexity of algorithm. Different search engines now
indexes different number of web pages. Some indexes more and some less.
Also some indexed pages more often than others. Hence no search engine
has the exact same collection of web pages to search through. Therefore
the results on different search engines are always different.
Once webmasters came to know about the frequency they cracked the
algorithm by using too many keywords in a page, just to get higher
rankings. Hence search engine stared to penalize such sites. Search
engine termed it as „spamming‟. So it became very necessary for SEO
companies to keep the frequency more than others but less than spamming.
Search engines watch for common spamming methods in a variety of ways,
including following up on complaints from their users.
Off-page factors: Above mentioned are some on-page factors. Now let‟s
look at some common off page factors. Crawler-based search engines have
plenty of experience now with webmasters who constantly rewrite their web
pages in an attempt to gain better rankings. Some sophisticated
webmasters may even go to great lengths to "reverse engineer" the
location/frequency systems used by a particular search engine. Because of
this, all major search engines now also make use of "off the page"
Off the page factors are those that a webmasters cannot easily influence.
Chief among these is link analysis. By analyzing how pages link to each
other, a search engine can both determine what a page is about and
whether that page is deemed to be "important" and thus deserving of a
ranking boost. In addition, sophisticated techniques are used to screen
out attempts by webmasters to build "artificial" links designed to boost
• Link analysis: Web-based search engines have introduced one
dramatically different feature for weighing and ranking pages. Link
analysis works somewhat like bibliographic citation practices, such as
those used by Science Citation Index. Link analysis is based on how well-
connected each page is, as defined by Hubs and Authorities, where Hub
documents link to large numbers of other pages (out-links), and Authority
documents are those referred to by many other pages, or have a high
number of "in-links"
• Popularity: Google and several other search engines add popularity to
link analysis to help determine the relevance or value of pages.
Popularity utilizes data on the frequency with which a page is chosen by
all users as a means of predicting relevance. While popularity is a good
indicator at times, it assumes that the underlying information need
remains the same.
Another off page factor is click-through-measurement. In short, this
means that a search engine may watch what results someone selects for a
particular search, and then eventually drop high-ranking pages that
aren't attracting clicks, while promoting lower-ranking pages that do
pull in visitors. As with link analysis, systems are used to compensate
for artificial links generated by eager webmasters.
There are few more factors such as:
• Date of Publication: Some search engines assume that the more recent
the information is, the more likely that it will be useful or relevant to
the user. The engines therefore present results beginning with the most
recent to the less current.
• Length: While length per se does not necessarily predict relevance, it
is a factor when used to compute the relative merit of similar pages. So,
in a choice between two documents both containing the same query terms,
the document that contains a proportionately higher occurrence of the
term relative to the length of the document is assumed more likely to be
• Proximity of query terms: When the terms in a query occur near to each
other within a document; it is more likely that the document is relevant
to the query than if the terms occur at greater distance. While some
search engines do not recognize phrases per se in queries, some search
engines clearly rank documents in results higher if the query terms occur
adjacent to one another or in closer proximity, as compared to documents
in which the terms occur at a distance.
• Proper nouns sometimes have higher weights, since so many searches are
performed on people, places, or things. While this may be useful, if the
search engine assumes that you are searching for a name instead of the
same word as a normal everyday term, then the search results may be
SEO is an abbreviation for „Search Engine Optimization‟. SEO is the
process of improving web pages so that it ranks higher in search engine
for targeted keywords with the ultimate goal of generating more revenue
for the web site. There are many SEO techniques. In general, these
techniques can be categorized as On-Page Optimization, On-Site
Optimization, and Off-Site Optimization.
Without specifying concrete figures, it is possible to conclude that it
is hard enough to find necessary information in the Internet at one
stroke nowadays. The Network now features billions of documents and their
number increases according to exponential functional dependence.
The quantity of data change sessions occurring at a short period is
enormous. The basic problem now lies in the absence of fully functional
data updating system accessible to all the Internet surfers worldwide.
According to the above listed reasons, the search engines have been
created. These engines have been designed to structure the information
saved up in the Network and provide web surfers handy and comprehensive
Search engine optimization is one the most effective techniques of site
attendance increase. Now some figures: First, about 90 percent of
Internet users find new web sites via search engines; hence, it‟s much
more cost effective than other marketing tactics.
Secondly, recent researches inform web masters that SEO optimized
resource visitors become real clients and partners five times eagerly in
comparison with banner advertising. The psychological aspect is as
follows: When an average web surfer finds your site on the top positions
in the search engines, he considers it to be one of the best sites over
Finally, about 80% of search engine users stop browsing search inquiry
results on the first page. Therefore, only the top 15 – 25 positions
bring sizeable inflow of visitors and potential clients / customers to
your site. It is very hard to do because of the severe competition.
Search engine optimization depends on 1. On-page factors and 2.Off-page
On-page factors include keywords, HTML tags, Content, CSS and URL
rewrites whereas Off-page factors include link popularity, page rank and
Anchor Text. We will look at all these factors in details as we move on.
But there is a basic methodology that all the SEOs follow.
Very first step is to identify Target Keywords. Keywords plays very
important role in optimizing any website. These are the words that a
prospective visitor might type into search engine to find relevant web
sites. For example if you are SEO consultant, your target visitors must
search for SEO in India, SEO industry, Internet marketing, SEO service
providers and so on. So to identify your target market and choosing
keywords is very important in at the beginning of SEO campaign.
The next step is keyword placement. You can not place all the keywords in
a single page. The keywords should be placed in such a way that crawler
can find them on multiple pages and can believe that the page is most
relevant than others. The most content focused pages are highly suitable
for keyword placements. Also the frequency and positioning of keywords
plays crucial role while optimizing any website. You can place the
keyword through out the page but crawler doesn‟t give importance to such
kind of placement. According to search engine algorithm there are some
places, which are more important than any other place in a web page. If
keywords are placed in these positions then it is more likely the page
will rank better than others. These positions include header tag, title
tag, meta tags and first few paragraphs of a web page. Also keyword
density matters while optimizing a page i.e. how many times you are going
to repeat the keywords! The ratio of number of words to keywords in it is
called keyword density.
Then follows Linking structure. One of the primary ways that search
engines find site pages is by following the links from your home page to
other inside site pages. In order to allow search engines to effectively
crawl your site and locate your inside pages, it is important to ensure
that your menu structure does not present search engines with any
barriers that interfere with their ability to follow internal links.
in the way of effective crawling. As far as search engines are concerned,
when it comes to finding links to your site pages, the simpler the
Next is, Link Building or Link Popularity. Another way that search
engines find your site pages is by following links to your site from
other external sites. Having such links to your site not only provides
search engines with additional opportunity to find your pages, but also
provides increased visibility for your site by putting it in front of
visitors on another site. Many top search engines, such as Google, will
factor in the number of sites linking to yours in determining its results
for a particular search query. This is known as “link popularity”. One
way to think about link popularity is that each external link to your
site counts as a “vote” for your site. So, the more links you have
pointing at you the better, right? Well, not necessarily. Because search
engines also know how to count the link popularity of the sites linking
to yours, a single link from a popular site will weigh more heavily than
many links from obscure unpopular sites. When it comes to getting links,
quality over quantity is the way to go.
There are some precautions to take while designing a site and planning a
SEO campaign. If these barriers can be removed your site will rank much
more higher than others and certainly will get more visitors.
Since crawler reads text only, it‟s always better to create content
oriented site. Try to avoid dynamic pages and any dynamic form as much as
possible. Also it‟s advisable not to go for frames while designing a
site. Dynamic URLs are also big barriers for SEO. Search engines
generally are not able to crawl dynamic URLs and hence can‟t index it
properly. So its better to use non-dynamic, search engine friendly URLs.
Also flash animation, videos and text are harmful for search engine