Search Engines

Document Sample
Search Engines Powered By Docstoc
					Search Engines & Marketing
Your Web Site

 Dr. Soe, Dr. Westfall & CIS Dept.
 California Polytechnic University,
 Pomona, Updated August 2010
   Introductory exercise: Googlewhacks
   How do Search Engines work?
   How do you get your web site listed?
   Search Engine exercise
   Internet information issues
   identify two words, NOT in quotation
    marks, that get only one result in
   examples (used to work)
   Exercise: find another Googlewhack
Search Engine Placement
   Search engines return lists of links
    based on search words entered by user
   Most users only look at 3 - 10 items in
    search output before changing words
   Placement--how high a web page is in
    the listings--is critically important in
    generating traffic from search engines
High Search Engine Placement
   "We can guarantee you a top 10
       what's it worth?
       how can they do it?
       Ixquick search on telecommuting
            does not include Google, bur see next page
High Search Engine Placement
   Google searches on specified words
       telecommuting productivity
       Westfall research
       Mrs. Westfall (also try Norwalk Brethren
       textbook ripoff

        Types of Search Engines
   Spiders, webcrawlers, robots
       automatic indexing of key- and other words
       Google, list of others
   Web subject directories
       built by humans who review web pages
       Yahoo!, Open Directory (Yahoo used to look
        like this)
   Hybrids = spiders + humans
        How Web Crawlers Work
   Automated index building
       crawlers or spiders (indexing robot programs)
        go to web sites
       examine pages & extract indexing information
            may simply locate words
            may identify key words, phrases, links
       store data in search engine’s database with
        URL for page
         Search Engines Deliver Indexes

   User requests information via search page
   Query engine searches database
   Delivers list of web resources
       creates results web page based on search
   Listed in order of a calculated index value
       index values based on search words, and also
        on "popularity" of site
       but usually preceded by "paid placements"
        Automated Index Building Problems

   No standards
       HTML Documents are not structured so that
        robots can extract routine information:
            Except for <meta> tags: keywords,
             description, publication date, author, etc.
            Robots index text, not graphics, movies,
   Search turns up inappropriate documents
Some Results Not "Genuine"
   Sponsored links at top or side of page
   Overture pay for placement is now
    owned by Yahoo!
   Google AdWords
       only pay for click-throughs
         Web Directories
         Built by Human Indexing

   Analyze site’s purpose
   Classify sites by broad subject area
       hierarchical classification schemes
   Yahoo! - has many people reviewing web
    site submissions
       doesn't have to accept submissions
       6 week delay unless pay for priority service?
Meta Search Engines
   Don't have their own databases or
       instead, combine results from other search
   Examples
       Dogpile, Clusty
       Ixquick (top 10 listings from other engines)
Specialized Search Engines
   Search Engines & Specialized
   List of search engines (Wikipedia)
   Search Engines Directory
   Specialized Search Engines and
    Directories (many for educators)
   Buzz Monitoring: 26 Free Social Media
    Tracking Tools
   Google: specialized "search engines"
Search Engines Ranked by %
of 16.7 Billion US Searches
 Google             65.8% 10.3 billion
 Yahoo              17.1     3.4
 Microsoft          11.0     2.1
 Ask Network         3.8     0.6
 AOL LLC             2.3     0.4
Source: comScore Releases July 2010
  U.S. Search Engine Rankings
Global Search Rankings
 Google              69.7%
 Yahoo                5.4
 Bing                 4.8
 Baidu (China)        4.6
 Yahoo (Japan)        4.4
Source: Microsoft and Baidu Gain Share
  on Google. Strategy Analytics, 2010, Q2
Search Engines Ranked by
Pages Indexed (billions)
 Yahoo!            19.2* (Aug. 2005)
 Google**          11.3 (Aug. 2005)
 MSN                5.0 (Nov. 2004)
 Ask Jeeves         2.5 "         "
* web pages (+1.6 B images, etc.)
** Google Now Knows About 1 Trillion
  Pages (July 2008)
        Get Site Into Directories
   Directories (e.g., Yahoo!) require careful
    selection of search categories & keywords
   Search for your keywords on Yahoo! to find
    appropriate categories
   Yahoo! asks for a 25-word description of
       make it really good to impress human indexers
    Targeting Spiders

   pick "keywords" that people would use to find
    a page like yours
   make these keywords prominent in your web
    pages, especially in the entry page
   Top Search Engine Ranking Factors for Google
   Search Engine Ranking Factors | SEOmoz
        Meta Tags
   keywords meta tag used to be important
       <meta name="keywords" content=
        "telecommuting, research, telecommuting
        research, telecommute, telecommutes,
        telecommuter, telecommuters">
       many search engines ignore them now because
        of widespread attempts to use them to
        manipulate rankings
Meta Tags - Description
   even though not used much in rankings
    anymore, contents of following tag are
    shown in Google page listing
       <meta name="description"
        content="Westfall research and papers on
        telecommuting, telecommuting
        productivity, telecommuting economic
        analyses, telecommuting strategies">
Keywords for Spiders
   All keywords are not created equal;
    spiders give heavier weights to:
       keywords in the <title> (more than once?)
       keywords in <h1> and other headers
       keywords in other text near top of page
       keywords in links (seen by user or in URLs)
       bold faced keywords? italics?
       How Search Engines Rank Web Pages
         More Keywords for Spiders
   Use keywords frequently, but don't repeat
    same word more than once in a row
       OK: pizza pizza
       not good: pizza pizza pizza pizza pizza pizza
   Use variations of keywords (plurals)
   Use keywords in alternate text for images
       <img src "file.jpg" alt="[keywords]">
   SEO quizzes
Links for Spiders
   Number of pages linking to a site has
    become extremely important
       Google pioneered this
       if high ranking pages link to a site on the
        same topic, it must be good
   Quality of links is also important
       need to be relevant both to page they are
        on and to linked page
Trying to Fool Spiders
   Search Engine “Spamming” (16 flavors)
       spiders are being programmed to detect it
   Examples:
       repeat hidden keywords (bottom of page)
            like background color, or <font size=1>
       keywords not related to site content
       irrelevant links: "link farms" or "link
        stuffing" (ethical issues)
"Black Hat" SEO Tactics
   Black Hat SEO (web page)
   Bad SEO example page
   Anti-link management
       forged emails asking for removal of links to
   "Black Hat" SEO (Google search)
Fake Sites on Search Engines
   Security researcher Jim Stickley created
    a phony site for a real credit union
       redirected visitors to real site
   phony site got #2 ranking on Yahoo
    and #1 on Bing
       ahead of even the credit union's real site
Google Blacklist
   It's not nice to try to fool Google!
     had an 87% decrease in
        web traffic after being blacklisted by
       search on "Google blacklist"
        Register your Web Site with
        Search Engines
   Register individually with top sites
       Yahoo!, MSN, Open Directory Project (goes
        into Google, etc.)
   Try site submission web sites?
       Submit Express 75,000 search engines,
       “Change content, resubmit every so often?
"I can guarantee a top 10 …"
   Junk mail and web sites
   True, but…
       not for your 1st choices of key words
   Use relatively unique combination of
    several words, and then them load into
    key parts of page (<title>, <H1>, etc.)
       probably not many people will search for
        this combination of words
            e.g., telecommuting productivity
Guaranteed Top 10 Listing
   Use misspelled words
       including 2 words ran together (no space
   Use unique combination of words
       keep adding unrelated words to a search
        until you find combination not found on
        any other page
   Manually submit page and put links to it
    on another page(s) that's in Google
"Google bombing"
   drives traffic to other pages by links and
       Google search on failure
       Google's AdWords ("Why these…")
       pages linking to new biography URL
Scam Website Clusters
   Scam promoters set up hundreds of
    search-optimized sites about the scam
       when you look for more information, most
        search results say good things about it
       example: try searching for keywords from
        Magic Words that Bring You Riches page
        and see how hard it is to find criticisms
Search Engine Exercise
   Search for your keywords on any
    automated search engine
   For top 2-4 sites, look for keywords in:
       <meta...>, <title>, <h1>, <a href="…>,
        <img… alt="…>, etc. (use View, Source)
       words in page, esp. near top
       also use Google advanced search (click
        Date, etc. then put URL in Find pages that
        link to the page:)
       Report any patterns you see
Site Submit Exercise
   Identify a site to submit
       find sites in Google related to Cal Poly
   Go through the process of submitting to
    a search engine or other submittal site
   Take notes, report back on experiences:
       how long it took
       information required, etc.
Locate Information on Internet

   Search Engines, Directories, Meta Search
    Engines, On-Line Indexes
   Pages with information on specific topics
   White & Yellow pages
   Usenet News
   On-line newspapers, magazines, radio and
    TV channels
    Evaluating Information Quality

   Source of site:
       Educational institution (e.g., MIT)
       Professional organization (e.g., IEEE)
       Government agency (e.g., NASA)
   Ratings by independent evaluators
   Corroborating evidence: multiple, reliable
Quality of “Did You Know?”
   YouTube video that “went viral”
   exercise: identify statements that
    probably are not true
       count passive references e.g., "predictions
       2010: Data doubling every 11 hours (do
        the math!)
Citing Web Information
   Whenever you use someone else’s
    ideas, you have to cite them
   Format for a research paper
       American Psychological Association (APA)
        Beckleheimer, J. (1994) How do you cite URL's in a
        bibliography? Retrieved [month day, year], from [URL]

   Graphics: if owner gives permission,
    follow their directions for giving credit
iving credit

Shared By: