Search Engine

Document Sample
Search Engine Powered By Docstoc
					Search Engine Optimization
        Black Hat
       Dr. Drew Hwang
      Black Hat Optimization
• Techniques that search engines do not
• May be effective in the short run
• May eventually be reduced in PR or be
  banned either temporarily or permanently
  automatically by the search engines'
  algorithms or by a manual site review
• Once banned, it is very hard to be
• Examples: spamdexing, cloaking, etc.

          Black Hat Optimization
• Manipulate the relevancy or prominence of resources
  indexed by a search engine in a manner inconsistent
  with the purpose of the indexing system
• Also called search spam or search engine spam
• Content spam: altering the logical view that a search
  engine has over the page's contents
• Link spam: any attempt to send a website links that
  are not relevant to the subject matter of a page, or
  are placed in-appropriately within unrelated subject
  matter; excessive linking; cross linking from social
  networking websites (e.g. wiki, blog) to the target
  website for the sole purpose of artificially inflate link
• Other types: URL redirection, cloaking, etc.

         Black Hat Optimization
                  Content Spam
• Keyword stuffing: calculated placement of
  keywords within a page to raise the keyword count,
  variety, and density of the page; works only for
  keyword-based indexing, not for modern search
• Hidden unrelated text: disguising keywords and
  phrases by making them the same color as the
  background, using a tiny font size, or hiding them
  within HTML code such as "no frame" sections, ALT
  attributes, zero-width/height DIVs, and "no script"

         Black Hat Optimization
                   Content Spam
• Meta tag stuffing: Repeating keywords in the meta
  tags or using unrelated meta keywords to the site's
  content; has been ineffective since 2005
• Doorway pages: web pages (e.g., entrance page)
  containing very little content but stuffed with many
  similar keywords and phrases; designed to rank
  highly within the search results with no purpose to

         Black Hat Optimization
                     Link Spam
• Link farm: any group of websites that have of pages
  referencing each other
• Hidden Links: putting links where visitors will not
  see them in order to increase link popularity;
  techniques including creating the links the same
  color as the background or in a tiny font size, or
  placing them in no-frame sections, no-script
  sections, zero-height/width DIV tags, or ALT
  attributes; hidden links can be injected by hackers
• Sybil interlinking: creating multiple web sites at
  different domain names that all link to each other

           Black Hat Optimization
                           Link Spam
• Spam blogs (slogs): artificially created weblog sites
  which the author uses to promote affiliated
  websites or to increase the search engine rankings
  of associated sites; in nature similar to link farms
  Note: Splogs are blogs where the articles are fake for search engine
  spamming. To spam in blogs is to include random comments on the blogs
  of innocent bystanders, in which spammers take advantage of a site's
  ability to allow visitors to post comments that may include links.
• Page hijacking: creating a fake copy of a popular
  website which shows contents similar to the
  original to a web crawler but redirects web surfers
  to unrelated or malicious websites

        Black Hat Optimization
                    Link Spam
• Expired Domins: monitoring DNS records for
  expiring domains, buying them when they
  expire, and replacing the pages with links to the
  spammer’s pages with the express purpose of
  obtaining handy PR

         Black Hat Optimization
                     Link Spam
• Buying PageRank: acquiring inbound links through
• Cookie stuffing: placing an affiliate tracking cookie
  on a website visitor's computer without their
  knowledge to generate traffic for the spammer’s
• Astroturfing: stimulating interest to a website
  artificially by having the spammer’s own people
  post reviews, blog comments and other user-
  generated content to favor the spammer’s website,
  product or service and/or post disparaging
  comments to your competitors

           Black Hat Optimization
                         Other Types
• URL redirection: taking the user to the spammer’s
  page without his or her intervention through META
  refresh tags, Flash, JavaScript, Java or Server side
• Cloaking:
   – The technique in which the content presented to the search
     engine spider is different from that presented to the user's
   – When a user is identified as a search engine spider based on the
     IP addresses or the User-Agent HTTP header , a server-side script
     delivers a different version of the web page.
   – The purpose is to try to trick search engines into giving the
     relevant site a higher ranking
   – For this reason, major search engines consider cloaking for
     deception to be a violation of their guidelines.
   – Similar to doorway pages.

        Black Hat Optimization
             Google Sanbox Effect
• The claim that Google temporarily reduces the
  page rank of new domains, placing them into
  what is referred to as its "sandbox” in an effort
  to counter the ways that search engine
  optimizers attempt to manipulate Google's
  page ranking by using back hat optimization
• A phenomenon that people have claimed to
  observe in the ranking of web pages that is
  performed by Google
• A phenomenon that has been written and
  debated about but not confirmed

          Black Hat Optimization
                  SEO Code Injection
• Back-end code injection: Web applications
  employing back-end systems that
  dynamically modify page content (e.g., meta-
  data, heading tag, etc.) to increase page
  relevance to search engines can be (and has
  been) abused by criminals in the past.
• Hidden link spam injection: Trusted web sites
  can be compromised by being inserted
  hundreds of hidden spam links.

           Black Hat Optimization
                   SEO Code Injection
• A quirky tool to check bad neighborhood linkages:
• Google as a hidden link detection tool: Hackers
  deliberately make their hidden links visible to
  search engine spiders. This makes Google a great
  tool to find infected web pages (e.g., parse the font
  color tags and compare them with the background
• But Google also can't be used to find other types of
  potentially harmful content such as iframes,
  malicious scripts and redirects.

        Black Hat Optimization
               Google Bombing
• Placing hyperlinks to directly affect the rank
  of other sites
• Google first algorithmically combated Google
  bombing on January 25, 2007.


Shared By: