Try the all-new QuickBooks Online for FREE.  No credit card required.

SEO Report Card

Document Sample
SEO Report Card Powered By Docstoc

SEO Report Card

Matt Wasserman
9/26/2007 SEO Report Card

       This document is intended to be an honest assessment of the state of the
        website from an SEO perspective. I’ve taken other considerations into account whenever
        possible, but I am not aware of all the factors that went into the decision making processes.
        Please accept everything in here as constructive criticism.
       I used samples from different sections of the website as the basis of this
        report. Some issues may not apply to all pages.
       We don’t know exactly how search engines work. Any information that might allow a
        webmaster to fool the algorithms is considered a closely guarded secret by the search
        organizations. SEO is a game where we aren’t allowed to know the rules and they can change
        at any time Therefore we have to work with what we do know, and then make some
        assumptions (assumptions are marked with an asterisk *):
            o Search engine spiders see pages the way a very simple, text-based browser would.
                 Focusing on providing quality content within those limitations is also useful in making
                 a site friendlier to disabled users, since screen-readers function in much the same
            o *Spiders don’t consider client-side code like JavaScript content, but they do factor
                 the entire size of the page into keyword density calculations. This is pretty well
                 accepted but I haven’t found where it has been verified.
            o *Since the stated goal of any search organization is to provide access to relevant,
                 high-quality content for their users, it stands to reason that one of the most effective
                 methods of SEO would be to help them achieve that goal. For that reason I will only
                 be suggesting honest SEO techniques (white hat), and avoiding dishonest or
                 disingenuous ones (black hat). While no SEO technique can be considered foolproof
                 or permanent, we do know that Google, et al are actively working to combat
                 misleading SEO techniques, so using them would be a short-term solution with
                 possibly long-lasting negative consequences.
       Search engine optimization, when done correctly, does more than just move a page up in the
        search rankings and increase clicks from organic search, although increasing unique visitors
        from organic search is the focus of this project. If the optimization project keeps the user
        experience first and foremost at all times, SEO can also:
            o Drive future direct visits. By ensuring that the content linked from a search is relevant
                 and has a lot of value, the perceived value of the site as a whole will also increase
                 and the likelihood of being added to the users’ bookmarks increases.
            o Reduce bounce rate. If the landing page is valuable and relevant, the visitor is more
                 likely to look around, increasing impressions.
            o Reduce cost per click for CPC campaigns. An optimized landing page offers higher
                 keyword quality, which reduces the amount charged for a given keyword and position
                 in paid search.
            o Help market content and build brand awareness. Every search result on Google is an
                 opportunity to present a 66 character blurb and a 128 character sales pitch for both

2 SEO Report Card

                 the site and the linked page. Every SERP we show up on is an ad impression for
       I will be focusing on Google. Not only do they dominate the search market, but increasing
        rankings on Google generally increases rankings on the other search engines. The reverse is
        not necessarily true.
       Google, more than any other search organization, seems to have taken the concept of
        becoming successful by serving your customers completely to heart. This is good for us
        because it gives us some solid ground to stand on in regards to decision making.
       Automation is going to be the key to a lot of the initiatives suggested here. That’s the best,
        most cost effective way to achieve consistent results across a site of this size, and allows the
        most flexibility for the future. It also means that some of the things we need to do are going
        to be dependent on when we get the tools in place to make them practical. Fortunately, we
        already have a lot of what we need to make significant improvements in search rankings in
        place. We just have to make certain decisions and leverage an excellent infrastructure.

Current State
                                    Batanga SEO Report Card

               SEO Success Metric                                           Score
Page Structure                                                                C
Language                                                                      F
Site Structure                                                                C
Content                                                                      B
Inbound Links                                                                D
Internal Links                                                               D
Age of Domain                                                                A+
Total Grade                                                                  C-

3 SEO Report Card

Where we are now
Supplemental Entries

As of September 21, 2007, these are the results of a site search for on Google:

Total Pages Indexed: 80,300

Pages in the main Google index: 11,200

Pages in the supplemental Google index: 69,100

Supplemental Index Ratio: ~86%

What does this mean?
The supplemental index (Google has stopped calling it that and started denying it exists) is a way for
Google to improve the performance of searches for the majority of their users. By moving pages that
they estimate are never going to rank well, unless very specific search terms are used, to a
secondary index they reduce the load on the primary index and return better results faster for most

Most of the pages on (86%) aren’t going to show up in searches unless some very
specific and unique search terms are used. This is because Google’s search algorithms have
determined that they are mostly duplicate or low-value content, or they are targeted at a very specific
audience. We know that in most cases none of these things are true, which means we have to work
towards modifying the pages to reflect their true relevance and value, or modifying the content to
increase their relevance and value, or both.

What it means to us is that these pages are doing nothing to increase the value of our other pages or
the site as a whole. Google says that the index formerly known as the supplemental index is no
longer excluded from most searches, which may be true, but internal links from those pages do not
help any of our other pages to rank higher. It also means is that these pages are highly unlikely to
generate ad impressions without help from CPC.

Current Rankings for Key Terms
Latin Music: 1

Latin Music Videos: 1

Music Videos: 57

Musica: 15

Internet Radio: 100+

4 SEO Report Card

Reggaeton: 36

Hip Hop: 100+

Baladas: 16

Mexicana: 100+

Mariachi: 100+

Cumbia: 100+

Tejano: 100+

Salsa: 100+

Merengue: 100+

Bachata: 100+

Flamenco: 100+

Tango: 100+

Cubanisimo: 9

Bolero: 100+

T-Pain: 100+

Chris Brown: 100+

RBD: 100+

Akon: 100+

Aventura: 100+

Fergie: 80

Don Omar: 30

Daddy Yankee: 12

Wisin y Yandel: 7

Rakim & Ken-Y: 6


5 SEO Report Card

Page Structure - C
Page structure is everything that goes into a page which is not unique, user visible content. Menus,
administrative links, code used for presentation of content. Just about everything in the page
template is structure, plus a little bit more.

               SEO Success Metric                                         Score

Page Headers                                                                 C
JavaScript                                                                   C
CSS                                                                          B
Page Titles                                                                  C
Page Descriptions                                                            D
Total Grade                                                                  C

Page Headers
The header is where the user agent is told how to render the page. There can be a great deal of
information here, but the language of the content, how to interpret characters, and information
regarding what the page is about are the primary concerns of a search engine spider so that is what
we will focus on.

Issues Found
Invalid Markup – The default templates do not validate using the W3C Markup Validator, which
means they don’t conform to the standards that are used as a baseline by the developers of all user
agents. While there is some debate over the true impact of valid source code on search engine
indexes – if Google only indexed valid pages the index would contain hundreds of thousands of
pages instead of billions – there are several reasons this is an issue worth addressing:

       The validator guesses that the page is encoded using UTF-8, which is an indicator that most
        browsers and spiders will do the same. The character coding intended, however, is ISO-8859-
        1. Using the incorrect or misinterpreted character encoding can result in spiders indexing text
        incorrectly, or browsers presenting it incorrectly.
       Validation breaks completely, at different places in the templates depending on the content,
        because of a non-Unicode compliant character (the XHTML 1.0 standard, which is the
        doctype this template is using, requires that all characters on the page be defined within
        Unicode even if the page itself is not encoded using Unicode). This introduces the possibility
        that no content after the point where validation breaks will be indexed.
       We don’t know how the various spiders break, or how they handle different types of invalid
        code. It is also nearly impossible to tell how every browser will respond to every possible
        error. Therefore, using valid code is the approach that has the best chance of achieving a
        predictable result.
       I can’t determine if there are any other validation errors in the templates because validation
        stops at that point.

6 SEO Report Card

Course of Action
    Change the default character encoding statement from
       <meta http-equiv=”charset” content=”ISO-8859-1”>


        <meta http-equiv=”Content-Type” content=”text/html; charset=ISO-8859-1”>
       Test changing to UTF-8 character encoding. This is the current recommendation of the W3C.
       Retest and continue making changes until pages validate completely.
       Validate all future changes to the template using the W3C’s validation tools before publishing
        those changes.

Language Meta-Tag – Google doesn’t recognize the language of the pages. Here are the results of a
site search of

No language specified: 80,300 pages returned.

English language specified: 4 pages returned

Spanish language specified: 4 pages returned

As you can see, users who run their searches in a specific language are not likely to see results
returned from Batanga. This can also limit the number of inbound links generated by scraping
Google for language specific results. Most likely a small impact but possibly some just the same.

Course of Action
    Change

        <meta http-equiv=”content-language” content=”Spanish”>


        <meta http-equiv=”content-language” content=”es”>


        <meta http-equiv=”content-language” content=”en”>

        As appropriate.

Language will be discussed again in this document, several times. This is a major consideration for
this site, as it is for all multi-lingual sites.

Keywords Meta-Tag –Keywords are not tailored to the specific content pages. Tags contain either
default keywords for the entire site, or the page title. Words contained in the meta tag do not
necessarily appear within the indexed text of the page. While having the right keywords in this tag will
not necessarily help you, having the wrong ones can definitely hurt you.

7 SEO Report Card

Course of Action
    Develop a list of keywords that marketing feels it is important to rank for.
    Use WordTracker and existing analytics to generate lists of relevant keywords and search
       terms for different types of content based on the list from marketing.
    Utilize editorial tags within the current system and the keyword lists generated by marketing
       to develop a consistent framework for the creation of keyword lists based on information
       specific to the content and the content type, and automate the creation of meta keyword

JavaScript has become increasingly popular in recent years. Once limited to forms validation (which
is why it was created) and then primarily used to add useless features to personal web sites,
techniques such as AJAX and the increased demand for rich web interfaces has resulted in more and
more reliance on it on professional sites. And there is no question that it can be used to add some
great functionality that users will enjoy.

Issues Found
JavaScript is not always called from external .js files, increasing page size and also increasing the
chances that poorly formed code will break search spiders.

Some content is created by JavaScript, and will not be indexed on Google (radio station descriptions,
main slideshow on homepage).

Course of Action
    Use CSS, combined with JavaScript, to display the slideshow at the top of the homepage
       instead of pure JavaScript. This will make the content available to spiders.
    Move all JavaScript to external .js files to reduce the amount of code on the page and
       increase keyword density. JavaScript enabled user agents will still access the scripts, and
       non-JavaScript enabled user agents will ignore them. This will also make the code easier to
       manage and maintain.
    Audit all uses of JavaScript on the site to ensure that it’s necessary and well formed.


CSS, or Cascading Style Sheets, is a standardized mechanism for separating the content of a web
page from its formatting. Everything about the content except the words and images themselves,
including position and even visibility, is considered formatting. This makes it the best way to build
rich, dynamic, SEO friendly sites because it never touches the content itself, just tells browsers how
to present it. It’s more limited in what it can do than JavaScript, and is actually often used with
JavaScript, but with some planning and thought it can be used to create rich effects without hurting
(and sometimes helping) SEO.

Issues Found
Some inline CSS is being used on the site, causing Google algorithms to possibly think deceptive
practices are being employed because it looks like we are trying to hide content.

8 SEO Report Card

Source ordering is not being used to improve the relevance of the content.

Course of Action
    Move all CSS to external style sheets
    Look into redesigning the page using source ordered code. Cross browser functionality is an
       area of particular concern when doing this.

Page Titles
One of the most common reasons for pages failing to rank well is duplicate titles – the sword cuts
both ways. Since the title is such an important factor in letting the algorithm know what the page is
about, duplicate titles are sometimes interpreted as indicating duplicate content. This becomes truer
as the number and complexity of the pages increases, since much of the code on every page is the

Page title length is another issue. Google only takes the first 66 characters of the title into account
when indexing, and only shows the first 66 characters in SERPs (Yahoo gives you 120). That means
you have to get as much information as possible that will help the page rank or get a searcher to
click on it into the first 66 characters. This makes for some very intense and creative copywriting.

One area that will require some discussion and possibly compromise is the position of the brand
within the title outside the homepage. The homepage is unique, and should have a unique title
structure. Putting the brand first there makes sense since it’s the page most likely to be bookmarked
(and bookmarks default to using the page title), it usually has the most general content, and it’s
analogous to a storefront. From an SEO standpoint, all other revenue generating pages should have
the brand last or close to last (administrative pages are not an issue as long as we keep them from
being crawled) because the URL will be enough to get the pages to show up if they are, and you want
the homepage to show up first if they do. Search algorithms put greater weight on the words closest
to the front of the title, and you want words specific to the page content to be the last to fall off if you
go over 66 characters.

Current home page title: - Free, Latin Music, Watch Videos, Radio, Música Latina, Cine...

Proposed home page title:

        Batanga – Free Latin Music, Music Videos, and Internet Radio

Current content page title:

        Fergie: Big Girls Don't Cry (Personal) / Videoclip - - Free, Latin Music, Watch
        Videos, Radio, Música Latina, Cine...

Proposed content page title:

        Fergie: Big Girls Don’t Cry (Personal) – Latin music videos at Batanga

9 SEO Report Card

Proposed short title (for use in site maps)

        Music Video - Fergie: Big Girls Don’t Cry (Personal)

The most likely search terms are at the front, increasing their importance. All keywords that are
pertinent to this content are included, reinforcing the type of content available on this part of the
site. And if the name of the song or the artist gets longer, the terms least likely to be used in
searches are the first to fall off.

Batanga already ranks #1 for the term Latin music and we’re going to keep it there by always having
those 2 words as close together as possible on every page title and keeping the titles short to
maximize density. Music video pages will say music video in the title, etc… to help get rankings when
the word Latin is not included in the search term.

It’s important to keep the part after the description of the content as short as possible, to increase
keyword density and reduce duplicate content. The goal is to generate uniqueness within a strictly
controlled structure while constantly reinforcing the key points of the site.

Keep in mind that the page title is also the link text to the site from a SERP, and clearly
communicating what the searcher will find there increases the chances you will get that click.

Issues Found
     Page titles are inconsistent and often contain both English and Spanish.
     Titles aren’t being used to market the site effectively
     Titles may or may not contain relevant search terms for their specific pages, and they usually
        won’t help the page rank for the relevant search terms.

Course of Action
    Develop a consistent framework for creating page titles which convey the subject of the page
       clearly and concisely and fit within the overall branding and positioning strategy.
    Utilize editorial tags to dynamically create page titles that meet the new standards.
    Segregate admin pages (privacy statement, signup forms, etc...) to their own directory and
       prevent spiders from crawling them using robots.txt.

Page Descriptions

You have 128 characters to bring a customer in the door using the description. That’s not much, but
if you do it you not only get a click but you might keep a competitor from getting one. Both help
increase your rankings, so writing good descriptions is worth the effort.

Descriptions need to be unique. Duplicate descriptions are just as harmful to rankings as duplicate
titles and content. Every description should be directly relevant to the specific page it’s associated
with, which should help drive uniqueness.

10 SEO Report Card

Issues Found
     Page descriptions are hard coded on many pages, and do not relate directly to the content on
        the page they are describing.
     Good descriptions are included in the tags for some content, but they are not being used.

Course of Action
    Develop standards for content descriptions for all pages on
    Utilize existing rich page descriptions where they exist, and automate the building of default
       descriptions using other tags where they do not.
    Prioritize the top 100 artists, videos, and CDs, plus all genres, and ensure that they have rich
       descriptions in the description tags in groups of 25.


This gets its own section, because it’s the biggest issue and opportunity facing this site. The majority
of our audience is bilingual, but most of them are going to have a preference and that preference will
apply to their searches as well as our content. We have to meet their wants and needs at every step
of the transaction, and do it effectively and transparently.

                SEO Success Metric                                           Score

Pages indexed in both languages                       F
Search results returned in both languages             F
Search results returned for a specific language       F
Total Score                                           F

Currently, the site uses JavaScript to let the user select the language they want to use. This only
changes the language of the content, not the language of the page. URLs, page titles, descriptions,
and keywords all stay the same and spiders generally only see and index Spanish content. The
mechanism for changing the language is invisible to them.

We need 2 sites – one in English and one in Spanish.

        Pages will be optimized for that language. Titles, URLs, descriptions, keywords and content
         will all be in the correct language, which is the one that matches the search terms.
        Searches conducted in English will return results in English, searches conducted in Spanish
         will return results in Spanish, and the user will be directed to a page in the language of their
        Fewer keywords, higher keyword density, no mismatched words.
        Twice as many pages will get indexed. Twice as many opportunities for unique content.
        Setting up and maintaining 2 statically linked sites is the only strategy Google recommends
         for bilingual sites. It’s the only method that ensures all content will be accessible to and
         indexed by their spiders.

11 SEO Report Card

It would work in much the same way as what we are doing now, except for instead of using
JavaScript it would use a static HTML link generated on the server. The link would take them to the
opposite language version of the same page. Links on Spanish language pages would go to the other
Spanish pages, and vice versa. The only possibly complicated technical issue would be dealing with
the back button and maintaining the last language selected, but that shouldn’t be insurmountable.

The real issue is management. We all know how much work it takes to maintain one site of this size.
Maintaining 2 wouldn’t be twice as much, since the content is already there in both languages and
the templates would be mostly the same, but it would be a significant increase. The content
management system is going to be the key to making this work, as it will for a lot of the things we
need to do in order to maintain good SEO.

Course of Action
    Determine what steps can be taken to separate English language content from Spanish
       language content using existing tools.
    Decide, with the development and content groups, what steps we can implement based on
       what we learn about the tools.
    Make SEO a priority in the selection process for the new content management system.
    Implement whatever changes we can to provide more transparent language support.

Site Structure - C

Having a clear, well organized site structure is an important part of getting search engine rankings as
well as providing value to visitors. The structure should make all the pages easy to find, and the
URL’s should communicate what the pages are about.

                SEO Success Metric                                           Score

Site Map                                                                       F
URL Structure                                                                  A
Robots.txt                                                                     A
Total Grade                                                                    C

Site Map
There are 2 types of site maps involved here. One is available from the site, usually from a link in the
footer or just the home page, used by visitors on occasion but critical to making sure that search
engine spiders have a link to follow to every one of our pages. The other type is an XML document or
documents that all major search organizations recognize.

The first type should be a dynamically generated list of links, using a short version of the page title as
the anchor text (no “– Latin music videos at Batanga”, for example). There has to be a hierarchy,
because Google recommends that we stay below 100 links per page and because a visitor may want

12 SEO Report Card

to use it, so it needs to be broken up into sections. And within the pages that make up the sitemap
there has to be a link to every page on the site.

The second type should also be dynamically generated, in the root of the site, using XML following
the sitemap protocol. Google, Yahoo, and MSN all support using robots.txt to direct them to a
sitemap, and they all support the sitemap protocol, so this method will automatically update the
sitemap at those search organizations every time they crawl the site.

Issues Found
     Batanga currently has no sitemaps.

Course of Action
    Develop a hierarchal sitemap for use by spiders and visitors, and add a link to the page
    Develop an automated sitemap that follows the sitmaps protocol.
    Add the line sitemap: to robots.txt.

URL Structure
The URLs are well formed, with related content using the same domain and unrelated content
segregated properly.

The robots.txt file is in place and validates correctly. We can use this to optimize the way the site gets
crawled in the future.

Inbound Links

Inbound links are the third major component in getting good rankings. The more sites refer to you as
a relevant and important source, the better.

                SEO Success Metric                                           Score

Quantity of Links                                     D
Quality of Links
Leveraging Partners
Leveraging Content                                    D
Leveraging Media and Industry personalities
Total Score

Google takes several aspects of an inbound link into account:

        Link quality or the ranking of the page linking to yours for keywords related to yours, as well
         as its ranking overall. Paid links are being punished, as are link farms or exchanges. Paid

13 SEO Report Card

         directories are not necessarily punished, but the quality of the link is adjusted according to
         the amount of editorial review the site gives its links.
        Anchor text, or the text someone clicks on to use the link. If the link says “Click here”, it’s not
         going to be as high quality of a link as if it says “Latin music videos”.
        Text near the link or the context the link is in. Lots of people still use “Click here”, so this
         helps get some “link juice” back if that happens. It also reduces the value of links from pages
         that are simply lists of links. And it goes to relevance – a link to Batanga from a page about
         tea in China, even if the site has a Latin context, is probably a paid link or a poorly managed
         content advertisement.
        Is the link one way or reciprocal? A site that links to you for no other reason than to direct its
         visitors to you can carry more weight than one you link to in return. Reciprocal links are fine,
         but it’s important to be aware of who you are linking to and getting linked from. A reciprocal
         link is an agreement to allow their reputation to influence yours.

The sheer quantity of links on the web now is creating a trend where search engines feel more and
more comfortable ignoring links they aren’t sure they can trust.

Issues Found
     Existing relationships with artists, content producers, and business partners are not being
        leveraged to generate links.
     Publicity and exposure we provide for artists and events is not being used to generate links.
     Anchor text in inbound links is usually not relevant to anything but the brand.
     Directories are not being put to the best use possible
     Content is not being exposed to an editorial audience outside the Batanga site

Course of Action
    Complete an analysis of current links to
    Devise a strategy for leveraging content and relationships from all areas of Batanga,
       including Live and the magazines, to generate links from highly relevant sources.
    Develop preferred anchor text for links to, for use in requested links, which
       reflects the context and the target of the link.
    Determine who is linking to our top competitors, and how we can get them to link to Batanga
       as well or instead.
    Generate a list of business associates and partners and verify that that they are linking to Request a link if they are not.
    Generate a list of directories that can provide quality links to, and submit the
         site to them (will require some budget).
        Use existing resources to generate quality content for linking.
        Explore using community news sites such as Reddit and Digg as alternate venues for
         publication of content to generate links and traffic.
        Generate more unique and authoritative content by leveraging the DJ’s – daily blogs,
         song of the day recommendations, record reviews, etc… to drive links and RSS

14 SEO Report Card

        Create or leverage existing relationships with people in the music (artists, publicists)
         and Latin media to generate links
        Work with charities. Because it’s the right thing to do, but also because it helps
         generate high value links from their sites.
        Build out and promote widgets for all radio stations.

Content – B-

Content is everything, both to search engines and visitors. Everything else we do in SEO is to
highlight and promote the content we have to offer. The best way to rank ahead of your competitors
is to have high quality, keyword rich content and lots of it. It is also important to take advantage of
opportunities to insert keywords into the content in a way that is not intrusive or contrived and to
convey the importance and relevance of terms to search engines and visitors.

               SEO Success Metric                                           Score

Use of keywords                                      B
Use of image attributes                              C
Use of heading tags                                  D
Quantity of Content                                  B
Total Score                                          B-

Issues Found
     Keyword density is low on many pages. Keywords often appear only once or twice on a page,
        if at all.
             o Image alt and title attributes are not being used consistently.
             o Image names are sometimes cryptic.
     Heading tags are sometimes being used for administrative headers instead of content.
     Copy is sometimes too short, affecting uniqueness and offering fewer opportunities to
        include keywords.

Course of Action
    Modify the templates so they make appropriate use of heading tags, use keywords in the
       anchor text for links, and include keywords in image attributes.
    Suggest implementing minimum length guidelines for various types of content.
    Implement a review process, either automated or manual using tools, to verify that keywords
       are being included in the content.

Internal Links
Internal links are important because they establish a clickable path to our content and they can help
to establish or reinforce relevance with the proper use of keywords.

15 SEO Report Card

Issues found:
     Some pages don’t have internal links that are visible to search engines.
     Some pages have only a single link leading to them, increasing the chances they will be
        missed on a given crawl, and limiting opportunities to establish relevance.
     Links to radio stations go nowhere, so radio station pages are not getting indexed.

Course of Action
    Implement sitemaps, as mentioned above.
    Review the anchor text being used in internal links to ensure that keywords are being
    Implement a 2 way internal linking structure, which uses keywords in the anchor text to
       reinforce relationships and relevance between general and specific content.


Shared By: