SEMANTIC SEARCH.pdf by wangnuanzg


									    SEMANTIC SEARCH
    What it is and What You Need to Know
    06 April 2012
    By: Jonah A. Berger, SEO Manager


Google makes hundreds of ranking algorithm changes each year – many that fly under the radar or
are headlines one day and back page residents the next. The search engine is in the news again, this
time for potential changes to the look and definitely the feel of its search results that website owners
and agency partners need to pay attention to. The changes, referred to separately (but housed under
a similar content umbrella) as direct answers and semantic search in this article, could have a
profound impact on how search results are displayed. The positive news is that while these changes
may alter the SERP as we know it, current SEO efforts don’t necessarily need to experience a
dramatic shift. This article provides extensive background on the history of semantic search and
concludes with five takeaways that can help emphasize what you as a site owner or agency partner
need to know.

First, the ‘Semantic Web’

Tim Berners-Lee is credited not only with inventing the World Wide Web, but also what’s widely
referred to today as the “Semantic Web.” The World Wide Web Consortium (i.e. W3C) director
believed early on that Web content was inadvertently designed for humans to read and not for
computer programs to manipulate. In a 2001 article on ScientificAmerican.com1, Berners-Lee
described the Semantic Web in the following manner: “The Semantic Web will bring structure to the
meaningful content of Web pages, creating an environment where software agents roaming from
page to page can readily carry out sophisticated tasks for users.”

Berners-Lee’s vision of “software agents roaming from page to page” of course ties in naturally with
SEO and how search engines of today work. When search was in its most basic form, it was a
content-based index of keywords that returned Web page results based on whatever the searcher

typed in. If you queried the engine for “crab cakes,” what you’d see in return were a multitude of
sites that had content that included the word “crab cakes.” Google, always one to tinker and tweak,
has improved its ranking algorithm over the years to add the power of link popularity. Now, if your
content is excellent but you don’t have much of a link portfolio, chances are you won’t rank where
you want to. What used to be a simple “let’s check our index to see where your keyword(s) match
and display the best page” has turned into a several-hundred-search-signals-strong algorithm that
has SEOs and site owners frothing at the mouth for even the slightest bit of insight. But, as Google’s
ranking algorithm has become more complex and relevance based, it’s still unable to completely
understand word sequences and meaning like humans do. This is where semantic search comes in.

Take, for example, Google search results for the query “what burgers does kuma’s corner offer?”
The following screen shot reveals what the search engine displays:

                                                                         Search results like these may
                                                                         not be exactly what the
                                                                         searcher is looking for. Why?
                                                                         They don’t answer the

Search results for the query that references the popular
Chicago burger restaurant are far from optimal – yet
they’re all too familiar to us. What’s wrong? They don’t
directly answer the query, which to a searcher is of utmost
importance. To tie in with Berners-Lee’s thoughts, when a
searcher queries an engine looking for information like this,
what’s returned should be similar to what a family member,
best friend or one familiar with the subject might say.
Instead of displaying a generic list of pages that are based     Is this burger available at Kuma’s?
off an elaborate algorithm, a search engine would
understand searcher intent and use a more human element to show a list of Kuma’s Corner’s
burgers (possibly in the search interface itself – more on this in a moment). Additionally, the search
listings on the page would be more tailored to the query and provide information that searchers can
digest and use in their decision to click through to the site. Upon learning that Kuma’s offers a
burger that has poblano peppers, bacon and cheddar cheese on it may provide enough of a boost for
that searcher to click through and learn about Kuma’s appetizers, side dishes, beers on tap, etc.

Changes likely on the way

There are at least two approaches Google may take to better understand user intent and how the
words of a search query and/or on a Web page are related to each other. The first is what’s being
referred to as “direct answers” – the idea that instead of having searchers click through a result and
leave Google to find the information they need, why not just present it on the SERP itself and let
them decide from there? The second is semantic search, which for such a big and bold name can, for
the purposes of this article, be broken into two areas that all SEOs and site owners should be
focusing on today: relevant page content and appropriate markup language use on those pages.

Direct answers

At the core of direct answers is Google’s Knowledge Graph, which the search engine continues to
refine after purchasing semantic search startup Metaweb in 2010. According to Google Fellow and
SVP Amit Singhal, the Knowledge Graph is said to contain millions “of interconnected entities and
their attributes,” which as it grows, will allow traditional Google search results to display results that
are more Artificial Intelligence and less ranking algorithm powered. Knowledge Graph entities can be
classified as people, places and things that can help search engines like Google actually provide
answers to questions and not just match queries located on a page.

As an example, is a semantic search engine that uses what it calls “semantic keys” in
combination with related keywords to return results that answer to that specific semantic key. If you
visit the search engine and type in the “price:” operator followed by a keyword (we’ll use
“bananas”), you’ll see the following:

At left in the Lexxe screen shot is a comparison box that shows the prices of bananas found in the
3,000-plus results returned by the search engine. The data indicate that 13 percent of the banana
prices in the results are $1 and 7 percent of the prices are $1.96. For a searcher who’s only
interested in what a banana costs, this is a quick and easy way to find out. Lexxe results of course
aren’t perfect, as is obvious from some of the yellow highlighted listings. It appears that the engine
returns pages that have both the keyword “bananas” on them as well as prices in the nearby text,
but those prices may not be related to bananas but instead something else on the page.

Search engines like Lexxe are nothing new; in fact, aside from data, some have been providing
searchers with topic/query summaries for years. One of these engines is DuckDuckGo.com2, which
uses what it calls “crowd-sourced” sites like Wikipedia and Wolfram Alpha to provide topic
summaries atop its results. The following screen shot shows an example for the query “who is john

Like Lexxe’s, the DuckDuckGo summary isn’t perfect, but it does give the searcher more information
about their query than Google does, as the latter currently displays (in typical SERP fashion) a
Wikipedia link, a link to and links to a variety of other sports sites like ESPN. Google
aims to change this through the use of its Knowledge Graph and its millions and millions of entities.
Once unveiled, one could envision a “who is john elway?” search query to be returned with
biographical information, as well as related content and links to Google Images, Maps, News and
even Google’s latest product rollout called “Play,”3 which is a marketplace that’s likely to sell not
only John Elway merchandise, but also merchandise for two teams he played on – the Stanford
Cardinal and Denver Broncos.

Uh oh! Will searchers ever leave Google?

A world in which searchers enter Google and never leave of course isn’t going to sit lightly with SEOs
and site owners who rely on driving search quality traffic, so what can be done to combat direct
answers? There isn’t one obvious solution, but since direct answers is likely to thrive off
understanding searcher intent and providing the content that matches their query, a content
strategy for your site that’s robust and focused more on expanding the content itself and less on the
keywords is a great start. Google’s direct answers, when simplified, are really just encyclopedia-like
snippets that present a plethora of facts to the searcher. Since it’s factual and relevant, why not
provide similar content on your site if you’re not already?

Markup language is also the key

The previous paragraph probably isn’t news to most of you. Since the beginning of time (or, at least,
SEO), the importance of having great content on your site has been preached to site owners by

Google on down the line. So how can you improve the content your site offers even more –
especially as Google changes begin to take form? Through the use of markup language. Berners-Lee,
in his quest for a more Semantic Web, said in the same 2001 article that
“data transmitted across the Web is largely throw-away data that looks good but has little
structure.” He in particular was talking about how HTML is more for human interpretation and
document structure than it is for representation of the document subject (i.e. what the document is
actually about and what machines can do with it).

The Semantic Web of today, when you zero in on semantic search, is more than just HTML, as it uses
rich snippets like microformats, microdata and RDFa to provide site owners with HTML tags that can
be used to mark up their website pages with structured data. Think of these rich snippets in terms of
a restaurant analogy. Say you sit down at a new dining establishment and are presented a thick
menu that runs the gamut on options. Right away you realize that prices are on one page and food
items on another, both in no particular order. Instead of blindly ordering or taking a guess as to what
looks good or may be priced fairly, wouldn’t your experience be much more enriching and rewarding
if food and prices were associated with each other? Yes, this is just an analogy, but now you know
how search engines can benefit from being served a heaping helping of rich snippets.

We mentioned microformats, microdata and RDFa – so which ones should you use? Google, Bing
and Yahoo! changed the markup landscape in June 2011 when they announced the Schema.org4
initiative. tags, which like the other markups don’t have any affect on the user
experience as they’re part of the page code, are recognized by the major search engines and allow
them to better understand the most pertinent information. Rather than populating a standard
<TITLE> tag with keywords and calling it a day, presents a variety of elements
(remember people, places and things?) to search engines. Instead of an engine crawling a site and
wondering if a piece of content is a review or not, markup language allows for the site owner to
basically say, “Yes, Google, this is a review.” The more search engines understand page content, the
more relevant their search results can be.

To delve a bit into what markup looks like, the following snippet tells a search engine
that the actress in a particular movie is Christina Ricci:

<span>Actor: <span itemprop="actor">Christina Ricci</span> (born February 12, 1980)</span>

Armed with this information, a search engine like Google for the query “christina ricci imdb” can
display a result like this:

Instead of the traditional search result that includes a clickable <TITLE> tag and the META
description blurb that follows, the enhanced result shows movie ratings and lists the director and
cast members. Google has said5 that rich snippets like these do not affect how a page ranks – so why
implement them? They can help increase search engine visibility and click-through rates. Rich
snippets can be used for more than just movie sites – a lot more, in fact. According to Google, they
are also useful for content related to people, events, reviews, products, recipes and even
breadcrumb navigation. Rich snippets are also valuable for sites that offer local content, as local
SEOs and site owners familiar with the acronym NAP (Name, Address, Phone Number) can attest. A
site that has individual location pages should include formatted NAP information on each as
information like this can make it much easier for search engines to export and use in a variety of
places. It’s also helpful for location aware browsers and crawlers.

Five semantic search takeaways

Semantic this, Google that - obviously there’s a lot to digest when it comes to the potential Google
changes that lie ahead. Here are five takeaways that you as a site owner or agency partner should
start thinking about right now:

1) Content is more than just keywords
If your content is good today, make it great tomorrow by keeping up with the times. When
performing keyword research, try to stray from the traditional and focus on word relationships and
product benefits. A site content audit can also be helpful, as it can outline pages that are lacking in
content or meaning. If Google does make semantic search results more of a priority, you should try
to respond with content that fits the guidelines. Product and services FAQ pages can be valuable to
searchers who ask questions, and factual information about a topic can go a long way toward
increasing your search visibility.

2) Rich snippets are your friend
If you currently use rich snippets like microformats, microdata or RDFa, don’t fret because Google
says that you can keep the formatting on your site(s) for the time being. However, at some point in
the future it’s probably easier to shift to (if resources allow) because it is the default
markup used by the major engines. Visit Schema.org6 to learn more about the initiative and how to
effectively mark up your content.

3) Use a schema creator tool
If you’re in charge of implementing markup language to your site and don’t know the first thing
about what to do, there are plenty of schema creator tools out there that can help walk you through
the process. One free tool we recommend was created by Raven Tools7.

4) Test, test, test

Regardless of which markup language is used on your site, Google’s Rich Snippets Testing Tool8 can
quickly show you how the search engine would structure your metadata. If the Rich Snippets Testing
Tool is unable to show a preview, chances are appropriate metadata aren’t being used.

5) Continue to follow SEO best practices
No matter how much any Google change affects your site’s SEO efforts, it’s important to remember
where you came from. As the first sentence in this article indicates, Google makes hundreds of
ranking algorithm changes each year – some that pack more of a punch than others. It’s OK to make
changes to your site as algorithms and the search landscape evolve, but don’t go overboard and try
to reinvent the wheel over and over and over again.


To top