Metadata, Content Models, and Taxonomies: Strategies for Smart Content Management

Document Sample
Metadata, Content Models, and Taxonomies: Strategies for Smart Content Management Powered By Docstoc
					          POINT OF view

Metadata, Content Models, and Taxonomies:
Strategies for Smart Content Management
By: Laura Lerner, Senior Manager of Content Strategy, SapientNitro

TRUE or false?
• Only developers care about metadata.
• A content model is the same thing as a content matrix.
• It doesn’t matter how content is managed as long as users get what they need.
• Taxonomy is another name for a site map.

If you answered true to any of these or aren’t quite sure, then you’ve come to the right place. Let’s clear
up any confusion on metadata, content models, and taxonomies and put to rest any questions you have
about what they are, the relationships and differences between them, and how they help make digital
experiences work.

Metadata is structured information that describes, explains, locates, or makes it easier to retrieve, use,
or manage an information resource. And an information resource, in this case, is something content
related: text, images, video, sound, or anything that conveys meaning to a user.

Metadata has many benefits, including:
• Enabling machine readability when describing things in their constituent parts,
• Powering content management and flexible delivery to maintain content as a strategic asset, and
• Enabling effective retrieval through disambiguation, that is, removing the guesswork about what
content means.

There are three flavors of metadata. They include administrative (date created, author, and version),
descriptive (for example, category, audience, and keywords), and structural metadata (for example,
title, file type, and related documents).

                                                                      In this example from, you
                                                                      can see a few pieces of metadata for just
                                                                      one image (the movie poster) on this
                                                                      page, including the height, the alternate
                                                                      text, and the title. That’s just a small
                                                                      portion of the metadata that’s been
                                                                      captured for this page.

                     IDEA ENGINEERS                                                          © Sapient Corporation, 2012
                  POINT OF view

       The content model is meant to create order from all that metadata. A content model:

       • Specifies the metadata that is required to manage a digital experience.
       • Provides the content blueprint and architecture for static and dynamic content by capturing
         system data and rules.
       • Enables CMS development by documenting requirements.

       So, let’s take a look at a high-level view, using again as an example. We’ll start with content
       types, the basic structure pieces of any content strategy. IMDB’s content types include general
       information like Movie, Studio, Actor, and Director. Each content type, even though they’re separate, can
       still have relationships with each other. For example, a Movie is produced by a Studio, is directed by a
       Director, and stars several Actors, who have appeared in many other Movies and worked with several

       But how do you know if you have a good content model? A high-quality content model adopts applicable
       best practices and should:

                                          • Enable reuse by separating content from presentation.
                                          This ensures that content is operationally efficient and saves resources
When do I need to                         and money.
create a content model?                   • Provide and enforce governance by marrying user and content
                                          manager needs. Working within specified requirements helps
• Create a content model when             streamline decision-making and ensure optimal content performance.
there’s a new experience design           • Enable the active management of content. You should know how
or CMS and content management             long content has been there, if it’s viable, and who wrote it (among
requirements haven’t been                 other performance measurements) in order to track and adjust it.
documented.                               • Document business rules that power content delivery and
                                          retrieval. Relationships and categories can classify content so users
• Don’t create a content model            can find and access content, and to ensure that content lives in the
when an existing CMS is in                appropriate channels.
use, no updates will be made,             • Be built with a widely accepted standard, such as Dublin Core.
or content will migrate 1-to-1.           Using a standard framework helps speed up development concerns
                                          and offers a great library of attributes and classification types.

       Now let’s look at the content model from a more in-depth, tactical level. Starting with content types,
       a content model must also describe the attributes of those content types, which can also require
       controlled vocabularies and taxonomies. If we continue with the example, attributes of the
       Movie content type include the Title and Genre.

       The content model also includes the system rules behind content types and their attributes. These
       rules include any piece of logic that is needed for a particular attribute. Examples of rules include:

       • Default values. These are the values that are prepopulated for content authors. For example, the
       content type Press Release might have the attribute Distribution to indicate what audiences can view
       the release. The default value of Distribution might be set to Public (rather than Internal Only) because
       content managers expect that most press releases will be made publicly available.

                            IDEA ENGINEERS                                                      © Sapient Corporation, 2012
          POINT OF view

• Required fields. These are the attributes that must be completed for a content type to be saved or
published. For example, a content type might include both the Title and Subtitle attributes, but only the
Title must be authored.
• Format or validation. These rules provide authors guidance about the form the content or metadata
must take for a particular attribute. Common validation rules include character counts, acceptable
characters, and date format. For example, if a content type includes an Expiration Date attribute, the
rule for the attribute would specify the required format, such as mm/dd/yyyy or DD Month YYYY.
• Repeatable content. The content model should specify which attributes can have multiple values.
This is especially common when an attribute uses a taxonomy of values so that content managers can
assign multiple categories a particular content type.
• User editable. This rule specifies that an author, rather than the CMS, can edit the value of an
attribute. For instance, the date of when content is created may be system generated, but the title of
that content would be user editable.

An optional piece of the content model is a content entry template. These templates help technology
developers create the CMS forms that authors will use to create content within the CMS. They show the
preferred layout of attributes per content type.

So with all this information and all this complexity around creating a content model, where do you
start? Well, you start by asking a lot of questions to fully develop a working, scalable, sustainable
content model. Questions like:

• What content and metadata already exists?
• What standards are in use?
• How will users find, access, and share this content?
• Will translation and localization be required?
• How is content acquired, created, approved, updated, and retired?
• How will content performance be measured?
• How will the content be stored?
• How does the CMS assemble and manage content?
• How will content be retrieved?

                     IDEA ENGINEERS                                                      © Sapient Corporation, 2012
          POINT OF view

Once you have a good understanding of the pieces within a content ecosystem, you will want to distill
that input into core content types. To do that, look at the content and information you’ve assembled
through four basic lenses:

1. Uniqueness
Uniqueness is judged by the type of information the content contains, where it appears, and its
functional requirements; for instance, an article contains a particular type of information versus
an image gallery. These two totally unique types of information give them different functional
requirements and, therefore, comprise two different content types.

2. Reuse
Reuse also helps inform what becomes a content type. A good example is a press release; you may
want to reuse pieces of one to feed other portions of the site. And while you might store all press
releases in the news section, you might want to be able to consume the title in other areas of the site,
such as a widget on a homepage.

3. Presentation
While a good content model should usually separate content and presentation, sometimes we need
to have very fine control over the treatment of text or images. So when we have very specific delivery
requirements, looking at the layout and the format is important.

4. Source
Decide where the information is coming from. Is a particular business unit producing it? Does a third
party provide the content? Because the content source can drive requirements around how the content
is managed, it can be easier to segment this content by type.

Once you’ve defined the content types, you can then break down the attributes. Take it step-by-step:

1. Itemize distinct pieces of content (such as a title, body, and image or some combination thereof).
2. Describe what the content is (the content’s aboutness, in terms of subject, audience, or purpose).
3. Define the relationships (how this content type will interact with other content types and user
4. Track its lifecycle (what you need to know about how it was created, changed, and archived).
5. Understand ownership (who created, updated, and approved it).
6. Specify where content goes and how to find it.

Typical attribute examples include the title, description, and keywords (all important from an SEO
perspective), as well as subject/category, author, date published, language, and unique ID. These
attributes are crucial to clearly identify content by the system that’s managing and delivering it.

So what about taxonomy? At its highest level, taxonomy is a hierarchical list of controlled terms (that
the business has defined and approved) that supports the classification, management, delivery, and
retrieval of content.

The content model specifies when a new or refined taxonomy is required. That means that taxonomies
can’t replace a content model, and you should never build a content model around taxonomy.

                     IDEA ENGINEERS                                                     © Sapient Corporation, 2012
          POINT OF view

Sometimes, due to its hierarchical nature, taxonomy is confused with a site map. A site map is how a
user experiences a digital presence, often including user-facing terms, short cuts, and redundancies
that we wouldn’t include from a content management, or taxonomy, perspective. It’s important to note
that developing taxonomies is a whole process in itself, a process of art and science that’s out of this
paper’s scope.

Given all that, what’s the point of taxonomies anyway? If they’re not site maps, why do we need them?
Take a look at these three examples that illustrate good use.

1. A parametric search. This is essentially powering a search that enables a user to filter and
manipulate results using multiple parameters. If you look at, for instance, under the
music tab, you’ll see that files are classified multiple ways: by genre, themes, lyrics, and so on. Those
are all controlled groups of metadata terms — taxonomies — assigned to music files behind the scenes
to enable users to search and refine through those parameters.

2. A recommendation engine. At, suggestions for content that might also be of interest are
offered, based on talks that have been previously viewed by the user. For instance, if you watch a TED
Talk about archaeology in space, the metadata assigned to this particular talk — an Archaeology tag
from a subject taxonomy — indicates the user’s interest in a particular subject to recommend other
content that might be of interest as well.

3. Related content. Related content is another way taxonomies are used. At, viewing a
particular movie’s page can also provide the user with links of photos and videos tagged with that
movie title, creating a relationship from one content type to others.

There’s a lot more taxonomies can do, especially on the back end, and we’ve only touched the surface
by looking at a few examples of how they work for users on the front end. But for now, let’s switch
gears and look at taxonomy before it’s implemented in a content management system. Sometimes, it’s
not so “behind the scenes” and we do see things that end up in the navigation.

                     IDEA ENGINEERS                                                    © Sapient Corporation, 2012
                 POINT OF view

                                                                                 For instance, at,
                                                                                 there’s a vehicle taxonomy
                                                                                 that goes through the
                                                                                 models and breaks them
                                                                                 down into trim levels, which
                                                                                 shows a good example of
                                                                                 taxonomy being represented
                                                                                 in navigation. So there’s a
                                                                                 hierarchy from the make
                                                                                 to the model to the more
                                                                                 specific styling categories.

                                    BRINGING IT ALL TOGETHER
Taxonomies are a big deal. From     Now that you understand these three pieces of the puzzle, all
a North American perspective,       that’s left is to bring it together. NPR is a great example of a solid,
we’ve got the American National     multichannel content model in action. From a Google search, notice
Standards Institute (ANSI), the     how NPR propagates one item of content across multiple channels.
National Information Standards      So the metadata behind this powers a website, Twitter feed, tablet,
Organization (NISO), and the        and smartphone presence; this content is reused across all these
ISO who document guidelines         channels through a presentation layer and scaling. NPR also uses
for controlled vocabularies and     this model to relate all its types of content, including audio files from
taxonomies, such as ANSI/NISO       its radio programming. In this way, NPR is managing this particular
Z39.19. Many organizations adhere   story everywhere it needs it to go. And since NPR uses a highly
to the recommendations and          structured model with robust metadata behind it, NPR’s content can
standards they provide.             even be found on channels that NPR doesn’t control.

                           IDEA ENGINEERS                                                    © Sapient Corporation, 2012
               POINT OF view

    Metadata gets content from point A to points B – Z. It is a super structure that allows us to get visual
    experiences to a user and gets the business results we need. The content model defines required
    metadata by marrying the needs of content consumers and managers. And the attributes (including
    taxonomies) provide the guardrails for valuable, well-maintained experiences.

    While metadata, content models, and taxonomies can sound like a bunch of technological gobbledy-
    gook, they’re really the magic behind all the experiences we deliver — so it’s time to put them in the
    limelight where they belong.

                              About the Author
                              A Senior Manager of Content Strategy, Laura Lerner is the Midwest regional
                              lead for Content Strategy based in our Chicago, IL office and lead moderator
                              of our global content strategy community. Her nearly 15 years of experience
                              in enterprise content management, editorial strategy, and business process
                              management has resulted in her modeling more content, both personally
                              and professionally, than she cares to remember.

Laura Lerner

                         IDEA ENGINEERS                                                      © Sapient Corporation, 2012

Description: While metadata, content models, and taxonomies can sound like a bunch of technological gobbledygook, they’re really the magic behind all the experiences we deliver — so it’s time to put them in the limelight where they belong. Point of View By Laura Lerner Senior Manager of Content Strategy
About SapientNitro, part of Sapient®, is a new breed of agency redefining storytelling for an always-on world. We’re changing the way our clients engage today’s connected consumers by uniquely creating integrated, immersive stories across brand communications, digital engagement, and omni-channel commerce. We call it Storyscaping, where art and imagination meet the power and scale of systems thinking. For more information, visit