Multiple Personalities By
Knowledge Management Unit
An Intranet That Changes Character Nampak
For Diverse User Groups Research and Development
Cape Town, South Africa
A Document Storage and Presentation
Model for a Flexible Intranet June 2003
Content Store Structures
o Traditional Models
o Unstable Content Stores
Where do we go wrong?
How do we try to address the problem?
o Our Approach to the problem
Design Elements of Our Solution
o Content Format and Tagging
o Main Components and their Roles
o Overview of System Operation
o Creating New Topics and Editing Existing Topics
o Creating New Channels and Editing Existing Channels
o Our Content Store Architecture
Programmed Retrieval Objects ( PRO )
o How PRO’s work
o Making PRO’s easier to use
o Other Ways to Use PRO’s
o Personal Menus
o E-mail Briefs and Newsletters
o Multi-mode Search Facility
New Possibilities and Further Enhancements Planned
o A: Examples of Dynamically Generated Menus for Various Channels
o B: Multi-mode Search Facility
o C: PRO Generated User Page
o D: Personal Menu
Copyright Statement: Copyright on this article is reserved. It should not be published nor reproduced, in whole nor in part, without the
express written permission of the author and of Nampak Group Research and Development Department. This article is the text of a
conference paper delivered at the 7 Seventh Southern African Online Information Meeting (3-5 June 2003), organised by the South African
Online User Group. Permission is granted to this body for the distribution of this article to delegates at this conference.
Page 1 of 15 2003 - Nampak Group Research & Development
Nampak is a South African packaging manufacturer with operations throughout Africa and Europe. To support
these operations the company has a Strategic Packaging Intelligence Unit (SPI) and a Research &
Development Department (R&D). A virtual team of knowledge workers drawn from these two units provides
various information and intelligence products and services to the Group and its customers.
An intranet portal was required to present these intelligence products to several distinct and diverse user
The system was required to:
• Present many unique, customized views of a single document set according to the user’s interest.
• Present, on a single page, various defined collections of documents from multiple sources, both internal
• Frequently and quickly change subject menus and presentation options to reflect changing priorities,
without disturbing underlying documents or storage structures.
• Provide focused, plug-in “Knowledge Nuggets” to pages in other sections of the company intranet.
• Have a stable and easily manageable content store.
Content Store Structures
Previous experience with content storage and delivery models had taught us that the design of the underlying
content storage structure was a crucial element in the capability, flexibility, manageability and sustainability of
the final delivery portal. A great deal of thought and planning was therefore done on this aspect at the early
Traditionally, document storage structures are subject- Since the content storage structure is fixed,
based. Users navigate down a single, fixed subject tree, cloning it for use as an on-line menu limits
opening folders to find the content they want. At any your ability to provide:
time, their view is limited to the contents of a particular
folder. A customized user menu and
Alternative logical views of your content
When an on-line delivery model is required, the
temptation is to simply clone the storage structure tree
for a menu and allow users to navigate it in the same
This on-line delivery model may be appropriate for indexes of paper-based collections or a small electronic
document set which does not change often. However, while simple and quickly rolled-out, it also transfers
many problems and limitations, and compounds them as the body of content grows and matures. It also
greatly limits the possibilities and opportunities of on-line delivery.
Multiple valid storage locations for a document within a subject
Apart from their use as
on-line menus, Ever-expanding branches and complexity in the subject tree.
structures present their Changes to the tree to accommodate new or obsolete subjects
require physical rearrangement of documents.
own administration and
management problems: Physical movement of a document is needed to re-assign it to
Page 2 of 15 2003 - Nampak Group Research & Development
To illustrate these points, consider the following examples:
Where do I file and retrieve documents?
Using the subject tree on the left:
Where would you file this document?
• Packaging The Effect of Microfleems* on
Beverages in Steel Cans
o Steel What is a microfleem anyway?
o Paper Should I make a new microfleem folder?
o Glass Should I rather pick one of the other existing folders?
Where would other people look for this document?
The answers would differ depending on your field of interest
and which particular aspect was most important to you today.
To avoid this problem we could store duplicate documents in all appropriate locations. This is wasteful and
causes version problems – which is the “official” copy? If a user edits one copy, all the others are now out of
date. We could rather store it in one location and place cross-references in other possible locations, but what
happens if we later decide to file the document under a different subject – where are all the cross references that
must now be changed?
What happens if we refine the subject tree?
• Products • Products We would need to manually
o Beverages o Beverages sort existing content under
o Food Hot Beverages into Hot and Cold
o Cosmetics Cold to properly file it.
What menu would different user groups prefer?
A Marketing Manager may like: A Materials Scientist would prefer:
• Sales Trends • Test Methods How can you cater for many
preferences with one fixed
• Product Launches • Production Processes
o Local • Materials
o Foreign o Metals
• Markets Steel
o Beverage Aluminium
o Food o Paper
o Cosmetics o Glass
* Credit to Scott Adams and Dilbert for invention of the word “microfleem”
Page 3 of 15 2003 - Nampak Group Research & Development
Unstable Content Stores
An unstable content store results when
unpredictable changes to its structure
A Content store has two aspects that must be
An intranet based on an unstable content
Initial storage and later retrieval of content
store will become increasingly difficult to
Management of content in the store
use and manage with time.
Where do we go wrong?
Typically, we begin by designing a sensible structure for our content store. We might appoint an administrator
to “look after” it. We populate the store, put in a web front end with a menu that mirrors the storage structure
and then open it to our users.
We then allow administrators (or worse still, any of our users) to change the storage structure at will to
accommodate new nodes as they see fit. We have an ever growing, ever changing and ever more complex
and unpredictable structure – Our definition of unstable. And since the user front end mirrors it, it too will
suffer the same handicap.
It becomes a vicious circle where it is increasingly difficult for users to decide where to store content and find it
again. For administrators, managing the system is like chasing a wagging tail.
How do we try to address the problem?
As a quick fix to our retrieval problems
we perhaps put in a basic search engine: Search !
We soon find that users are unimpressed – they don’t know or don’t want to learn how to use it. It has made
finding content a little easier but it is still frustrating and inefficient – Long lists of search results, mostly
We then look at buying a professional answer to our problems - A Knowledge Management Solution from a
reputable vendor with all the modules to handle this and that and a web-based front end. We install it and to a
point, it does help but ultimately it fails to deliver its true potential – Why? The root cause of the problem
remains – The unstable and unmanageable storage structure. This does not belittle the value of such
solutions. Many excellent products exists that will justify their high price tag and deliver excellent results
provided that they are served by a stable and predictable document store.
Our Approach to the problem
We designed a flexible, program generated, on-line menu system. It
can generate a variety of custom menus and is completely
With our flexible intranet model we independent of the content store structure.
have avoided an unstable content
store by divorcing the on-line With this system, users no longer need any knowledge of the
menu from the content store content store structure. They do not need to remember where they
filed a document, and don’t need to guess where a colleague might
have filed it. They therefore have no need nor desire to expand or
change the content store to their preferences. Instead, they just
change the online menu.
Since this menu system is programmed and therefore predictable, it
The needs of the user allows us to have a fixed, and therefore, stable storage structure. In
interface, on-line menu and fact, it forces this requirement. The content structure design can
content management now be optimized for the needs of maintenance and management,
requirements were developed without worrying about how users will navigate it.
first and the content store to
An added benefit is that, this fixed structure is much less complex
serve them then followed. than a subject based one. It is simple for people adding content to
select a single, correct storage location.
Page 4 of 15 2003 - Nampak Group Research & Development
Design Elements of Our Solution
Content Format and Tagging
Any document format that your Search Engine can index is suitable for use in the content store. Plug-in IFilters
are available which enable indexing of less common formats – Acrobat PDF is one example.
HTML, however, is our preferred format due its smaller size and speed of delivery. We try, as far as possible,
to keep our content in a standard format. Various html templates are used for this purpose. This helps maintain
a common look for users.
Some meta-tagging of documents is also done. As a minimum, each document has an informative title tag.
This is used as the document link in search results. Standard lists of meta-tags are used for additional tagging
in certain document types such as news items. This allows for fine separation of search results as they can be
grouped by meta-tag.
To avoid information overload and irrelevance, content is stored in the smallest chunk feasible. This makes it
easier to supply accurate and focused information to the user. For example, a large 50 page report covering
many topics may be correctly retrieved and returned in search results, but the user will then have to wade
through the document to find the relevant section. XML format also lends itself well to this approach.
Main Components and their Roles
A decision was taken to avoid, as far as possible, any fixed, “hard coded”, menu pages because a large
number of these menus would require extensive maintenance of links and could not be changed quickly and
Instead we based the menu system on program generated menus and display pages, supported by a search
engine and database application.
The Search Engine indexes all documents in the content store.
Following a television analogy, different “channels” are available to users that show topics of interest
to distinct user groups. All channels are fed from the same content store. The channels allow a range
of different logical views of that same data.
A table in the database associates relevant topics with corresponding channel names.
Another database table holds topic names with corresponding, expertly written search phrases that the
search engine can use to find appropriate content.
Programmed Retrieval Objects (or PRO’s) Use the search engine and database to find, format and
display content matching the user’s topic choice.
Each of these components will be described in more detail in the following sections.
Page 5 of 15 2003 - Nampak Group Research & Development
Overview of System Operation
Channels Topics When a user selects a channel
A set of relevant topics for the channel is
Materials • Beer obtained from a database table
Markets • Dairy
Technology • Wine A topic menu for the channel is
programmatically built for the user.
See Appendix A for examples of menus
from various channels
When a user selects a topic
Dairy = (milk OR dair* OR yoghurt OR cheese) The expert search phrase for this topic is
Wine = obtained from the database
The user display page contains several
separate sections. Each section displays
content on the same topic but from a
The search engine indexes
content in the store
Each section runs a separate search using
Search the expert search phrase to find content
Index Engine from different sources.
Search results from the different sources
are built and presented on a single page
according to pre-programmed instructions.
Journals Reports News
Content Store Menu Dairy
Appendix C: PRO Generated User Page
for a screenshot of an example page
Page 6 of 15 2003 - Nampak Group Research & Development
Creating New Topics and Editing Existing Topics
To define a new topic, a search phrase for it is simply added to the database table. There is no need to first
populate a folder with content. This topic can then be used in any channel.
Topic Search Phrase
Beer ( ale OR beer OR lager OR (brew* AND NOT( tea OR coffee ) )
Coca-Cola (drink) ( “Coca-Cola” OR CocaCola OR “Coca Cola” OR ( coke AND NOT (coal OR
drugs) ) )
Coca-Cola (company) ( “Coca-Cola” OR CocaCola OR “Coca Cola” OR ( coke AND NOT ( coal OR
drugs ) ) ) OR Fanta OR TAB
A query language expert writes this search phrase. The expert may have sufficient subject knowledge to write
it them self or may consult with a subject expert. The search phrase can be tested and adjusted until
sufficiently accurate results are returned.
Some advantages of this approach are:
No filing inconsistencies as there is no need to assign a document to its topic(s) when saving.
Assignment to topic(s) is rule based and therefore consistent between all users.
Only one copy of a document is required in the data store. That one copy can then appear under an
unlimited number of topics. There is no need to maintain multiple copies under relevant topics or any
need for cross-referencing.
It is easy to include common spelling variants and common misspellings in the search phrase.
It is possible to have separate search phrase for terms that can have more than one meaning. For
Example, one search phrase for the Coca-Cola drink itself and another for The Coca-Cola Company
as a whole. This makes it possible for the user to get accurately targeted context-aware results.
It is possible to include associations of which the user may not be aware. For example, how many
people know that Coca-Cola owns the Fanta brand? This is taken care of in the expert search phrase.
The user then gets results that show associations of which they may never have been aware. – This is
new knowledge for them.
The search phrase need only be crafted once and thereafter all users enjoy the benefit of an optimum
search phrase by just selecting a friendly topic name. No advanced query language skill is needed.
The topic can be edited at any time to instantly cater for real-world changes. For example, suppose
Coca-Cola and Pepsi merge into a new company called ColaPep. It is easy to add a new “ColaPep”
search phrase that includes the terms Coca-Cola and Pepsi. No physical moving of documents is
involved and the changes are instantly reflected in user results
Creating New Channels and Editing Existing Channels
To create a new channel, two steps are needed:
A new channel name is entered into the database table
A set of topics for the channel is defined and search phrases for them are written.
A user display page for the channel is programmed. A basic template is opened and customized to
show all the desired content for the channel.
Page 7 of 15 2003 - Nampak Group Research & Development
Our Content Store Architecture
Having first designed the user interface and retrieval mechanisms, the content storage architecture to support
them almost writes itself. It is uncomplicated, predictable, very seldom needs changing, and then only by the
system architect. In short, it is stable.
Its features are:
A few large content storage bins are used. To determine the hierarchy for these bins, we use our
OATS model - Ownership, Age, Type, SubType.
o O - At the top level, split by Ownership. This allows you to clearly demarcate content that it is
owned, managed and maintained by different entities in the organization.
o A - Next, split into current and archive material. Fresh, current content is likely to have higher
value and will be the most commonly used material, so separate it from old, out-of-date
o T - Then split by document type (news, report, journal etc). This makes it easier to accurately
and judiciously apply expensive indexing, search and processing resources. e.g. High value,
frequently changing news item folders can be automatically indexed hourly, while monthly
journals are indexed only when required
o S - At the lowest level split into any further subtypes you may need. For example, news feeds
from various information vendors may have different usage and copyright restrictions. Having
these split into discreet folders makes it easier to apply access and retirement policies.
There is no need to file individual documents under their corresponding topics. In fact, topics are
purposely excluded from the overall hierarchy. The search engine and database take care of finding
content on any particular topic.
The storage structure does not change nor expand as new topics come and go, so it is stable. This
makes the location and management of content much easier and more predictable.
Little physical movement of content is required, even if user display options are radically changed.
Movement is limited to routine maintenance where batches of content are moved from “Current” into
The content store is easier to maintain and administer. Since content is collected by Ownership and
document type, it is easier to set access permissions and apply indexing and archiving policies.
Contrast this with a topic-based structure where a single folder may contain mixtures of document
types, and require different access permissions for individual documents
Manual re-index as required Access
Archive monthly Permissions
Content changes continuously All users have access – no
Re-index hourly special folder permissions needed
This content subject to user
licences - Set folder permissions
Content changes infrequently for licensed user groups only
Archive annually Confidential – restrict permissions
Page 8 of 15 2003 - Nampak Group Research & Development
Programmed Retrieval Objects ( PRO’s )
The flexibility of the system comes from the use of Programmed Retrieval Objects or PRO’s for short.
PRO’s are re-usable blocks of code that can be inserted into a user display page. They use the search engine
to retrieve a required block of content, format it and insert the content into the page. Examples of PRO’s we
News PRO Focused Search PRO
Displays the latest news on a topic supplied to it. The This PRO is typically inserted into a page when the
titles are links to the actual articles. It also gives you user has selected a topic from the menu. The search
options for displaying results: box is pre-programmed with the expert search phrase
Include a short summary with the title. associated with the topic in the database.
The height and width to occupy on the page
The user then only has to enter a further word or two
to get an accurate and highly focused set of search
The search that the user is actually running in this
(milk* or dair* or yoghurt OR joghurt OR yogurt or
cheese or whey) AND bottle*
With a topic and source specified, it provides a list of titles, a short teaser phrase and a thumbnail of an image
in the file. The titles are linked to the actual article.
Any combination of PRO’s can be used together in a single page to construct a variety of customized views of
one content store. See Appendix C: PRO Generated User Page for an example of this
Page 9 of 15 2003 - Nampak Group Research & Development
How PRO’s work
The PRO will then:
To get results from a PRO: Get the appropriate search phrase from the
Specify an existing topic database and construct search parameters for the
Specify any additional search phrases search engine.
Specify from where in the data store Pass the search to the search engine
the content is to be drawn Receive the raw search results from the search
Specify which search index to use engine
Specify any content age restraints Format the search results according to the
Specify any display options instructions programmed into the object and any
display options specified
Return the completed html text snippet for insertion
into the user display page
Making PRO’s Easier to Use
The News PRO described earlier is, predictably, a very commonly used object. People want news on many
different subjects so this PRO is used many times in various pages. To avoid having to repeatedly supply the
search index, content location and age restraints every time it is used, a further database table is used. This
table contains entries for commonly used content sources.
Source Name Index Name Location Maximum Age
Fresh News Current News C: \ Data Store \ Current \ News 3 days
Recent News Current News C: \ Data Store \ Current \ News 1 month
Archive News Archive C: \ Data Store \ Archive \ News Any Age
Using this database table makes it very easier
Topic = Coca-Cola ( company ) for an editor to insert a PRO into a custom page.
Source = Fresh News
< INCLUDE NewsPRO > Just these three lines will insert the latest news
about the Coca-Cola Company into any page.
Other Ways to Use PRO’s
PRO’s are not limited to use on our own intranet Site. There are several other departmental intranets and
customized PRO’s can be inserted into any of those pages as well. This allows the sharing of specialized
“knowledge nuggets” between departments and divisions.
PRO’s can also be used to offer content from selected external sources such as other company intranets,
databases and the Internet. To do this the search engine is set to index the selected external content. Then
when a user clicks on a topic, one PRO can display internal content while another can show links to external
content relevant to the topic. This then removes the need for users to visit all these external sources
individually to look for content. Instead, it can be presented to them all on one page.
Page 10 of 15 2003 - Nampak Group Research & Development
Since programmed pages are used to find and display content and not hard-coded links, further automated
features become possible. Listed below are some of these additional features.
Besides the choice of channels, we have also developed a facility that lets
Add to My Channel users build up their own personal menu. The facility enables users to save
preferred views and user defined searches for future use.
Most of the program-generated pages include an “Add to My Channel” button. When a user clicks it, the
address of the page and all the input parameters that generated it are stored in the database along with the
user name. When the user next opens their My Channel page, all of their personal entries are retrieved from
the database and a personal menu is built from them.
When they select an entry on their personal menu the query that produced the original page is re-run. This
means that they get the latest content and not just a copy of the original page.
See (Appendix D: Personal Menu) for an example of this.
E-mail Briefs and Newsletters
Users can subscribe to various briefs and newsletters
delivered by e-mail. Once content is saved into the system
in a standard format it can be displayed, not only in
intranet pages, but it can also be programmatically
extracted, reformatted and packaged into an e-mail format.
Highly focused newsletters can be quickly produced by
editors with very little extra effort.
The editor uses an on-line form to specify topics, type of
content, age and display options. As for a normal page,
the system then finds relevant content, constructs an index
table, includes the content and images from the
documents, formats it and produces the finished file ready
Multi-mode Search Facility
The one-line textbox search facility seen on many sites does not produce adequate search results for many
users. Most users are unskilled in the query language needed to get good results from it. This is serious
because an easy-to-use and accurate search facility is an extremely important component of a site. It is
imperative to spend the time in developing and honing it to assist the user as much as possible.
On our intranet we offer three modes of use, each one with automatic pop-up help to guide the user towards
optimum use of the facility:
Keyword / All, Any, Exclude
This the default mode offering the best trade-off between ease of use and accuracy of results
The easiest to use but least accurate – useful for getting a wide spread of content around a topic
For expert searchers – offers the most accurate results for those skilled in query language.
( See Appendix B: Multi-mode Search Facility for a screenshot )
Page 11 of 15 2003 - Nampak Group Research & Development
In developing our intranet we aimed to keep the code open-source and technologies interchangeable. This
means that functionality can be quickly altered at any time and individual software components can be mixed
and matched independently.
Any webserver which supports server-side scripting can be used
The scripting language can be whichever one your organization is most comfortable with
Any reasonably competent Search engine can be used to power the system
SiteServer, Microsoft Index Server, Verity etc.
The database, likewise, is your choice. For high volume usage SQL, Oracle or similar is
recommended. For lighter duty applications with limited users something like Access, Approach or
MySQL could be feasible.
The system does a lot of processing to generate pages. As a result, a suitably powerful server is required. The
exact specification will depend on the number of simultaneous users. For very high usage, a number of servers
working in tandem to distribute the load may be needed.
Our own system uses a master staging server running MIIS and the Search and Index Server components of
SiteServer. Two similar Production servers are also used. The master server does the very resource intensive
indexing work and holds master copies of the page generation code. The production servers are therefore free
to service user requests. Completed indexes and code pages are mirrored from the master server to the
production servers. More production servers can be plugged in, as more capacity is required. ASP and
VBScript is used for server-side scripting and the database is SQL Server.
New Possibilities and Further Enhancements Planned
If we are to harness the full intelligence resources in our
We focus on content in organisation, we must recognize that it does not only reside in
electronic and paper media. documents, but in the expertise and knowledge of people.
Capturing this knowledge into an electronic format is, for now, a
But what about organic media? dream. We intend to at least develop further functionality to
…..The human brain. provide search results that include links to people with expert
knowledge of a topic.
The information needed to enable this feature will be generated both manually and programmatically. We hope
to establish and maintain an expert knowledge inventory of people in the organization, linked to the topic
definitions in the database.
We will also investigate ways of collecting this information in real-time by logging individual user activity on the
site. Along with relevant documents we could then also provide “Other people that often access this topic”
links. In an organization with thousands of people, geographically widely spread, this would be a valuable tool
for connecting people to expert human knowledge.
You can build an intranet, To keep people coming back to a site, it must be easy and quick to get
But will people keep focused and relevant information. The job of delivering this is never completed
coming back to it? and will require constant refinement and improvement of systems and
Even when this is achieved, do people at work these days have the time to spend browsing the site? We
intend to develop an alerting utility that notifies users of the arrival of new content that would interest them.
However, if you are going push content to people, it had had better be highly relevant and free of clutter. As a
first step we would only push content that fits topics they themselves have added to their personal menu.
I would like to thank colleagues in the Strategic Packaging Intelligence Unit and the Research & Development
Department for their collaboration and assistance in developing this intranet model. Thanks also to colleagues
in Business Information and IT for sponsorship, provision of hardware and network services.
Page 12 of 15 2003 - Nampak Group Research & Development
Examples of dynamically generated menus for various channels
Multi-mode Search Facility
Page 13 of 15 2003 - Nampak Group Research & Development
Appendix C: PRO Generated User Page
Page 14 of 15 2003 - Nampak Group Research & Development
Appendix D: Personal Menu
Page 15 of 15 2003 - Nampak Group Research & Development