Controlled Vocabulary and Folksonomies by forrests


									Folksonomies and Social Tagging

What are folksonomies?
• Folksonomies (known also as “social classifications”) are user created metadata. • They are grassroots community classification of digital assets. • The term “folksonomy” was created by Thomas Vander Val and represents a merging of the terms “folk” and “taxonomy.”

Where are folksonomies found?
• Folksonomies are found in social bookmarks managers such as ( and Furl (, which allow users to:
– Add bookmarks of sites they like to their personal collections of links – Organize and categorize these sites by adding their own terms, or tags – Share this collection with other people with the same interests.

• The tags are used to collocate bookmarks: (a) within a user’s collection; and (b) across the entire system, e.g., the page will show all bookmarks that are tagged with “blogging” by any user.

Social Bookmarking and Social Tagging
• what is social bookmarking?
– public sharing of links
• association of tags (keywords) with links

– network of related links created by users
• network of related tags created by users

• what is tagging?
– act of associating a term with a link or article – labelling or classifying for personal use

• Tagging creates an association between user, item and set of tags

Inter-term relationships
• There are no clearly defined relations between and among the terms in the vocabulary, unlike formal taxonomies and classification schemes, where there are multiple kinds of explicit relationships (e.g., broader, narrower, and related terms) between and among terms. • Folksonomies are simply the set of terms that a group of users tagged content with; they are not a predetermined set of classification terms or labels.

Popular folksonomy sites
• • • • • • • ( Flickr ( Frassle ( Furl ( Simpy ( Spurl ( Technorati (

How folksonomies work
• Registration is free. Little personal information is required; normally just a login name and password. • Once registered, you add a bookmarklet to your browser. When you find a web page you'd like to add to your list, you add it to our manager site. You then assign keywords to describe the content of the site. • If your page has been bookmarked by other people, you will be shown the most popular tags assigned to it; you can assign your own tags, or simply click on the popular tags to have them assigned automatically.

The popularity of folksonomies
• The growing popularity of folksonomies can be attributed to two principal factors:
– An increasing need to exert control over the mass of digital information that we accumulate on a daily basis. – A desire to “democratize” the way in which digital information is described and organized by using categories and terminology that reflect the views and needs of the actual end-users, rather than those of an external organization or body.

What is Social Bookmarking?
• Social bookmarking is a server side web based service which allows users to create, manage and share their personal bookmarks in a social community. • Social bookmarking systems have three major axes: users, tags, and URLs. • Social bookmarking systems are a type of folksonomy.

…then what is folksonomy?
• Folksonomy is a collaboratively generated, open-ended labeling system that enables users to categorize content by freely chosen labels. • Thomas Vander Wal coined the phrase by combining “folk” + “taxonomy”. ���� ���� • While folksonomy appears to be the most popular, other names for the same phenomena have been proposed which included: folk classification, folk taxonomy, ethnoclassification, distributed classification, social classification, open tagging, free tagging, faceted hierarchy, etc

Social Bookmarking as a Classification System
• A classification system is a structured scheme for categorizing knowledge, entities or objects to improve access or study, created according to alphabetical, associative, hierarchical, numerical, ideological, spatial, chronological, or other criteria. • Traditional methods for organizing information include controlled vocabularies, taxonomies, thesauri, and ontologies.

Function of Social Bookmarking
• Method for organize and storing information
– Social bookmarking as a type of sense making – Allows users to organize personal information their way

• Connects users to other related topics and ideas
– Gives the users the ability "to sort the wheat from the chaff“ – More narrowed focus, vetted by humans as opposed to computers – Collective Wisdom - tags are ranked by popularity.

• Connects users to other users
– Allows users to interact with other users methods – “Eavesdropping on someone else’s thought pattern”

Social Bookmarking Characteristics
Common elemental characteristics of social bookmarking (folksonomic) systems. • Tag – a single word label that is applied to an object (URL) • Tagging – the process of organizing an object by assigning a label or “tag” • Tag bundle –a group of tags linked by another tag or “super tag” • Tag cloud - a visual weighted list of a set or subset of tags

Example of a Tag Cloud

Tagging Issues
• Tagging is Good • dynamic distributed classification • related tag networks • tag cloud shows extent of collection • user terminology • diversity • Tagging is Bad • mob indexing • no controlled vocabulary • poor browsing experience • no thesaurus • consensus by a mob or no consensus

Tagging Issues
• • • • • spelling variations spelling mistakes potentially mistaken term usage acronyms, homonyms, synonyms sesquipedalians (terms made by sticking many smaller terms together e.g. information_seeking_behaviour) • non subject tags (e.g. affective tags, time and task tags)

Patterns in Tagging (3 studies of tags)
• Are categories emerging in social tagging that will complement those developed through professional methods? • What does tag convergence and co-word usage suggest about the utility of tagging? • What implications do the use of affective or time and task related tags have for the organisation of information?

Convergence and Divergence in Tags
• When enough people tag a site, a set of more frequently applied tags will emerge that start to look like a reasonable description of the item • tag trends do not follow standard power laws for term usage (80/20 rule)
– the drop off tends to be much slower at first before suddenly returning to the normal power law

100 150 200 250 300 50 0

ontology tagging folksonomy classification tags w eb2.0 shirky article taxonomy metadata folksonomies semanticw eb tag toread Categorization categories reference Blog w eb information ontologies semantic Internet library

Tag Frequency 1

Tag Frequency Graph for

10 15 20 25 30 35 40 45 0 5

tagging social collaboration folksonomy tags article folksonomies socialnetw orking research Web2.0 bookmarking classification kcb201 cataloging academic articles indexing socialtagging taxonomy Information library netw ork Social_netw orking socialbookmarking

Tag Frequency 2

Tag Frequency Graph for

Tagging Patterns
• Consensus forms after a certain number of users have tagged an item
– first item by 2250 people, second only tagged by 49

• frequency graphs suggest a relative consensus on terms, but tag lists and co-word graphs do not
– high frequency tags used frequently but not necessarily with other high frequency terms – tagging patterns may show group consensus and trends in user communities.

Tag Lists
• Shirky 2005 ( 1f4925233d): • by nayma to folksonomy tags web2.0 ontology • by zeft to ontology • by chrysoberyl to 2.0 libraries thinky • by peleke12 to ontology shirky tagging • by alisaepstein to folksonomy folksonomies tagging web2.0 653

Co-word Graph of Tags

Comparison of Tags with Controlled Vocabulary
• 1. study tag use and types of tags on articles compared to subject headings on CiteULike (like but indexes journal articles which have more metadata)
– most common relationship between the terms was "related but not in the thesaurus" – next most common RT and then equivalence

• 2. study comparing tags and LCSH on LibraryThing without further context it is extremely difficult to tell whether an apparently anomalous tag in a tag cloud is a mistake

Non Subject Tags
• some time and task or affective tags are very popular
– cool, fun, funny, toread appeared in main tag cloud

• ToRead and fun are popular tags on all three sites • affective terms appear more frequent on Citeulike and Connotea than expected
– biology articles more often listed as toread; math and physics as fun

Utility of Tagging
• tagging can be useful for providing a good picture of how users see the material
– Steve Museum project: found that users used very different terminology and tagged specific items seen in the picture which had been absent from professional cataloguing

Tagging Discussion
• tagging has all the problems of free text search/automatic indexing • but, tag groups tend to converge on a useful set of terms after a threshold number of users • users use some terminology which is rare or completely absent from subject heading lists (e.g. time and task tags) • user terms often not part of formal thesaurus

Social Bookmarking Characteristics
• Common elemental characteristics of social bookmarking (folksonomic) systems.
– Tag – a single word label that is applied to an object (URL) – Tagging – the process of organizing an object by assigning a label or “tag” – Tag bundle –a group of tags linked by another tag or “super tag”. Bundles are a way to group together common tags. For instance, if you have the tags "design", "painting", and "moma", you may want to group these together into a bundle called "art". – Tag cloud - a visual weighted list of a set or subset of tags

Folksonomies and user vocabulary
• In information retrieval systems (IRS), the vocabulary used to organize content may be based upon the choices of the authors of the materials, the designer of the IRS, or the designer of the controlled vocabulary in place. • Folksonomies reflect users’ choices in diction, terminology, and precision. • Folksonomies can adapt very quickly to changes in user needs and vocabulary, and adding new terms to a folksonomy incurs virtually no cost for either the user or the system.

Folksonomies and online communities
• Folksonomies create a sense of community amongst their users. Most social bookmark managers will recommend new links and other members’ folders or sites that are strongly related to an individual member by analyzing his or her linking pattern. • As soon as users assign a tag to an item, they can see the cluster of items carrying the same tag. This feedback loop leads to a form of asymmetrical communication between users through metadata. • The users of a system negotiate the meaning of the terms in the folksonomy.

• The terms in a folksonomy may have inherent ambiguity as different users apply terms to documents in different ways.
– E.g., the tag “ANT” has been used to refer to “Actor Network Theory”, a sociological term, as well as Apache Ant, a Java programming tool

• The polysemous tag “port” could refer to a sweet fortified wine, a porthole, a place for loading and unloading ships, the left-hand side of a ship or aircraft, or a channel endpoint in a communications system.

• Folksonomies provide for no synonym control; the terms “mac”, “macintosh”, and “apple”, for example, are used to describe Apple Macintosh computers. • Both singular and plural forms of terms appear (e.g., flower and flowers), thus creating a number of redundant headings.

• Related terms that describe an item vary along a continuum of specificity ranging from very general to very specific; so, for example, documents tagged “perl” and “javascript” may be too specific for some users, while a document tagged “programming” may be too general for others.

• Folksonomies provide no guidelines for the use of compound headings, punctuation, word order, and so forth; for example, should one use the tag “vegan cooking” or “cooking, vegan”?

Incorrect Usage
• Tags could be applied incorrectly; the term “archeology”, for example, is used to tag items pertaining to both dinosaurs and primitive microbes

• Users strive to achieve a degree of consensus over the general meaning of tags. • As a URL receives more and more bookmarks, the set of tags used in those bookmarks becomes stable across different users. • This stabilization is facilitated through imitation and shared knowledge. shows users the tags most commonly used by others who bookmarked the same URL already; users can easily select those tags for use in their own bookmarks, thus imitating the choices of previous users.

Folksonomies and controlled vocabularies
• Folksonomies are not necessarily antithetical to controlled vocabularies. • Once you have a preliminary system in place, you can use the most common tags to develop a controlled vocabulary that truly speaks the users’ language
– E.g., you can link related tags such as “nyc,” “newyork,” and “newyorkcity”; it may be possible to align these terms with established controlled vocabularies, such as the Getty Thesaurus of Geographic Names, in order to provide a greater range of related terms.

Other uses for folksonomies
• Could be used to organize resources for an intranet, course collection, etc. • Could be used to enhance the customizable features of library catalogues. Clients could organize and tag items of interest from the catalogue, as well as external sources (if allowable).
– Could share these tags and sources with other clients with similar interests. This could lead to a user-directed reader advisory service. – Could use folksonomies to supplement existing LCSH vocabulary in the catalogue, e.g., LCSH does not contain terms for the popular film genres “cult”, “drama,” or “action.”

Advantages of Social Bookmarking
• Low “cognitive” cost – large grassroots community users vs. expert metadata specialists or catalogers • Self moderating and democratic • Flexible, inclusive, adaptive and current • Immediate Feedback • Usability – easy to use • Great at serendipitous discovery

Disadvantages of Social Bookmarking
• Low Precision/Recall due to synonymous and polysemous tags • Basic Level problem – Granularity of tags (too specific, too general) • Lack of hierarchy – no parent-child, broad-narrow relationship • Highly susceptible to malicious users.
– Meta Noise - incorrectly "malicious" tags – Gaming - cheating the system – Spamming - a universal plague of all social systems

• Fails as a search system, bad at finding specific items

• Folksonomies are undoubtedly fraught with the problems typical of uncontrolled vocabularies, but their growing popularity suggests that people are interested and motivated in assigning their own metatags to items of interest. • One cannot help but wonder whether such enthusiasm for metadata would be the same if people were asked to use only prescribed and standardized vocabularies.

Other Areas to Explore
• The cognitive and behavioural aspects of folksonomy use:
– What is the tagging behaviour of people who use folksonomies? – Why do people choose the tags they use; what motivates them to modify these tags; how often do they modify them? – How are folksonomies used communally? – How do folksonomies foster consensus in the use of tags? – How does the community affect which tags are used and how?

Folksonomies in Libraries
• Libraries can’t continue to rely exclusively on in-house cataloging • We can achieve our overall goals while allowing new mechanisms along the way • Users are one additional source of metadata we must tap • We must match appropriate metadata needs to the tasks users are best equipped to perform • Good interfaces for metadata collection will be key • We must use the best ideas for user participation, and adapt them for the library environment

To top