The Value of Metadata in the Google Era

Document Sample
The Value of Metadata in the Google Era Powered By Docstoc
					The Value of Metadata in the Google Era
When and Why to Apply Metadata to Digital Production Projects

• What is metadata?
– What does “data about data” mean?

• Who uses metadata?
– Information Architects – Libraries – Computer Scientists – Web Publishers?

Information Architecture Definitions
IA’s primary contribution is a preoccupation with putting content in buckets • Categorization
– Structured Hierarchical
• Taxonomies • Ontologies

– Structured Non-hierarchical
• Faceted Classification • Topic Maps

– Non-structured Non-hierarchical
• Controlled Vocabularies • Folksonomies

Library Definitions
• Descriptive Metadata
– Bibliographic Description – Subject Analysis

• Administrative Metadata
– Structural – Technical
• Preservation • Operation

– Rights

Computer Science Definitions
Computer Scientists have rightly inserted an expectation among all metadata users that metadata generation be automated wherever possible.
• • • • Databases (Keys, Labels, Fields) Ontologies (especially re: inheritance) Syntax (data encoding and transmission) Semantic (Web)

Web Publishing Definitions
• How do you define metadata and how would you characterize the way you use it?
• Metadata is moving beyond description, beyond categorization to being a fundamental part of the operation and object itself. (see MIT OpenCourseWare)

Value of Metadata for Digital Production Projects
• Metadata as part of Digital Production Projects is subject to similar evaluation • Digital Production Projects are:
– Practical – Return On Investment focused – Push technology

• Metadata must be worth it
– What does metadata do that is worth it?

Information Organization
• Metadata organizes information – Metadata makes things accessible – Metadata makes things discoverable – Metadata makes things endure • In order to see a return on investment metadata needs to be employed in an organized fashion. • Librarian perspective: A good return on investment is information that is easily retrieved by end-users • Publisher perspective: A good return on investment is realized by information that furthers the organizational mission and grows the business

Information Retrieval
• Search dominates the web
– Dominated by Google

• Search is a pull technology
– Google is Pull

• Most Digital Production Projects are Push (so are library catalogs)
– Metadata takes part both in push and pull – Both push and pull involve organizing information

• Question of the day: How can we reconcile push and pull web technologies? • Answer: Proper use of Metadata

The Google Philosophy
• World wide web objects should describe themselves • You can’t trust “hidden,” user generated information • The more world wide web objects that link to your world wide web object the more relevant your object must be

The Library Philosophy
• Information objects don’t always describe themselves (images, audio, movies, technical writing) • You can trust information provided by information experts • Pagerank does not accurately denote relevance, meaning does

Reconciliation through Metadata
• Google
– – – – – – Indices Results blurbs Links (identifiers) Html (title) element Image alt elements Article Citations

• Library
– Google Scholar – Publishing Metadata
• DSpace • OCW

– MIT GSA integration – The Metadata Services Unit

Metadata Services Unit
• What we do
– We help digital production projects organize their information, both data and metadata – We design information models and model information flows and user interactions – We recommend best practices – We make and document metadata decisions – We catalog, describe, aggregate, put things in buckets – We participate in research and we test systems

For More Information
• Please Contact Metadata Services
Robert Wolfe 14E-210B 3-0604

Shared By: