The Role Of Metadata

W
Document Sample
scope of work template
							The Role Of Metadata
Brian Kelly
UKOLN
University of Bath
Bath, BA2 7AY
Email
B.Kelly@ukoln.ac.uk
URL
http://www.ukoln.ac.uk/web-focus/presentations

UKOLN is supported by:

A centre of expertise in digital information management   www.ukoln.ac.uk
Introduction     Contents
                    • Introduction
                    • Background To Metadata
                    • Metadata Standards
                    • Metadata Management
                    • Metadata And Quality
                    • Conclusions
               The Brief
                "I know from conversations … I have had with customers,
                that metadata poses some really difficult questions …"
               The talk addresses the questions:
                What is metadata and why is it important? What's this Dublin
                Core I've heard about (and why Dublin?) What benefits will I
                get if I use metadata? How should I do it? What will it cost
                me? of expertise in digital information management
                A centre                                           www.ukoln.ac.uk
Introduction   About UKOLN / Web Focus
               UKOLN:
                    • A national centre of expertise in digital information
                      management (including metadata)
                    • Based at University of Bath
                    • Funded by JISC and Resource to support the
                      Higher & Further / cultural heritage sectors
               UK Web Focus:
                 • Provides advice and support on Web issues,
                   especially standards and best practices
                 • Provided by Brian Kelly
                 • Funded by JISC from Nov 1996 - August 2003.
                   Now jointly funded by JISC & Resource
               QA Focus:
                 • Developing QA methodology to support JISC
                   digital library programmes
               A centre of expertise in digital information management   www.ukoln.ac.uk
Introduction   About You
               How many are:
                    •   Librarians
                    •   Software / systems developers (techies)
                    •   Commercial vendors
                    •   Others

               What is the extent of your knowledge of metadata?
                Novice                         Average                      Expert

                 ???                           MARC                         RDF
                                               Dublin Core                  OAI
                                               …                            CLD
                                                                            …
               A centre of expertise in digital information management   www.ukoln.ac.uk
Background   What is Metadata?
             "This metadata you've been talking about …. isn't it
             just catalogue records?"
                                 Question at metadata seminar, 1998
             Metadata can be regarded as:
                 • Catalogue records for the Web
                 • Data about data
                 • Structured information suitable for automated
                   processing
                                               Metadata Demystified
             In current practice, the term has come to mean structured
             information that feeds into automated processes, and this
             is currently the most useful way to think about metadata
             http://www.niso.org/standards/resources/Metadata_Demystified.pdf
             A centre of expertise in digital information management   www.ukoln.ac.uk
Background   The Problem
             Back in mid-1990s:
               • Size of Web growing exponentially
               • Web being used for both scholarly and
                 non-scholarly (!) purposes
               • Need for better searching mechanisms
               • Search engines seemed promising, but
                 concerns over abuse (e.g. porn index
                 spammers) and difficulties in finding
                 quality information
               • Various sectors came together to develop
                 a core set of metadata attributes for
                 resource discovery
             A centre of expertise in digital information management   www.ukoln.ac.uk
Dublin Core   Dublin Core
              In mid-1990s:
                 • Meeting held in Dublin, Ohio in 1995
                 • Involvement from several sectors
                   (libraries, museums, science, IT, …)
                 • Agreement reached on a core set of
                   metadata attributes for resource discovery
                 • Given the name Dublin Core (DC)
                 • DCMI organisation later formed
                 • DC Working parties established to
                   coordination development of DC
                 • Regular annual conferences held
               See <http://dublincore.org/>
              A centre of expertise in digital information management   www.ukoln.ac.uk
Dublin Core   Why So Complex?
              Why is there a need for working groups,
              annual events, etc. for developing a standard
              for catalogue records?
                   • It's not just documents: an Author record is
                     inappropriate for a painting, a piece of music, etc.
                   • It's not just for humans: the DC records will be
                     processed by software, for which unambiguity in
                     essential
                   • It needs to be integrated: with a rapidly-
                     developing Web architecture
                   • It needs to be future-proofed : so we don't have
                     to do it all again when a new technology emerges
              A centre of expertise in digital information management   www.ukoln.ac.uk
Dublin Core   Using Dublin Core
              Note that DCMI defined a core set of elements:
                Title         A name given to the resource.
                Creator       An entity primarily responsible for
                              making the content of the
                              resource.
                Publisher An entity responsible for making
                              the resource available.
                Date          A date of an event in the lifecycle
                              of the resource.
                …             …
              How this format could be represented was not
              defined initially
              A centre of expertise in digital information management   www.ukoln.ac.uk
Dublin Core   Representing Dublin Core
              Initially many people thought that DC would be
              embedded in HTML pages:
                <META NAME="DC.Creator" CONTENT="Brian Kelly">
              but how are multiple author's represented:
                <META NAME="DC.Creator" CONTENT="Brian Kelly">
                <META NAME="DC.Creator" CONTENT="John Smith">
              or
                <META NAME="DC.Creator" CONTENT="Brian Kelly,
                John Smith">
              It is not possible to describe the potential complexities
              of DC in the HTML language



              A centre of expertise in digital information management   www.ukoln.ac.uk
Dublin Core   Dublin Core Is Too Simple!
              Dublin Core was designed as a core set of metadata
              elements for resource discovery. However:
                 • The benefits of the standard became apparent
                   and DC became used in many areas
                 • There was a need to be able to represent richer
                   metadata content and relationship e.g.
                      Multiple authors and contact details
                      Alternative titles
                      Use of controlled vocabularies from particular
                       schemes
              A mechanism known as Qualified Dublin Core was
              developed to address this.


              A centre of expertise in digital information management   www.ukoln.ac.uk
Dublin Core   Use In HTML
              Dublin Core potential was recognised and the W3C's
              release of HTML 4.0 included a mechanism for
              defining schemes in the <meta> element:
              <meta name = "DC.Subject"                         <meta name = "DC.Type"
                     content = "heart attack">                       scheme = "DCMIType"
              <meta name = "DC.Subject"                              content = "Dataset">
                     scheme = "MeSH"                            <meta name = "DC.Type"
                     content = "Myocardial                           scheme = "DCMIType"
              Infarction; Pericardial Effusion">                     content = "Event">
                See
                <http://dublincore.org/documents/2001/
                04/12/usageguide/qualified-html.shtml>


              A centre of expertise in digital information management          www.ukoln.ac.uk
W3C Developments   XML
                   XML (Extensible Markup Language):
                     • Developed by W3C
                     • A meta-language used to create other languages
                     • Addresses HTML's lack of extensibility
                     • A family of standards which form the foundations
                       for a richer and more interoperable Web:
                           XML              XML Namespaces
                           XSLT             XML Schemas
                           …
                      • A proven success

                   Rather than slowly tweaking HTML to allow rich DC to
                   be embedded, XML allows new metadata applications
                   to be developed which can be integrated with existing
                   Web services digital information management
                   A centre of expertise in                    www.ukoln.ac.uk
W3C Developments   Beyond Use In HTML
                   In parallel to release of HTML 4.0 W3C working on:
                       • A rich metadata framework which could be used
                         for any metadata application:
                            Content filtering (this resource contains
                              nudity)
                            Defining collections of related resources
                              (Web site maps)
                            Digital signatures
                           …
                       • Development of the Semantic Web - An
                         ambitious attempt to allow data from distributed
                         services to be integrated

                     RDF (Resource Description Framework) was
                   A developed as W3C's solution to both problems
                     centre of expertise in digital information management www.ukoln.ac.uk
W3C Developments   RDF
                   RDF:
                     • An XML application
                        • Richer than conventional XML applications: a
                          mathematical model which describes
                          relationships is embedded in the RDF
                        • This richness comes with a price - increased
                          complexity
                    RDF applications are being developed. However at present
                    it may be advisable to leave RDF to the research
                    community or well-funded pilot studies to prove its benefits
                    before committing to use in a service environment
                    (However note that metadata in PDF documents is stored
                    as RDF)
                   A centre of expertise in digital information management   www.ukoln.ac.uk
Using Metadata   Beyond Resource Discovery
                 Metadata has a role to play beyond item-level resource
                 discovery
                 Other metadata applications include:
                    • Metadata for digitised objects: about the object
                      and about the digitisation process
                    • Management / administrative metadata: review
                      this resource by xx; delete this resource on …;
                      this resource is managed by the XYZ group; …
                    • Metadata about collections (physical and online)
                    • …



                 A centre of expertise in digital information management   www.ukoln.ac.uk
Using Metadata   Metadata Modelling (1)
                 You want to use Dublin Core metadata. How do you
                 choose how to model your metadata?
                    • Do you use simple Dublin Core (the basic 15
                      elements)?
                    • Do you use qualified Dublin Core to enable richer
                      metadata to be described?
                    • If the latter, how do you decide which qualified
                      DC metadata to use?
                      These are key issues to address.
                      In some cases answers may be provided for you.
                      In other cases, you musty answer these
                      questions for yourself.

                 A centre of expertise in digital information management   www.ukoln.ac.uk
Using Metadata   Metadata Modelling (2)
                 Why do you wish to use metadata?
                   • Because it fashionable?
                   • Because you're a librarian and librarians 'do'
                     metadata?
                   • Because you want you Web site to be no. 1 in
                     Google?
                   • Because you are developing an application which
                     requires use of metadata?

                 Please remember:
                    • Developing applications which make use of
                       metadata can be expensive.
                    • Creating and managing metadata can be expensive
                    • Search engines such as Google typically make little
                       or expertise in digital information management
                 A centre ofno use of metadata                        www.ukoln.ac.uk
Using Metadata   Metadata Modelling (3)
                 Exploit Interactive case study:
                  • EU-funded ejournal
                  • Requirement to provide
                    local searching better than
                    simple free text searching:
                        • Search by title, author and
                          keywords
                        • Search by funding stream
                        • Search by issue and article
                          type
                   • The end-user interface is
                     illustrated

                                                        See <http://www.ukoln.ac.uk/qa-focus/
                                                        documents/case-studies/case-study-01/>
                 A centre of expertise in digital information management           www.ukoln.ac.uk
         Metadata Modelling (4)
         How did we manage and model the metadata?
doc_title = "The XHTML Interview"          issue_num = "6"
author="Kelly, B."                         pub_date="25 Oct 2002"
title="WebWatching National Node
                                                       Issue metadata
Sites"
description = "In this issue's Web
                                           name = "Exploit Interactive"
Technologies column we ask Brian
                                           publisher="UKOLN"
Kelly to tell us more about XHTML."
article_type = "regular"                                     Site metadata
 Article metadata
                         Processed by server-side script

<meta name="DC.Title" content="The XHTML Interview">
<meta name="DC.Creator" content="Kelly, B.">
<meta name="DC.Description" content="In this issue's Web Technologies ….">
<meta name="DC.Relation.IsPartOf" content="http://www.exploit-lib.org/issue6/">
<meta name="DC.Type" content="text.article.regular" scheme="Exploit-categories">
        A centre of expertise in digital information management    www.ukoln.ac.uk
Metadata Management   Storing DC Metadata
                      It is up to you how you store your metadata. Your
                      choice will be affected by the use which will be made of
                      your metadata and how it will be created and managed.
                      You may wish to store your Author                            Book          Pub. Date
                      metadata in a database
                                                                         G.Orwell  1984          1948
                      and make it available
                                                                         I. Rankin Question      2003
                      according to its use.                                        Of Blood
                      You may wish to:
                          • Embed HTML metadata in HTML pages
                          • Link to HTML metadata from HTML                                        HTML

                          • Embed RDF                                                      RDF
                          • Store metadata in application
                            (home-grown scripts, CMS,                                       Metadata
                            metadata repository, image                                      management
                            management system, …)
                      A centre of expertise in digital information management               tool
                                                                                           www.ukoln.ac.uk
Metadata Management   A Simple DC Management Tool
                      DC-dot:
                       • Simple Web-based
                         DC creation and
                         management tool
                       • Output in range of
                         formats (HTML,
                         XHTML, RDF, …)
                       • Provides validation
                       • Useful for small-scale
                         metadata creation        http://www.ukoln.ac.uk/metadata/dcdot/
                      But:
                       • Not ideal for large-scale usage
                       • Doesn't provide rich
                         management capabilities
                      A centre of expertise in digital information management   www.ukoln.ac.uk
Metadata Management   Management Tools
                      Many types of metadata tools:
                        • Type the metadata by hand
                        • Use File -> Properties menu in MS Office
                          applications and export data
                        • Home-grown database systems
                        • Home-grown scripting solutions
                        • Use of commercial systems:
                              • Library management systems
                              • Image management systems
                              • …

                       There is no single ideal solution.
                       The solution you choose should reflect your needs,
                       expertise, organisational culture, …
                      A centre of expertise in digital information management www.ukoln.ac.uk
Quality Assurance   Quality Assurance
                    The Need for QA:
                       • Metadata is the 'glue' for integration of services
                       • If the metadata quality is poor, services will not be
                         able to be interoperable
                       • There is therefore a need for quality assurance
                         procedures to ensure fitness for purpose
                    What Can Go Wrong?
                       • Things that can go wrong include:
                               •   Metadata is out-of-date or incorrect
                               •   Metadata is used inconsistently within service
                               •   Metadata is used inconsistently across services
                               •   Metadata is not modelled correctly
                               •   Metadata not compliant with storage standard
                               •   …
                    A centre of expertise in digital information management   www.ukoln.ac.uk
Quality Assurance   Think About The Implementation
                    It is important that when you deploy metadata systems
                    you can manage and maintain the metadata. For
                    example:
                         • Details of the person maintaining the data change
                           (name change due to marriage, person leaves,
                           …)
                         • Organisational details change (mergers,
                           takeovers, …)
                         • Technology changes

                    Prepare for change! People change, organisations
                    change, responsibilities change, technologies change,
                    …
                    Ensure that you can manage the metadata which
                    A centre of such digital information
                    reflectsexpertise inchanges management       www.ukoln.ac.uk
Metadata Management   Need For Cataloguing Rules
                      Your Cataloguing Rules
                           • You will need cataloguing rules to support your
                             metadata creation
                           • You will need to provide necessary training and
                             support (especially if you are dependent on
                             cataloguing by non-professionals)
                      Interoperability
                           • How will you interoperate with services which
                             deploy different cataloguing rules:
                                            04/07/03 – what date is this?
                                            LSC – what does this stand for?
                           • Humans use context; software products don't
                           • There is a need to define the standards you're
                             applying (in a machine understandable way)
                      A centre of expertise in digital information management   www.ukoln.ac.uk
 Quality Assurance   Need For QA Procedures
                     So we have:
                        • Tools for managing metadata
                        • Cataloguing rules
                     But:
                        • People make mistakes
                        • Software may have bugs
                        • Our rules may be ambiguous
                        • The standards may be ambiguous
                        • The metadata may be correct but confusing in
                          other contexts,
                        • …
Although humans can adapt to errors and unambiguities, software
typically can't. We therefore need quality assurance procedures to
ensure that metadatainapplications management
         A centre of expertise digital information
                                                   will be interoperable. www.ukoln.ac.uk
Quality Assurance   Approaches To QA
                    We may wish to consider:
                      • Systematic checking at data creation
                      • Systematic checking of output
                      • Semi-automated checking (e.g. duplication,
                        common misspellings, out-of-range checks, …)
                      • Automated checking
                      • …

                    Worst Case Scenario:
                    You service is fine, and quality metadata provided. Your data is
                    integrated with others services to provide an international portal
                    to quality resources. However the other service providers have
                    poor quality metadata. The poor quality of the final service brings
                    your contributor into disrepute.
                    A centre of expertise in digital information management   www.ukoln.ac.uk
Pulling It Together




A centre of expertise in digital information management   www.ukoln.ac.uk
Conclusions
To conclude:
     • Metadata can provide richer searching and other services
         within a service and the glue for integration across several
         services
     • There are several key standards: Dublin Core, HTML, XML, …
     • You will need to select the standards appropriate to your
         service requirements
     • You will need to choose the metadata according to your
         service requirements
     • You will need to choose the architectural framework and
         applications for managing your metadata according to your
         service requirements
     • You will need to ensure that you have appropriate quality
         assurance mechanisms in place – otherwise the above work
         will have been wasted!
     • It can be digital information management
A centre of expertise inworth it!                         www.ukoln.ac.uk

						
Related docs
Other docs by liwenting
Prudential Long-Term Care LTC3 Sales Ideas
Views: 7  |  Downloads: 0
Seite 1 von 5 Tischtennis Ein we
Views: 49  |  Downloads: 0
Activating Bridge Baron
Views: 216  |  Downloads: 0
doc_15_
Views: 4  |  Downloads: 0
MERCADOS FINANCIEROS
Views: 199  |  Downloads: 0
Business Object Type Library Dr
Views: 11  |  Downloads: 1
Hot Buy
Views: 67  |  Downloads: 0