BS 8723 a new British Standard for structured vocabularies - PowerPoint by niusheng11

VIEWS: 6 PAGES: 36

									Crossing the boundaries:
interoperability between
vocabularies

Stella G Dextre Clarke
Senior Metadata Consultant,
Bridgeman Art Library;
Independent Consultant
Summary
   Interoperability:
       At the metadata schema level
       At the vocabulary level
   Practicalities of vocabulary mapping
   Interoperability at the data exchange
    level
   Standards to help us through the maze
In a networked world,
interoperability is all the rage
 CIDOC-CRM
 Web 2.0

 Mash-ups

 Semantic Web (well, not quite with us
  yet, but said to be coming shortly)
… and it’s not just about Museum A
  sharing with Gallery B
How to achieve interoperability?
   Step 1: apply a metadata schema
    consistently to all your records and
    export via a standard metadata format
   Step 2: implement a metadata cross-
    walk e.g. The Getty crosswalk at
    http://www.getty.edu/research/conducting_research/stan
    dards/intrometadata/metadata_element_sets.html

   So far so good – it’s not so difficult
But interoperability needs to
apply at two levels
   Between metadata schemas, e.g:
Artist   → Creator → Maker
Location → Place → Coverage.spatial
Keywords → Subject
   Between vocabulary terms, e.g:
rowing boats → rowboats → pulling boats
gramophone records → phonograph records
garments → clothes → clothing
How to achieve interoperability
at the vocabulary level?
   Step 1: apply a controlled vocabulary
    consistently to all your records
   Step 2: implement a vocabulary cross-
    walk (a.k.a. set of mappings)
   But ready-made crosswalks are not so
    easy to find; you may have to build
    your own, and it can be a long job…
some practicalities of building
and using crosswalks
Sample entries from a crosswalk
Vocabulary A    Vocabulary B
schools         schools
teachers        instructors
pupils          students
textbooks       text-books
study           studies
 Building the mappings – an
 easy example


Vocabulary A    Vocabulary B
Churches        Churches
  Look a little closer. Is it so
  easy?

Vocabulary A             Vocabulary B
Churches                 Churches
 NT Byzantine churches   NT Anglican church
    Gothic churches         Protestant church
    Norman churches         Roman catholic church
Another example: compare 5
different vocabularies
Look for the concept “schools” in the
  following:
 IPSV    (UK public sector)
 AAT     (art/architecture)
 GEMET (environmental)

 ERIC    (education)
 MeSH    (medical)
URLs for those vocabularies
   IPSV http://www.esd.org.uk/standards/ipsv/
   AAT
    http://www.getty.edu/research/conducting_research/
    vocabularies/aat/
   GEMET http://www.eionet.europa.eu/gemet
   ERIC http://www.eric.ed.gov/
   MeSH http://www.nlm.nih.gov/mesh/
Typical differences between
vocabularies
   Different term for the same concept (and
    same term can signify a different concept)
   Hierarchical structure around the concept
   Scope note, definition, synonyms and other
    attributes of a term/concept
   Concepts designated by terms or by codes or
    notation
   Language of access (e.g. French, German)
   Layout and format
     More practicalities: two-way
     versus one-way mappings
 Parrots                       Poultry

                                  Chickens
 Canaries         Birds
                                  Ducks

 Budgies
                                  Geese

Vocabulary 1   Vocabulary 2   Vocabulary 3
More practicalities – planning
the architecture

 A       B
                     F
 C       D           H
                E         G
Or some people do chain
mapping…

    A           B
                        F
    C           D       H

P       Q   R   S
                    E       G
But what happens with chain
mapping?
buses → coaches                   Timber → wood
coaches → trainers                Wood → woods
trainers → training shoes         Woods → forests

Job vacancies → jobs              Firewood → logs
Jobs → posts                      Logs → records
Posts → post                      Records → archives
post → mail

Any one of the mappings could be OK in one context, but not
  when chained.
Most howlers can be avoided, but only if you check carefully
So best avoided…

    A           B
                        F
    C           D       H

P       Q   R   S
                    E       G
Mapping 25 vocabularies (slide from GESIS project)



                    Information
                    science (1)

    Agricultural                               Gerontology (1)                                   Medicine (1)
    science (1)

                                                                      Political science
                       Social Sciences                                        (3)
                             (10)



                                                                 Economics (2)

    Universal (3)




                                  Pedagogics (1)                                 Sports science (2)




                        Psychology (1)
A bit of practical reasoning
   You can’t rely on a computer to do the
    matching
   But it’s such a huge job, you can’t do it
    without a computer!
   Ergo, use a computer to suggest
    matches, but do a human check on
    each one
One more practical need for
interoperability
   Data exchange between vocabularies and the
    computer applications that exploit them
   Either for importing a vocabulary into an
    application (e.g. into a search engine or a
    cataloguing package)
   Or to allow online interrogation of a
    vocabulary by a searching or indexing
    application
   What we need are standard formats and
    protocols
So what standards do we
have?
   ISO 2788, ISO 5964 and national
    equivalents
   ANSI/NISO Z39.19
   SKOS, Zthes, ADL, MARC, SRW/SRU
   BS 8723
   ISO NP 25964
    Vocabulary construction and
    management
   ISO 2788-1986 Guidelines for the establishment and
    development of monolingual thesauri
    = BS 5723:1987 and other national standards
   ISO 5964-1985 Guidelines for the establishment and
    development of multilingual thesauri
    = BS 6723:1985 and other national standards
   ANSI/NISO Z39.19-2005 Guidelines for the
    construction, format and management of
    monolingual controlled vocabularies
Vocabulary data formats only
   Simple Knowledge Organization Systems
    (SKOS) format is in XML/RDF and destined
    for Semantic Web.
    http://www.w3.org/2004/02/skos/
   Zthes – an application profile of Z39.50, for
    exchange of thesaurus data.
    http://zthes.z3950.org/
   MARC has a format for “authority records”,
    suitable for library applications. at
    http://www.loc.gov/marc/authority/
Vocabulary data protocols only
   SKOS API designed for live querying of
    vocabularies on the Web.
    http://www.w3.org/2001/sw/Europe/reports/thes/skosapi.html
   ADL Thesaurus Protocol for querying and
    navigation around monolingual thesauri on
    the Web.
    http://www.alexandria.ucsb.edu/thesaurus/specification.html
   SRW/SRU (Search and Retrieve via the
    Web/URLs) is for a variety of search types,
    not just vocabularies.
    http://www.loc.gov/standards/sru/
Vocabulary construction and
management + interoperability
BS 8723: Structured vocabularies for
  information retrieval – Guide
 Part 1: Definitions, symbols and abbreviations

 Part 2: Thesauri

 Part 3: Vocabularies other than thesauri

 Part 4: Interoperability between vocabularies

 Part 5: Exchange formats and protocols for
  interoperability
Motivation throughout is “interoperability”
ISO NP 25964 (adoption of BS
8723 as an ISO standard)
   The proposal to revise ISO 2788 and
    ISO 5964, basing the work on BS 8723,
    was submitted to ISO TC 46/SC 9
    members in April 2007
   Project now approved
   At least 9 countries participating:
    France, Germany, Canada, Finland, New
    Zealand, Sweden, UK, Ukraine, USA
In conclusion
   In a networked world, we need
    interoperability at the vocabulary level
   Building the mappings is a job for people, not
    computers (but computer support is vital)
   Mapping may not be easy, but it’s fun… for
    the person with the right mindset
   We need to apply standards to all aspects of
    vocabulary work, data exchange as well as
    construction and maintenance

								
To top