D4_ SKOS and HIVE—Enhancing the Creation_ Design and Flow of

Shared by: pptfiles
Categories
Tags
-
Stats
views:
12
posted:
2/23/2012
language:
pages:
30
Document Sample
scope of work template
							   D4: SKOS and HIVE—Enhancing
   the Creation, Design and Flow of
              Information



Speakers: Hollie White
           Jane Greenberg
Coordinator: Alan Keely
                    Overview
• HIVE—Helping Interdisciplinary
  Vocabulary Engineering
  – Motivation—Dryad repository
• HIVE—Goals, status, and design
     • A scenario
• HIVE for Law Library, repositories, etc.
• Challenges
  – Technical and social
• Conclusion and questions
 HIVE model




 <AMG> approach for integrating discipline CVs
 Model addressing C V cost, interoperability, and usability
 constraints (interdisciplinary environment)

    23/02/2012            Titel (edit in slide master)
                                                               3
Motivation
                 ~ Surveyof400 evolutionary biologist: 48 %
                 based on other data; 78% data not deposited
                                         ~ Evolutionary
                                         biologists use
Ecology Paleontology  Physiology       published data more
 Systematics  Genomics                frequently than they
Population genetics….                    are depositing it
                                         themselves!




                                                                5
           Partner Journals
American Society of Naturalists
    American Naturalist
Ecological Society of America
    Ecology, Ecological Letters, Ecological Monographs, etc.
European Society for Evolutionary Biology
    Journal of Evolutionary Biology
Society for Integrative and Comparative Biology
    Integrative and Comparative Biology
Society for Molecular Biology and Evolution
    Molecular Biology and Evolution
Society for the Study of Evolution
   Evolution
Society for Systematic Biology
    Systematic Biology
Commercial journals
    Molecular Ecology
    Molecular Phylogenetics and Evolution
   Vocabulary needs for Dryad
• Vocabulary analysis
  – 600 keywords, Dryad partner journals
     • Vocabularies: NBII Thesaurus, LCSH, the Getty’s TGN,
       ERIC Thesaurus, Gene Ontology, IT IS (10 vocabularies)
     • Facets: taxon, geographic name, time period, topic, research
       method, genotype, phenotype…
• Results
  431 topical terms, exact matches
  – NBII Thesaurus, 25%; MeSH, 18%
  531 terms (research method and taxon)
  – LCSH, 22% found exact matches, 25% partial
• Conclusion: Need multiple vocabularies
Goals, status, and design
         HIVE...as a solution
Address CV (controlled vocabulary) cost,
  interoperability, and usability constraints
• COST: Expensive to create, maintain, and
  use
• INTEROPERABILITY: Developed in silos
  (structurally and intellectually)
• USABILITY: Interface design and
  functionality limitations have been well
  documented
Relevance to the law library community?

• Orphaned data (more of a Dryad issue)
• More important, interdisciplinary needs
     • COST (create, maintain, and use)
     • INTEROPERABILITY
     • USABILITY
                                     Three phases of HIVE:
          HIVE Goals                 1. Building HIVE
                                     - Vocabulary Development
− Automatic metadata                 - Server preparation
                                       - Primate Life Histories Working
  generation approach that               Group
  dynamically integrates               - Wood Anatomy and Wood
  discipline-specific controlled         Density Working Group
  vocabularies encoded with
  the Simple Knowledge               2. Sharing HIVE
  Organisation System                - Continuing education
  (SKOS)                                 (empowering information
•   Provide efficient, affordable,       professionals)
    interoperable, and user
    friendly access to multiple
    vocabularies during metadata
    creation activities              3. Evaluating HIVE
•   A model that can be replicated   - Examining HIVE in Dryad
    —> model and service
 HIVE Partners               Advisory Board
                             • Jim Balhoff, NESCent
Vocabulary Partners          • Libby Dechman, LCSH
• Library of Congress:       • Mike Frame, USGS
  LCSH                       • Alistair Miles, Ok
                             • William Moen, University of North
• the Getty Research           Texas
  Institute (GRI): TGN       • Eva Méndez Rodríguez,
  (Thesaurus of                University Carlos III of Madrid
  Geographic Names )         • Joseph Shubitowski, Getty
                               Research Institute
• United States Geological   • Ed Summers, LCSH
  Survey (USGS): NBII        • Barbara Tillett, Library of
  Thesaurus                    Congress
                             • Kathy Wisser, Simmons
 Agrovoc Thesaurus
                             • Lisa Zolly, USGS
                             WORKSHOPS HOSTS: Columbia
                               Univ.; Univ. of California, San
                               Diego; Univ. of North Texas;
                               Universidad Carlos III de Madrid,
                               Madrid, Spain
            HIVE Construction
• HIVE stores millions of concepts from different
  vocabularies, and makes them available on the Web
  by a simple HTTP
   – Vocabularies are imported into HIVE using SKOS/RDF
     format
• HIVE is divided in two different modules:
1. HIVE Core
   – SKOS/RDF storage and management (SESAME/Elmo)
   – SMART HIVE: Automatic Metadata Extraction and Topic
     Detection (KEA++ and MAUI)
   – Concept Retrieval (Lucene and MG4J)
2. HIVE Web
   – Web user Interface (GWT—Google Web Toolkit)
   – Machine oriented interface (SOAP and REST)
<rdf:RDF>
                                   SKOS
   <rdf:Description rdf:about="http://thesaurus.nbii.gov/nbii#Wood-pulp">
   <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/>
   <skos:prefLabel>Wood pulp</skos:prefLabel>
   <skos:altLabel>Pulp (wood)</skos:altLabel>
   <skos:broader rdf:resource="http://thesaurus.nbii.gov/nbii#Wood”/>
   <skos:related rdf:resource="http://thesaurus.nbii.gov/nbii#Paper”/>
   <skos:related rdf:resource="http://thesaurus.nbii.gov/nbii#Paper-industry-
   wastes”/>
   <skos:related rdf:resource="http://thesaurus.nbii.gov/nbii#Pulp-mills”/>
   <skos:related rdf:resource="http://thesaurus.nbii.gov/nbii#Sawdust”/>
   <skos:inScheme rdf:resource="http://thesaurus.nbii.gov/nbii#"/>
   <skos:scopeNote>LSC Life Sciences</skos:scopeNote>
</rdf:RDF>
A scenario
Meet Amy

 • Amy Zanne is a botanist.




 • Like every good scientist,
   she publishes.
      Meet Amy
• Amy Zanne is a botanist.




• Like every good scientist, she publishes.




• She deposits data in Dryad.
  Law library/data repositories
• http://www.law.harvard.edu/library/researc
  h/databases/major.html
• http://www.digitalcurrent.com/legal_webho
  sting.aspx
                     Challenges
• Building vs. doing/analysis
   – Source for HIVE generation, beyond abstracts
• Combining many vocabularies during the indexing/term
  matching phase is difficult, time consuming, inefficient.
   – NLP and machine learning offer promise
• Interoperability = dumbing down
   – ontologies
• Proof-of-concept/ illustrate the differences between HIVE
  and other vocabulary registries (NCBO and OBO
  Foundary)
• General large team logistics, and having people from
  multiple disciplines (also the ++)
 Conclusion
 • Vocabularies will enrich Dryad data description,
   and assist with access, use, reuse, etc…
 • Nothing novel, but infrastructure is supportive,
   finally…
 • Dryad and HIVE are real-world applications using
   Semantic Web technology
 Links
 •   HIVE
      – http://ils.unc.edu/mrc/hive/
 •   Metadata Research Center <MRC>
      – http://www.ils.unc.edu/mrc/
 •   Dryad
      – http://datadryad.org/
 •   National Evolutionary Synthesis Center (NESCent)
      – http://www.nescent.org/index.php




 The Dryad Data Repository
23/02/2012
                                                        30

						
Related docs
Other docs by pptfiles