Document Sample
Collaborative_Ontology Powered By Docstoc
					 Environmental Health Science—
     Cross Domain Ontology
  Research (EHS-CORE) Project

     Collaborative Expedition Workshop #38,
  February 22, 2005, National Science Foundation
Jane Greenberg, Associate Professor, School of Information and
Library Science, University of North Carolina at Chapel Hill

Abe Crystal, Research Assistant and Doctoral Student, SILS/UNC

W. Davenport Robertson, Library Director, National Institute of
Environmental Health Sciences
Obesity and the Built Environment:
  An Interdisciplinary Challenge
 Obesity in America has become an “epidemic.” (Health
  and Human Services Secretary Tommy Thompson)

 Accounts for more than 300,000 premature deaths each
  year, direct health care costs in excess of $61 billion

 Burden significantly greater in the lower socioeconomic
  strata, minority and vulnerable populations.

 Promising solution—integrate physical activity into daily
  life by improving the built environment—the physical
  surroundings in which one lives and works.

 Interdisciplinary nature of obesity and the built
   Problem: “Information Silos”
 Researchers are unaware of useful data and
  literature sources in related disciplines, beyond
  their immediate scope, because they are
  confronted with information silos
   Scenario 1: we know it’s there, but “it’s roll the dice
    whether or not we find it”
   Scenario 2: we don’t know it’s there (student PubMed
    search misses many relevant databases)

 Researchers aware of resources in other
  domains must locate all relevant and
  independent data sources, interact with each
  data source in isolation, and manually combine
              Problem impact
 Researchers face:
   A labor-intensive and inefficient interdisciplinary
    research experience (hard to find/integrate data and
    literature from outside own domain)

   Difficulty in locating “undiscovered public knowledge”
    (Swanson, 1986)—research from disparate
    disciplines, that when combined can solve an open

   Duplicative research resulting from the absence of
    knowledge about research in related, but pertinent
   Solution: information integration
Research goals of proposed project:
 Integrate existing domain-specific ontologies to provide
  uniform intellectual access to interdisciplinary data and
  literature on obesity and the built environment.

 Use Semantic Web metadata and technologies to
  provide powerful querying and inferencing capabilities on
  the integrated ontology.

 Develop an ontology server capable of dynamically
  incorporating changes (i.e., “just-in-time” integration) in
  domain-specific ontologies (e.g., new or revised
  vocabularies) into the integrated ontology.
    Proposed Research Team
 Domain science
 (nutrition and public health)
   UNC School of Public Health, Active Living by Design
 Ontology engineering and systems
 development (computer science)
 Ontology and Web semantics
 development and evaluation (information
   Metadata Research Center/SILS/UNC-CH
       Information Integration:
        Ontological Solutions
Functional criteria
 Integrate ontologies from different
  domains/disciplines, using standard languages
  such as OWL

 Provide access to disparate and distributed
  data and literature

 Update vocabulary dynamically (on the fly, or at
  frequent intervals) based on changes in host
     Information Integration:
     Ontological Solutions (2)
Technical criteria
 The components must be openly
  accessible, preferably open source, and
  listed in a standard registry.

 They must use open enabling
  technologies and standards, such as:
   Uniform Resource Identifiers (URIs)
   Resource Descriptor Format (RDF), RDFS,
    and OWL (Web Ontology Language)
 Domain research
    Multi-method approach (interviews, log analysis…)
 Ontology mapping
    Standardization, pruning, mapping, testing, reviewing,
 Ontology server
    Define functional requirements, system architecture,
     prototyping, evaluation
 Document Cataloging
    Document sampling, cataloging (Dublin Core),
     metadata evaluation
 Unified interface
    Define functional requirements, prototyping, connect
     to ontology server, usability testing
            Three Key Impacts
 Addresses a major social problem, epidemic

 Validates an approach to dynamic ontological
  integration approach, which may be applicable
  to many domains

 Facilitates cross-domain research, leading to
  increased scientific productivity and discovery
            Project Status
 Beginning preliminary fieldwork

 Pending proposals: NSF (system design
 and ontological integration), IMLS (user
 access to resource collection at ALbD)

 Environmental Health Science Thesaurus
 Forum (buy-in by many)
                           Selected References
 Greenberg, J. (2004a). Metadata Extraction and Harvesting: A Comparison of Two
    Automatic Metadata Generation Applications. Journal of Internet Cataloging, 6(4): 59-82.
   Gruber, TR. (1993). A Translation Approach to Portable Ontology Specification. Knowledge
    Acquisition, 5: 199-220.
   Gruber, TR. (1994). Toward Principles for the Design of Ontolgoies Used for Knowledge
    Sharing. IJHSC, 43 (5/6): 907-928.
   Guarino, N. (1998). Formal Ontology and Information Systems. In: N. Guarino, editor,
    Proceedings of the 1st International Conference on Formal Ontologies in Information
    Systems, FOIS '98, Trento, Italy, June, 1998, ISO Press, pp. 2-15.
   Kalyanpur, A, Sirin E, Parsia B, and Hendler, J. (2004). Hypermedia inspired Ontology
    Engineering Environment: Swoop. Submitted to ISWC 2004 as a poster. [Online]. Available
   Lauser, B., Wildemann, T., Poulos, A., Fisseha, F., Keizer, J., and Katz, S. A
    Comprehensive Framework for Building Multilingual Domain Ontologies: Creating a
    Prototype Biosecurity Ontology. In Proceedings of the International Conference on Dublin
    Core and Metadata for e-Communities, 2002, Florence, Italy. October 13-17. Firenze:
    Firenze University Press, pp. 113-123, 2002. [Online]
   Robertson, WD, and Greenberg, J. (2004). Architecting a Cross-Disciplinary Thesaurus for
    the Semantic Web. DC-2004: Metadata across Languages and Cultures. Proceedings of
    the International Conference on Dublin Core and Metadata Applications, October 11-14,
    2004, Shanghai, China.
   Sowa, J. F. (2002). Ontology, Metadata, and Semiotics, International Conference on
    Conceptual Structures, ICCS '2000, August 14-18, Darmstadt, Germany.
   Swanson, D. R. (1986). Undiscovered Public Knowledge. Library Quarterly, 56: 103-118.
   Shanghai: Shanghai Scientific & Technological Literature Publishing House, pp. 231-235.