Standards for Long-Term Retention of
Can Ontologies Help?
National Institute of Standards and Technology
Collaborative Expedition Workshop
National Science Foundation
July 18, 2007
• Too much digital data!
– It takes about 15 minutes for the world to churn out new digital
information equivalent to the entire collection in US Library of
• Proprietary file formats
– Expected lifetime of typical manufacturing software application only
• Short-lived Computing hardware and software
– Expected lifetime of today’s storage/retrieval technologies only 10
• Products often outlive computer software/hardware by an order
– Aircraft can last 50 years or more
– Healthcare records should be preserved through the patient’s
lifetime, and perhaps beyond
• Methods/tools address preservation, but not reuse or re-
• Necessary to avoid being locked into a vendor
format or application that could disappear in
the near future
• Likely to be more stable than proprietary
• But data standards are only part of the solution
– Information is more than just data!
Information = Data + Interpretation
from Reference Model for an Open Archival Information System (ISO 14721:2003)
An Information Package
Tools for Tackling Long-
• Standards for representing digital artifacts
– STEP – ISO 10303 (product data)
– XML (documents)
– Graphics, audio, video, multimedia standards
– Scientific modeling standards
• Methods for representing preservation information
– Digital object typing/packaging
• METS (Metadata Encoding and Transmission Standard)
• DOPs (Digital Object Prototypes)
– Ontology languages
– Rules languages
• Schematron (ISO 19757-3:2006)
• Digital format registries (UK Archives, Harvard, Univ.
Sustaining Digital Information
What is sustainability?
From The Free Dictionary:
• Noun - the act of sustaining life by food or providing a means of
subsistence; "they were in want of sustenance"; "fishing was their main
• Transitive verb
– 1. To keep in existence; maintain.
– 2. To supply with necessities or nourishment; provide for.
– 3. To support from below; keep from falling or sinking; prop.
– 4. To support the spirits, vitality, or resolution of; encourage.
– 5. To bear up under; withstand: can't sustain the blistering heat.
– 6. To experience or suffer: sustained a fatal injury.
– 7. To affirm the validity of: The judge has sustained the prosecutor's
– 8. To prove or corroborate; confirm.
– 9. To keep up (a joke or assumed role, for example) competently.
Sustaining Digital Information
– “Prop up”
– Prevent destruction
– Ensure authenticity, availability
– “Care and feeding”
– Enable reuse
• Library of Congress digital format sustainability factors
– External dependencies
– Impact of patents
– Technical protection mechanisms
• What are the sustainability factors for an archiving and/or
records management strategy?
OAIS Functional Model
Access Scenarios: The Three Rs
– Preserve information in its original state
– Example (product data engineering): 3D visualization
– Allow for future modification, re-engineering
– Example: ISO 10303-203:1994 (STEP AP203)
– Encode construction history, design intent, tolerancing info,
lifecycle management info, etc.
– Example: STEP AP203 ed.2 ++
– Ontologies and/or other representations needed
Extended Functional Model
So How Can Ontologies Help?
• Digital object type classification
• Prediction of records management policy
• Evaluating a records management system
based on sustainability criteria
• Tailoring repository access according to the
• Measure long-term sustainability based on the