reliable - School of Information by waishengda


									Lifecycle Metadata
for Digital Objects

October 4, 2004
Creation Metadata
Review class metadata sets: What
categories do they fill?
   Dublin Core      NLM Journal
   EAD              OCLC/RLG
   FRBR             ONIX
   INDECS           PRISM
   MARC/MODS        RKMS
   METS             TEI Lite
   MIX/NISO         VRA Core
   MPEG21
What happens at file creation?
   Consciously
    – You place into the file everything you think it
      needs in order to be useful
    – You save it with a file name that will help you find
      it again (you hope)
   Unconsciously
    – Metadata is added using environmental
    – Metadata is added using information elicited from
Metadata from creating app
(Word 2000 Statistics)
Metadata from creating app
(Word 2000 General)
Metadata controlled by the user
(Word 2000 Summary)
Metadata controlled by the user
(Word 2000 Custom)
Viewing Word Metadata in XML
   Add relevant metadata to your Word
    document as outlined above
   If you have Word 2003, save the document
    as XML
   Open the document in an XML editor
   If not, save the document as HTML
   View the document in Notepad or another
    ASCII editor or view source from the HTML
    document displayed in a browser
The future of XML in Word?
 Word already provided XML markup of its
  Document Properties and Custom Document
  Properties metadata; in Word 2003 a native
  (and patented) XML schema is used.
 Several vendors made plugins for making
  older Word documents into XML documents:
    – eXportXML from Schultz: a template installed into
      Word using a macro
    – Xfinity Author Wx from B-Bop: makes Word into an
      XML editor for ordinary documents
    – WorX SE from Xyenterprise: Word as XML editor
      or creator of XML objects
Future of XML at Microsoft
   XML and the whole model of the way the web works
    are becoming part of the emerging Microsoft
    operating environment from servers to desktop
   Note already a move to a standard IE interface for
    system functions
   Presently provides tools to programmers under
    “Information Bridge” framework to allow connecting
    XML documents created by Microsoft programs via
    metadata elements to web services
   Creation metadata thus vital to this whole scheme
[non-Microsoft] Uses of creation
   Establishing prior art for an invention
   Identifying who knew what and when
   Showing how an object fits into the larger
    scheme of things (preserving the “archival
   Keeping track of versions of an object
   Providing assurance of reliability: that the
    object is what it purports to be
   Anchoring the object in the place and time of
    its origin
Placement of creation metadata
 Same options as for all metadata
 Embedded within the object (Word metadata)
 Wrapped around the object (object is
  embedded in metadata document: Word
  document containing metadata embedded in
  XML document extracting reliability metadata)
 Captured, communicated, or kept separately
  from the object (non-text objects but not only
UBC Creation Metadata I

 A word on diplomatics
 The notion of a complete record
    –   Medium
    –   Content
    –   Form
    –   Persons (author, writer, addressee, creator)
    –   Acts
    –   Archival bond
    –   Transmission (intent, capability, success)
UBC Creation Metadata II

   “Elements of intellectual form” inside the
    – Date (time of transmission and receipt; place of
    – Superscription or attestation (author/originator)
    – Inscription (all addressees and receivers)
    – Title and/or subject
    – Disposition/purpose (the intention of the record)
UBC Creation Metadata III

 The notion of reliable record; must in addition
  to completeness have:
 “Document profile” as container for the object
    –   Date available (created or received)
    –   Time available (created or received)
    –   [Date and time of further transmission]
    –   Author
    –   Addressee
    –   Subject [classification code, registry number]
DoD 5015.2 Recordkeeping
Standard: Assumptions
 Note that 5015.2 assumes an entire
  detailed recordkeeping system that fully
  accounts for all records at the series,
  folder, and individual level
 The “file plan” defines the
  recordkeeping system; the “schedule” is
  applied to entities defined in the file plan
DoD 5015.2 Recordkeeping
Standard: record metadata
 Unique identifier*      Author/Originator*
 Supplemental marking    Addressee*
  list                    Other addressees*
 Subject/Title*          Originating
 Media type*              organization*
 Format*                 Location

 Date filed*             Vital record indicator

 Publication date*       Vital record

 Date received
                           review/update cycle*
                          User-defined fields
DoD 5015.2 email metadata
 Sender (Author/Originator)
 Primary addressees (Addressee)
 Other addressees (Other addressee)
 Date/time sent (Publication date)
 Date/time received (Date received)
 Subject (Subject/Title)
Date/time and persons: vital to
records’ reliability
 Without date/time, not possible to manage
  records by date: cutoffs, retention, destruction
 Without persons (author, recipient, creator),
  nobody would care
 Without hierarchical set of data categories, no
 Note dependence on systems in which
  records creation are embedded
What about non-text objects?

   Creation metadata for non-text objects covers
    much the same ground:
    – Information about occasion of creation
    – Information about creator, intention, receiver
    – Information about the object itself
   Many kinds of non-text objects:
    – Images, still and moving
    – Sound
    – Multimedia
Connecting metadata to a non-
text object
 Object is kept in specifically-defined file
 File name/ID is crucial to the connection
 XLink is used to connect the two using a
  series of XML attributes:

To top