Docstoc

Reference Model for an Open Archival Information System (OAIS)

Document Sample
Reference Model for an Open Archival Information System (OAIS) Powered By Docstoc
					Reference Model for an Open
Archival Information System
           (OAIS)

       ESIP Summer Meeting

 John Garrett – ADNET Systems at NASA/GSFC


                2009-07-09
          Topics (time permitting)
• OAIS Reference Model
• Follow-on/Related Standards
  – Producer-Archive Interface Methodology Abstract
    Standard (PAIMAS)
  – Repository Audit and Certification Metrics (draft)
     • Requirements for Bodies Providing Audits (draft)
  – Producer-Archive Interface Specification (PAIS) (draft)
  – XML Formatted Data Unit (XFDU)


  Contributors: Don Sawyer, Daniele Boucon, Lou Reich,
   David Giaretta and many others involved in CCSDS
   Archiving and Packaging standards development
      OAIS Reference Model Home
• Consultative Committee for Space Data Systems
  (CCSDS)
  • International group of space agencies
  • Develop variety of science discipline-independent standards
  • Became working body for an ISO TC 20/ SC 13 about 1990
     TC20: Aircraft and Space Vehicles
     SC13: Space Data and Information Transfer Systems
  – http://www.ccsds.org/

  – Ensured broad participation, including traditional archives,
    libraries, companies

  (Not restricted to space communities; all participation was
    welcomed!)
                  In the Beginning: OAIS Reference Model
               CCSDS 650.0-B-1 Reference Model for an Open Archival Information System (OAIS)
               (ISO 14721:2003)               http://public.ccsds.org/publications/archive/650x0b1.pdf

               OAIS Mandatory Responsibilities:                                                             OAIS Information Model

                                                                                                                        Archiv al
          1.          Negotiating and accepting information                              Package       de riv e d fromInformation de limite d by      Packaging
                                                                                        De scription                 Package (AIP)                   Information
          2.          Obtaining sufficient control of the
                      information to ensure long-term preservation
          3.          Determining the "designated community"
          4.          Ensuring that information is independently
                      understandable                                                                                                    Pre se rv ation
          5.          Following documented policies and                                             Conte nt                            De scription
                                                                                                                  furthe r de scribe d by
                                                                                                                                        Information
                      procedures                                                                  Information
                                                                                                                                            (PDI)
          6.          Making the preserved information available


                                                                                    OAIS RM              OAIS Environment and Data Flows
                   OAIS Functional Model
                                                                                                       Producer
                             Preservation Planning                                                                               Submission
                                                                                                                                 Information
                                                                      4-1.2




                                                                                                                                  Package s
P              Descriptive           Data            Descriptive                    C
R                 Info            Management            Info                        O
O                                                                                   N
                                                                                                                                 OAIS
                                                                      queries
D                                                                                   S                                          Archiv al
                 Ingest                                               result sets
U                                                            Access                 U                                        Information
    SIP                            Archival                           orders                                                                              queries
C                                                                                   M                                         Package s
E                                  Storage                                          E
                      AIP                        AIP
R                                                                             DIP   R
                                                                                                                                                result
                                                                                                                                                 sets
                                Administration
                                                                                                                       Disse mination       orders
                                                                                                                        Information
                                                                                                                         Package s                 Consumer
                                MANAGEMENT
         OAIS Responsibilities

• Negotiates and accepts information from information
  producers
• Obtains sufficient control to ensure long-term preservation
• Determines which communities (designated) need to be
  able to understand the preserved information
• Ensures the information to be preserved is independently
  understandable to the Designated Communities
• Follows documented policies and procedures that ensure
  the information is preserved against all reasonable
  contingencies
• Makes the preserved information available to the
  Designated Communities in forms understandable to those
  communities
   OAIS Archival Information Package

                    derived from
                                          Archival        delimited by
  Package                                                                         Packaging
                                        Information
 Description                                                                     Information
                                       Package (AIP)

e.g., Information                                                              e.g., How to find Content
supporting customer                                                            information and PDI on
searches for AIP                                                               some medium

                                                                Preservation
                  Content          further described by         Description
                Information                                     Information
                                                                   (PDI)
   e.g., • Hardcopy document
         • Document as an electronic
           file together with its format
                                                      e.g., • How the Content Information came
           description
                                                               into being, who has held it, how it
         • Scientific data set consisting
                                                               relates to other information, and how
           of image file, text file,
                                                               its integrity is assured
           and format descriptions file
           describing the other files
  Preservation Description Information (PDI)
• Reference Information
   – Provides one or more identifiers, or systems of identifiers, by
     which the Content Information may be uniquely identified
   –   Bibliographic Description, Persistent IDs

• Provenance Information
   – Describes the source of Content Information, who has had custody of it,
     what is its history
   –   Logs of migrations

• Context Information
   – Describes how the Content Information relates to other information
     outside the Information Package
   –   Pointers to related collections

• Fixity Information
   – Protects the Content Information from undocumented alteration
   –   Digital signatures, Checksums
 View of an OAIS Environment

• Producer provides the information to be
  preserved
• Management sets overall OAIS policy
• Consumer seeks and acquires preserved
  information of interest



                   OAIS
Producer         (archive)        Consumer


                Management
             OAIS Functional Entities

                                Preservation Planning


P                                                                                 C
                  Descriptive          Data         Descriptive
R                   Info.           Management        Info.
                                                                                  O
O                                                                     queries     N
D                                                                   result sets   S
                 Ingest                                    Access
U                                                                     orders      U
C    SIP                                                                          M
E                         AIP         Archival      AIP                DIP        E
R                                     Storage                                     R

                                   Administration


                                MANAGEMENT
SIP = Submission Information Package
AIP = Archival Information Package
DIP = Dissemination Information Package
     External Data Flow View
Producer            Submission
                    Information
                     Packages


                   OAIS
                  Archival
                Information                   queries
                 Packages

                                     result
                                      sets

           Dissemination          orders
            Information
             Packages                  Consumer
                    Conformance
• How does an archive conform?
   – It discharges the set of minimal responsibilities
   – It supports the basic information concepts that
     address a definition of information and types of
     information packages
• How do other documents conform?
   – By using OAIS terms and concepts

• Certification Standard in progress
                 OAIS Update

Many improvements including:
• Authenticity
• Information properties
• Risk management
• Emulation
• Federation
Producer-Archive Interface Methodology Abstract Standard

       PAIMAS Focus


                                 Preservation Planning


   P                                      Data                                        C
                                       Management
   R                                                                                  O
   O                                     Descriptive                                  N
                                           Info.                    queries
   D                                                              result sets         S
       SIP            Ingest                             Access
   U                                     Archival                    orders           U
   C                                     Storage                                      M
                                                                                DIP
   E                                       AIP                                        E
   R                                                                                  R

                                       Administration




  SIP = Submission Information Package
                                  MANAGEMENT
  AIP = Archival Information Package
  DIP = Dissemination Information Package
          PAIMAS Methodology
 CCSDS 651.0-B-1 Producer-Archive Interface Methodology Abstract Standard.
 (ISO 20652:2006)   http://public.ccsds.org/publications/archive/651x0b1.pdf

•The Archive Project is broken into 4 main phases:
  •   Preliminary Phase,
  •   Formal Definition Phase,
  •   Transfer Phase,
  •   Validation Phase.

• PAIMAS identifies:
  •   the phases in the process of transferring information,
  •   the objective of the phases,
  •   Extensive action tables of actions that must be carried out,
  •   the expected results.
• PAIMAS is a basis:
  • for further specialization by a particular community
  • for the identification of standards and implementation guides,
  • for identification and development of a set of software tools.
                   PAIMAS phases & relationships
Phase objective




                                                            Transferred
                                                            object files

   Preliminary          Formal Definition      Transfer                Validation
      Phase                 Phase               Phase                   Phase
                  Preliminary    Submission Agreement       Anomalies              Validation
                  Agreement      including Dictionary and                          agreement
                                 Formal Model


                                                                           Data ready to archive
                 Preliminary phase: sub-phases
First contact

Preliminary definition,       Information to be archived, Quantification, Legal and
feasibility and assessment    contractual aspects, permanent impact on the Archive,
                              Summary of costs, etc.


Establishment of a                                                                 Action table
preliminary agreement

                         Id     Preliminary phase: quantification                           Involves

                         P-19   Estimate the data volume to be transmitted to the Archive   Producer
   Description
                         P-20   Assess the permanent data volume to store                   Archive

                         P-21   Assess the storage capability need for the ingest process   Archive

                         P-22   Assess the associated costs                                 Archive
    Repository Audit and Certification - Metrics

• http://wiki.digitalrepositoryauditandcertification.org/bin/view
• Closing in on public draft that will be submitted to CCSDS and ISO
•    Builds on previous audit work by TRAC and many others
           INCLUDED TOPICS                       4.2.8 The repository shall verify each
                                                 AIP for completeness and correctness
    ORGANISATIONAL INFRASTRUCTURE                at the point it is created.
      GOVERNANCE & ORGANIZATIONAL VIABILITY      Supporting Text
      ORGANIZATIONAL STRUCTURE & STAFFING              This is necessary in order to ensure
      PROCEDURAL ACCOUNTABILITY &                      that what is maintained over the long
    .            PRESERVATION POLICY FRAMEWORK         term is as it should be and can be
      FINANCIAL SUSTAINABILITY                         traced to the information provided by
      CONTRACTS, LICENSES, & LIABILITIES               the Producers.
    DIGITAL OBJECT MANAGEMENT                    Examples of Ways the Repository can
      INGEST: ACQUISITION OF CONTENT             Demonstrate it is Meeting this
      INGEST: CREATION OF THE AIP                Requirement
      PRESERVATION PLANNING                            Description of the procedure that
      AIP PRESERVATION                                 verifies completeness and
      INFORMATION MANAGEMENT                           correctness of the AIPs; logs of the
      ACCESS MANAGEMENT                                procedure.
    INFRASTRUCTURE AND SECURITY RISK             Discussion
    MANAGEMENT                                         The repository should be sure that
      Technical Infrastructure Risk Management         the AIPs it creates are as they are
      Security risk management                         expected to be by …
                        PAIS Objectives
• Producer-Archive Interface Specification

• Provide formal modelling of data objects that are to be transferred
  from Producer to Archives
    –   XML-based interchange of the model and SIPs


• Implementation standard for Producer – Archive Interface
     – Conformity with the OAIS Reference Model
     – Conformity with the PAIMAS
     – Conformity with the XFDU
• Aimed mainly at Formal Definition Phase with applicability to
  Transfer Phase with Validation


•   Closing in on public draft that will be submitted to CCSDS and ISO possibly
    by end of year
                                                                18
PAIS Basic Entities




                      19
      XFDU Packaging Standard Rationale
                  Technology and Requirements Evolution
•   Physical media -->Electronic Transfer
•   No standard language for metadata--> XML
•   Homogeneous Remote Procedure Call-->CORBA, SOAP
•   Little understanding of long-term preservation-->OAIS RM
•   Record formats-->Self describing data formats
                                New Requirements
•   Describe multiple encodings of a data object
•   Better describe the relationships among a set of data objects.

                             Technical Drivers
• Use of XML based technologies
     •Designed to be extensible to include new XML technologies as they emerge
• Linkage of data and software
• Direct mapping to OAIS Information Models
• Support both media and network exchange
• Support for multiple encoding/compression on individual objects or on entire
package
• Mapping to current SFDU Packaging & Data Description Metadata where possible
•Maximal use of existing standards and tools from similar efforts
                     XFDU Conceptual View
    CCSDS 661.0-B-1 XML Formatted Data Unit (XFDU) Structure and Construction Rules.
    (ISO 13527:2009)               http://public.ccsds.org/publications/archive/661x0b1.pdf




Open source XFDU Toolkit Library developed as a reference implementation
     (available at: http://sindbad.gsfc.nasa.gov/xfdu
•    Interoperability testing completed with ESA XFDU Implementation
•    Partnered with JPL/PDS to establish a NASA Testbed
•    XFDU Briefing on Scalability at the Collaborative Expedition Workshop / Toward
     Scalable Data Management (available at: http://colab.cim3.net/cgi-
     bin/wiki.pl?ExpeditionWorkshop/TowardScalableDataManagement_2008_06_10)

				
Jun Wang Jun Wang Dr
About Some of Those documents come from internet for research purpose,if you have the copyrights of one of them,tell me by mail vixychina@gmail.com.Thank you!