Component Diagram for Online Library Management System Ensuring Access to Mathematics Over Time Cooperative Management of Distributed

Description

Component Diagram for Online Library Management System document sample

Shared by: xia20470
-
Stats
views:
515
posted:
1/11/2011
language:
English
pages:
7
Document Sample
scope of work template
							                          Ensuring Access to Mathematics Over Time:
                     Cooperative Management of Distributed Digital Archives
                                  NSF Award # IIS-0240450
                              Annual Report for Year 2 (FY 2005)

Ensuring Access to Mathematics Over Time (formerly abbreviated EATMOT, now Math Arc) has
made considerable progress on 3 fronts over the course of the past year:
    o Metadata formats and schema
    o Scenarios for cooperative management of distributed preservation systems
    o System design
We have built on the foundation of our work in the first year in which we a) identified the
resources to be preserved, b) examined existing metadata formats used in other archiving projects,
and c) analyzed the OAIS Reference Model to break it down into practical work units. This year
we advanced the design of the Math Arc system to the point where we are ready to begin building a
working prototype in the remaining months of the second year and into the third.


The Math Arc Metadata Format
The core of the Math Arc system is the metadata format that supports preservation. Not wanting to
duplicate work done by others, we extended existing tools and standards to create the design for the
Math Arc system’s metadata. In August, the PREMIS working group released its preservation
metadata data dictionary http://www.loc.gov/standards/premis. An innovation we have developed
combines a metadata framework, the PREMIS standard, and a web protocol to produce a
distributed archive metadata format. The following four paragraphs summarize the borrowed
components and the format we created.


The METS framework http://www.loc.gov/standards/mets: We decided to use the Metadata
Encoding and Transmission Standard (METS) as the basis for exchanging complex digital objects
between the two OAIS archives at Cornell University Library (CUL) and at GSUB. We defined an
"asset" as a complex digital object and its associated metadata that describes the object for
preservation. METS is only a framework, so we had to define the information we want to preserve
about the assets we are exchanging.

The preservation standard http://www.loc.gov/standards/premis: The Research Libraries Group
(RLG) and the Online Computer Library Center (OCLC) initiated a project to develop a data
dictionary of preservation metadata elements. The draft of their standard, the PREMIS data
dictionary, contains a large number of core attributes of digital content and the metadata necessary
for long-term preservation. Because our goal is to allow separate institutions to exchange disparate
collections of assets, we narrowed the element set to those that are most useful in a generalized
setting, expecting that the Math Arc system might be extended to include more than the current two
archives and our respective collections of assets.


The protocol: http://www.openarchives.org: We had to determine a protocol that would allow our
servers to talk with each other and to automate the exchange of assets. Early in the project we
decided to use the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). It uses
the HTTP protocol as the message carrier, but adds a system of messages that support resource
discovery and harvesting. It enables a "pull" mechanism, in which one archive requests information
about new resources at the other, rather than a "push" mechanism, in which the producer of new
assets notifies another archive that it is sending its assets. The pull interchange permits our Math
Arc system to expand to more than two archives.


The Math Arc metadata format
http://www.library.cornell.edu/dlit/MathArc/web/resources/MathArc_metadataschema031a.doc:
We defined how our chosen subset of PREMIS elements would be expressed in the METS
framework. We also added attributes to the metadata that will control the automated operation of
the exchange system. The local archives will be able to parse the initial metadata that is passed to
the system using the OAI-PMH protocol. Depending on the values of certain elements, the system
will direct further OAI-PMH messages to carry out the exchange and to ingest the assets into the
local system.

Math Arc Scenarios
Our first step in creating the design was to identify the scenarios in which we imagined the Math
Arc system would be operating. At the end of our first year we created a preliminary list of “what
if“ situations. Throughout the first part of the second year, we described these scenarios and the use
cases that occur in each. Our understanding of the system grew as we refined our definitions. To
refine these scenarios we met on three occasions with our colleagues from Göttingen State and
University Library (GSUB). We held face-to-face meetings at GSUB in December 2004, at the
Digital Library Federation Forum in San Diego in April 2005, and again in Göttingen earlier this
month. All of these meetings were very fruitful; we broke through the barriers of cross-cultural co-
operation (both national and institutional) to produce a picture of the commonalities between our
archives and a description of the steps by which the individual systems can communicate and
exchange resources. Our most current understanding is captured in the working document "Math
Arc Scenarios", http://www.library.cornell.edu/dlit/MathArc/web/resources/MathArcScenarios.doc,
which describes all fifteen scenarios and the use cases we developed to guide the cooperative
management of our two preservation systems.




System Design
Because the Math Arc system is based on the Open Archive Information System Reference Model,
its components can be categorized into the OAIS Functional Entities--Ingest, Data Management,
Archival Storage, Access, Administration, and Preservation Planning as shown in figures 1 and 2.

Producer and Consumer Roles:
In our system, each archive can sometimes take the role of Producer, sometimes that of Consumer.
Archival Storage:
Archival Storage is largely a hardware issue. Each archive is providing its own Archival Storage
system, the design and implementation of which is a local matter and therefore outside the scope of
the project. Suffice it to say, we are considering two such storage systems one from EMC Corp.
and one from Sun Microsystems.

Data Management:
Some of the metadata in the system is about the institutions and the archives, not about the assets.
We have designed the system to store rights management information, institutional commitments,
and preservation policies in a database separate from the individual assets. These higher-level
attributes will be stored as machine-readable tagged files in a distributed, peer-to-peer network. The
software that runs the LOCKSS system, http://www.lockss.org, permits us to create a network of
low-cost computers that can store and automatically maintain this metadata. We are currently
working with the LOCKSS system designers at Stanford University to modify the LOCKSS system
for our purposes.




Fig. 1 a high level diagram of the OAIS system attributes

Administration:
The system has two levels of administration--the human, and the machine. At the machine level,
the design of the metadata format and the exchange protocol employed manage transactions
between Producer, Consumer, and the Archive. Co-operation between the people at the
participating institutions is the mechanism for administrative decisions.

Preservation Planning:
Like Administration, co-operation is the mechanism for Preservation Planning. Some of the
Scenarios describe the procedures we will use for implementing changes in preservation policy and
technology.
Figure 2 shows another view of the Math Arc system, a modification of the highest-level diagram
of an OAIS system, adapted to show one archive taking the role of both Producer and Consumer.
The second archive takes the role of the OAIS. Both contribute to Data Management. The roles can
be switched, one archive serving as the archive for the other’s resources.

Ingest and Access:
The OAI-PMH mechanism at each archive is the key to the Ingest and Access systems and is
shown in more detail in Figure 3.
  Fig. 2 A diagram of the distributed archives and their functions



        <<Administration>                               <<Preservation Planning>>
              >
          PA        OA                                          PA        OA




                               <<Data Management>>
                                          Registry
<<Producer>>                                                          <<Consumer>>
Original Archive                                                      Original Archive


                                          {LOCKSS}


    {OAI-PMH}                                                            {OAI-PMH}




 <<Ingest>>                      <<Archival Storage>>                 <<Access>>
Partner Archive                     Partner Archive                  Partner Archive
Figure 3 shows how the individual partners will provide the functionality described in the OAIS
reference model—local ingest, archival storage, persistent ID resolvers, etc. The functionality
provided by a Partner Archive to an Originating Archive are defined by the agreements called for
and established in our scenarios and associated policy agreements.


                                                       Partner Archive


                                                                         Archival Storage                       Local URL Resolver




                                                                     Ingest Processor


                                                                                                              OAI-PMH Harvester Service




            The Partner Archive's (PA) Ingest Processor uses its OAI-PMH harvester to
            communicate with the Original Archive's (OA) OAI-PMH Service Provider.
            The Ingest Processor requests asset-description metadata from the OA.
            The OA checks rights and commitment metadata in the MathArc Registry,
            and returns a list of appropriate assets.
            The PA requests individual files from the OA.
            After performing quality assurance checks, the Ingest Processor stores the
            Asset in the PA storage, mapping local addresses to persistent
            identifiers in the Local URL Resolver.

            When an Original Archive requests its own Assets from a Partner Archive,
            the roles are reversed. (Each archive can perform all request and reply
            functions and can create Dissemination Information Packages and Sub-
            mission Information Packages from its stored Archival Information Packages.)




                                                               Original Archive
                                                                                                              OAI-PMH Data Provider Service
         MathArc Registry


                       LOCKSS cache/server
                                                                                    Access Processor


At least 8 computers in a peer-to-peer
network share machine-readable
rights metadata and metadata about
institutional commitments and policies.
The LOCKSS software provides a                                                             Archival Storage              Local URL Resolver
self-repairing mechanism, insuring against
data corruption. Because the network is
distributed among many machines, each
replicating the data on the others,
the network is owned by all participants.

Fig. 3 Component diagram showing a high-level view of how the system modules work together.
Next Steps
Programming of CUL’s prototype Math Arc system will begin in October. The system will be
written in Java, with XML-tagged metadata, so that its source code can be widely distributed and
the system expanded and used by others. In the next month we will expand the functional
specifications outlined in the Activity diagrams included in the Scenarios. We will hire an
additional programmer in December or early January to help complete the testing and
implementation of the system.

						
Related docs
Other docs by xia20470
Comprehensive Review on Business Law
Views: 32  |  Downloads: 0
Computer Lease Agreement Free Forms - Excel
Views: 53  |  Downloads: 0
Compter and Management Functions
Views: 4  |  Downloads: 0
Comprehensive Project Finance
Views: 24  |  Downloads: 0
Comptia Certificate
Views: 354  |  Downloads: 2
Comps Template
Views: 242  |  Downloads: 0
Component Manufacture Contract - Excel
Views: 11  |  Downloads: 0
Compound Probability Worksheet - Excel
Views: 370  |  Downloads: 0