Docstoc

Hong

Document Sample
Hong Powered By Docstoc
					1 Summarization
1.1Background
      In Chinese academe there are a great deal of special subject collections which have many
   kinds of feature, such as cultural character, localization character, history character, and so on.
   If these collections can be digital, and be transformed into subject-special repositories which
   used for all kinds of subjects, it will be benefit to form a subject-special repositories group.
      Based on the consideration, China Academic Library &Information System(CALIS) imburse
the project of constructing national university special subject repositories all through, and think
the project as one of the most important branch items. After two periods, 9th five-year and 10th
five-year, the repositories take great progress.

1.2 System Architecture
        The special subject repositories system of CALIS implement uniform searching, data
harvester, comparatively uniform interface. System is consisted of “special subject repositories
center service system” and “local repositories system”. The center service system harvest metadata
from local system and provide user search interface, the local system provide digital object
service.
      In additional, special subject repositories as the part of CALIS repositories, can realize the
function of information sharing and communication between CALIS and other systems. Center
service system and local repositories system have the standard interface which has the uniform
search system, CALIS portal, CALIS SSO, CALIS balance system, CALIS linker resolve system,
CALIS statistic system. The local system can implement these interfaces or not based on the
policy.
      The center system harvests metadata from local system, the local system provides data access
  interfaces .The center system can call back the local system object through implementing
  OpenURL[2]/CALIS-OID resolve protocol. The center system ,as a part of CALIS resource, can
  be revoked by CALIS portal and uniform search system.
      System architecture as the figure 1.
                                                                                         CALIS-OID
                                CALIS portal/Uniform search system




                    CALIS special subject metadata repositories



                                     Metadata Harvest(OAI)




           Local system 1                    Local system 2
                                                                   …                     …




              local user                        local user                       local user
                            Figure 1: special subject repositories system architecture


1.3 Three main protocols
1.3.1 OAI [3] protocol
     OAI Protocol for Metadata Harvesting is the standard protocol which acquiring metadata in
distributing network circumstance. OAI protocol defines the standard interfaces. The outer
program or application can harvest the metadata from the data provider server through these
interfaces.
     The center service system harvest metadata from local system through OAI protocol. It
includes OAI Repository and OAI Harvester. The local system install OAI Repository and
configure the parameter. The OAI Harvester installed in the center system send “harvest” request
through HTTP protocol. The OAI repository in local system select metadata, addition and
modification by timestamp, and form xml package to response the OAI harvester request. The
OAI harvester receives and parses the XML package, then updates the repositories of center
system.

1.3.2 METS[4] standard
     METS’ full name is Metadata Encoding & Transmission Standard , which provided by
American library of congress as the framework based on XML for the coding, description,
management of digital library’s object.
     By utilizing the standard of METS, the metadata of the related digital resource can be zip,
which including all describable, manageable, structural, authorization and other metadata can be
used to digital search, save and server. METS data can be transformed with the other systemic data
object. If the digital resource are described by METS, it can be used in many kinds of system
conveniently.
     CALIS digital library project includes not only the metadata of digital object, but also digital
object itself. There are biggish difference in the maintenance, exchange, transference between the
digital object and the metadata. It is necessary to appoint the coding format , organizational format,
and network interface of digital object. In this protocol, the exchange item is digital object which
apart from the exchange of metadata. OAI protocol mainly establish one standard for the metadata
exchange. METS protocol appoint the corresponding digital object itself for OAI harvesting
metadata can be represent by many formats, such as PDF file, WORD file, picture, AVI, MPG
multimedia files, or a set of order files collection
     METS is organized and produced by Digital Library Federation. And Network Development
and MARC Standards Office of the Library of Congress is responsible for its maintenance
operation work at present.

1.3.3 CALIS-OID
       In order to solve the problem of special repositories digital object naming criterion, CALIS
constitute the standard of the digital object identifier. Presently, DOI[5] as digital object identifier,
it is the most perfect in management, register, and resolve fields. However, CALIS whether apply
to DOI as itself resource or applying to RA, it all needs to hand in a plenty of member fee or login
and maintenance fee for DOI. Therefore, we decide to adopt the way combining the international
universal naming manner to being used into the standard of digital object identifier in CALIS.
       Name space+register code+resource identifier
       Regardless of entering any organization, register code is defined by register department self.
Once added the register code, it will become the international identifier. The identifier can be
transformed rapidly to the relevant organizational naming way which used popularly in current
period, meets the needs of system extensibility and compatibility.
       The CALIS standard of digital object identifier request to meet the standard of URN[6] and
become subset of URN, the grammar as following:
urn:CALIS:Library Code -CollectionName[.CollectionName]/ObjID.type.format
       Please notice the name space, urn, and register code, CALIS, lowercase and uppercase.
       The length of CALLIS-OID characters should be less 255.
       For example:
       One degree thesis of Peking University:
       The complex object of thesis — urn:CALIS:pul-ETD/S02024
       The first 24 pages — urn: CALIS:pul-ETD/S02024.P.PDF
       The full text — urn: CALIS:pul-ETD/S02024.T.DOC

2 Main function and implement of special subject repositories
     About the design of the system of special repositories, we adopt the two-tier architecture:
“special subject repositories centre service system” and “constructing unit local system”. The
following are described mainly for the function and implement of the two systems.
2.1 The local system.
      Special repositories local system mostly accomplishes resource digital process, metadata
process, publish, and other related tasks. It act the role of processing data and servicing for digital
object throughout the special subject repositories. Also, it has the full function of database system.
      For the local system providing digital object service, it requests the interaction can be
processed mutually with CADLIS program. The user with authentication and approval can
directly access the digital object of the local system from CADLIS portal and special subject
repositories center service system. Therefore, It is necessary that local system can get the
integration with CASLIS authentication centre, CADLIS charging centre, resource adjusting
centre, and CADLIS-OID digital object identifier resolve system., which make the local system
brought into integrated CADLIS digital library system.
      Local system includes mainly the following modules.
2.1.1 Process and publisher system
     Process and publisher system includes: object data processing, metadata processing, data
issuing three parts. These system developer need to consider to checkout for the validity of being
convenient to the user.
2.1.2 OAI Repository Data Providing System
      OAI protocol only defines six statements. There must setup a regular system to handle all
harvesting management and configure. Therefore, it is need to setup OAI management system in
local system and centre service system.
      CALIS special subject repositories local system should arrange one OAI-DP data providing
server to be responsible for the harvesting request of OAI harvesting server of special repositories
center.
      OAI-DP in special repositories local system must keep to the protocol of OAI-PMH and the
relevant part in” CALIS OAI harvesting server designing reference” to insure to communicate
smoothly with OAI harvesting server of special repositories center system.
OAI-DP providing server includes the following modules:
(1) OAI Harvester register module:
     To enroll for OAI harvesting server of special subject repositories center system and other
  possible OAI harvester.
(2) OAI Harvester inquiring & management modules:
     To provide the configuring, inquiring listing, adding, deleting, modification, prohibiting, and
other functions to OAI Harvester; to monitor the status of OAI Harvester.
(3) Configure module:
     The configure information of OAI DP includes: the providing metadata types, no-DC format
metadata to DC mapping, etc.
(4) OAI harvesting log and Stat::
     It can be recorded the harvesting time, quantity, status, abnormal information of metadata. the
administrator grasping OAI-DP data providing server through the statistic.

2.1.3 METS transmitting server
     In METS standard, there is not any content on the aspect of network service interface.
Reference to the current network dealing mode and considering the reliability, compatible, security,
independence and capability optimizing of the metadata object, Message Queue, recognized by
industry is decided to be adopt. So when transmitting the digital object, some network problem
can not be considered in the process of network transmitting, such as security, data continuous
transmitting, error recovering, log management, etc.
     At present mature MQ system has these types: Java flat JMS(Java Messaging Service);
Windows flat MSMQ(Microsoft Message Queue); IBM Websphere MQ, we commend to use the
fore two free flat system
The METS server has the following modules:
  (1) METS register module: To register MQ mapping with special subject repositories center.
  (2) METS responding module: Reponses the request of special subject repositories center
    system, and package the corresponding object data into METS format in order to send to MQ
    system
     The most importance of implementing METS server is to define the message format in MQ.
About the definition details, please refer to “CALIS digital object exchanging protocol standard”.
By now, the resource of special subject repositories has been confirmed mostly, each record has
one simple digital object at best.
2.1.4 CALIS-OID resolve agent
     In order to access the digital object stored at local system, special subject repositories local
system must have CALIS-OID resolve agent to support for resolve CALIS digital object
identifiers and providing the downloading URL service for the digital object.

2.2 Special subject repositories centre service system
      The main function of special subject repositories center service system includes providing the
resource searching service for user, user management, log and stat., and so on. Special subject
repositories center service system not only runs independently, but also integrates with other
CADLIS system perfectly, to form CADLIS digital library system together.
2.2.1 The website service system.
      That includes searching interface, searching functions (searching, browsing, and indexing
function), searching ways, searching language/technology, searching result interface and searching
history interface, etc; acquiring full text and the related service, such as full text link(Full Text、
Link to), holdings link, ILL(Inter Library Lending) link, CALIS-OID resolve system, etc. User
own design web style, my searching history, user own defining the result displayed on each page,
my favorite, my personal information, and the other personality functions.
2.2.2 OAI harvesting server
      CALIS special subject repositories center service system need to setup a OAI harvesting
server to harvest the metadata from many local systems.
      The OAI harvesting server required by this system accords to “ CALIS OAI harvesting server
designing reference”, in order to provide the effective communication to server with OAI-
Repository in constructing local system.
OAI harvesting server system has the following modules:
(1) OAI Repository register module:
     To enroll each OAI-DP mapping with the local system
(2) OAI Repository inquiring, management module:
     Provide the management functions of the OAI Repository such as inquiring, browsing,
appending, deleting, modifying, prohibiting, and so on.
(3) OAI Repository configure module OAI Repository;
     Configure OAI Repository harvesting mechanism and error dealing. System implements the
corresponding harvesting action itself and monitors the status of OAI Repository.
(4) OAI harvesting log and Stat:
     The information can be recorded about the time, the quantity, status, and abnormal things of
metadata harvesting. The administrator can grasp the OAI Repository and OAI harvesting server
states, and grasp the metadata updating information through statistic.
(5) The management of metadata harvesting
     The metadata harvesting process halt caused by network error, server error, hardware error,
software error, communication error, and others reasons, must harvest metadata from local system
again。
2.2.3 METS receiving server
     Based on the Message Queue technology recognized by current industry, it not need to
consider the network problems while transmitting process, such as security, error r recovering, log
management, and so on

METS receiving server including these modules:

(1) METS request module:
    The system takes out the MetaID of metadata from OAI harvesting server and creates one
request message to MQ system. It can be appointed one or multi MetaID in one request, one
period of time also can be the appointment. These two messages are consistent with each of
harvesting message in OAI harvesting server.

(2) METS receiving module:
    Acquiring METS package from MQ array, tears open the METS package and saves into
resource storage. If METS package is too length, merges several METS package into one
integrate METS object, then handles for entering storage.

(3) METS register module: Enrolls each MQ mapping with each local system

2.2.4 Log and statistics system
      The system can record all operation, such as administrator operation, user login and other
accessing operation, and configure, manage, maintain, transform, and the other operation for log
database. On the basis of above log, it also provides several statistics functions, such as operation
Stat., resource distributing Stat., website accessing stat., and so on.

3 Project development.
     The two parts of the 10th Five-year CALIS special subject repositories project adopts the
different development ways to accomplish.
     For local system, it adopts the following way. Project group confirm main function modules,
technology, data processing standard, interface standard. The commercial software companies
develop software separately and independently. The project group staff test them according to all
the relevant target, and recommend seven software company up to the standard to libraries. At
present, tens of libraries have adopted one of this seven software as the local system.
     Special subject repositories center service system is cooperated to develop by CALIS
technology center and Wuhan University library. The system environment is Linux, Oracle 10g
database, Java language, CALIS portal framework. Through frequently using and testing, the most
of development task have accomplished including the implement of harvesting data from several
local system. The centre service website URL is http://162.105.139.52:8180/tskopac/.

References:
    1. CALIS http://www.calis.edu.cn

    2. OpenURL http://www.niso.org/committees/committee_ax.html

    3. Open Archives Initiative,http://www.openarchives.org/

    4. METS,http://www.loc.gov/standards/mets/

    5. DOI, http://www.doi.org/

    6. URN, http://www.w3.org/TR/uri-clarification/#urn-namespaces

				
DOCUMENT INFO