Embed
Email

berlin

Document Sample

Shared by: Kerala g
Categories
Tags
Stats
views:
0
posted:
12/7/2011
language:
pages:
6
Tone Merete Bruvik

HIT Centre, University of Bergen





XML in MALVINE and LEAF

MALVINE and LEAF. Gateways to Europe's Cultural Heritage

4th of December 2000 in the Staatsbibliothet zu Berlin

XML is changing from a buzzword for Internet freaks to a basic standard for the encoding of

document structures and the interchange of information. In the MALVINE project we have

made a converter that translates archive catalogues held in various local formats into one

common format: EAD - Encoded Archival Description. EAD is a DTD - Document Type

Definition written in XML, made by the Society of American Archivists. Once catalogues are

held in EAD, it is demonstrated how easily they can be transformed into other formats, for

instance into HTML or other local cataloguing formats. In the upcoming LEAF project we

will work, together with other groups, to develop and use a general DTD for the encoding of

biographical information in XML. A DTD is this field will make it easier to interchange and

harvest biographical information from various sources, which is one of the main goals of the

LEAF project.

Contents

What is XML?

How to use XML?

Why is XML important?

So what is XML really?

Demonstration of XML in the MALVINE project

XML in LEAF



What is XML?

Before talking about the use of XML in the MALVINE project, I will give a short

introduction to XML.

XML - Extensible Markup Language, a W3C - World Wide Web Consortium –

Recommendation from February 1998 [XML 2000]. XML is developed from SGML -

Standard Generalized Markup Language [ISO 8876].

XML looks similar to HTML, but XML is a meta language: a language to describe markup

languages.

Diagram 1. The relationship between SGML, XML, HTML and XHTML.

A markup language defines four things (from A Gentle Introduction to SGML, p. 13 in [TEI

P3]):



 What markup is allowed



 What markup is required



 How markup is to be distinguished from the content of the document



 What the markup means







Why is XML important?

 Separates data from the software.

 Software provider independent.

 Simple and readable for both humans and computers.

 Easy to transform into other formats.

 The same information might be presented in many different ways.

 Robustness over time.



So what is XML really?



This is a very simple XML document:





















]>



Markup languages



There are several text markup languages, for

example:

HTML (HyperText Markup Language)

XML (eXtensible Markup Language)

SGML (Standard Generalized Markup Language)









This document does not contain any information about the layout, that information is kept in a

stylesheet. In this case an XSLT - Extensible Stylesheet Language Transformation [XSLT]

stylesheet called “apage.xsl” is used. Displayed on the Web, this XML document using the

given XSLT stylesheet will look like this:









Demonstration of XML in the MALVINE project

Our SGML/XML feasibility study in the MALVINE project has shown that the various

catalogue formats used by the libraries and archives can be very well represented in XML or

SGML, and that the translation can be done with a rather simple computer programme. After

considering various DTDs we decided to use the EAD - Encoded Archival Description

developed by the Society of American Archivists [EAD 1998]. Our use of EAD has shown

that this DTD is very well suited for this kind of material. We have made the Local catalogue

format to EAD converter available on the web at

http://helmer.hit.uib.no/malvine/EADconverter.html. It works as shown in diagram 2:

Export XML

Local The EAD

Catalogu converter

e







Conv.

Table





Diagram 3. The EAD converter.





On the Catalogues of manuscripts and letters, encoded in XML using EAD page at

http://helmer.hit.uib.no/malvine/EADpage.html samples of manuscript catalogues from some

of the MALVINE data providers are given. On this page we also demonstrate the power of

XSLT - Extensible Stylesheet Language Transformation, which is used to show different

views of the same XML encoded catalogue.

The EAD converter we have made is used in the MALVINE project whenever an exchange

from one format to another is needed. For instance, some of the collections available in the

MALVINE cluster do not have the Z39.50 protocol. In order to make these catalogues

available through the MALVINE Search Engine, these catalogues are converted from the

locally keep catalogue into a new catalogue hosted by someone with Z39.50 protocol

available, using our EAD converter.



Local

Catalogue

Without

Z39.50



Malvine search XSLT



engine



The EAD

Converter

File XSLT

Processor

EAD

Local Catalogues Copy of Catalogue

With Z39.50 Local Catalogues

With Z39.50

Diagram 4. The EAD converter used in the MALVINE project.

XML in LEAF

In the upcoming LEAF project, we at HIT Centre at the University of Bergen will develop

and use a general DTD for the encoding of biographical information in XML. This work will

be done together with other groups working in the same field. We hope to work together with

the community behind the development of EAD, especially Daniel V. Pitti at University of

Virginia, who is the main architect of EAD. He pointed out the need for a new DTD to cover

biographical information in his paper “Encoded Archival Description, An Introduction and

Overview” [Pitti 1999]. This DTD will be based on ICA’s International Standard Archival

Authority Record for Corporate Bodies, Persons, and Families (ISAAR(CPF))

[ISAAR(CPF)].

A DTD like this will be a grammar for a general language to express the information used in

any kind of biographical record. This DTD will not replace the various local formats; it will

only make communication between the various systems easier.

A DTD in this field will make it easier to interchange and harvest biographical information

from various sources, which is one of the main goals of the LEAF project. In the same way as

we have used XML to interchange archival records in the MALVINE project, will we use

XML to interchange information about persons, families and corporate bodies in the LEAF

project.



Conclusion

The use of XML might look very technical. In one way it is, and in one way not. Although

documents encoded in XML might look very technical, they are really very, very simple

compared to for example the document format used in Microsoft Word! XML is about having

control over your own documents, independent of the software provider.



References



[TEI P3] TEI Guidelines for Electronic Text Encoding and Interchange, edited by C. M.

Sperberg-McQueen and Lou Burnard, Chicago, Oxford 1994. Also available at

http://www.tei-c.org/Guidelines2/



[ISO 8876] International Organization for Standardization, ISO 8876: Information

processing - paragraph and office systems - Standard Generalized Markup Language

(SGML), 1986.

[XSLT] XSL Transformations (XSLT) Version 1.0, W3C Recommendation 16 November

1999, http://www.w3.org/TR/1999/REC-xslt-19991116

[XML 2000] Extensible Markup Language (XML) 1.0 (Second Edition),

http://www.w3.org/TR/2000/REC-xml-20001006

[Pitti 1999] Pitti, Daniel V.: “Encoded Archival Description, An Introduction and Overview”,

D-Lib Magazine, Vol. 5 No. 11, November 1999, available at

http://www.dlib.org/dlib/november99/11pitti.html

[EAD 1998] EAD Encoded Archival Description Tag Library, Version 1.0, The Society of

American Archivists, Chicago 1998. Also available at http://lcweb.loc.gov/ead/eadtlweb.html

[ISAAR(CPF)] International Council on Archives, International Standard Archival Authority

Record for Corporate Bodies, Persons, and Families (ISAAR(CPF)), Ottawa

1996. Available at http://dobc.unipv.it/obc/add/infap/archdes/isaar_e.html



Related docs
Other docs by Kerala g
union-budget-2012-13-highlights
Views: 81  |  Downloads: 0
notification M.Tech_05-03-09
Views: 56  |  Downloads: 0
India_Customs Regulation 1
Views: 52  |  Downloads: 0
CE Notification 39-2011-12.9.2011
Views: 50  |  Downloads: 0
STATISTICS
Views: 69  |  Downloads: 0
A Hero (R.K. Narayan)
Views: 87  |  Downloads: 6
RRBPatna-Info-HN
Views: 98  |  Downloads: 0
RRB-Notice-Para
Views: 100  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!