Fedora  a Digital Object Repository

Reviews
Shared by: vixycn
Stats
views:
9
rating:
not rated
reviews:
0
posted:
11/4/2009
language:
ENGLISH
pages:
0
Fedora Selecting and Implementing an Open Source Software Digital Repository Jon Dunn Digital Library Program Indiana University RLG Members’ Forum, December 12, 2003 Outline        What is a repository and why do we need it? Background on IU environment Background on Fedora Fedora Digital Object Model The Fedora Architecture Fedora use at IU: EVIADA Future Fedora use Why a repository?  Isn’t what we have good enough?     Web servers, delivery systems File servers Databases Hierarchical storage systems  Why do libraries need repositories? A digital object is more than just a file! Example: Electronic Book Metadata Delivery page image files (JPEG) Hi-res page image files (TIFF) Text file (TEI/XML) A digital object is more than just a file! Example: Archival Collection EAD Finding Aid DL Objects  Digital library “objects” have many parts  Metadata  Descriptive, administrative, structural, preservation, …   Preservation/archival files (several) Delivery files (several) Now: Good practice in file naming, directory organization, project documentation -not scalable! Future: Digital object repository  How do we keep them connected and organized?   Repository Purposes  Access   Web access to digital files and metadata Services/applications for searching, browsing, transformation, etc.  Preservation   Secure storage for digital files and metadata Services for integrity checking, migration, conversion, etc. Data Persistence   Key is migration Keeping the bits alive   Physical media Logical media format  Keeping the bits understandable   File format Metadata  Small “pockets” of digital content pose a problem for migration DL Object Repository Preservation version in MDSS Users and Applications: Access and Management Repository System Delivery version(s) on web server Metadata records Motivation for a Digital Repository at Indiana University   Many pockets of digital content and metadata Difficult to sustain    Variable tech support, replacement funding Harder to preserve, migrate data forward to new software and hardware Harder to budget for Cross-collection search Standard interfaces for viewing and playing content Interfaces to course management and other IT services OAI data providers Preservation services (integrity checks, etc.)  Difficult to build common services and applications      Not a New Model…  Digital Repository  Common system for storing, managing, and providing access to digital content and metadata  Integrated Library System  Common system for storing, managing, and providing access to MARC records “Digital Repository” vs. “Institutional Repository”  Digital repository   Common storage for digital content and metadata Basic infrastructure component: “plumbing” Often implies focus on one application: institutional content, research output e.g. MIT DSpace:   Institutional repository   “capture, store, index, preserve, and redistribute the intellectual output of a university’s research faculty in digital formats” Background: IU Digital Library Program  Mission:  “…dedicated to the production, maintenance, distribution, and preservation of a wide range of high quality networked information resources for scholars and students at Indiana University and elsewhere” IU Digital Library Program   Established in 1997 Collaborative venture:     University Libraries (IUL) University Information Technology Services (UITS) School of Library and Information Science (SLIS) School of Informatics    Funding provided by Libraries and UITS University-wide responsibility: 8 campuses Responsibility beyond just the Libraries IU Digital Library Program: Areas of Responsibility        Digital conversion Metadata Usability / UI design Infrastructure Software development DL research Both direct involvement and consulting roles IU Digital Library Program: Staff  12.5 full-time equivalent (FTE) permanent staff    3 librarians 9 professional staff: IT, digital conversion, UI/usability 1 support staff (.5 FTE)   10 grant-funded IT staff Student staff, including graduate assistants and interns from the School of Library and Information Science and Computer Science Object Types at IU           Books Manuscripts Photographs Art images Music audio Video Sheet music Musical score images Music notation files …and more Questions In Repository Planning at IU  Scope     Just library? Museums and archives? All campuses? Other digital content   Instructional (e.g. faculty materials in OnCourse) Business (PR, Athletics, etc.)   Funding model Standards  Minimum requirements for content formats and metadata  Tools/services/applications  What else is needed to make a repository useful/usable for preservation and access? Repository Evaluation Criteria  Flexibility    Not a rigid data model Support for many media types, complex digital objects Not locked into one technology platform (OS, database) Use of modern technologies Easy integration with other systems/tools Means of extension/modification Support for DL standards, particularly metadata  Extensibility        Sustainability Supportability Cost Fedora • FEDORA • • • • • • Flexible Extensible Digital Object and Repository Architecture Fedora - Background  Began as CS research project at Cornell – 1997-98   Architecture Reference implementation Trying to create a DL architecture No commercial solutions found  UVa Libraries became interested – 2000    Mellon-funded project – 2001-2003      Joint UVa/Cornell project Update technologies Make use of relational database Make more production-ready IU member of “deployment group” engaged in testing Fedora - Technical Environment    Open Source software Written in Java OS Platforms:    Windows Linux / Unix Mac OS X (not yet officially supported) MySQL McKoi Oracle8i , Oracle9i  Database support:    What does Fedora do?     Manages files or references to files that make up digital objects Manages associations between objects and interfaces Invokes behaviors of objects Basic DL “plumbing” What does Fedora not do?      Searching/browsing of metadata and content End-user UI for display/navigation of metadata and content Cataloging tools Preservation services … Fedora is DL “plumbing”… Not an out-of-thebox complete DL system  Fedora 1.2 Software Feature Set  Open Fedora APIs  Repository as web services  Flexible Digital Object Model    Content View: objects as bundle of items (content and metadata) Service View: objects as a set of service methods (“behaviors”) Extensible functionality by associating services with objects  Repository System       Core Services: Management, Access/Search, OAI-PMH Storage: XML object store; relational db object cache; relational db object registry Mediation - auto-dispatching to distributed web services for content transformation Auto-Indexing – system metadata and DC record of each object HTTP Basic Authentication and Access Control Built-in disseminator services: XSLT x-form, image manipulation, xml-to-PDF  Content Versioning   Automatic version control (saves version of content/metadata when modified) Enables date-time stamped API requests (see object as it looked at a point in time)  Clients     Fedora Administrator: GUI client to create/maintain objects Default Web browser interface: search; access objects via default disseminator Command line utilities (batch load, ingest, purge, others) Migration Utility – mass export/ingest The Fedora Object Model    Persistent ID (PID) Disseminators System Metadata  PID – persistent unique identifier Datastreams – represent content or metadata System Metadata – manage and track the object in the system Disseminator(s) – a service for transforming or presenting the object   Datastreams Behavior Definition Behavior Mechanism Object Model Example: Image Objects  Two File Image Object  Data   Hi Resolution Version: tif Low Resolution Version: jpg  MrSID File Image Object  Data  MrSID File Basic Image Interface: Behavior Definitions   getHighResolutionTIF getLowResolutionJPG Implementations: Behavior Mechanisms  Two File Image Object  getHighResolutionTIF  returns high resolution TIF returns low resolution JPG  getLowResolutionJPG   MrSID Image Object  getHighResolutionTIF  processes the MrSID file to return a high resolution TIF file of the image processes the MrSID file to return a low resolution JPG of the image  getLowResolutionJPG  FEDORA’s Interface Implementation Behavior Definition Object Persistent ID (PID) System Metadata Data Object Persistent ID (PID) Disseminators System Metadata Datastreams Datastreams Behavior Definition Metadata Behavior Mechanism Object Persistent ID (PID) System Metadata Datastreams Service Binding Metadata (WSDL) Fedora Architecture Client Application Web Browser Batch Program Server Application HT T P SOAP HT T P SOAP HT T P SOAP HT T P Manage Acce ss Se arch OAI Provide r Web Service Exposure Layer Session Management User Authentication Manage me nt Subsyste m Object Mgmt Component Mgmt Object Validation Se curity Subsyste m Policy Mgmt Policy Enforcement Acce ss Subsyste m Object Reflection Object Dissemination HTTP SOAP Remote Service Users/Groups Local Service PID Generation Policies External Content Source External Content Source Storage Subsyste m HTTP Digital Objects Datastreams HT T P XML Files FT P External Content Retriever FT P Content Relational DB Client and Web Service Interactions user user user Client application Server application web browser Client application Fedora Service APIs Fedora Repository System Content Transform Service External Service Dispatch Content Transform Service API API Current Fedora Use at IU: EVIADA  EVIADA  Ethnomusicological Video for Instruction and Analysis Digital Archive (!)  Goals   Digital archive of ethnomusicology field video Instructional tool   Partnership with University of Michigan Funding from Andrew W. Mellon Foundation Current Fedora Use at IU: EVIADA    Complex objects Many versions of content  Original analog video  Digital Betacam tape  Digital file master – 50 Mbps MPEG-2  Derivative files: MPEG-1, QuickTime, Real, ??? Many types of metadata  Collection-level descriptive metadata  Annotations: event, scene, action  Technical, preservation, digital provenance  Using METS+MODS+MARC Current Fedora Use at IU: EVIADA  Fedora used to manage content and metadata  Streaming video files will be “redirected content”   Web application built with Java, Struts framework, Oracle9i XDB Web-based annotation tool  Creates METS structmap and MODS records Future Fedora Software Releases December 2003 – December 2004  Fedora Object XML (FOXML)     Internal storage format; direct expression of Fedora object model Better support for relationships (“kinship” metadata) Better support for audit trail (event history) Format identifiers for dynamic service binding   Shibboleth authentication Policy Enforcement   XACML expression language Fedora policy enforcement module       Web interface for easy content submission Batch object modification utility Administrative Reporting Object Event History (ABC/RDF disseminations) Better support for “collections” New ingest and export formats (METS1.3, DIDL) Future Fedora Development Proposals  Digital Library in a Box   Full-featured DL application with “Fedora inside” Optimized for common set of content types  Fedora Power Server        Integrity Management Tools Service and link liveness checker Fault Tolerance Mirroring and Replication Peer-to-peer interoperability features Repository clustering Load balancing  Object Creation Tools   Workflow applications based on content models Web interface for document/content submission Implementing Fedora at IU beyond EVIADA: Next Steps      Define scope Define content, metadata standards Import existing content into Fedora Initial focus on images? Define and implement applications  Example: Common image search service  Ongoing process Who should use Fedora?  Now     Willingness to do programming, development Willingness to be on the “bleeding edge” Sufficient IT / DL staff Interested in cooperating with others to define best practices  Future: Lower barriers to entry  Thanks to:   Corey Keith, Library of Congress Sandy Payette, Cornell University  More information on Fedora:  www.fedora.info Jon Dunn, jwd@indiana.edu, 812-855-0953  My contact information: 

Related docs
Introduction to Fedora
Views: 28  |  Downloads: 8
Introduction to Fedora
Views: 38  |  Downloads: 8
Fedora Tutorial
Views: 38  |  Downloads: 6
Fedora Tutorial
Views: 121  |  Downloads: 16
CLT media database to Fedora ingest
Views: 0  |  Downloads: 0
Fedora_-software-
Views: 17  |  Downloads: 1
premium docs
Other docs by vixycn