Slide 1 - Massachusetts Historical Society by chenmeixiu


									     Choosing an Open Source
Digital Collection Software Solution:
Joseph Fisher
Database Management Librarian @ UMass Lowell

Digital Commonwealth Executive Committee

Student in the University of Arizona SIRLS
Graduate Certificate Program in Digital Information
  Management (DigIn)
Five stages of implementation:
1.   Installation (DSpace)
2.   Branding
3.   Collection setup
4.   Item ingest
5.   User Interface
•   Linux
•   Apache        • Eprints
                  • Omeka
•   MySQL
                  • Drupal
•   PHP/Perl
•   Linux
•   Tomcat
•   PostgresSQL   DSpace
•   Java
• EPrints
• DSpace
• Drupal
 • Omeka
•   University of Southampton, UK
•   First released (v 2.0) 2002
•   Current version 3.2.7 (v3 2006)
•   Linux / Windows (Perl)
•   “EPrints is … the easiest and fastest way
    to set up repositories of open access
    research literature, scientific data, theses,
    reports and multimedia.”
           EPrints : Key Features
• software links to the SHERPA/RoMEO database so
  authors can easily verify their publisher policies
• LCSH framework included that can be manually
• Supports internal and external authority files or
  auto-fill from previous database entries.
• Multiple item import and export options

• Thumbnails and preview images of search items
EPrints: Branding
Step 1: /usr/share/eprints3/archives/archivename/cfg/subjects
Step 2:

Step 3: $ bin/import_subjects archivename filename
Step 1: /usr/share/eprints3/archives/archivename/cfg/subjects
Step 2:

Step 3: $ /bin/import_subjects <archive id> -xml
Collection “Division” Setup
EPrints: Item Ingest
             EPrints Conclusion
• Easy to install, easy to brand, best for a single
  subject-specific repository
• Designed for documents, though now supports
  multiple file types
• Easy embargo, default thumbnail and preview
• Some preservation support – History module for
  tracking changes, METS export plugin
• V3 does enable easier development of plugins
  (20-30 listed per year)
           EPrints: Challenges
• Not all Dublin Core fields provided (relation,
  source, contributor, coverage)
• Lack of theme options
• Possible with Views to create custom
  collection space within a repository?
• Statistics? Permissions? Plugins?
• Documentation not well developed
• EPrints
• DSpace
• Drupal
• Omeka
•   HP-MIT Libraries Alliance (2002)
•   DuraSpace (2009)
•   Current version 1.7.1 (2.0 due in Oct.)
•   Linux / Windows (Java)
•   “DSpace preserves and enables easy and open
    access to all types of digital content including
    text, images, moving images, mpegs and data
            DSpace : Key Features
• Preservation support: Bit integrity checker, format
  registry, item history, handles, DuraCloud backup
• Discover faceted seach/browse interface
• Qualified Dublin Core base or MODS thru xmlui

• Embargo capability
• OAI-PMH harvesting, both data and service supplier
• Manakin XMLUI provides flexible though challenging
  customization capability (multiple custom archives)
             Dspace: Installation
Prerequisite Software :
•   Linux or Windows
•   Oracle Java JDK
•   Maven (Java build tool for stage 1)
•   Ant (Java build tool for stage 2)
•   PostgreSQL or Oracle
•   Tomcat
•   Perl
JSPUI: Java Server Pages User Int.
XMLUI : Manakin (default theme)

Stop / Start Tomcat
Kubrick theme
  @Mire Mirage Theme
    Changing home page XMLUI
New Preservation Tool: 1.7
OAI Harvesting
            DSpace: Conclusion
• Very functional out of the box with easy branding
• Increasing preservation management support
  (PREMIS support coming in v2.0)
• Good documentation, training materials, and
  broad support base
• Platform is versatile and flexible with ability to
  customize collection interfaces
• OAI-PMH/ORE/SWORD (postponed)
• SOLR-based statistics engine
• Metadata registry, option to add new schemas
           DSpace: Challenges
• Research Projects:
  – Discovery faceted search
  – Embargo
  – DSpace Statistics
  – Thumbnails
  – XMLUI Customization
• EPrints
• DSpace
• Drupal
• Omeka
• Became Open Source project (2001)
• Current version 7 (Jan. 2011)
• Content Management Framework (CMF)
• LAMP / Windows (PHP)
• “Drupal … allows anyone to easily publish,
  manage and organize a wide variety of
  content on a website. .”
          Drupal: Key Features
• Numerous themes
• Extensive module availability:
  – Custom Control Kit (CCK) – allows creation of
    fields for input and display
  – Faceted Search
  – Image administration
  – Advanced Search
  – Views – create customized lists and queries
  – OAI-PMH module
           Drupal: Conclusions
• Flexible and versatile but takes time and effort
  to put all the pieces together
• Large, varied, and active community of users
  and developers with good documentation
• Limited only by module availability, of which
  there are nearly 8,000 (back to version 4.x)
• Challenges: modules are managed individually
  and the entire structure manually constructed
• EPrints
• DSpace
• Drupal
• Omeka
• Center for History and New Media at George
  Mason University (2008)

• LAMP / Windows (PHP)
• V 1.3.2 (2011)
• “Omeka is a free, flexible, and open source web-
  publishing platform for the display of
  library, museum, archives, and scholarly
  collections and exhibitions.”
            Omeka: Key Features
• “Designed with non-IT specialists in mind”
• “as easy as launching a blog”
• Over a dozen themes
• Over 30 Plugins:
  –   OAI-PMH Harvester
  –   Exhibit Builder
  –   LC Subject Headings
  –   Comments
  –   Social Bookmarking
  –   Item Relationships
• Cloud web hosting
• Five tiers:
   – 500MB for free with 1 site, 4 plugins, 4
   – Tier 1 = $49 for 1GB, 2 sites, 6 plugins
   – Tier 4 = $999 for 25GB, unlimited sites and
           Omeka: Conclusion
• Extremely easy plug and play functionality
• Exhibit plugin enables easy customized
  collection interfaces
• Immediate metadata field additions
• Limited user and development base, but still
  new and growing
• Very easy OAI-PMH capability
            Why Open Source?
Open source is a development method for software
 that harnesses the power of distributed peer
 review and transparency of process. The promise
 of open source is better quality, higher reliability,
 more flexibility, lower cost, and an end to
  predatory vendor lock in.
                  The Open Source Initiative (OSI)
        Lower Cost !!??!!
•   Server
•   Technical-savvy staff
•   Programmer?
•   Time
                Choice Criteria
•   Institutional Repository or Digital Archive
•   Ease of use
•   Metadata flexibility
•   OAI Harvesting
•   File format support
•   Multiple collection interface customization
•   Popularity / Reputation
•   Size of active community
•   Development commitment
        Insource                                  Outsource
“…two major problems …
First, the core mission of for-profit service providers is not to
preserve and provide access to significant digital objects. It is,
however, to generate a profit and stay in business….

Second, if an outsourcing trend takes hold and gains
momentum, then cultural institutions are at great risk of losing
their own value proposition and viability as institutions in the
digital age.”
 Tyler O. Walters, Katherine Skinner.
 “Economics, sustainability, and the cooperative model in digital preservation.”
 Library Hi Tech (28.2) 2010
 Librarians of the World Unite!!

   We have nothing to lose!!
Control the means of Production!!

To top