Free and Open Source Software for librarians and libraries

Document Sample
Free and Open Source Software for librarians and libraries Powered By Docstoc
					Free and Open Source Software for librarians and libraries Derek Keats
Derek Keats Executive Director, Information and Communication Services, University of the Western Cape

Free and Open Source Software (FOSS) is a national priority in South Africa, and is increasingly recognized as a means to achieve quality, lower costs, create agility and foster innovation. All of these are things libraries need to accomplish, so a survey of the FOSS tools available to support library functions or the role of librarians is provided. There are FOSS tools to support nearly everything that happens in a Library, from personal productivity to library information management. Many of them have versions that run on proprietary operating systems or are web based and so cross platform by nature, making them easy to experiment with even on proprietary operating systems. This paper provides a brief overview of some of the FOSS tools that are available to libraries.

Free and Open Source Software (FOSS) is a national priority in South Africa, and is increasingly recognized as a means to achieve quality, lower costs, create agility and foster innovation. During 2007, the South African government approved a policy and strategy to implement FOSS in government. The strategy was developed by the Government Information Officers' Council (GITOC) to ensure that Government exploits the benefits that FOSS can offer more systematically by both using available FOSS and contributing to further FOSS development. In this way, government recognizes the importance of being part of a broader FOSS ecosystem that it helps to create and sustain. According to a report on Tectonic1, a cabinet statement issued when the strategy was approved included the following:


Innovation, No.36, June 2008

...all new software developed for or by the government will be based on open standards and government will itself migrate current software to FOSS. This strategy will, among other things, lower administration costs and enhance local IT skills.

Government is one of the largest customers of the IT industry, so this strategy is likely to have a significant impact on the availability and quality of FOSS tools. Libraries that are able to align themselves to this strategy are likely to be able to benefit from participation in the FOSS ecosystem. This paper provides a review of FOSS tools from the perspective of how they may benefit libraries and librarians. FOSS in libraries has been used extensively for many years (Schlumpf 1999), so this report is not defining a new concept. Nor is it a comprehensive review of all software that could be used in libraries (see Anon 2007), but rather provides an overview and gives examples of software and the ways in which libraries can participate in a FOSS ecosystem. It is not a comprehensive literature review, but rather the grounded perspective of someone who is an institutional “chief information officer” and who is active in many aspects of FOSS.

Free and Open Source Software?
To understand the nature of FOSS and how it can benefit libraries, it is useful to review its origins and history. The concept of free software is the creation of Richard Stallman, founder of the Free Software Foundation (FSF), based on the notion of the freedom of users to run, copy, distribute, study, change and improve the software. According to the FSF website, there are four kinds of freedom that free software ensures its users:
• • • •

The freedom to run the program, for any purpose (freedom 0). The freedom to study how the program works, and adapt it to your needs (freedom 1). Access to the source code is a precondition for this. The freedom to redistribute copies so you can help your neighbour (freedom 2). The freedom to improve the program, and release your improvements to the public, so that the whole community benefits (freedom 3). Access to the source code is a precondition for this (Free Software Foundation 2007).

Thus “free” here refers to freedom and not to price, although a useful side benefit to users is that Free Software is usually available for download at no cost. There are many commercial free software products whose revenue model derives from sources other than the sale of licensed permissions, so the opposite of free software is not commercial software but rather proprietary software. These four freedoms are

Keats : Free and Open Source Software …


lacking in proprietary software. In the late 90s, a group of people who were unhappy with the single-minded emphasis of Richard Stallman and the FSF on freedom, and who had personal differences with Stallman, developed the concept of open source and formed their own movement to promote it. The focus of this parallel movement was on the shorter-term business benefits, particularly software quality, rather than on the freedoms of users. The main argument for the term open source software is that the concept of “free” is ambiguous and that business people are somehow afraid of, or uneasy about the idea of freedom. Since the free software and open source movements overlap in much of what they do, those of us who are active in this area and who use English often refer to them collectively as FOSS (Free and Open Source Software). FOSS gets created by a number of mechanisms, from individual developers who create something for their own use, through organizations and individuals collaborating to address a common need or opportunity, to corporations into whose business model FOSS is able to fit and add value. The approaches that a library may take with respect to FOSS in their own operations can be grouped along a continuum into five categories. Approaches to FOSS relevant to libraries First and foremost, FOSS is a way of working within an organization to derive benefit for the organization from technology choices that promote freedom, collaboration, co-operation and openness. To maximize benefit from a FOSS strategy, an organization must develop its FOSS ecosystem, whether this consists of internal capacity or partnerships with strategic business partners who have developed their own FOSS ecosystem. There are specific technologies for specific applications, but to think of the strategic implementation of FOSS as only a technology choice is to completely misunderstand the FOSS ecosystem. For UWC, for example, FOSS is a strategic business decision, based on where the institution wishes to position itself for now and the future. While it is perfectly possible to simply use existing FOSS, a strong FOSS ecosystem will have a mix of the approaches listed in Table 1.


Innovation, No.36, June 2008

Table 1. Business models whereby libraries may take advantage of FOSS software, and some examples from our own experience at UWC.

Business model Use existing

Adapt existing

Contribute to a project

Sponsor a project

Create a project

Key features The institution uses existing FOSS tools, such as GNU/Linux, and does not contribute to their development The institution makes minor adaptations of existing tools to serve its own peculiar business needs. The institution puts resources, either money or a software developer, into an existing project or contributes code back to a project The institution sponsors an external agency to create a tool on its behalf, and may assist that agency to locate other sponsors who could join the project The institution creates a new project, puts its own developers onto writing the software and seeks other sponsors or others who may join the project

Example at UWC The use of GNU/Linux, MySQL database, Apache web server, Firefox web browser, Gimp image editor, etc. Adapting of PHP-BB to our Novell Directory Services environment, now succeeded by other FOSS tools. We have contributed to the PEAR PHP library, as well as the VLC media player and streaming framework We have not yet sponsored a project per se.

The AVOIR2 project and the Free Software Innovation Unit3 at UWC have created the Chisimba application framework and a large number of systems built on top of it including the KEWL e-learning sytem, the Portal, the AHERO digital publication repository, ETD thesis and dissertation system, etc.

Areas of FOSS in libraries There are a number of areas within libraries where FOSS tools are often used, and can be deployed or adapted to improve the quality of experience by librarians and patrons alike. These include the operating systems that run our computers, the desktop productivity applications that we use, our institutional repositories, library information systems, and various web and distributed applications. FOSS tools are of a particularly high quality in the area of content management, blogging, wiki and other web 2.0 and social software applications. Operating systems An operating system is the software that makes the computer work, and presents the user with an interface with which to interact with the computer. In the proprietary world, the operating system and the user interface are the same, but in the world of FOSS, operating systems may have different interfaces. For example, GNU/Linux

Keats : Free and Open Source Software …


has Gnome, KDE, Enlightenment and several other window managers that provide choice in the user interface. The table below lists the three FOSS operating systems, their key features, and provides examples from our use of them at UWC. The screen capture below shows the Gnome desktop, running with four desktops using a rotating cube to switch between them provided by the Compiz Fusion 3D enhancements on Ubuntu 7.10.

Figure 1: Gnome desktop with Compiz Fusion 3D enhancements. This document is being edited on the right face, while a video is playing on the left face, switching between by pressing Control_arrow.


Innovation, No.36, June 2008

Table 2. Operating systems, their key features, and some examples of use at UWC. Operating system Key features UWC examples Most of our student labs are dual boot, both GNU/Linux and Windows. We have some labs that are GNU/Linux only. Many of our servers in our datacentre run GNU/Linux. I use GNU/Linux for all my own computers, both at work and at home, and have found it to be relatively crash-free, free of virus problems, and easy to use. BSD is used to run some of the most important components of our infrastructure, including mail spools, all domain name servers (DNS), firewalls, and in the library for hosting ezproxy for off campus journal and database access.

GNU/Linux GNU/Linux is the product of a very large community that spans the globe and consists of both corporate and individual programmers. It is suitable for both server and desktop environments. Linux comes in 'flavours' called distributions, with one of the most popular being Ubuntu created by Mark Shuttleworth and maintained by his company, Canonical. BSD Unix BSD is the UNIX derivative distributed by the University of California, Berkeley starting in the 1970s. It comes in various descendents of the original BSD, including FreeBSD, NetBSD and OpenBSD. BSD is often used in the data centre environment, but is sometimes used by geeks to run their desktop environment. Apple's Mac OS X is derived in part from BSD, something made possible by the BSD license that allows derived code to be proprietary. OpenSolaris is an open source project created by Sun Microsystems to build a developer community around the Solaris Unix system. Combined with Sunray thin clients, OpenSolaris is particularly good for running computer laboratories.

Sun Open Solaris

Most of our larger computer labs for students at UWC run Sunray thin clients and OpenSolaris. OpenSolaris and Sunrays are ideal for running general purpose labs and terminals, such as you often find in libraries. The desktop environment is provided by Gnome, which is also popular as the desktop environment on GNU/Linux and BSD.

Many GNU/Linux distributions, including Ubuntu, come on a bootable CD, allowing you to get a taste of GNU/Linux without replacing your other operating system. In addition, computers can be configured for dual boot, both Windows and Linux (or more than two operating systems), or to run one operating system inside the other via virtualization.

Keats : Free and Open Source Software …


Desktop software Desktop or productivity software is use to carry out routine office operations, and includes office suites, email clients, web browsers, graphics applications and desktop publishing systems. Given the rising prominence of audio and video applications, particularly in the higher education context, we can also include video and audio editing applications.
Area Office suite Packages Notes UWC experience We have a legacy of Microsoft dominance in this area. However, we provide training and support for OpenOffice and usage is increasing. As a GNU/Linux user, OpenOffice is all I use, and I find it to be a superb office package that offers a number of features that cost extra in the Microsoft world, including export of content to PDF and Flash.

OpenOffice. Comprehensive office suite, org conforms to the requirements of the SA government minimum interoperability standards for document exchange. Available for all major desktop operating systems, including GNU/Linux, Mac OSX, and Microsoft Windows.

URL: K-Office Used by a few staff who run KDE Comprehensive office suite, conforms to the requirements of as their desktop on GNU/Linux. the SA government minimum interoperability standards for document exchange. Available for GNU/Linux, designed for the KDE desktop. URL: E-mail client Evolution It combines e-mail, calendar, address book, and task list management functions similar to Microsoft Outlook and Novell Groupwise. Evolution development is sponsored primarily by Novell, and it is the official package for the Gnome desktop. Several individuals who are GNU/Linux users make use of Evolution as an email client, including myself. It also serves as a task manager and calendar/diary.

URL: Thunderbird An e-mail and news client developed by the Mozilla Foundation. Additional features are often available via other extensions. Several individuals who use GNU/Linux at make use of Thunderbird as their preferred email client.


Innovation, No.36, June 2008

Area Graphics applications

Packages The Gimp

Notes The GNU Image Manipulation Program, or GIMP, is a raster graphics editor used to process digital graphics and photographs, and is often used as a replacement for Adobe Photoshop.

UWC experience At UWC we do training on the use of GIMP, and encourage its use where people need to manipulate bitmap graphics. I do a lot of graphics work for presentations, and as an experienced Photoshop user, I found switching to GIMP fairly straight-forward. We use Inkscape to make the icons used in Chisimba. I use Inkscape to create graphics for use in presentations. We hope to introduce Inkscape training during 2008.


URL: Inkscape Inkscape is a vector graphics editor application similar to tools such as Adobe Illustrator, whose goal is to be a powerful graphic tool while being fully compliant with the XML, SVG and CSS standards. URL: Desktop publishing Scribus Scribus is a cross-platform desktop publishing (DTP) application that is available for GNU/Linux, Microsoft Windows and others. Equivalent proprietary applications include Microsoft Publisher, Adobe PageMaker, QuarkXPress and Adobe InDesign. URL: Web browsing Mozilla Firefox The Mozilla Firefox web browser is derived from the Mozilla application suite, managed by the Mozilla Corporation. Firefox is the second-most-popular browser in current use worldwide. It has a plugin architecture and over 2000 plugins are available. Firefox is the recommended browser across all platforms at UWC. It is the only browser I use, as it is superior to any other. Unknown. In researching this article, I installed Scribus and used it to create a one page flier. If you are familiar with DTP, this application can be very powerful. It took me about half an hour to do the flier, and that included learning the application.


Keats : Free and Open Source Software …


Area Instant messaging

Packages Kopete

Notes Kopete is a multi-protocol instant messaging client designed to integrate with the KDE desktop environment, but runs well under Gnome. It can talk to most of the popular messaging services, such as Yahoo, Windows Live Messenger, ICQ, Jabber, Novell Groupwise, IRC, etc. URL: Pidgin is very similar to Kopete, but has support for multiple operating systems, including Microsoft Windows. URL:

UWC experience Primarily used by computer enthusiasts, but I use it extensively for both personal and work-related messaging. It has MSN and Yahoo! messenger webcam support, but no sound, forcing a skip to Skype for voice conversations.


Unknown. However, I use it on a personal desktop at home that runs the latest Ubuntu. I find that it integrates well with my dual monitor Gnome desktop. Use is still limited. I use it to edit videos for home use, as well as for videocasting. My son used it very effectively to produce a video for a school project. I hope that we will be able to offer basic Kino training in 2008.

Video editing


Kino is a non-linear digital video editing package for Linux. It claims the vision: "Easy and reliable DV editing for the Linux desktop with export to many usable formats." Kino supports many basic digital video editing and production tasks. URL: Audacity is a digital audio editor tool that operates across platform, including Microsoft Windows. It is popular among the podcasting community, where it is the tool of choice for editing podcast productions. As a general sound editor, it is as good as it gets. So popular is it that by mid 2007, it achieved 24 million downloads from Sourceforge.

Audio editing Audacity

This is the recommended editor for editing audio files, especially podcasts. We offer training for staff and students in the use of Audacity and the production of podcasts. I use it for my own podcasting as well as for slidecasting (syncronization of slides and podcast audio via the slideshare online tool at



Innovation, No.36, June 2008

In addition to the FOSS tools listed above, there are several proprietary applications that will run on GNU/Linux, including Skype, but in most cases, the FOSS tool is more than adequate to do the job. Many of the tools listed above are cross platform, so if you are a Windows or Mac user, you can explore the world of FOSS without leaving the comfort and familiarity of your own computing environment.

Other helpful tools
Although not software per se, the Open Clipart Library4 can help to enhance your applications where graphics are needed. Institutional repositories An Institutional Repository can be thought of an online system for collecting, preserving, and disseminating the research and other output of an institution in digital form. Assets lodged in such repositories include preprints of research articles, post publication prints, electronic theses and dissertations as well as other digital materials produced in the institution. FOSS tools to support institutional repositories are available, and excel over any proprietary alternatives. EPrints is a package for building open access repositories that are, as with all the others listed here, compliant with the Open Archives Initiative Protocol for Metadata Harvesting. It is primarily used for institutional repositories and scientific journals. DSpace is fairly popular for the establishment of institutional repositories, with about 240 installations worldwide. It is written in Java and makes use of Java server pages (JSP) for the web interface. It supports the use of PostgreSQL and Oracle for its database. Dspace content is available primarily via a web interface, but it also supports the OAI-PMH v2.0, and the export of Metadata Encoding and Transmission Standard (METS) packages. Fedora5 provides digital asset management architecture, upon which many types of digital library, institutional repositories, digital archives, and digital libraries systems can be constructed. The software provides for the underlying architecture for a digital repository, and is not a complete management, indexing, discovery, and delivery application. Because of its modular architecture, built to provide for interoperability and extensibility, systems are built on top of the Fedora base. Our own software application framework, Chisimba, includes two modules that provide repository functionality. These include AHERO for publications, and ETD for electronic theses and dissertations. We developed this functionality on the

Keats : Free and Open Source Software …


Chisimba system in order to have an integrated system that we can maintain and extend, eventually including full postgraduate student management and other functionality. In addition, we are in the process of implementing an archiving project, in which we will build web and social networking capability using Chisimba on top of Fedora.
Package DSpace Notes UWC experience

None DSpace provides tools for the storage and management of digital assets or objects. It supports almost any kind of content that can be stored digitally. It can also serve as a platform for digital preservation activities, although it has some limitations for this use. URL: Eprints is a perl application that is specialized for institutional repositories and research journals. URL: Fedora provides tools for the storage and management of digital objects. It supports a any kind of content that can be stored digitally. It is flexible and other applications are usually built on top of it. URL: None



We are just beginning with Fedora, which we will be using for a digital archive project in conjunction with Chisimba and other tools.

Greenstone Greenstone is a suite of software for building and distributing digital library collections via the Internet or on CD-ROM in the form of a fully-searchable, metadata-driven digital library.

We have used Greenstone to build CD-ROM based collections, and have been exploring integrating Greenstone's webservices into Chisimba. However, there has been no demand for this so we have paused it for the time being.



Innovation, No.36, June 2008

Package Chisimba (AHERO and ETD)

Notes AHERO was created for a project called African Higher Education Research Online. This supports the OAI protocol for resource harvesters, and is designed for creating a repository of published or preprint articles. EDT was created for the library as a means to publish UWC Theses in an Electronic Thesis and Dissertation system. The tools are compatible with recommended standards, and there is a plan to incorporate ETD into a full postgraduate research, supervision and learning system. URL:

UWC experience We developed the AHERO module on Chisimba and host it on behalf of the Centre for the Study of Higher Education at and ETD for the library at

Since DSpace and Fedora are popular systems, it is worth noting their strengths and weaknesses. DSpace is best suited for use to create an institutional repository of text based items. It has a well designed Web front end, and provides a well researched business model that makes it easy to justify use of the tools within an organization. However, its web environment is suitable for single item entry and there are minimal tools for batch import of items. In addition, its archival model assumes that all objects are static so there is no support for versioning with objects being difficult to retrieve from archived items. DSpace stores objects but does not store behavior, particularly not the behavior the objects had before entry. There are many anecdotal reports of the system not being able to deal with objects bigger than a few Mb in size. DSpace has no real way to store complex relationships, and there exists but a single metadata stream creating an “all or nothing” metadata environment. Where Fedora beats the more fully functional DSpace is enabling the storage of both a data object and its behavior with full versioning and multiple metadata streams. This is vital to projects that require long term digital preservation. The downside of Fedora is that it requires more expertise to implement, and you have to either build, buy or adapt a web front end. At UWC we have a digital repository project that needs to cater for multiple metadata streams, versioning, and long term digital preservation. For this reason we have chosen Fedora for the repository, and are building on top of it using the Chisimba framework. Integrated Library Systems Integrated library systems (ILS) are enterprise resource planning systems for libraries and are used to track information such as the items owned, orders made, bills paid,

Keats : Free and Open Source Software …


patrons and borrowings. As such they are the bread-and-butter of the business of running a library. There are five FOSS LIS packages that I have looked at for this article: Gnuteca, Koha, PhpMyBibli, PHPMyLibrary and Evergreen. Gnuteca is an ILS package developed initially in Brazil, and in use in a number of Latin American institutions. The package is developed in PHP, on top of a FOSS stack using the Miolo application development framework. We have worked a little bit with the developers of Miolo and Gnuteca, and we have the capacity to interoperate Gnuteca and our own Chisimba Framework. Commercial support is available from the Solis6 Software Development Co-operative in Brazil.
Package Gnuteca Overview A full-featured Integrated Library System (ILS) created in accordance with criteria defined and validated by librarians, and was developed and tested using the library of the Univates University in Brazil, where it has been in operation since February of 2002. Mostly used in Latin America. The software follows known standards as ISIS and MARC21, and uses a web interface. Maintained by the Solis Software Development Co-operative. A full-featured ILS, developed initially in New Zealand by Katipo Communications Ltd and first deployed in January of 2000 for Horowhenua Library Trust. In use worldwide, its development is steered by a growing community of libraries collaborating to achieve their technology goals. Maintained by a team of software providers and library technology stff from around the globe. URL (English and French demos available on request)

Koha A full-featured ILS, initiated by François PMB (PhpMyBibl Lemarchand in October 2002, Director of Public Library of Lambs and now the maintained by PMB i) Services. PhpMyLibra An ILS with cataloging, circulation, and webpac ry modules. Evergreen pmylibrary A partially-featured ILS, developed by the Georgia (USA) Public Library Service and used in the integrated statewide library system of c. 250 libraries known as PINES (Public Information Network for Electronic Services).

Koha claims to have been the first open source ILS and was created in 1999 for the Horowhenua Library Trust in New Zealand. It went live in 2000, making it an old


Innovation, No.36, June 2008

and stable application in this space. Koha is developed in PERL on top of a FOSS stack. Hosting and commercial support are provided by a company called Liblime7, which makes its services available around the globe. PhpMyBibli (PMB) development was started in 2002 with the latest 3.0 version launched in September 2006, and since then has enjoyed frequently release updates that include the addition of new features. Another PHP package is PhpMyLibrary, which consists of cataloging, circulation, and webpac modules. The download was last updated April 30, 2006 so there does not seem to be much activity lately. Evergreen is an ILS developed by the Georgia Public Library Service and used in the integrated statewide Public Information Network for Electronic Services (PINES ) at roughly 250 libraries. The core of the business logic is primarily written in Perl, and much of the underlying infrastructure is written in C, on top of a FOSS stack. The OpenSRF framework, however, is language agnostic and there are Python bindings for both client and server, and Java bindings are planned according to the project FAQ. It does not currently have acquisitions or serials modules, which limits its functionality somewhat.

Web 2.0 and FOSS in libraries
The concept of Web 2.0 applies to relatively recent evolution of the World Wide Web in which content creation has become two way (read, write), social networking services and technologies have emerged, and the web browser has become not the only medium of interaction. Interfaces have become simpler, and technologies such as blogs (weblogs), podcasts, wikis have become popular. Socially networked content sharing sites such as Flickr, Youtube, and Slideshare make it possible to share digital assets and build communities around them. Pure social networking sites such as Facebook and MySpace are also a prominent feature of Web 2.0. Hardware and Software as a Service (HAAS and SAAS) have recently emerged as compelling business models where bandwidth is cheap. There are a number of software applications available that enable Libraries to create communities of practice for both librarians and patrons alike. The number of FOSS applications available in this space could fill a book. Wordpress is one of the most popular blogging applications, while Mediawiki probably dominates in Wiki space. Content management systems are also often used to create interfaces to various library facilities, one of the most popular being Joomla.

Keats : Free and Open Source Software …


Figure 2. The Koha OPAC interface at the Horowhenua Library Trust.

AT UWC we have a collaborative project, involving 13 African universities together with Kabul University in Afghanistan, that involves capacity building in Software Engineering. One of the outcomes of that project is a toolbox (framework) that integrates much of the Web 2.0 functionality into a single application that can behave differently depending on how it is configured using an easy web-based user interface. This includes the article repository (AHERO), the ETD system, an e-learning system (KEWL 3.0) as well as a large array of Web 2.0 enabled functionality that may be relevant to the activities of a library, including blogging, wiki, podcast, video, etc.

Discussion and conclusions
There is a large array of Free and Open Source Software available to support the activities of libraries, and help librarians improve their own work and build a better experience for patrons. Much of the software that is available can work with GNU/Linux, Windows and Apple Macs, so one does not have to go FOSS across the whole application stack to benefit from it although of course the benefits are larger the broader the ecosystem for FOSS is within your library and the institution of which it is part. Aside from the software reviewed here, there a many other applications that can be used to enhance library work. I will maintain a section of my blog called Free Software in Libraries at Go to the blog, and the section FOSS in Libraries will be visible on the right panel, under Categories.


Innovation, No.36, June 2008

I thank Simon Tanner of Kings College, London, for sharing his thoughts and experience with DSpace and Fedora. To create this document in the time available, I drew on software descriptions from Wikipedia8 as the most accessible and succinct summary.

Anon, 2007. oss4lib: open source systems for libraries. Accessed: 21/11/2007. Free Software Foundation. 2007. Accessed: 21/11/2007. Schlumpf, P. 1999. Open source library systems. The future of the automated library catalogue 18 (4): 323-326.

1 2 3 4 5 Flexible Extensible Digital Object Repository Architecture 6 7 8

Shared By:
Description: Free and Open Source Software for librarians and libraries