Tartu University Library - entering the world of
digital collections and archiving
Good afternoon ladies and gentlemen, my name is Andres Didrik, I come from Tartu
University Library and it is my pleasure to introduce to you the growing Tartu University
Film Archive. We are a library, both an institutional library and a public library, and being an
institutional library means that we do have original content that is produced by ourselves, I
mean by Tartu University, namely audio and video recordings and even films. So, it means
we have to perform some archival functions as well, to preserve and give acces to our
But let me first tell you about the background of our current situation, then I’m gonna show
you some of the films and then I will discuss some of our solutions to archiving audio-visual
OLD FILM VAULTS
In 2004 it was decided in Tartu University to start managing its vast collection of old 16 mm
and 35 mm celluloid and nitrate film collections properly. The films had been laying on the
shelves for years without any proper use and therefore had been largely forgotten. Also
because of the development of video technology in the 80s and 90s old film projectors were
becoming more and more obsolete, to the point where there was only one working 16 mm
film projector in Tartu University which nevertheless was scratching and damaging films after
each viewing. It was not possible to watch 35 mm film at all.
In the process of reorganizing and restructuring various departments of the University the film
collection was rediscovered and question about its place in the world was asked again.
However, no clear answers were given, partly because it was hard for everyone to relate to
something as obsolete as film technology in this age of digital cameras and desktop
filmmaking. Also the films were not exactly kept in suitable conditions necessary for long-
term preservation, that means some of the metal boxes carrying films were almost totally
decayed to rust, seriously damaging films inside (as you can see in this picture). Other boxes
were dirty enough not to want to touch them at all, and lacking any coherent information
about what they contained. The films had also been moved several times, from bad conditions
to even worse, and existing partial call number system destroyed.
Starting from around 2005 several professional video production companies have been asking
about Tartu University's old film collection, if it was possible to watch the films and use them
in professional filmmaking. After some consideration it was decided to lend out some of the
films that were eventually digitized and used in a documentary feature about a renowned
estonian/russian scientist Yuri Lotman.
So, the need for proper handling and cataloguing of films was imminent. It was decided to
move the films over to Estonian Film Archive where proper archival conditions were met.
The whole collection consists of more than 2500 metal boxes carrying about 1500 different
films. Many of the boxes contain positive or negative copies of the same film, with audio
track either in russian or in estonian, hence the big difference in volumes. Today around 400
films have been moved over to Estonian Film Archive where they have been described,
digitized and stored in proper conditions. It will probably take another 4 or 5 years to finish
this moving and digitization process.
The earliest film in the collection dates back to 1932. It was made by a renowned estonian
filmmaker Theodor Luts in the event of 300th anniversary of Tartu University, and features
views and sights from Tartu as well as various social events and even lectures given at the
University. Naturally it is a silent film, as majority of the films in the collection are.
As you can see, the quality of this movie is not very good, mainly because this is digitized
from a copy of the original film, while the original film reel is presumably lost. The story is
that it was hidden among other films and left unmarked on purpose so that the soviets
wouldn’t be able to find it and take it away. In that our staff at the Cinema and Photo
Laboratory of Tartu University succeeded, the only drawback being that the original reel is
Also, the digitization equipment at the Estonian Film Archive is old, and not very good. This
actually puts us in a bit awkward position, because we already want to archive the films, but
these digitized films are of preview quality mostly. At least, we can see the films and decide if
we want to digitize some of them in Finland, for example.
The main bulk of the old film collection consists of films being made between 1950 and 1987.
About half of them are educational films in russian, received from major cinema production
studios in Russia and shown by lecturers and teachers during the initial courses at the
University, to help students make their career choices. These are popular films of no great
scientific value, but they are interesting to watch anyway, given that you can read and
Another half, and more important part of the collection consists of films being filmed locally
at the University by Departent of Cinema and Photo Laboratory. These are mainly educational
films used in everyday teaching process in many faculties, for example many are sports films,
many concern medicine issues – it is our understanding that these were matters politically
harmless enough to be shown freely at the classes during the soviet era. Also there are films
depicting various events and everyday life at the University.
Videotape archives and video.ut.ee server
Since the end of the 80s and beginning of 90s local events in Tartu and Tartu University have
been recorded on videotapes by our multimedia department, resulting in a collection of
hundreds of films on VHS, SVHS, miniDV and other magnetic media. The main emphasis
has been on cultural events such as music festivals and drama festivals, various students-
related activities and so on. Starting from 2004 most of the recordings have been made
digitally, lectures and conferences recorded straight to hard disk, and a lot of them are
available for watching on the internet address http://video.ut.ee.
In 2004 digitizing of old VHS videotapes to DVDs began. It was already a critical time by
then because picture quality of the VHS recordings was deteriorating rapidly, especially
colors were fading out on older tapes, which is typical behaviour. These tapes are currently
being transferred to DVD discs and because of the lack of technical staff this process will
move on slowly, taking another 4 or 5 years to complete.
Most interesting are the recordings of Estonia regaining freedom and publicly exposing our
national flag on the streets of Tartu again in 1988 - the first time in nearly 50 years. These
recordings are now digitized and available for local viewing at the library on DVDs and
possibly on the Internet soon.
Sound recordings have also been made at the University's multimedia studio since 1957,
featuring interviews and talk-shows with renowned scientists and professors, who reminisce
past times and talk generally about their work. In a sense this is even more vital and
interesting information about these times compared to the films from the same era.
In 2001 a whole film collection was donated to Tartu University by Chronos Film Studios in
Germany – a very well known producer of documentary films which is owned by von zur
Mühlen family, who were originally from Estonia and were forced to leave Estonia in 1940s.
The von zur Mühlen collection consists of 65 Betacam cassettes and around 80 VHS cassettes
with footage material about Estonia regaining freedom at the end of the 1980s, also
documentaries with Lennart Meri, Jaan Kross and other renowned estonians, and so on. These
films were all transferred to DVDs professionally by Orbital Vox Studios in Tallinn, and are
also available for local use at the Tartu University library.
Together with recordings made by Tartu University at the same time they form an outstanding
collection of estonian contemporary history captured on video.
Currently we practice analogue video digitization to DVDs. Why is that?
Today in 2007 it still seems that archiving video to DVD is a fairly good idea. After all,
MPEG-2 codec presents good enough picture quality, especially when converting from older
formats like VHS or even film. While new codecs and formats emerge (Blu-ray, HD-DVD),
older ones as MPEG-2 DVD discs are still usable and supported by all hardware and software
players. Typical recordable DVD disc today holds 4,5 GB of data or 2 hours of standard-
quality video, which is enough for digitizing VHS and film. Newest regulations in Estonia
suggest MPEG4 codec as standard for archiving digital video, but this codec has yet to
become universal standard as widely supported as MPEG-2 and MPEG-1.
Let’s also keep in mind that DVDs offer more than just storage for moving pictures – you can
design menus, include photo slideshows and also authorise subtitles and subpictures – so we
can say that DVD has a specific kind of functionality which also may deem to need long-term
preservation in the future. In our case, menus are mostly machine-generated menus and it is
not important to include them on streaming processes. However, it is important to include
menus and bonus materials into preservation process.
While digitizing analogue to DVD benefits in usability (ease of use) and maybe even
preservation point of view, those DVDs are determined for local use at the library only.
However, it is fairly easy to convert the content of these DVDs into streaming media format
and offer broadcast over Internet as an additional value-based service.
Considering the existance of all these film and video collections led us at the Tartu University
Library to a decision to start managing all these collections properly, as well as digitizing old
collections and making them available to a general public again. That means proper
cataloguing of all the items, digitizing and storing original materials in proper archival
conditions suitable for long-term preservation. It was decided that server copies must also be
made in streaming media format for an easy access on the Internet.
Now let me show you our streaming video server video.ut.ee.
Here in this server we collect streaming video broadcasts of many activities taking place at the
University. These have been recorded digitally to hard disk and also to magnetic media such
as miniDV and later authorized to DVD – so there are superior quality copies on DVD.
Again, you can see that we are archiving to DVD.
This is a Microsoft Server 2003 running Microsoft Streaming Media Services and that means
files are in windows media video format, which are of quite an average picture quality.
However, since windows media codec and windows media applications are so well integrated
into Microsoft Windows operating system, this video server is able to stream windows media
video at very high volumes of file requests. (HERE YOU CAN SEE SOME OF THE
BENEFITS OF MICROSOFT WINDOWS MEDIA SERVER - SLIDE)
There are around 1500 windows media files on this server right now, totalling around 130
While Windows Media Server is good for streaming media, it has a major misadvantage built
into its system as well. Because the Windows Media Server system runs on the regular
Microsoft NTFS filesystem, files are not safe on these hard disks. Information can be
accidentally deleted or it can get corrupted in various ways, even without the human factor.
Further, corrupted files can get backupped without anyone noticing and once the corrupted or
missing files are discovered, it may be too late and all the backups may have been
overwritten, thus erasing valuable data. This is why we started lookng for a better solution to
safeguard our data.
The issue of long-term preservation of digitized information on dedicated digital repository
servers has been in the air for quite some time already. Recent years have seen the
development of such software by many European and American institutions, and many more
institutions like universities, research centers, libraries and even archives have started using
their solutions. There are many instances of digital repository software around the world that
you can download and start to use instantly. Here you can see some of them. (SLIDE - DR)
There are different ways or different philosophies of how digital preservation works. We are
all familiar of how rapidly technology changes including our own home computers and cell
phones. Older software may not work on a new computer and new file formats hardly play on
older computer systems. This has led the developers of digital repository software to find
solutions of how to guarantee the access to your data, no matter how old it is.
(SLIDE – strategies)
For instance, BBC announced several months ago that they’re using old Windows 3.1 version
running on Windows Vista PC to get access to their old files. This is one way to go – you
have to preserve software or even hardware to be able to read your files some 15 years from
Here are some of the other methods or strategies. Of course you can combine them or use
different strategies at once.
Tartu University Library chose Dspace as its digital repository software in 2006. Dspace is
being developed by Massachusetts Institute of Technology in cooperation with Hewlett-
Packard Laboratories and these institutions have taken the responsibility to best fulfill the
long-term preservation goal, however they’re doing it, I’m even not sure. From what I have
read, they plan to combine different preservation strategies or even run different sets of
strategies at once, thus being able to engineer and backward engineer all the
Dspace software probably is the most developed digital repository software right now, with
its main emphasis being focused on long-term preservation of digital content and scalability
of services. The software has been described as Information-Centric – the developers wish to
construct a system that will enable the information that the system deals with, to outlive the
system itself. This means that the historic so-called „silo-oriented“ views of information
living inside an asset management system will not be sufficient anymore. Systems and services
in the future will flow around existing information assets, and standards-based mechanisms
for access to those assets will need to be a part of the infrastructure. Of course, in many cases
there are no clear standards, or existing standards are being developed all the time. In these
cases the developers of Dspace have taken the duty to test, inform, and influence the
development of useful and appropriate standards to guarantee the survival of information
over the changes of technology.
Dspace also has Low Adoption Barriers, meaning that maintenance and development costs
are both being low – software is relatively easy to install, administer and configure to your
specific needs. Hewlett-Packard and MIT have agreed to license all software produced within
the joint project with an open-source licence, meaning that the use of the software is free for
everyone and all costs administering Dspace relate to hardware and infrastructure only
(which are very important, of course).
After the initial experiences with Dspace I can say that Dspace is great for archiving various
text-based research materials like Adobe pdf files, Microsoft documents and so on, and also
pictures - photos, maps, digitized books, autographs and so on. Since Dspace’s main goal is
preservation, it is actually very hard to delete something from the Dspace hard disks – the
files just stay there – you can remove the links to the files but the information is always there
and you can always restore the links to the files – so you have to be careful in selecting what
to submit, and you have to make clear the information you submit is indeed finished work and
not a work in progress. You cannot make changes to the archived files later on.
Audio-video technology however isn’t that well supported – as of now, DSpace is not yet able
to properly stream A/V and there are limitations to the filesizes of uploaded files, which can
get very large in the case of video. You can store regular MPEG files or other common video
files, but to watch them you have to download the entire file, although progressive download
seems to work well in the case of windows media video. Hopefully proper video streaming
mechanisms will be developed into DSpace software soon.
So it has come that what we started as a digital video repository server has turned out as a
digital repository server for electronic documents other than audio-video, which is also a good
thing of course. As you can see, there are already over a thousand dissertations and thesises
submitted, and all of these are fully accessible and searchable not only by metadata but even
from full text. Being Open Archives Initiative’s Protocol for Metadata Harvesting supported
means that these electronic documents are indexed by big search engines such as Google or
OAIster, Open Archives Initiative’s own search engine.
Regarding video, our current plan is to archive disk image files of the DVDs as ISO formatted
files at the Dspace server, and offer streaming media previews via Microsoft Windows Media
server – so our DVDs will be safely stored on hard disks and preserved long-term. ISO disc
image format is also a very standardized file format, allowing to acces the content of DVDs or
CDs even without converting or writing the files to optical media, as of now there are many
software players capable of handling ISO files as if they were regular DVDs. Disc image
format is a great solution in that it makes the disc image files media-independent, while at the
same time acting as if the files were on optical disc. Even if physical DVD discs become
obsolete in the future, you can still use the ISO files.
So, it seems to be safe to preserve ISO disc image files long-term, and it seems pretty likely
that further development will be done in this direction by big corporations soon. After all, the
Moving Pictures Expert Group – the developer of the new MPEG-A codec, has specified ISO
file format for storage of information.
(SLIDE – three sites)
Regardless of the repository server handling archival tasks, proper metadata will be generated
in MARC format on yet another server running Innovative Millennium software, and
appropriate links will be generated between the two or even three servers. Initial metadata will
be generated in Dspace server for automated metadata harvesting, more thorough descriptions
of archival items will be generated in the Innovative Millennium server.
So – our current plan is to have 3 servers, WMV files being on Windows Media Services
server, archived ISO files on Dspace server and proper metadata on the Innovative
Millennium server all linked to each other by URLs. This of course means a lot of handwork,
and maybe it is best to create a web site to bring all the information together and present all
the films and related materials in one place.
There are some possibilities of crosslinking the contents, as illustrated in this case (WEB -
MATI UNT). Here, the video is on the Windows Media Server, fully streaming and you can
skip to whatever position you like. It is actually an unofficial solution or simply an
experiment, by inserting a special kind of code into MARC bibliographic record we
discovered that we can have a fully working video window inside the MARC record. It seems
to work well on all Windows computers and also Mac computers, but it seems you have to
install necessary software to get it working on Linux computers. Linux operating systems do
not like Microsoft formats by default.
(HERE YOU CAN SEE A MAC SCREENSHOT – that the embedded Windows Media
Player has simply been substituted by Apple’s Quicktime player, and working just fine)
This is an unofficial solution and may not work after the next Millennium software update,
but at least it is possible to do it that way now.
Common Creative Licence - It will offer the ability to release audio visual content for
viewing, coping and sharing, but with some rights reserved to author, such as commercial
Future developments THE END