Document Sample
Frank Powered By Docstoc
					  Indexing and Accessing Electronic Theses and Dissertations:

                           Some Concerns for Users

                             Ilene Frank and Walter C. Rowe

Once a dissertation or thesis is completed and the degree granted, how do others know it
exists? Libraries have a long history of housing and cataloging the intellectual output of
the graduate degree process. Catalogers have worked in an online environments since the
early 1970s, first sharing cataloging records, then placing them into online “card”
catalogs using an agreed-upon standard for input: the MARC (Machine Readable
Cataloging) format. The transition from drawers of printed catalog cards to online
databases is part of an evolutionary process based upon existing library practices.

How then does the electronic format for theses and dissertations affect the effort to
describe these items? This paper will focus on a general overview of issues involved in
providing access through the cataloging process for electronic theses and dissertations
(ETDs). We will discuss cataloging issues from the point of view of the user rather than
the cataloger. This paper is based in part on discussions held at the University of South
Florida (USF) during 1997 as the librarians developed a proposal for managing ETDs.
We will use the term “dissertations” to include theses as well.

A key goal of the movement toward ETDs is to improve awareness of and increased
distribution of student research. The Networked Digital Library of Theses and
Dissertations (NDLTD) which has been under discussion since 1987 has provided a focus
for these efforts. This project is international in scope. A “Federated Search System” has
been developed that allows for searching for dissertations from diverse databases from a
number of countries including Canada, the United States, and Germany. With the goal of
increasing awareness of ETDs in mind, libraries are taking a number of approaches to
improve points of access. One method is to integrate records for ETDs into the library
catalog. As users search the catalog for books and documents, they will find dissertations
as well. This approach is useful for researchers who want the whole range of material
available on a given topic. It is not unusual for a search in a Web-based library catalog to
include references to books, videos, other multimedia, and references that include
clickable URLs. The records for ETDs will include clickable links to the material itself.

The other approach to dealing with documents like ETDs is to segregate them into a
special database, perhaps as part of a “Virtual Library.” This second approach is useful
for more sophisticated researchers who view dissertations as a type of resource. They
seek out ETDs as a special class of materials. This solution can be useful for public
relations purposes. It becomes clear to users that there is an extra effort being made to
highlight the intellectual efforts of the school’s graduates.

Fortunately, it is reasonable to have both strategies coexist. For example, the University
of South Florida will maintain records for ETDs in both the online catalog
(WebLUIS/USF) and develop a second search tool using OCLC’s SiteSearch software
for those researchers who would like to head directly to the ETD collection. The record
for the dissertation used in the online catalog will serve as part of the record for the
SiteSearch database. Virginia Tech has used the same approach. Dissertations are
included in the campus online catalog (that is supported with VTLS software and the
Addison front-end). The material is also searchable as a discrete database via a search
engine, OpenText LiveLink.

One might think that having multiple points of access would be a straight-forward and
foregone conclusion. After all, librarians have been including dissertations in their
collections for more than a hundred years. Many libraries create cataloging records for
their school’s dissertations to add to their library catalogs. Subject, author, title, local
notes, and, at times, table of contents have all been basic staples in the construction of a
bibliographic record for a thesis or dissertation. Some include abstracts of dissertations
as well. Local notes can include major professor, department for which the dissertation
was completed, etc.

The proposal to set up a system to distribute theses and dissertations electronically gave
the catalogers an opportunity to re-examine their practices. Should the records for ETDs
be enhanced in any way? For example, should abstracts be added to the University of
South Florida cataloging record? What about subject headings? Subject headings take a
lot of time to construct. With improved searching capabilities, would it be possible to
save staff time by providing fewer access points since the material itself would be online
and searchable? Adding subject headings has always proven to be a time-consuming task
for dissertations which deal with very specialized topics. Would a traditional-style record
describing the dissertations be necessary at all? Other questions emerged.

It is important to review the ways that the existence of dissertations is brought to the
attention of users:

* The institution’s library catalog. Now that so many library catalogs are accessible
electronically, users can visit catalogs via Telnet or the Web to determine what
dissertations might be available at a given institution. Users need to be alert to the
differing search capabilities at each institution.

* OCLC’s WorldCat. The database is available to users at many libraries as one of the
FirstSearch databases. This source covers material in many libraries. Using keyword
features available through FirstSearch, users can locate a wide array of theses and
dissertations. A sampling study done by OCLC in 1998 suggests that there are several
million such reads.

* Indexing services provided by professional associations. Example: the American
Psychological Association lists dissertations in its versions of “Psychological Abstracts”
including the electronic form PsycINFO. These services only cite dissertations in certain
* UMI’s “Dissertation Abstracts International” (DAI) which is available in many
libraries, in either hardcopy, CD-ROM, or online formats. UMI is offering free guest
accounts which allow for searching authors, titles, and keywords for the last two years of
dissertations through their “ProQuest Digital Dissertations” products
( (The complete database is accessible to
institutional subscribers.) DAI has always been a prime source for locating doctoral
dissertations since the indexing goes back to 1861 and provides coverage of dissertations
from the U.S.

* Notices in professional journals or via websites that compile dissertation listings that
are discipline specific. Example: The American Musicological Society provides a
searchable database of dissertations in musicology called “Doctoral Dissertations in
Musicology Online” ( This was formerly
a feature of the print publications of the Society. This source includes works in progress
as well as listings for completed dissertations.

* The Networked Digital Library of Theses and Dissertations (NDLTD) has developed
the ability to search across databases of collections of theses and dissertations via a
system of Federated Searches (

Each one of these access tools provides differing interfaces and differing searchable
fields. As noted above, ProQuest Digital Dissertations provides guest users with the
capabilities of searching for author, title, and keyword of dissertations. Since only two
years of dissertations are accessible in this abbreviated guest version of the database,
these fields will provide many users with enough capabilities to find recent dissertations.
Other subscription controlled versions of the DAI database allow for more sophisticated
searching techniques. How many searchable fields are enough?

Virginia Tech’s ETD database is powered by OpenText and the initial search screen
( provides a simple keyword access to title
pages and abstracts. This system accommodates the information queries mentioned above
since the data elements are available on the title pages of dissertations. Major professor,
department, and other pertinent information are searchable. In addition, Virginia Tech
includes a cataloging record of each dissertation and thesis in its online catalog. This
ensures that the dissertations can be found along with all of Virginia Tech’s other books
and periodical titles. The specialized, separate database devoted to ETDs serves as a
means of showcasing the material. Users outside the institution are likely to seek out this
specialized database.

As mentioned above, the plan for the University of South Florida ETD project is similar.
USF is using OCLC’s SiteSearch software to provide searching capabilities for the
database that will be devoted to ETDs. SiteSearch is a commercially available search
suite from OCLC ( Current plans allow for
search capabilities much like Virginia Tech’s. Users of the ETD database will be able to
search the cataloging record for each dissertation and the abstract as well. USF will be
adding searchable abstracts provided by the graduate student as part of the submission
process. This is the first time these abstracts will be included in the USF cataloging
records. This approach should prove helpful to the average users of library systems.
Abstracts provide a valuable, additional searchable field which expands the vocabulary
found in the titles of dissertations.

Users are seeking information beyond author and title. Kay E. Lowell’s article (Lowell
1998) on “Added Access Points in Theses Cataloging” points out many local needs -
some of which are still poorly accommodated by national cataloging standards. Students
at a given institution want to see what other graduates have produced. They want
examples of other students’ writing and research. They may be less interested in the
specific research presented in the dissertation. For example, they may want a list of recent
dissertations done for the department of psychology. They may want to know which
professor chaired which dissertation committee. They may want similar information for
other schools as well. What kinds of dissertations are being done for the PhD program in
art education at New York University? Since schools like Virginia Tech and the
University of South Florida are providing searches run against the title pages of
dissertations, names of major professors and other data can be retrieved.

In our Position Paper for Electronic Theses and Dissertations (University of South
Florida, 1998) USF librarians address the issue of cataloging as follows: “Full level
cataloging will be continued and records will continue to be added to OCLC. Access
points will include author, title, keyword, LCSH (Library of Congress Subject Headings)
subject headings, level, Department, and Major Professor. It is anticipated that current
levels of staffing will be maintained.”

The decision to add full-blown subject headings is an important topic of discussion. The
most time-consuming part of theses and dissertations cataloging is adding subject
headings. This is the kind of work that necessitates a well-trained, professional cataloger.
Why add subject headings to a cataloging record when keywords for the entire cataloging
record and even full text searching of entire documents are available? As noted above,
one way that users discover dissertations is within the confines of the library’s online
catalog. Without the additional subject headings added to the cataloging record in Figure
1, a researcher using the library's online catalog is not likely to find this thesis on self-
esteem. The title mentions “coping strategies” but neither the term “stress” nor the term
“self-esteem” appears in the title of the thesis. The inclusion of the approved Library of
Congress subject heading (labeled “Subjects, General” in Figure 2) “self-esteem in
children” ensures that users will find this item along with other books on the topic.

Figure 1: Example of Cataloging Record

        Santa-Lucia, Raymond C.
A situational investigation of hassles, uplifts, coping strategies, and adjustment in 3rd-
through 5th-grade children / by Raymond C. Santa-Lucia.
v, 35 leaves : ill.; 29 cm.
Thesis (M.A.)--University of South Florida, 1998.
Includes bibliographical references (leaves 33-35).
Self-esteem in children.
Stress (Psychology) in children.
Adjustment (Psychology) in children.
Dissertations, Academic--USF--Psychology--Masters.


Another reason to add subject headings is that English is a notoriously imprecise
language. The word “pitch” is a good example. The term means different things to
musicians, reading teachers, vocal coaches, architects, engineers, etc. Is it “the pitch of
the roof” or “the pitch of the actor’s voice”? Here are some examples from the University
of South Florida’s WebLUIS online catalog using the keyword search “pitch and

Figure 2: Results of searches using the keywords “pitch and dissertations” – in the
University of South Florida online catalog, WebLUIS

Author, etc.: Goodwin, Mark A.
Title: The effectiveness of Pitch Master compared to traditional classroom methods in
teaching sight singing to college music students / by Mark A. Goodwin.
Published: 1990.
Description: xi, 167 leaves: ill., music ; 28 cm.
Notes: Thesis (Ph. D.)--University of South Florida, 1990.
Includes bibliographical references (leaves 135-141).
Subjects, general: Sight-singing--Instruction and study.
Music--Programmed instruction.
Dissertations, Academic--USF--Music education--Doctoral
Author, etc.: Shaw, Jill D. K.
Title: The relationships in the usage of oral contraceptives and their effects on vocal pitch
and vocal quality: a short term study / by Jill D. K. Shaw.
Published: 1979.
Description: viii, 42 leaves; 29 cm.
Notes: Thesis (M.S.)--University of South Florida, 1979.
Bibliography: leaves 34-36.
Subjects, general: Oral contraceptives--Side effects.
Dissertations, Academic--USF--Speech-Language Pathology--Masters.
Author, etc.: Lutfi, Robert A.
Title: The effects of uncertain mask intensity and frequency on
pitch judgments in the backward recognition masking paradigm / by
Robert A. Lutfi.
Published: 1977.
Description: iv, 54 leaves; 29 cm.
Notes: Thesis (M.A.)--University of South Florida, 1977.
Bibliography: leaves 52-54.
Subjects, general: Psychology, Experimental.
Human information processing.
Dissertations, Academic--USF--Psychology--Masters.

The likelihood that these theses will reach interested readers depends in part on the
addition of subject headings. “Subjects, general” seen in the bibliographic records in
Figure 2 is a designation for a field for approved Library of Congress subject headings.
In spite of the benefits, some libraries choose not to add subject headings due to staffing
constraints - a very real concern in libraries faced with increasing workloads. The
application of authoritative subject headings can be a daunting task.

However, without the addition of subject headings, users may find that they need to
exercise some inventiveness in order to uncover everything a given institution has to
offer. The addition of abstracts as searchable elements of the cataloging record is also
helpful. Some institutions may decide that the addition of abstracts provides enough
additional data for successful searching.

This important access point of subject headings has consistently consumed the most
discussion among librarians. One method for supplying subject headings depends upon
the input of the author as part of the submission process rather than the librarian's
evaluation of the content of the material. This practice came about partially as a method
to save librarians' time and thereby save costs. The theory seemed like good an excellent
strategy but in practice problems arose. The cataloger learned that subjects submitted by
the author were not consistent with standard subject headings. A student in the throes of
submitting a dissertation is not a good candidate for a course of study on Library of
Congress Subject Headings. (For a time, the University of South Florida catalogers tried
setting up meeting times with authors. Even in the face-to-face interview, it was difficult
for the students to contribute useful subject headings. Also, since meetings between
authors and catalogers were deemed necessary, this approach was not a time-saver for the

Since the rules for cataloging dissertations do not provide guidance, this may be one of
the reasons that “local notes” have become a necessary addition for the expedient input of
the bibliographic record for a thesis or dissertation. Catalogers have developed in-house
standards for adding local information. This kind of practice makes searching across
databases problematic. The record becomes searchable but users may need to learn local
rules for search strategies in order to retrieve local notes.
The abstract is provided by the student. As part of the ETD submission process, students
are asked to input information which is used as part of the cataloging record. The student
submits the title of the dissertation, the name of the department, the committee members,
abstract, and other pertinent information. Students are asked to assign keywords to their
dissertation. These keywords enhance the verbiage used in the title and abstract of the
dissertation. Since the students themselves have expert knowledge of the work in hand,
they can best provide synonyms and/or alternate terms that describe the topic of their
dissertation. Thoughtful keyword selection can ensure that the dissertation finds its
reader. Librarians can transfer this information to the record destined for the online

In fact, much of the work of developing the cataloging record can be done automatically
as part of the submission process. The ETD projects can provide catalogers with
important information for descriptive, subject heading and information notes. For
example, at Virginia Tech, catalogers 'copy and paste' the student's abstract into the
cataloging record. Virginia Tech is also developing PERL script which will map the
submission information to a MARC record. This kind of work can be handled by
assistants which will free up the librarian's time to handle other matters such as the
application of subject headings.

Since some libraries will continue to add cataloging records to their online catalog, we
have already mentioned the MARC record as a standard tool for constructing
bibliographic records. There are also other ways to construct a bibliographic record in an
electronic environment. These are mentioned in other chapters of this book. One is
Dublin Core, a metadata (data about data) element set intended to facilitate access of
electronic resources. This is based upon many of the elements used in a MARC formatted
record but constructed for use specifically in a web based environment. Another is the
Text Encoding Initiative (TEI), an international project to develop guidelines for the
preparation and interchange of electronic texts for scholarly research. A TEI is the SGML
(Standard Generalized Markup Language) approach to a bibliographic record. It contains
many of the elements of MARC and the Dublin Core but was from its inception designed
for electronic dissemination of information. As librarians seek to improve access to
electronic material, some of these tools will supplement or eventually supplant the more
traditional cataloging record.

To non-catalogers this attention to details, the discussion of national and international
standards in terms of record elements, subject headings, etc. might appear to be overly
complex. However all this foundation-laying activity benefits the user. As we move
toward constructing the “database of databases,” articulation of record elements becomes

One of the biggest questions for cataloging and identifying ETDs to be faced is the likely
to be the changing nature of a dissertation. As graduate schools decide to accept multi-
media dissertations, librarians will have to develop a language for describing these
formats. Virginia Tech has already seen an influx of dissertations which include
Quicktime movies, audio files, and a variety of graphics files. Until access to all these
formats is transparent to the user, catalogers will need to carefully describe all these file
types. Catalogers have taken this challenge head on. They have developed notes fields to
indicate what software is needed to view all the parts of a particular thesis or dissertation.
These notes are included in the bibliographic record for the ETD.

Files types are bound to change. It is inevitable that software and hardware will continue
to evolve. Adobe Acrobat's PDF format seems to be a current favorite due to the
capability to producing an electronic version that maintains the look and feel of a printed
dissertation. If the nature of the dissertation changes as seems likely, other file types will
emerge. If older materials are transferred to new media, cataloging records will have to
change. Why change the record if the intellectual content has not changed? Cataloging
records provide information about the media in which the material is presented. A long-
playing record provides a different quality of sound than a CD recording even though
both items may be recordings of the same performance of Beethoven’s Ninth Symphony.
Also, the two formats need different devices in order to play back. One could argue that if
both paper copy of a dissertation and its electronic equivalent are available in the same
library collection, then these items need two cataloging records.

This seemingly sensible arrangement can be confusing for users. Combining the
information about the two formats into one cataloging record allows the user to select the
one most suitable at a given moment. The University of South Florida has gone one step
further. If a paper and an electronic copy of a item exist, the item is given a single
cataloging record. Both the paper and the electronic format are indicated. A separate,
additional record for the electronic version is added to the catalog as well so that users
can easily identify material available in electronic format. The ETDs will be among those
items readily identifiable as part of the “electronic partition.”

Cataloging records should indicate the file formats in which the material is stored. Users
will need to know what software and hardware is needed in order to retrieve ETDs.
Catalogers will need to develop routines to change cataloging records as ETDs migrate to
new formats.


In conclusion, there are some points we want to review:

* Strategies that integrate ETD cataloging records into the library's main online catalog
aids the average user who may overlook useful dissertations if only listed in a separate

* Decisions about the level of cataloging provided should be user-driven. This may
necessitate clever use of national cataloging standards to accommodate user needs. Also,
as noted by many frustrated web searchers, full text keyword searching can result in
many hits, few of which are useful.
* Adroit cataloging (which may be embedded in the electronic file itself) generally means
fewer false hits.

* Graduating students are well-advised to think carefully about writing abstracts and
suggesting keywords keeping retrieval in mind. Good keywords can ensure that a
dissertation finds its reader.

* Subject headings are time-consuming to add, but help users to developing precise
searches. Today these are generally added by librarians. Research may lead to automatic
aids or means to add subject headings with less human intervention.

* Librarians should be alert to emerging tools and strategies for describing electronic
documents. This too will ensure that useful material falls into the hands of interested


Lowell, Kay E (1998).
“Added access points in thesis cataloging enhancing public service–
without running athwart input standards.”
Cataloging & Classification Quarterly, vol. 26, no. 2 pp. 57-71.

McMillan, Gail (1996).
“Electronic Thesis and Dissertations: Merging Perspectives”
Cataloging & Classification Quarterly, vol.22, no.3/4 pp.105-125.

University of South Florida Libraries. Electronic Theses and Dissertations Team. (1997)
Draft Electronic Dissertations and Theses Position Paper.

University of Virginia Library. Cataloging Services Department. Ad Hoc Committee on–
Digital Access. (1998). “Final Report.” June 15, 1998.
(, November 21, 1998.

Shared By: