Setting up a shopping cart in CONTENTdm by zpf24999

VIEWS: 25 PAGES: 2

									                               Oregon State University (OSU)
                          CONTENTdm Controlled Vocabulary Analyzer
                                   Modified: April 26, 2004
                                         Terry Reese
                                        541-737-6384
                                terry.reese@oregonstate.edu




Background:
The Controlled Vocabulary Analyzer really came about out of a need for some sort of authority
control mechanism in our CONTENTdm collections. One of the difficulties with CONTENTdm is
maintaining a single, master vocabulary list for specific collections – particularly when multiple
users are creating and adding metadata. The Controlled Vocabulary Analyzer then allows me to
quickly generate a series of reports that reflect metadata terms currently setup up as controlled
vocabulary and for what project – as well as view metadata terms that are actually in use within a
project or projects.

The script can be run in two modes: list mode and report mode. In list mode, the Script will
generate two files. The first file is a master list of all available controlled vocabulary terms for a
particular metadata element. The second file is a master list of all controlled vocabulary terms
actually used within the project for a particular metadata element. The second mode, report
mode, generates 9 separate reports generating data on both the controlled vocabulary lists and
actual usage of metadata elements.

Video Examples:

        Analyzing Selected Collections -- Size: 880 K
        Analyzing all collections -- Size: 398 K

Potential Uses:

        Maintaining a master list of controlled vocabulary elements for particular Dublin Core
         fields. This is probably most useful in the cases of Creator, Contributor,
         Coverage.Spatial and Subject.
        Identifying conflicting terms or names between projects (which would affect cross project
         searching) or within an individual project.

System Requirements:
    PERL

Arguments:

        server: Name of the CONTENTdm server that you are looking to query.
        projects: (OPTIONAL) You can limit the analyzer only to a specific list of projects. The
         projects argument should be constructed as a colon delimited list: i.e.--
         dna:archives:streamsurvey. This argument is checked against the servers catalog file,
         and so long as the collection is present, it will be analyzed.
        dc: Name of the Dublin Core element that you wish to query. The element can be
         passed in CONTENTdm shorthand or in Dublin Core notation, including the dc: or
         dcterms: namespace (example: dcterms:spatial)
        nicks: This term specifies a project and a CONTENTdm nickname. If the nicks value is
         present, the data in projects and dc are ignored.
        debug: Set to 1 to turn on report mode. Set to 0 (or don’t use) to activate list mode.
Generated Files (List Mode):

      Master Controlled Vocab. list: This is a complete list of all Controlled Vocabulary Terms
       from all projects on the designated server.

      Master Controlled Vocab. list for actual in use items: This is a complete list of all
       Controlled Vocabulary Terms for all projects on the designated server that are currently
       used in a project’s metadata.


Generated Files (Report Mode):

      Tab Delimited Term/Count list: Creates a tab delimited list noting each defined controlled
       vocabulary term, as well as the number of times the term has been defined within one’s
       CONTENTdm projects.

      Tab Delimited Cross Term usage list: Creates a tab delimited list that notes the
       controlled vocabulary term and projects that this term is defined in. This file only includes
       elements that are defined in multiple projects (i.e., just because a term is defined as a
       controlled vocabulary element does not mean that it has actually been used within the
       project)

      HTML Info list -- Displays Term and Projects: This is an HTML generate table that
       includes the same information as found in the Tab Delimited Term/Project list. The file
       basically notes what terms are defined in what projects.

      Controlled Vocabulary Usage Table: This is an HTML report that notes if a project utilizes
       a controlled vocabulary for a particular Dublin Core Element and the number of elements
       that do not use (or use) a controlled vocabulary.

      Tab Delimited Term/Project list: This is a tab delimited list that notes what terms are
       defined in what projects.

      In Use Tab Delimited Term/Count List: The terms that are in this list are terms that are
       actually in use (appear in the metadata) of a project. This counts the number of times
       that a term has been used in a project. This does not count the frequency of use within a
       project, but the number of projects that the word has actually been used in.

      In use Tab Delimited Cross Term usage list: This is a tab delimited list that notes cross
       term usage between collections. So if in project1, you’ve used Rivers within the subject
       element of your metadata and in project2, you’ve used Rivers within the subject element
       of your metadata and in Project3 you’d only defined Rivers as an available controlled
       vocabulary term the, the term Rivers would be outputted, in addition to Project1 and
       Project2.

      In use HTML Info List -- Displays Term and Projects: Displays the same information as
       found in the In Use Tab Delimited Term/Project list. Basically, its an HTML
       representation of a report that notes each Controlled Vocabulary term currently in use
       and the projects that the term appears in (i.e., is used within the project metadata).

      In Use Tab Delimited Term/Project list: Basically, its an Tab Delimited representation of
       a report that notes each Controlled Vocabulary term currently in use and the projects that
       the term appears in (i.e., is used within the project metadata).

								
To top