1999_deutsch_remote_access_and_transfer_of_audio_recordings by wuyunyi


           Harmonised Access to Music and Music Information in Libraries
                Libraries Project: PROLIB/HARMONICA 10453
                    Commission of the European Communities
                           LIBRARIES PROGRAMME

            Remote Access and Transfer of Audio Recordings

                          Deliverable Number: D3.6.3

Version:                 1.0
Date:                    30. 11. 1999
Authors:                 Werner A. Deutsch, Siegbert Herla , Werner Kriechbaum
Confidentiality:         Public
Status:                  Final

This document consists of <35> pages plus this cover
                                                             PAGE      DOCUMENT                                VERSION
                                                               1       D3.6.3 Access & Transfer                1.0


1.      INTRODUCTION .................................................................................................. 1
     1.1 Purpose and Scope .......................................................................................... 1
     1.2 Applicability....................................................................................................... 1
     1.3 Acronyms and Abbrevations ............................................................................. 2

2.      ANALOGUE AUDIO ............................................................................................ 4
     2.1 Analogue Tape-Formats ................................................................................... 4
     2.2 Digital Audio Formats ....................................................................................... 4
     2.3 DVD - A Breakthrough in Digital Audio? ........................................................... 5

3.      DIGITISATION ..................................................................................................... 6
     3.1 General ............................................................................................................. 6
     3.2 Digital Audio Workstations (DAW) .................................................................... 7
        3.2.1     Analogue to Digital Conversion – Dynamic Range                                                              7
        3.2.2     Sample Rates and the 24 – 20 –16 bit Story                                                                  7
        3.2.3     Linear PCM Audio 16-bit is not obsolete!                                                                    8
     3.3 New Digital Audio Formats ............................................................................... 9
        3.3.1     Archive File Formats                                                                                        9
        3.3.2     Broadcast WAVE Format (BWF)                                                                                 9
        3.3.3     ‘Unique’ Source Identifier (USID)                                                                          11
     3.4 BWF - A Music Library Audio Format? ........................................................... 12
     3.5 De-Facto/Industry Digital Audio Compression Standards ............................... 12
     3.6 Digitisation Procedure Quality Control ............................................................ 13
     3.7 Signal Enhancement and Signal Restoration ................................................. 13

4.      AUDIO SEGMENTATION AND CONTENT DESCRIPTION ............................. 15
     4.1 Query by Audio Content (QBAC). ................................................................... 15
        4.1.1     Sound File Segmentation                                                                                    15
        4.1.2     Links Refering to Segments                                                                                 16
     4.2 Content Description ........................................................................................ 19
     4.3 Visualisation of Music Signals ........................................................................ 21
     4.4 Future Development of Content Driven Approaches: Audio Description
     Schemes ................................................................................................................ 22
        4.4.1     The MPEG-7 Audio Descriptor Scheme (tentative)                                                             23

5.      A MODULAR ARCHIVE MODEL ...................................................................... 25
     PAGE       DOCUMENT                                VERSION
        2       D3.6.3 Access & Transfer                1.0

     5.1 Acquisition of documents and information ...................................................... 25
     5.2 Archival Storage ............................................................................................. 26
     5.3 Data Management .......................................................................................... 26
     5.4 Administration ................................................................................................. 27
     5.5 Access ............................................................................................................ 27

6.      AUDIO-NETWORKING...................................................................................... 28
     6.1 The Role of Libraries and Archives on the Internet ......................................... 28
        6.1.1     General                                                                                                   28
        6.1.2     A Glimps on Copyrights                                                                                    29
     6.2 Library and Archive Services on the Internet .................................................. 29
        6.2.1     Connectivity                                                                                              29
        6.2.2     Delivery Model                                                                                            30
     6.3 MP3 – An Evolving Digital Music Delivery Sector ........................................... 31
     6.4 Frequently Used Bit Rates .............................................................................. 32

7.      APPENDIX ......................................................................................................... 33
     7.1 A Sample List of Audio Players ...................................................................... 33
                                                  PAGE      DOCUMENT                   VERSION
                                                    1       D3.6.3 Access & Transfer   1.0

1.           Introduction

1.1          Purpose and Scope
             The purpose of this document is to provide an overview and references to
             information on transfer of analogue audio recordings into digital formats
             (digitisation), local and remote storage of data as well as access and re-
             trieval in a library or archive environment. A description of current state of
             the art data acquisition workstations, quality control and archive reference
             models is given. New developments in network capabilities providing better
             service to selected network traffic over various technologies (QoS: Quality
             of Service networking) are in discussion.

                  Analogue Audio: formats still working

                    Digital Audio Workstations (DAW): data acquisition, local storage,
                     segmentation, technical metadata generation, access issues

                  Archive Reference Model: ingest station, archival storage, data man-
                   agement, administration, access

                  Local area network (LAN) configuration: Gigabit, ATM, QoS, tailored
                   services, special requirements of streaming media

                  Internet connectivity: FTP, MP3, Real Audio

1.2          Applicability

             The issues described in this document may be applicable to any sound
             storage, archive or collection. They are applicable to organisations with the
             responsibility of providing information on a temporary basis as well as for
             the long term. When taking the rapid pace of technology changes or possi-
             ble changes in a Designated Community into consideration, there is the
             likelihood that facilities thought to be holding information on a temporary
             basis will in fact find that some or a lot of their holdings will need the same
             kind of attention as that given by permanent archives.

             The deliverable is not intended to offer a de facto standard for digital sound
             document engineering, but rather as help for digitisation, editing, tagging,
             indexing and storage. It is aimed towards library and archive institutions
             that already have or are building up the equipment and expertise to digitise
             sound documents in-house. It addresses the more standard formats of
             sound. If it is planned to digitise primarily historic or fragile and rare sound
             documents or materials in non-standard formats and sizes, it might be con-
             sidered outsourcing the digitisation of these materials to specially equipped
             laboratories or institutions1.

          See D3.6.4 In-house pros/cons. Outsource pros/cons.
PAGE   DOCUMENT                    VERSION
 2     D3.6.3 Access & Transfer    1.0

1.3    Acronyms and Abbrevations

       A/D             - Analogue/Digital
       AES             - Audio Engineering Society
       AIC             - Archival Information Collection
       AIP             - Archival Information Package
       AIU             - Archival Information Unit
       ASCII   -       - American Standard Code for Information Interchange
       BEXT            - Broadcast Extension Chunk (BWF)
       BWF             - Broadcast WAVE file format
       CAD             - Computer-Automated Design
       CAR             - Computer-Aided Radio
       CCSDS - Consultative Committee for Space Data Systems
       CD-ROM - Compact Disk            - Read Only Memory
       CIP             - Catalog Inter-operability Protocol
       CRC             - Cyclical Redundancy Check
       D/A             - Digital/Analog
       DAPA    - Digital Audio Production and Archiving (EBU working group)
       dB FS           - Decibel (relative to full scale value)
       dB m            - Decibel (relative to 1 mW)
       dB r            - Decibel (relative to an absolute reference)
       dB SPL - Decibel (relative to 20 μPa)
       dB u            - Decibel (relative to 0.7746 V) equiv. to dB m at 600 Ω
       dB v            - Decibel (relative to 1 V)
       dB      - Decibel (1/10 Bel)
       DBMS            - Data Base Management System
       DDL             - Data Description Language
       DED             - Data Entity Dictionary
       DFAS            - Distributed Finding Aid Server
       DIP             - Dissemination Information Package
       DLI             - Digital Libraries Initiative
       DR              - Dynamic Range
       DSD     - Direct Stream Digital (1-bit delta sigma technology: SACD)
       DTD             - Document Type Definition
       DVD             - Digital Versatile Disk
       DVD-Audio       - DVD working group specification audio
       EAD             - Encoded Archival Description
       EBCDIC - Extended Binary Coded Decimal Interchange Code
       EBU             - European Broadcasting Union
       ERL             - Electronic Reference Library
       FITS            - Flexible Image Transfer System
       GIF             - Graphics Interchange Format
       HDCD    - High Definition Compatible Digital format (Pacific Microsonics)
       HFMS            - Hierarchical File Management System
       HFS             - Hierarchical File Server
       HTML            - Hypertext Markup Language
       ICS             - Interoperable Catalogue System
       IEEE            - Institute of Electrical and Electronic Engineers
       IMS             - Information Management System
       ISBN            - International Standard Book Number
                              PAGE    DOCUMENT                    VERSION
                                3     D3.6.3 Access & Transfer    1.0

ISO              - Organisation for International Standardisation
LSB              - Least Significant Bit
MPEG             - Motion Picture Expert Group
MPEG     1 - Coding for Moving Pictures and Associated Audio for Digital
                   Storage Media up to 1.5 MBit/s
MPEG     2 - Generic Coding for Moving Pictures and Associated Audio
                   (multichannel audio)
MPEG 1 Audio Layer 1 Audio Coding Scheme (compression ratio: 1:4)
MPEG 1 Audio Layer 2 Audio Coding Scheme (compatible to Layer 1)
MPEG 1 Audio Layer 3 Audio Coding Scheme (compatible to Layer 1 & 2)
MP3      - syn. MPEG-1 Audio Layer 3 (compression ratio 1:10...12)
NARA             - National Archives and Records Administration
NASA             - National Aeronautics and Space Administration
NSF              - National Science Foundation
OAIS             - Open Archival Information System
OCR              - Optical Character Recognition
ODL              - Object Description Language
ODLS             - Oxford Digital Library Services
OPAC             - On-Line Public Access Catalogue
PCI              - Periodicals Contents Index
PDI              - Preservation Description Information
PDMP             - Project Data Management Plan
QBAC             - Query By Audio Content
QBIC             - Query By Image Content
QoS              - Quality of the Service
RIFF             - Resource Interchange File Format
RLG              - Research Libraries Group
SACD             - Super Audio CD (Sony, Philips format, uses DSD)
SGML             - Standard Generalised Markup Language
SIP              - Submission Information Package
Super CD         - DVD-based audio formats, superior to audio CDs.
TEI              - Text Encoding Initiative
UML              - Unified Modelling Language
UNICODE          - Universal Code
WAV              - Windows Wave File
WWW              - World-Wide Web
PAGE   DOCUMENT                      VERSION
 4     D3.6.3 Access & Transfer      1.0

2.     Analogue Audio
       The majority of historic sound documents and many recent digital remakes
       are still based on analogue audio formats. Analogue audio technology has
       matured to an excellent high level of sound quality from the early begin-
       nings of non-linear Thomas Alva Edison´s horn recordings to the almost
       linear transmission chain of today. Generations of audio engineers have
       optimised the linear transfer function of tape recorders, amplifiers and disc
       cutting machines in order to produce noiseless and distortion free re-
       cordings. It was left to our generation to re-introduce nonlinearity when per-
       ceptive coders as MP3 are applied in broadcasting and network environ-
       ments. Perceptive coders perform lossy coding omitting the possibility to
       reconstruct the original signal. Nevertheless, MP3 sounds acceptable, al-
       lows low bandwidth connections and has yet developed to one of the most
       frequently used audio formats for audio transmissions over the Internet.
       The European Broadcasting Union (EBU) lists considerable less analogue
       audio tape formats as digital ones being referenced in the broadcasting
       area. This (4/1997) list is remarkably conservative and several additional
       digital formats have been introduced in the meantime (see also Section 3 of
       this document):

2.1    Analogue Tape-Formats
         A01 6.3 mm         analogue audio             Full track
         A02 6.3 mm         analogue audio             2 channel
         A08 12.5 mm        analogue audio             8 channel
         A16 25.4 mm        analogue audio             16 channel
         A32 25.4 mm        analogue audio             32 channel
         AS2 6.3 mm         analogue audio             2 channel stereo
         AT2 6.3 mm         analogue audio             2 channel stereo & TC
         CCA Compact        Cassette audio

2.2    Digital Audio Formats

           CDA                     Compact Disc Audio
           D24     25.4 mm         digital audio DASH 24 track
           D32     25.4 mm         digital audio PD 32 channel
           D48     25.4 mm         digital audio DASH 48 track
           DA2     DAT format digital audio 2 channel
           DAT     DAT format digital audio Stereo
           DD2     6.3 mm          digital audio DASH 2 channel
           DP2     6.3 mm          digital audio PD 2 channel
           3.5"    data diskette - FD5
           5.25"   data diskette - FD8
           8"      data diskette - H8A
           Hi-8    digital audio 8 channel
           MO      disk            600 MBytes capacity
           M12     MO disk         1 200 Mbytes capacity
                                       PAGE    DOCUMENT                      VERSION
                                        5      D3.6.3 Access & Transfer      1.0

         M13 MO disk            1 300 Mbytes capacity
         NAB NAB                audio cartridge -S16
         A-DAT digital audio    8 channel

2.3   DVD - A Breakthrough in Digital Audio?

      Although people have always been full of expectation for improved sound
      quality available from higher sample rates and greater resolution, technical
      limitations have made it impractical to implement a new standard until now.
      The arrival of DVD — the Digital Versatile Disk — has the potential of trans-
      forming the possibility for improved digital sound. A DVD offers over seven
      times the storage capacity of a CD. This additional capacity allows for an
      improvement in the resolution of the audio signal as well as in sample
      rates. The DVD specification allows for – and manufacturers of audio
      equipment and programme material (originally a group of 10 members was
      holding more than 4000 DVD related patents) use - many different resolu-
      tion levels; but the dominant standard for high quality audio samples at 96
      kHz to a resolution of 24 bits. This translates to 16,777,216 (2^24) different
      possible amplitude levels at a theoretic dynamic range of 144 dB. Com-
      bined with a more than doubled sampling frequency, DVD audio offers over
      500 times the resolution available from CD. DVD audio provides for the first
      time in audio engineering the technical potential to produce and distribute
      music at a sound quality considerably better than human listeners can hear.

      To probe further:

      For a survey on historic sound formats and preservation issues see Har-
      monica Deliverable D3.1 “Analogue Documents, Carriers and Formats”
PAGE   DOCUMENT                      VERSION
 6     D3.6.3 Access & Transfer      1.0

3.     Digitisation

3.1    General
       Although many digitisation projects will be done for purposes of increased
       access to sound archives and library collections primarily, preservation is
       often a natural by-product. Digitisation should therefore be performed with a
       "preservation mindset." This mindset implies2:

          Performing analogue to digital (A/D) conversion and digital to analogue
           (D/A) conversion at the highest sample rate appropriate to the nature
           and the informative content of the originals

          Performing analogue to digital conversion at an appropriate dynamic
           range and sound quality to avoid re-doing the transfer and re-handling of
           the originals in the future - digitise once only

          Creating and storing a linear coded master sound file that can be used
           to produce derivative and compressed sound files in order to serve a va-
           riety of current and future user needs (i.e. data reduced and perceptual
           coded copies for browsing and Internet access)

          Using system components that are non-proprietary

          Using sound file formats, editing systems and data compression tech-
           niques that conform with industry standards

          Creating backup copies of all files on a stable medium

          Creating meaningful metadata for sound files and associated documents
           including cataloguing issues (if appropriate)

          Monitoring and recopying data if necessary

          Outlining a migration strategy for transferring data across generations of
           archive and access technology (plan for obsolescence of current hard-
           and software technology)

          Anticipating and planning for future usage and technological develop-

       This document occasionally suggests the minimum hardware standards as
       well as high end professional audio equipment but libraries and archives
       should not just "do the minimum." Analogue to digital conversion at a higher
       sample rate and resolution rather than to the minimum required is encour-
       aged. One plausible argument in favour of adapting a higher technical

 which correspondingly apply for sound documents.
                                         PAGE      DOCUMENT                    VERSION
                                           7       D3.6.3 Access & Transfer    1.0

        standard from the beginning is given by the development of costs during a
        digitisation project. Cost relation for equipment and staff usually range 1:1
        at the start of a digitisation project and drops to 1:4 after a 4 year period.

3.2     Digital Audio Workstations (DAW)

3.2.1   Analogue to Digital Conversion – Dynamic Range

        The main function of a digital audio workstation is performing a high quality
        analogue to digital conversion of a continuous audio stream generated from
        an analogue signal source as well as converting the digital audio stream
        back to an analogue signal. The quality of the conversion is determined by
        the resolution available from the analogue to digital converter (A/D) and
        digital to analogue converter (D/A). The resolution of an A/D – D/A con-
        verter system is given by:

                                                signal  range(V )
                                resolution 
                                                      2 n bits

        The dynamic range of an A/D - D/A subsystem can conveniently be ex-
        pressed in dB. According to a resolution of 1*x(V), a n*bit system provides

                                      Lx  20  lg (2 n ) dB

        i.e. ≈ 96 dB at n=16 bits, ≈ 108 dB at n=18 bits, ≈ 120 dB at n=20 bits and ≈
        144 dB at n=24 bits.

3.2.2   Sample Rates and the 24 – 20 –16 bit Story

        Sample rates in professional audio and video environment range from 32
        kHz to 96 kHz and 192 kHz (DVD audio), 44.1 kHz and 48 kHz being used
        mostl frequently. Other sample rates occasionally used are: 44.056 kHz,
        47.952 kHz, 64 kHz, 88.112 kHz, 88.2 kHz, 95.904 kHz and 176.4 kHz.

        It is advisable for sound archives and collections to maintain audio worksta-
        tions capable to select sample rates ± 10% off the nominal values. At least
        one unit serving arbitrary sample rates between 5 kHz and 48 kHz should
        be available in order to handle non-standard recordings from several
        sources; otherwise a sample rate converter as a separate functional unit is
        necessary. An overview on popular sample rates and sound file formats is
        given in HARMONICA deliverable D3.2

        High end DAWs use accurate, discrete, multi-bit A/D converters. The A/D
        converters operate at a sample frequency of 192 or 176.4 kHz and employ
        sophisticated digitally subtracted dither to produce both low noise and dis-
        tortion components below -120 dB FS, or less than one part per million.
PAGE    DOCUMENT                      VERSION
 8      D3.6.3 Access & Transfer      1.0

        The 192/176.4 kHz signal is decimated to 96 or 88.2 kHz, 24 bits using op-
        timised filtering.

        In spite of all technology progress CDs are still in 16 bit audio. Several solu-
        tions for handling both the 24 and 16 bit domain have been proposed. In
        order to reduce the 44.1 kHz, 24-bit signal to 16-bits while retaining many
        24-bit audio benefits, soft limiters are applied which allow the increase of
        the peak signal level up to 6 dB without overloading. The peaks are recon-
        structed when decoded increasing dynamic range by 6 dB. For undecoded
        playback the units work as standard limiter. Some DAW units provide a low
        level range extension which gradually increases the gain on low-level sig-
        nals (approx. starting at -45 dB FS) by 4 dB over a 20 dB range.

        As one of several possibilities the final step in the reduction to 16-bits is to
        add high-frequency weighted dither and round the signal to 16-bit precision.
        The dither can be applied to the frequency range of 16 kHz to 22.05 kHz
        leaving the noise floor flat below 16 kHz without influencing the psycho-
        acoustic relevant frequency range for the perception of tonal signals.

        Psychoacoustically designed noise shaping filters are controlled by the
        spectral range of the time varying audio signal. Some audio systems intro-
        duce, as part of the final quantisation, a pseudo-random noise hidden code
        as needed into the LSB of the audio data. The hidden code carries the
        decimation filter selection and peak detection and low level range parame-
        ters. The hidden code is completely inaudible and is only inserted 2-5% of
        the time, effectively producing 16-bit undecoded playback resolution. The
        result is an industry standard 44.1 kHz, 16-bit recording which should be
        compatible with all CD replication equipment and consumer CD players.

        Although DAW producers advertise their signal processing being compati-
        ble to CD-standards, careful verification of the format along CD-Red Book
        specification is necessary, which calls for linear PCM audio at a 16-bit word
        length and 44.1kHz sample rate. Special care has to be taken if sound
        documents from unknown sources are not accompanied by the appropriate
        digitisation side information. Chaining of different noise shaping or dithering
        concepts can produce distortions quite above the hearing threshold, al-
        though each of them is inaudible when listening to them alone.

3.2.3   Linear PCM Audio 16-bit is not obsolete!

        Audio technology has to be seen transitional today. With both DVD-Audio
        and Super Audio CD on the horizon, libraries and archives will face new
        challenges before classic digital audio formats have been acquired suffi-
        ciently. A psychological turning point will probably come when the CD is no
        longer seen as the best available sound quality format by the public. Just
        as the CD in comparison to the LP and the cassette on its way up, the new
        formats will gain consumer acceptance rapidly. Nevertheless, CD format:
        linear PCM audio at a 16-bit word length and 44.1kHz sample rate is not
        obsolete. It will remain the only viable consumer digital delivery format
        building the mainstream in consumer audio electronics for quite a while. An
        estimation of millions of CD-recorders to be sold per year represent a lot of
                                         PAGE    DOCUMENT                      VERSION
                                          9      D3.6.3 Access & Transfer      1.0

        hardware for the survival of the format. Recordable DVD is still in discus-
        sion on different formats (and size: 12 cm, 8 cm?); the projection that one
        DVD can possibly not be run on two different laptops is confusing the
        salespeople as well as the users. DVD as Digital Video Disk will even de-
        pend on the manufacturers choice to make it playable in more than one
        geographical region.

3.3     New Digital Audio Formats

3.3.1   Archive File Formats

        In the early 90s, computer-aided radio (CAR) systems became digital audio
        islands in broadcasting houses. These CAR systems used propriety file
        systems. The music and radio programme exchange between different is-
        lands took a lot of processing time just for file conversion. Out of a variety
        of sound file formats two evolved as a de facto standard: AIFF used in the
        MAC/UNIX world and RIFF/WAVE in the PC domain. This scenario was
        found by an EBU project group P/DAPA (Digital Audio Production and Ar-
        chiving) when negotiating with industry in order to propose a common file
        format for linear audio quality serving the AES/EBU hardware interface
        standard. In order to generate and process descriptive information conven-
        iently, metadata should also be included in the file format. The group de-
        cided to select the widespread RIFF/WAVE format as a proposal for a
        standard. One major advantage of the WAVE file format can be used:
        WAVE files are worldwide native files on all PC platforms and each PC is
        able to play and edit them. WAVE files are also used for audio data import
        and export on several other computer platforms. In order to enable stan-
        dardised audio programme exchange the group developed the so-called
        Broadcast Wave Format. The main issue consisted in the agreement on a
        special designed 'Broadcast Extension Chunk' (BEXT Chunk) for storage of
        additional metadata and descriptive information in sound files.

3.3.2   Broadcast WAVE Format (BWF)

        As libraries and archives receive music documents from quite different
        sources and their users occasionally are located in a broadcasting envi-
        ronment, in future many sound files stored in BWF format may appear. Al-
        though libraries could consider BWF as a useful lib-rary standard, it seems
        questionable to convert all digital audio holdings into BWF as long as audio
        file transfer with broadcasters is not needed often. BWF stands for a com-
        prehensive method to include metadata and links to additional descriptive
        information in sound files but alternative solutions taking advantage from
        the possibility to define user chunks in the RIFF/WAVE format can be ex-
        pected to arise. It should be emphasised that sound data of standard linear
        coded WAVE files remain playable with any wave-player available whether
        or not additional chunks are included; whereas the content of user chunks
        needs to be managed by special application software components.
PAGE   DOCUMENT                     VERSION
 10    D3.6.3 Access & Transfer     1.0

       Fig. 1: the Broadcast WAVE Format (from EBU Technical document 3285).

       The Broadcast Wave Format (BWF) has been defined in EBU standard
       N22-1997. The full specification of BWF, a description of the Broadcast Ex-
       tension Chunk and basic information on Microsoft RIFF format is given in
       EBU standard document Tech. 3285. BWF incorporates ISO/MPEG-2
       Layer II which is intended to be used as browse quality in sound archives.
       A description of MPEG support in BWF is given in Supplement 1 of Tech.
       3285. Further information on BWF as well as the documentation described
       can be obtained from the EBU (http://www.ebu.ch/pmc_bwf_ug.html)
       and from Swedish Radio Corporation (http://www.sr.se/rd/bwf/). In addi-
       tion to the BWF specification, the project group (P/DAPA) published the fol-
       lowing recommendation R85 for programme exchange of audio data files:

              · Sample rate: 48 kHz
              · Resolution: minimum 16 bit / linear
              · Alignment level: according to EBU R68 (headroom of 9 dB)
              · Preemphasis: none
              · Channel formats: mono, 2 channel stereo
              · Signal formats: multi channel >> MPEG, linear PCM
              · BEXT chunk: transfer ahead of the audio data

       BWF does not support all types of RIFF chunks; nevertheless it is compati-
       ble to ISO OSI layer model for information interchange. BWF files can be
       used independently from the transport layer for data exchange in real time
       and file transfer over networks as well as for signal storage on data carriers
                                          PAGE    DOCUMENT                        VERSION
                                           11     D3.6.3 Access & Transfer        1.0

        such as disks or tapes. One single BWF file is capable to hold about 4
        Gbytes of data, corresponding approximately up to 6 hours of linear stereo
        sound signal (16 bit / 48 kHz) or 4 hours (24 bit / 48 kHz). This size of data
        volume is sufficient for the storage and reproduction of almost all analogue
        sound carrier volumes commonly used.

        The following BWF file structure is currently supported:

           The BEXT Chunk contains general metadata such as title, originator,
           archive number etc.

           A Coding History (part of BEXT chunk) describes the transmission
           chain of the current sound signal, providing information such as of
           sound carrier material, recording and playback equipment, analogue-to-
           digital-converter and digital I/O interface card of the PC.

           The Format Chunk is used to specify format information as linear PCM,
           stereo, sample rate and resolution (16...24 bits).

           The Quality Chunk contains information obtained from the digitisation
           procedure, such as a protocol of defects in the analogue recording and
           transmission chain, tape drop-outs, clicks, thumps, hiss, print-through
           and additional notes (in preparation).

           The Cue Sheet Chunk provides cue points, tags and segment data as
           offset, start time and duration of a specific content in the file (in prepara-
           tion; see also Audio Segmentation and Content Description).

           The Wave Chunk contains the audio samples of the digitised sound

3.3.3   ‘Unique’ Source Identifier (USID)
         In order to idetify BWF source sound files an unique identifier has been
         proposed which serves as a prime link between sound files and
         associated data in a database system. Applications can use the identifier
         instead of the file name for reference.

           The EBU proposed the following structure:
           The <OriginatorReference> field is as a sequence of 32 ASCII
           characters (not a string) provided in the BWF to contain an unique
           identifier of the file. The organisation originating the BWF file is
           responsible for the allocation of the USID.
           Country code: (2 characters) is based on the ISO 3166 standard
           Company code: (3 characters) is based on the EBU Technical
           information I30-1996.
           Serial number: (12 characters extracted) This should identify the
           machine’s type and serial number.
           OriginationTime (6 characters, HHMMSS) This should be sufficient to
           identify a particular recording in a human-useful form in conjunction with
           other sources of information, formal and informal.
PAGE   DOCUMENT                     VERSION
 12    D3.6.3 Access & Transfer     1.0

           Random Number (9 characters 0-9) Generated locally by the recorder
           using some reasonably random algorithm. This number serves to
           separate files made at the same time, such as stereo channels, or
           tracks within multitrack recordings.

           Example of an USID
           Generated by a Tascam DA88, S/N 396FG347A, operated by
           Radiotelevisione Italiana, at time: 12:53:24

                   UDI Example: ITRAIDA88396FG347125324098748726

3.4    BWF - A Music Library Audio Format?

       As has been pointed out, BWF is an advanced real life example for combin-
       ing sound and (technical) metadata in a comprehensive file format for
       broadcasting storage and retrieval applications as well as for programme
       exchange between different partners. One of the outmost advantages of
       the approach transporting all necessary metadata within the sound files
       themselves is easy management of sound files and metadata in the LAN
       and across different hard- and software platforms. For the archive or library
       starting from scratch, a medium size digitisation project can be evolved on
       learning by doing and just by implementing a minimum standard as BWF
       requests. For up to several ten thousands of sound files of a homogenous
       collection, general purpose computer systems and file archive servers pro-
       vide file management and browsing tools convenient for use.

       Larger collections will not succeed without additional data base manage-
       ment and a specified file system structure. Links from (or to) an existing
       catalogue have to be updated at regular intervals and sound and metadata
       should be “frozen”. In any case, before digitising, libraries and archives
       should think very carefully about implementing appropriate file naming con-
       ventions for sound files and associated documents along content related
       classification or indexing schemes in order to support effective data re-
       trieval later on.

       The current BWF structure is certainly superior when the percentage of
       metadata associated to sound is small in comparison to the data size of the
       sound itself. It is applicable when search and browsing is primarily done on
       the basis of listening to sound data and not by cruising through large vol-
       umes of content related metadata. One very useful extension of the BWF
       concept could be providing appropriate links to recently evolving content
       description standards as MPEG-7 promises to become.

3.5    De-Facto/Industry Digital Audio Compression Standards
       Although sound archives and music libraries should not even think about to
       select any perceptual (lossy) coded sound format as an internal archive
       standard, they certainly will have to deal quite a lot with manyl sound
       documents encoded in different industry audio compression schemes. Ar-
       chives and libraries should therefore be prepared to read all formats on
                                        PAGE    DOCUMENT                       VERSION
                                         13     D3.6.3 Access & Transfer       1.0

      which relevant content will be stored as well as they could provide an user
      sevice to convert nonstandard formats into generally readable ones. Today
      audio formats are dictated by the music contents the consumer appreciates
      rather than by technical quality criteria which would support technically op-
      timized solutions at reasonable costs: A list of audio coders and links to
      useful pages on this issue always is incomplete at that point in time it would
      be written (for further information see Section 6 of this deliverable and
      among others, http://www.research.att.com/~gjm/imusic/links.html).

3.6   Digitisation Procedure Quality Control

      Real time sound analysis of the analogue and digitized audio stream is per-
      formed during digitisation in order to control the analogue to digital conver-
      sion process. For this reason, digital audio workstations provide several
      useful features such as:
          · automatic detection of start and end position of the audio signal
          · automatic detection of pauses during the recording
          · automatic detection of the noise floor level
          · automatic detection of clicks and impulsive distortions
          · automatic detection of analogue media drop-outs
          · average value of signal to noise ratio of the audio signal
          · average value of frequency bandwidth of the signal
          · average value of stereo correlation
          · average value of level dynamics

      Signal parameters extracted by means of digital signal processing as listed
      above are used to control the sound quality of the digitised waveform. Any
      errors occuring during digitisation of analogue recordings should be de-
      tected on the fly. Standard quality criteria, obtained from long term statistics
      are matched against actual values measured. A transfer (quality) protocol is
      usually added to the technical metadata set.

3.7   Signal Enhancement and Signal Restoration

      Although Quality Control may report signal degradations which could be re-
      paired by means of digital signal enhancement algorithms, a strictly linear
      archive copy should be produced in any case. This advice is justified be-
      cause any signal enhancement to be considered can be performed on digi-
      tal copies of the linear archive document with no loss of generality. As digi-
      tal copies are 100% exact replica of the original no information loss takes
      place; whereas signal filtering or any other processing carried out already
      at the time of digitisation would inevitably introduce nonlinear transfer func-
      tions which frequently cannot be inverted later on. As a general rule nonlin-
      ear manipulations of the audio signal, such as lossy data reduction, com-
PAGE   DOCUMENT                      VERSION
 14    D3.6.3 Access & Transfer      1.0

       pression algorithms, filtering and signal restoration, should be carried out at
       the end of the transmission chain only and never on archive copies.

       Some manufacturers of digital audio workstations and audio software
       packages provide typical audio signal processing algorithms, working either
       in the time or frequency domain, such as:

              DeNoiser for the reduction of broadband noise (hiss)
              RepairFilter for elimination of quasi-static noise (mains hum, hum
               from dimmers, stereo pilot tones)
              DeScratcher for elimination of scratches on vinyl record and shellac
              DeClicker for automatic elimination of clicks
              DeCrackler for automatic elimination of crackles
              DeClipper for automatic elimination of digital clipping
              DropOuter for automatic drop-out restoration
              VPIs for remastering and sweetening
               (parametric EQ, 1/3 octave EQ, linear phase EQ)

       The list above can be completed by additional sound conditioning functions
       which belong to the standard equipment of a sound recording studio, usu-
       ally appreciated by the professional sound engineer; these are among oth-
       Compressors, Limiter, Loudness Maximizer, De-esser, Free Shaper
       (redithering / noise shaping), stereobasis manipulator etc. For exact
       metering (level control) tools such as PhaseScopes (stereoscope) , FFT-
       Analyzers, 1/3 octave Analysers, Real Time Spectrograms,
       MatrixScopes etc. are applied.
                                            PAGE    DOCUMENT                       VERSION
                                             15     D3.6.3 Access & Transfer       1.0

4.        Audio Segmentation and Content Description

4.1       Query by Audio Content (QBAC)3.

          With the increasing acceptance of Digital Libraries and archives as storage
          systems for music and multimedia data, efficient architectures for context-
          oriented search of non-textual data becomes a pressing need. Whereas
          both, automatic shot detection and query by image content (QBIC), have
          opened inroads to characterise and search for visual information, no
          equivalent methods exist for streaming audio data. Many audio data have a
          common property which still images and video material do not have: It can
          be expressed in two corresponding forms, either as a 'textual' representa-
          tion (music score, transcript) or as a realisation (sound recording).

          Working with Streaming Media raises new issues related to collecting, stor-
          ing, annotating, indexing, browsing, use of meta-data, and retrieval inter-
          faces for libraries and audio/video archives. While in some cases, the ref-
          erence linking of an entire audio file to a score or a text file might be suffi-
          cient just for "listen to", the correspondences of a search on the score or
          text file require a fine-grained audio data segmentation. Once such a fine-
          grained linkage between textual representation (narrow transcription) and
          acoustic realisation is established, the textual representation can be used
          to facilitate QBAC (Query By Audio Content). A language-aware search
          engine locates the desired elements in the score or text. The result of the
          query are segments of audio material (audio objects), identified by fine-
          grained links between score or text and the sound recording.

4.1.1     Sound File Segmentation
          Sound segments (audio objects) are addressed sample by sample from
          the beginning of a recording. Usually, sample number offset and dura-
          tion of the segment is referenced. Segment identifiers are created by
          automatic segmentation procedures or by programme supported man-
          ual editing. Cue in and cue out points, edit decision lists and play lists
          can be linked to segment identifiers and collected in sound file directory
          tables. The segment structure should not be limited to a single segmen-
          tation layer and relative addressing of segments should be supported.
          In order to facilitate context related queries, overlapping segmentation
          should be implemented. Sound file directories may grow to consider-
          able size (several thousands entries for each file) in the course of cu-
          mulative segmentation work sessions. Furthermore, as one and the
          same audio signal has to be segmented differently according to special
          user requirements, it seems appropriate to store the segmentation

       AC308 ACTS-DICEMAN: Distributed Internet Content Exchange using MPEG-7 and
      Agent Negotiations.
PAGE        DOCUMENT                                 VERSION
 16         D3.6.3 Access & Transfer                 1.0

            metadata in separate files which serve as input to an audio content re-
            lated database.

            Archived sound data usually do not change anymore after digitising.
            What has to be updated in regular intervals are metadata and metadata
            links, the location of suitable cue in points, segment sequence proce-
            dures for rapid browsing, the creation of clips and several further ar-
            chive staff and user accessible functions. The concept to manage the
            metadata separated from the sound files enables fast and easy access
            and virtual (non-destructive) processing and access of sound4.

                                     audio streams

           sound file 1
                 sample 0

                              offset/duration/segment ID/optional links/content description..

                  sound file 2
                        sample 0

               links between segments of different sound files

                                   sound file 3
                                         sample 0

            Fig. 2: segmentation of sound files in multiple layers. Sound segment
            addresses, segment identifier, optional links and content description are
            stored in a sound file directory which is separated from the sound.

4.1.2       Links Refering to Segments

            Several standard, industry-standard, and proprietary standards can be
            used for to express the links needed to refer to segments. However, since
            in many cases neither the audio recording nor the description or other au-
            dio segment linked to it can or should be changed, a simple hyperlink
            scheme as in HTML is not sufficient. Instead it is necessary to use so-
            called ‘independent’ hyper links, which are external to the files they link. In
            addition, these hyper links must be bi-directional (description to audio and
            vice versa) to allow both, applications like querying the text and playing
            back the speech, as well as playing the audio and switching e.g. to the dis-
            play of the score at an arbitrary point in time.

                                        PAGE    DOCUMENT                         VERSION
                                         17     D3.6.3 Access & Transfer         1.0

    Currently the most advanced linking mechanism available is the HyTime il-
    ink [HyTime97]5. Similar functionality can be provided by other implementa-
    tions and will is likely to be provided by XML linking mechanisms. In the fol-
    lowing discussion SGML and HyTime will be used as an example. An inde-
    pendent link consists of three components:

    Anchors, which are regions or points in a text or audio document.
    Locators, which locate or address anchors
    Links, which link or connect locators.




         Figure 3: The canonical bi-directional indipendent link from HyTime.

    The ISO/IEC HyTime standard offers locators to uniquely address elements
    (sub trees) within SGML documents. This mechanism assigns a list of inte-
    ger values to each node of an SGML tree. The list of integer values is the
    'road map' to get from the root of the SGML document (tree locator '1') to
    the specific SGML element using several 'traffic rules' to generate the tree
    locator integer list. The rules are:

            1. The 'journey' starts at the root element and adds one integer for
               each horizontal level below on the way down the SGML tree.
            2. The root element of the SGML tree has the tree locator '1'.
            3. Each integer stands for one horizontal level of the SGML tree.
            4. Each integer value is generated by counting the number of
               nodes from left to right. Only the children of the node above are
               taken into account.

  [HyTime97] ISO/IEC JTC 1/SC18 WG8 N1920rev, “Information-Processing - Hypermedia/Time-
based Structuring Language (HyTime) 2nd edition,” ed. Charles F. Goldfarb, Steven R.
Newcomb, W. Eliot Kimber, Peter J. Newcomb. May 1997.
PAGE   DOCUMENT                        VERSION
 18    D3.6.3 Access & Transfer        1.0

               5. The left most node (left most child) of a node above is assigned
                  the integer value '1'.

       Starting at the root element (tree locator '1') and taking all the above rules
       to generate the tree locator into account, the nodes (elements) of the fol-
       lowing abstract tree will be addressed by the tree locators listed in the table

                                    Element         Tree locator
                                    A               1
                                    B               11
                                    C               12
                                    D               111
                                    E               112
                                    F               121
                                    G               122

           Table 1: Tree locators for the abstract SGML tree:

           In SGML, the DTD fragments needed for a link between audio and text
           would be expressed in a form similar to :

                   <!ELEMENT      audioloc - - (#PCDATA) >
                   <!ATTLIST      audioloc
                     id           ID   #REQUIRED
                     HyTime       NAME #FIXED "queryloc"
                   <!ELEMENT      textloc     - O         (#PCDATA) >
                   <!ATTLIST      textloc
                     HyTime       NAME        treeloc
                     id           ID          #REQUIRED
                     locsrc       CDATA       #IMPLIED
                   <!ATTLIST      audio
                                                PAGE     DOCUMENT                           VERSION
                                                 19      D3.6.3 Access & Transfer           1.0

                        HyTime         NAME   ilink
                        id             ID     #IMPLIED
                        linkends       IDREFS #REQUIRED

            Using above definitions, the link itself would be expressed in a form similar
            to :

               <audiolink linkends="text audio">
               <audioloc id="audio">>file=test.wav start=588 end=24703
               <textloc id="text" locsrc=test.txt>1 1 2 1 1 1</textloc>

4.2         Content Description
            In the previous section it has been shown that content description of
            sound is closely related to narrow segmentation of sound files. Whereas
            the automatic creation of video objects6 has already been successful by
            use of semiautomatic analysis tools nothing comparable exists for au-
            dio. The limitations experienced in the content description of streaming
            video equally apply – even to a larger extent – for audio data. Further
            development is needed for:

                segmentation tools requiring fully automatic, real-time analysis

                applications which may allow some convenient level of user guid-

                means to exploit the complementary skills of user and machine in
                 solving complex analysis and interpretation problems

                tools that assist in the integration of the different solution frag-

                means to establish consistency of the final description, especially
                 in corres-pondence to video sequences in case of multimedia

                applications for the integration of audio content description in an
                 existing catalogue framework

            Some open issues have been seriously addressed by the CUIDAD 7
            group, contributing to MPEG-7, building a bridge between low level au-

          ACTS-MoMuSys (Mobile Multimedia Systems)
        CUIDAD is a European Working Group coordinated by Ircam - Centre Georges Pompidou
      in order to gather all institutions, industrials and users interested in the . ESPRIT project
PAGE   DOCUMENT                      VERSION
 20    D3.6.3 Access & Transfer      1.0

       dio descriptors based on the signal (amplitude spectra and parameters
       extracted from the acoustic waveform), music descriptors (music scores
       on a symbolic level) and semantic descriptors on the perceptual level.

       Fig. 4:    functional diagram of audio content processing (from CUIDAD

       A working model of a description scheme for sound clips and sound effects
       has been developed by the American company Musclefish

       The most comprehensive attempt to standardise – among others – the de-
       scription of audio content is the MPEG-7 effort of the ISO/IEC
       JTC1/SC29/WG11 working group. MPEG-7 is still an ongoing process and
       the first version of the standard is expected late in 2001. The following table
       gives an overview of the most significant milestones met and the timeline to
       the completion of the standard:

              October 16, 1998: Issued formal Call For Proposals
              February 1, 1999: Deadline for submission of MPEG-7 proposals.
              February 15-19, 1999: MPEG-7 Evaluation Ad Hoc group meeting:
                                      PAGE    DOCUMENT                     VERSION
                                       21     D3.6.3 Access & Transfer     1.0

            March 15-19, 1999: Developed the first Experimental Model (XM
             1.0) and Core Experiments.
            July 6-10, 1999: the first Audio Core Experiments, in Speech Rec-
             ognition, Sound Effects, Instrument timbre, and Melody, were initi-
            December, 1999: MPEG-7 Working Draft established.
            October, 2000: MPEG-7 Committee Draft
            February, 2001: MPEG-7 Final Committee Draft
            July, 2001: MPEG-7 Draft International Standard
            November, 2001: MPEG-7 International Standard
            Additional information about MPEG in general and the upcoming
             MPEG-7 standard can be found at http://drogo.cselt.stet.it/mpeg/
             and http://www.darmstadt.gmd.de/mobile/MPEG7/

      Although certain details proposed by MPEG-7 may not yet be applicable for
      libraries and archives without appropriate applications ready for usage,
      general guidelines for audio document classification can already be de-
      rived. Among them are:

                  Speech, Speech Recognition Systems
                  Singing voice
                  Timbre or Instrument
                  Instrument Description
                  Melody, Melody Description
                  pitch or note (spectrum description)
                  tempo or rhythm (temporal description)
                  Surround sound
                  Sound Effect Classification

      Worth mentioning are classification and content description systems origi-
      nating from musicology as well as from the music industry, the latter being
      progressively present in the commercial download business on the Internet.

4.3   Visualisation of Music Signals
      Visualisation of audio signals by so called Spectrograms is employed
      whenever music signals cannot be represented in 'textual' formats as music
      score or transcription. This happens frequently in ethnomusicology or when
      acoustic and perceptual differences between individual interpretations of
      the same piece have to be documented. Visualisation of music is per-
      formed in real time so that visual and audio representation of the music can
      be observed synchronously. Spectrograms can be read similar to piano
      rolls comprising the running time axis on the abszissa and the frequency
      scale on the ordinate. The strength (level) of spectral components is coded
      in an appropriate color scale. Spectrogram icons (sound thumbnails) are
PAGE      DOCUMENT                       VERSION
 22       D3.6.3 Access & Transfer       1.0

          used in order to provide fast access to a large number of sound files and
          segments stored in a database.

          Fig. 5: structuring of audio by visualisation (narrow band spectrogram) of an
          extract (11 min) from Bruckner 8th symphony.

4.4       Future Development of Content Driven Approaches: Audio Descrip-
          tion Schemes8

          As far as MPEG-7 has already created a structure for “Obvious Audio Ds”
          four types of audio Ds have been considered:

                  1.) media based Ds,
                  2.) non-perceptual low-level audio characteristics,
                  3.) perceptual low-level audio characteristics and
                  4.) high-level audio characteristics.

          In order to meet best the requirements for the description of audio of key
          applications the group obviously abandoned the demand for generality and
          concentrated on the following areas and audio content sets:

       This section refers to: Obvious Audio Descriptors / Description Schemes; Source:
      MPEG-7 Audio Reflector; Nov. 1999.
                                        PAGE    DOCUMENT                        VERSION
                                         23     D3.6.3 Access & Transfer        1.0

              1.) pure music,
              2.) pure speech,
              3.) pure sound effects and
              4.) arbitrary soundtracks applications.

        For each of the four application areas it has been provided:

            a typical application scenario in order to prove its relevance,
            the effort required to implement it has to be stated,
            and a statement whether there’s a chance to automatically determine
              the values under consideration.

        Currently MPEG-7 is considering the followin Audio Content Set:

           Radio          A1      Radio news broadcast
           Music          A2     "Two Ton Shoe" Rock album
                          A3      Bruckner's Te Deum, and Mozart's Requiem
                          A4      Original composition, a capella. Voice only
           Audio          A5      Short sequences of solo instrument and other
                          A6      Pop song based on an A-A-C motif

        According to this pragmatic point of view a step by step creation of useful
        tools for automatic segmentation and tagging of sound of a large number of
        audio applications is expected. The currently considered descriptors are
        among many further possible as the following:

4.4.1   The MPEG-7 Audio Descriptor Scheme (tentative)

        Descriptors for pure music:
                  Archiving music
                  Descriptors for musical genres
                  Descriptors for a composer
                  Descriptors for an artist
                  Descriptors for an artist group (e.g. band, ensemble, orchestra,
                  Descriptors for single pieces of music
                  Searching music collections
                  Structuring music, descriptors to capture musical structures
                  Filtering music broadcasts
PAGE   DOCUMENT                     VERSION
 24    D3.6.3 Access & Transfer     1.0

                   Music education / teaching
                   Music editing
                   Manipulating musical content
                   Music production

       Descriptors for pure speech data:
                 Searching speech collections
                 Structuring speech collections
                 Rhethorics education

       Descriptors for sound effects:
                 Searching sound effects collections
                 Movie synchronisation

       Descriptors for arbitrary soundtracks:
                 Searching Radio program collections
                 Searching TV program or Movie collections
                 Filtering Radio programs
                 Filtering TV programs
                 Film production / editing
                 Film education
                                            PAGE    DOCUMENT                          VERSION
                                             25     D3.6.3 Access & Transfer          1.0

5.        A Modular Archive Model

          Long Term Preservation of digital data involves issues of physical storage,
          software and data standards as well as migration plans and disaster man-
          agement. In addition, the digital archive involves technology required for
          global, multimedia, object-oriented databases with emphasis on adding
          value along dimensions such as: real-time, fault tolerance, security, and
          Quality of Service (QoS). The technologies and standards needed to be
          applied for the archiving/preservation/retrieval of digital documents is cur-
          rently a major concern in the archives community. Life time of digital stor-
          age media and systems is extremly short in comparison to analogue sound
          and multimedia carriers libraries and archives got used to for many years.
          The question is whether computer industry will actually provide small and
          medium tailored storage solutions for individual libraries in the near future.
          As an alternative, backup and archive systems located at large computer
          centers and data farms frequently provide secure digital storage containers
          and rental storage which could also be used for small and medium volumes
          of digital data. Librarians and archivists have to force themselves to over-
          come the psychological barrier not to see a digital document, not to handle
          it physically anymore, not even to have it in-house – and it is still available,
          just because it is stored in a secure digital container!

          In any case whether the digital storage system is located and managed by
          the institution in-house or by a professional computer center remotely the
          following key functions have to be provided by the archive system (for a
          Reference Model for an Open Archival Information System see: OAIS9:

5.1       Acquisition of documents and information
          An entity, which provides the services and functions to accept new
          documents and adjunct information from external or from internal
          acquisition units under Administration control and prepare the contents for
          storage and management within the archive. Acquisition functions include
          receiving sound documents and adjunct information, performing quality
          assurance on the document package, generating an Archival Information
          Package (AIP) which complies with the archive’s data formatting and
          documentation standards, extracting Descriptive Information from the AIPs
          for inclusion in the archive database, and coordinating updates to Archival
          Storage and Data Management. Different collections may have different
          description schemes fitting into the archive’s data formatting and
          documentation standards.

       Reference Model for an Open Archival Information System (OAIS): Consultative
      Committee for Space Data Systems CCSDS 650.0-R-1 RED BOOK May 1999.
 PAGE         DOCUMENT                            VERSION
   26         D3.6.3 Access & Transfer            1.0

 5.2          Archival Storage
              An entity, which provides the services and functions for the storage,
              maintenance and retrieval of AIPs. Archival Storage functions include
              receiving AIPs from the acquisition unit and adding them to permanent
              storage, managing the storage hierarchy, refreshing the media on which
              archive holdings are stored, performing routine and special error checking,
              providing disaster recovery capabilities, and providing AIPs to Access to
              fulfill user requests.

 5.3          Data Management
              An entity, which provides the services and functions for populating,
              maintaining, and accessing both Descriptive Information - which identifies
              and documents archive holdings - and administrative data used to manage
              the archive. Data Management functions include administering the archive
              database functions (maintaining schema and view definitions, and
              referential integrity), performing database updates (loading new descriptive
              information or archive administrative data), performing queries on the data
              management data to generate result and query sets, and producing reports
              from these sets.

                                                 DATA MANAGEMENT
                                                   Database Updates,
                                                Database Administration,
                                                  Query Management,
                                                 & Reports, Cataloguing
                      ACQUISITION                      Database                        ACCESS
                  Receiving Information.                                            Dissemination,
 Music and            Quality Control,                                                 Delivery,
   Music         Descriptive Information,                                        Data Transfer Control,
Information                                                                                                 User
               Digital Audio Workstations       ARCHIVAL STORAGE                       & Reports          Community
 Producer        Coordination of Updates      Receive Data, Provide Data,           User Interface
                                                    Migrate Media,
                                               Management of Storage,
                                                  Disaster Recovery
                                                    Storage Media
                                                    Backup Media

                                  Acquisition and Access Agreements, IPR Management,
                                        Archive Standards, System Configuration,
                                  Physical Access Control, Archival Information Updates,
                                              User Support, User Monitoring.

                   Fig. 6: Outline of a functional diagram of a modular archive system.
                                       PAGE    DOCUMENT                      VERSION
                                        27     D3.6.3 Access & Transfer      1.0

5.4   Administration
      An entity, which manages the overall operation of the archive system.
      Administration functions include soliciting and negotiating acquisition and
      access agreements with document providers and IPR owners, auditing
      acquisition material in order to ensure that they meet archive standards,
      maintaining configuration management of system hardware and software,
      evaluating the contents of the archive and periodically requesting archival
      information updates, providing system engineering functions to monitor and
      improve archive operations, developing and maintaining archive standards
      and policies, providing user support, monitoring changes in the Designated
      User Communities, interacting with library and archive Management, and
      activating stored requests.

5.5   Access
      This entity supports users in determining the existence, description,
      location and availability of information stored in the archive and allowing
      users to request and receive documents and information products. Access
      functions include communicating with users in order to receive requests,
      applying controls to limit access to specially protected data and information,
      coordinating the execution of requests to successful completion, generating
      responses (Dissemination Information Packages, result sets, reports) and
      delivering the responses to users.
PAGE    DOCUMENT                        VERSION
 28     D3.6.3 Access & Transfer        1.0

6.      Audio-Networking

6.1     The Role of Libraries and Archives on the Internet

6.1.1   General

        The Internet has developed into a mass communications system. By the
        end of 1998, the Internet had more than 100 million users located through-
        out the world, and that number is growing rapidly. More than 100 countries
        are linked into exchanges of data, news and opinions and more than 1 mil-
        lion servers are sending information within the net. Obtaining access to in-
        formation from the net is open to all users who have a personal computer
        or other access device, the appropriate software and the ability to gain ac-
        cess to the system (referred to as “obtaining connectivity”), usually provided
        by an Internet Service Provider (ISP).

        Increased availability of bandwidth, faster modems, improved and scalable
        audio coding schemes are supporting the development of library music in-
        formation services. These include the implementation of technology that al-
        lows the digital conversion (digitisation) and storage of mass amounts of
        data as described in previous sections. Future developments might not in-
        clude a significant change of networt structure but rather a substantial in-
        crease of capabilities of access devices to download large quantities of
        data; the development of higher bandwidth distribution systems for real
        time access and streaming media. The latest research on building, under-
        standing and using digital archives and digital libraries indicates the devel-
        opment of sophisticated routers that transmit information; the advent of
        user-friendly software allowing access to information stored on any con-
        nected computer (search machines and intelligent assets) etc. in the near

        Libraries and archives have to decide - depending on their regional policies
        - whether or not they will use the capability to implement
               online services for streaming media,
               services for file transfer or
               services for traditional catalogue access only.
        What are the benefits for the library user and what will the consequences
        be for the library organisation and its management? Which are the neces-
        sary tools and protocols on the Internet and what are the provisions to pro-
        tect the Intellectual Property Rights (IPRs)?
                                         PAGE    DOCUMENT                       VERSION
                                          29     D3.6.3 Access & Transfer       1.0

6.1.2   A Glimps on Copyrights

        As libraries and archives can play many different roles for various types of
        transmissions over the Internet, it is, consequently, important to examine
        these roles separately in determining which activities may give rise to which
        liability. The definition, whether an Internet transmission is to be classified

              a communication to the public or as
              a communication by telecommunication or as
              a process involving reproduction of data, which takes place between
               two identifiable partners,

        might develop as a central legal issue. Several legislative bodies in different
        countries raised a number of issues including: whether there is a communi-
        cation by telecommunication to the public as soon as musical sound docu-
        ment or music information is electronically transmitted, made available, up-
        loaded, downloaded or browsed? Is a communication over a network for
        which access is restricted, a communication to the public? IPR organisa-
        tions certainly will argue that a communication to the public already occurs
        as soon as the end user can access a library document from a computer
        connected to a network.

        The role of a library or an archive as content provider is given as soon as

              content is assembled and placed as a collection of files
               on a server to allow the files to be accessed.

        Usually, the library or archive organisation which has overall responsibility
        for the content of the site (the site owner) also operates and maintains the
        server on which the site is located. This model is normally followed by lar-
        ger and medium sized libraries and archives (for a detailed description on
        the role of content providers on the Internet, associated legal issues and
        business arrangements see e.g. : Public Perfomance of Musical Works
        published by the Copyright Board Canada). For small volumes of digital
        data it is recommended to rent a server as well as storage capacity in a se-
        cure digital storage container provided by large computer centers.

6.2     Library and Archive Services on the Internet

6.2.1   Connectivity

        Digital library services use the Internet as a network of local and remote
        computers and computer networks designed to receive and forward bytes
        of data grouped into packets between end nodes (the source and destina-
        tion computers). The basic communication service of the Internet consists
        of two components:

              the Internet addressing structure and
              the Internet delivery model.
PAGE    DOCUMENT                      VERSION
 30     D3.6.3 Access & Transfer      1.0

        The addressing of computers in the Internet is managed by unique Internet
        Protocol adresses (IP address). The format of an IP addess is given by a
        combination of integer numbers, usually in the range of 000 and 255, as fol-

        The unique IP address allows the control of the World Wide Web traffic.
        Host names can be allocated according to their geographical location and
        access patterns providing geographical as well as temporal data are ob-
        tained from each connection. In order to make the addressing easier,
        slightly more user-friendlier domain names are generally used instead of IP
        addresses. These names are translated automatically back to their associ-
        ated IP addresses by means of the Domain Name System (DNS), operated
        by all IAPs for use by their subscribers. Changes of domain names and IP
        addresses have to be carried out in co-ordination with the IAP. The domain
        names together constitute the Internet’s addressing structure. Once the
        connection is established, the service can be initiated providing the appro-
        priate software is running on the participating computers.

        Fig. 7: installing an IP address on a PC under MS Windows. This is not a
        valid IP address!

6.2.2   Delivery Model

        The delivery model contains several delivery modes for the transmission of
        music and music information with the aim of sending and requesting infor-
        mation over the Internet. Originally, the network providers tried their best to
        deliver data but would not provide commitments as to the quality of the
        service (QoS, e.g., commitments as to bandwidth or reliability). Generally,
        the user requests the document required in a unicast pull mode during a
        connection. Although it is now possible to request a minimum of required
        bandwidth and a maximum delay (e.g., that packets will be transferred
        within a specified period of time, see Harmonica deliverable 3.6.1 & 3.6.2,
        RSVP Reservation Protocol), participants in the professional music busi-
        ness still consider the Internet as inadequate in performance. Moving from
        streaming analogue-based audio transmission to full audio bandwidth
        packetized digital delivery systems, it is evident that in spite of significant
        progress in network technology there are many difficulties yet to overcome.
        Libraries and archives have to decide which of the possiblities they accept
        as apllicable for their service.

        Alternative delivery modes of providing music and music information over
        the Internet involve streaming media as well as multicasting. Real Audio,
                                               PAGE      DOCUMENT                          VERSION
                                                 31      D3.6.3 Access & Transfer          1.0

           currently the market leader for delivering real-time audio and video (includ-
           ing music)10, provides servers capable to broadcast multiple streams;
           nevertheless, the delivery system for multicast is almost the same as for
           unicast because each recipient still receives its individual copy. Further key
           players in the real time and music download business are – among many
           others - Liquid Audio, WinAmp and the MP3 community. The selection of a
           specific delivery for a music information service has to be decided along the
           nature of the delivery mode intended to be implemented, whether a direct
           relationship between the library or archive with the end user can be estab-
           lished or not. In particular, whether the end user or the library organisation
           is paying for the usage of the information or the work provided.

6.3        MP3 – An Evolving Digital Music Delivery Sector

           WinAmp, Sonique, MusicMatch, RealJukebox, MP3.com, Listen.com,
           Liquid Audio, Lycos MP3, 2look4, FileQuest.com, Audiofind, MP3friend,
           Emusic are just a small selection of sites and search machines of tools and
           possibilities for playing, encoding, searching, browsing and downloading a
           huge amount of music and music information over the Internet.

           A couple of years ago, MP3 was just an audio compression format which
           originally has been developed by Fraunhofer IIS-A mainly for broacasters
           programme transfer.Today, MPEG Layer-3 is one of the most advanced
           audio coding schemes like MPEG-2 AAC (Advanced Audio Coding). In the
           meantime, MP3 is a growing technology standard for storing and distribut-
           ing audio on a much broader basis, and is revolutionizing the way audio is
           transmitted over the Internet. Several manufacturers of audio processing
           tools and signal processing workstations support MP3 by implementing
           user-friendly export options.They feature high quality audio transmission
           and support of constant bit rate (CBR) as well as variable bit rate (VBR),
           encoding at bit rates of up to 320 kbps.

           MP3 has become a Net phenomenon that is currently in the center of an
           enormous controversy. That is because MP3 allows people with an Internet
           connection to bypass record stores (and cashiers) and download CD-
           quality music by their favorite artists - for free. MP3 is not welcomed by mu-
           sicians and record companies, who expect their sales figures to drop. How-
           ever, record companies and music publishing houses themselves are
           adopting this format for promotion purposes and music-on-demand over the
           Internet. Music industry is still discussing which of the partners in the game
           will be the loosers, which the winners und who will dominate the market?

           The home production of MP3-files is easily performed by means of CD rip-
           pers. CD rippers are programmes that extract - or rip - music tracks from a
           CD and save them onto the hard drive (Audiograbber). Once the tracks are
           located on the hard drive, they are converted to the MP3 format. Many CD
           rippers have MP3 encoders built-in (such as MusicMatch Jukebox); or a

         For a demonstrator system of a digital sound archive see
      http://www.kfs.oeaw.ac.at/DLI/home.html or http://www.sb.aau.dk/Jukebox/edit-report-1.html
PAGE   DOCUMENT                     VERSION
 32    D3.6.3 Access & Transfer     1.0

       separate encoder utility, such as MP3Enc. is needed. Rippers are used to
       store the music programme of the current choice on handy hard drives.

       For further interesting music sites see: DMX - Digital Music Express ; TCI
       cable service: 95 music programs;

6.4    Frequently Used Bit Rates

       CA*net3 CANARIE–Bell Canada 40 GBit/s            optical Internet Oct.1998
       Internet2 US new national backbone 9.6 GBit/s network for end of 1999
       Internet2 US new national backbone 2.4 GBit/s network for end of 1998
       Internet2 US national backbone 22 Mbit/s         infractructure April 1998
       ATM OC-12                                622 Mbit/s
       ATM OC-3                                 155 Mbit/s
       100-BaseT / FDDI LAN                     100 Mbit/s
       T3                                       45 Mbit/s
       10-BaseT Ethernet LAN                    10 Mbit/s
       T1                                       1.5 Mbit/s
       Digital HDTV                     40-60 Mbit/s 5.1 audio uncompressed
       Next generation DVD (Blue Laser)         23 Mbit/s
       DVD-ROM                                  11.08 Mbit/s
       Digital DVD-Audio (uncompr.) 9.6 Mbit/s 6 channels max. 96 kHz, 24 bit
       Digital TV, DVD-Video (NTSC) 6-10 Mbit/s 5.1 audio, NTSC video compr.
       Multichannel audio compressed 224-640 kbit/s 5.1 channels, Dolby Digital
       Compact disc                     1.14 Mbit/s     44.1kHz, 16 bit, stereo
       Stereo audio uncompressed        1.536 Mbit/s 48 kHz, 16 bit, stereo
       Stereo audio compressed ("MP3")          20-128 kbit/s MPEG-2 Layer 3
       Normal telephone channel         64 kbit/s       mono, limited bandwidth
       Telephone modem                  14.4- 56 kbit/s ITU V.90 modem 56 kbit/s
       Cable modem (with Ethernet card) 50-200 kbit/s up to 10 Mbit/s theoretical
       ISDN                                     64-128 kbit/s FM stereo quality
       ISDB (Integr.Services Digital Broadc.) 150 Mbit/s NHK trans. 21 GHz/ch
       ADSL (a new telephone service)           512 kbit/s      uses standard wires
       ADSL high-speed modem                    1 Mbit/s
       Program data                             10 kbit/s
       Facsimile (fax)                          20 kbit/s
       Still picture                            70 kbit/s
       Tell Text                                100-200 kbit/s
       Audio graphics                           800 kbit/s/ch
       MIDI                                     32.5 kbit/s     per 16 channels

       Table 2: bit rates currently in use for different audio and multimedia ser-
       vices (from AES WP-1001 Technology Report TC-NAS 98/1: Networking
       Audio and Music Using Internet2 and Next Generation Internet Capabilities
                                       PAGE    DOCUMENT                   VERSION
                                        33     D3.6.3 Access & Transfer   1.0

7.    Appendix

7.1   A Sample List of Audio Players

      1.00 for Macintosh

      a2b Music Player 1.00b8 for Macintosh

      a2b Music Player 2.0 for Windows 95/98/NT

      ARIES Breathe MP3 player 0.91 for Windows 95

      Aries Mod Player 1.0 for Windows 95, NT 4.0

      Audio CD Player for Windows 3.1

      Audioactive Player 1.2a for Macintosh PPC

      Audioactive Player 1.3 for Windows 3.1

      Audioactive Player 1.9 Beta for Windows 95/NT

      Beatnik Player 2.03 for Macintosh

      Beatnik Player 2.03 for Windows 95/98/NT

      CD player Maximus 3.3 for Windows 95/98/NT

      Cubic Player for Dos

      Destiny Media Player 1.31 for Windows 95/98/NT

      Digital Music Player for OS/2

      DSM Player 1.04 for Macintosh

      Dual Module Player for OS/2
PAGE   DOCUMENT                     VERSION
 34    D3.6.3 Access & Transfer     1.0

       Hyperprism H-PPC Player for Macintosh

       iNERTiA PLAYER for Dos
       http://www.hitsquad.com/smm/programs/iNERTiA_PLAYER/ ,

       Liquid Player 5.0 Preview for Macintosh

       Liquid Player 5.0 Preview for Windows 95/98/NT

       Melody Player 2.0 for Windows 95

       Microsoft Windows Media Player 6.4 for Windows 95/98/NT

       MIDI Player V1.55 for Windows 95

       Midi Synthi Player 5.8 for Windows 95/98/3.1

       Midisoft Internet Media Player v3.08 for Windows 95/98/NT

       Mikey Player 98 4.1 beta for Windows 95

       Mikeys mp3 Player for Windows 95, 98

       Mini MIDI Player v1.12 for Windows 95

       MM Player 4.02 for Windows 95/98/NT

       MM Player Pro 4.02 for Windows 95/98/NT

       MODPlug Player 1.40 for Windows 95/98/NT

       Mpeg Audio Player 1.21 for Macintosh

       Multi Module Music Player 1.00b4a for Windows 95

       Musician's CD Player for Windows 95

       NoteWorthy Player 1.50 for Windows 3.1
                                 PAGE    DOCUMENT                   VERSION
                                  35     D3.6.3 Access & Transfer   1.0

NoteWorthy Player 1.55a 16 bit for Win 3.x

NoteWorthy Player 1.55a for Windows 95/NT

PH Player for Atari

Player PRO Direct-To-Disk 0.1b for Macintosh

PROcessu CD Player 2.02 for Windows 95/98/NT

Real Player G2 Update 1 for Windows 95

RealPlayer 5.0 for Windows 3.1

RealPlayer G2 for Windows 95

Shockwave 6 Flash Player for 68K for Macintosh

Simple CD Player 2.3 for Windows 95/98/NT

SSEYO Koan File Player V2.2 for Windows 95/98/NT

Streaming Audio Player 0.8 beta for Windows 95/98/NT

ThrottleBox Player 1.2 for Windows 95/98/NT

TitleTrack CD Player v2.1 for Macintosh PPC

True Speech Audio Player for Mac (PPC) for Macintosh

True Speech Audio Player for Mac 68k for Macintosh

Ugly CD Player 2.1 for Macintosh

Unreal Player Max 2.02 for Windows 95/98/NT

Upscale Pro MIDI Player for Windows 95
PAGE   DOCUMENT                        VERSION
 36    D3.6.3 Access & Transfer        1.0

       Variable Speed CD Player for Windows 95

       Wired Planet player for Windows 95/98/NT

       Xing MP3Player

       ya cd player 2.5 for Macintosh

       Yo!MPEG Player v1.0.2.79 for Windows 95/98

To top