New collaborations in the e-humanities

Document Sample
New collaborations in the e-humanities Powered By Docstoc
					PARADISEC background,
  Pacific and Regional Archive for Digital Sources in Endangered Cultures



 current structures, and
thoughts on international
     collaborations


                Linda Barwick, University of Sydney
DELAMAN workshop, MPI Nijmegen, 29 November 2004
                              PARADISEC
                               structure
                               CIs: Cliff Goddard
                               Hugh de Ferranti



                                         CIs: William Foley
                                         Allan Marett             Audio Archiving Unit
                  CIs: Andrew Pawley     Jane Simpson             Director: Linda Barwick
                        John Bowden                               Audio: Frank Davey
                        Malcolm Ross                              Project Liaison: Amanda Ha
                         Alan Rumsey

CIs: Steve Bird
    Nick Evans
                                               Store account - web interface
    Cathy Falk
                                               Stuart Hungerford
Janet Fletcher
    John Hajek         Project Manager
                       (Metadata guru)
                       Nick Thieberger
             PARADISEC
                  rationale not
• prioritises Asia-Pacific region materials
  otherwise catered for;
• provides a rational framework for prioritising and
  managing University research recordings using
  international archival formats and standards;
• implements IP arrangements tailored to
  University needs and practices;
• involves researchers in specialist description of
  resources;
• streamlines consortium processes to salvage
  important recordings and make them available for
  research in a timely and cost-effective way
               Research
           applications
• Making Australian research available internationally
• Fieldwork - use for elicitation and documentation, and for
  language learning in preparation for fieldwork
• Return of materials to communities
• Digital tools for optimal transcription and analysis
• Comparative studies - historical recordings give time
  depth for area language and music studies
• Better understanding of diversity - data from some
  languages only in older recordings
• Incorporation of primary data in presentations and,
  ultimately, publications
   Staged approach
• Metadata - 1623 records, to make
 resources discoverable even if not yet
 digitised
• PIs and content metadata need to be
 assigned before digitisation (some
 refinement during process)
• Repository - 807 items digitised to date,
 some complex e.g. fieldnotes (page
 images) or transcripts accompanying
 tapes
      Metadata November 2004
• 1623 records in the metadata repository
 with data from 24 countries in Asia-Pacific
 (Australia, Chile, Cook Islands, Fiji, French
 Polynesia, Hong Kong, Indonesia, India, Japan,
 Korea, Lao, Malaysia, Federated States of
 Micronesia, Myanmar (Burma), New Zealand,
 Palau, Papua New Guinea, Reunion, Singapore,
 Solomon Islands, Taiwan, Tonga, Vanuatu,
 Vietnam)
Metadata OLAC harvest
   Repository contents
• Repository totals 26 November 2004
  • total files: 2582
  • total items: 807
  • total size: 1.0TB
  • total hours audio: 627.3 hours
  • file types: .wav, .mp3 (1040); .tif, (179),
    .jpg (46), .pdf (34), .txt (3), .rtf (8), .xml
    (32)
       Repository Collections
Bradley (5hr) McIntyre (10hr)
Capell (9hr)* Margetts (17hr)                        AC1
                                                     AM2
                                                     AM3
                                                     AM4

Corris (6hr)   Rumsey (17hr)*                        AR1
                                                     BE1
                                                     CLV1


Crowther (2hr) San Roque (1hr)                       DB1
                                                     DG3
                                                     DL1
                                                     KM1

Donohue (3hr) Sam (4hr)*                             LS1
                                                     LSR1
                                                     MC1

Dutton (266hr) Tepano (19hr)                         MC2
                                                     MD1
                                                     MK2


Fedden (7hr) Thieberger (39hr)                       MT1
                                                     NT1
                                                     NT2
                                                     NT3

Foley (23hr) Toulmin (35hr)                          NT4
                                                     RL1
                                                     SAW 2

Gardner (56hr) Voorhoeve (33hr)*                     SF1
                                                     TD1
                                                     TT1
                                                     W F1

Kartomi (2hr)* Wurm (2)*                             W S1




Laycock (29hr) Evans (Hons thesis)
Lawton (3hr) Thieberger (PhD thesis)
McElhanon (41hr) * Ingestion ongoing November 2004
PARADISEC Repository Languages November
2004



                           PAPUA N. GUINEA
                         Abau                   Dima               Kinalaknga      Mari                Qld Pidgin
      INDIA              Ambonese Pidgin        Dimadima           Kimi            Maria               Rabuka
                         Angoram (Kanduanuin) Dina                 Kiriwina        Mekeo               Raepa Tati .
      Rajbangsi          Angoram (Moim dialect) Doga               Koiari          Melpa               Saliba
                         Aomie                  Domu               Koita           Mian                Samo
                         Arapesh                Doromu             Koitabu         Mid-Wahgi           Sene
                         Arifama                Doura              Kokila    SOLOMONS
                                                                                   Migabac             Sepik Tok Pisin
                         Aunalei       PALAU    Efogi              Kokoro Babatana Mindik              Sialum
                         Auwim         Palauan  Efogi Dialects     Komba Ririo Miniafa                 Sinaugoro
                         Awomo                  Emo                Kopar           Mogoni
                                                                             Ruviana                   Sona
                         Ba                     Enivilogo          Koriki    VareseMom                 Suau
                INDONESIABalawaia               Fore               Koriko    Lau Mor                   Suku
                Asmat Barai                     Fuyugey            Kosorong Santa Cruz
                                                                                   Motu                Surai
                Brat     Baruga                 Gabadi             Kovai           Muhiang Arapesh     Taboro
                Hatam Barupu (Warapu)           Ginuman            Kovio           Nabak               Tairuma COOK
                Inanwatan                                                     VANUATU                  Tauade
                         Be'anivia              Gwedena            Kubuirubu       Naga
                ManikionBiage                   Herei              Kuman      South Efate
                                                                                   Namanadza           Tobo       ISLANDS FRENCH
                Moi      Bibo                   Hiae Motu          Kumukio Bislama Naoro FIJI          Tok Pisin Rarotongan
                Ningrum Binandere               Hiri Motu          Kuni       Lelepa
                                                                                   Nara                Tolai      Pukapuka
                                                                                                                            POLYNESIA
                Sahu                                                   NEW                  Lauan
                         Bodinumu               Hube               Kunimaipa                        TONGA
                                                                                   New Ireland Pidgin Uberi                 Tahitian
                Sebyar Boera                    Hula                   CALEDONIA
                                                                   Kwale           Ngala               Ubir
                                                                                                    Tongan                           CHILE >>>
                Tinam Boine                     I'ai               Laimodo         Nomu                Ubir Gonjoe                   Rapa Nui
                                                                       Dehu
                Todahe Boku                     Ikega              Mada'a          Notu                Vesilogo
                Tok PisinBoridi                 Ioma               Magi            Ondoro              Vioribaiwa
                Yahadian Bouxula                Isaka (Krisa)      Mâgobineng      One (Onne)          Wamora
                         BratMomire             Kaipi              Magore          Onjab               Wangun
                         Buin                   Kairi              Maisin          Ono                 Wiga
                         Burum                  Kambot             Maiwa           Opao                Wosera
                         Chimba                 Kanga              Managalas       Orokaiva            Yele.
                         Chirima                Karama             Manam           Orokolo             Yewudu
                         Daga                   Karawari Lg        Manubara        Ouma                Yimas
                         Darava                      (Ambinwari)   Manumu          Paiwa               Yoba
                         Dawawa                 Karukaru           Mapei           Police Motu
                         Dedua                  Kâte               Mapena          Porome
         Regional links
• Institute of Papua New Guinea Studies
• Vanuatu Kaljoral Senta
• Archive of Maori and Pacific Music, U.
  Auckland
• University of Hawai’i
• New Caledonia - Tjibaou Cultural Centre
• Indonesia - UIN, Jakarta
• Malaysia - Universiti Malaya
• Rapa Nui - Museo antropologico P.
  Sebastian Englert
• Micronesia - Historical Preservation Office,
         Audio Ingest
• Initially ingested as raw WAV on AudioCube
 5 Dell 670 workstations running Wavelab
 (2005 will add remote Pyramix
 workstations)
  • Masters 24-bit 96khz Broadcast WAV
    Format (uncompressed audio with
    encapsulated metadata)
  • Some lower rate if digital original (e.g.
    16bit 48khz from DAT)
• WAV > BWF by Quadriga software
  • derivatives produced by batch processing
    - CD-audio quality (16-bit, 44.1khz) and
    mp3 quality(128bps)
 Digital preservation
• “Azoulay” server partitioned for working files
  and archive partition for sealed masters -
  current capacity 750GB (>3TB in 2005)
• Sealed masters archived to 100GB data
  tapes on University of Sydney LTO Mass
  Data Storage System (high-low watermark
  script) - duplicate data tapes kept at 2
  locations on campus
• Sealed masters mirrored to APAC national
  Store facility (Canberra) nightly - nearline
  storage
• Password-protected online access to Store
  facility
PDSC data flow
          Networking
• Main campuses (University of Sydney,
 University of Melbourne, Australian National
 University) connected by Grangenet (next
 generation research network, 10Gbps
 connections)
  • Pay subscription, not traffic costs
• Satellite campus UNE connected by AARnet
 (Australian research and education network
 - currently billed traffic cost, 155Mbps
 connection)
• Both with connections to APAN community
                                        QuickTime™ and a
                                    TIFF (LZW) decompressor
                                 are neede d to see this picture.



 (Asia Pacific Advanced Networks) - potential
 for linking to regional and international R&E
 networks - potential traffic costs an issue
              Storage
• Australian Partnership for Advanced Computing
                                            QuickTime™ and a
 National Facility Mass Data Storage System -
                                        TIFF (LZW) deco mpressor
                                     are neede d to se e this picture.


 Hierarchical Storage Manager system
  • Funded by consortium of Australian higher
    education bodies
• Tape robot system - can handle 1.2PB
  • PARADISEC will add 2-3TB per year once
    satellite ingest commissioned
  • Current horizon of facility 2008 - project
    PARADISEC collection up to 9TB by then
  • Will need to apply to host material/share data
    from other DELAMAN collections
                Streaming
• GrangeNet streaming server currently in trial mode
 - only available within network
• Soon to have automatic copying of main collection
 to streaming server
• Foresee higher demand for access when scaled
 streaming access to excerpts available; but also
 greater resources needed to mount and manage
  • Will depend on researchers’ provision of
    timecoded transcripts/glosses
  • Access and authentication protocols yet to be
    developed
  • Testbed for citation/integration into e-publications
             Software
• Initial metadata database in Filemaker Pro 6
  with periodic XML dumps for OLAC static
  harvesting
• Currently beingported to MySQL/PHP to allow
  dynamic harvesting and other functionality
• Python software for managing repository and
                                      QuickTime™ and a



  website (Stuart Hungerford, ANU)
                                  TIFF (LZW) decompresso r
                               are neede d to see this picture.




• Developing Java-based geographic search
  interface (TimeMap)
• All based on Open Source tools
             Implications
• Implementations will change over time - foundation
  for cooperation must be agreements and alignment
  of strategic objectives
  • Minimal shared standards needed on formats,
    ethics, description, rights - what else?
• Possibility of staged modular approach
  • federated discovery platform
  • proof-of-concept pilot studies/trials
    • targeted data sets for exchange
    • dark hosting/mirroring
    • tools development and testing
                    Issuesidentify and
• Transnational projects - how to
  coordinate international funding opportunities?
• Projections of international traffic & storage
  charges - funding implications
• Sustainability of our collections - how to cost
  overheads and source long-term funding
  commitments
• DELAMAN governance and administration
  structures? How to resource and support without
  duplication/reinventing the wheel, adding to
  administrative burden?
• How to involve all stakeholders (including
   APAN Bangkok 2005
•E-science workshop: Toward a semantic web for digital data
archives (convenor V. Balaji, Princeton)
•Immense quantities of digital data and images are now archived and publicly
available through the web. These include domain-specific data archives,
covering such domains as weather and climate, seismology and geophysics,
astronomy and particle physics, as well as images and digital copies of non-
textual human cultural production. Describing, cataloguing, searching and
locating information within digital data and image archives is one of the grand
technological challenges of the semantic web era. This session will draw
together participants from diverse fields of science and the humanities to
share their experience on metadata, standards and techniques for access to
large digital archives.

•Tentative Titles of presentations:
      • 1) The Hierarchical Data Format for EOS (HDF-EOS), Richard
           Ullman, NASA Goddard Space Flight Center (Invited)

       •   2) Metadata Requirements for Global Climate Models, V. Balaji,
           NOAA Geophysical Fluid Dynamics Laboratory

       •   3) DELAMAN?? Remote presentation…
            PARADISEC gratefully
            acknowledges support
                   from:
• Partner Universities (Sydney, Melbourne, ANU,
 UNE)
• Australian Research Council LIEF scheme
• Australian Partnership for Sustainable
 Repositories (SORRT testbed)
• Australian Partnership for Advancedc m
                                  Q ui k Ti e ™ and a
                        TI FF ( Unc om pr es s ed) dec om pr es s or
                                                    s c
                           ar e neede d t o s ee t hi pi t ur e.




                                                                       Computing
• Grangenet
• ANU Internet Futures
           Contact us
•         http://www.paradisec.org.au


• Linda.Barwick@paradisec.org.au
    (Director)


• Nicholas.Thieberger@paradisec.org.au
    (Project Manager)
      Relevant URLs
• PARADISEC website http://paradisec.org.au/
• PARADISEC repository login
  http://store.apac.edu.au/cgi-bin/pdsc-
  v3.0.cgi/login
• PARADISEC streaming trial
  http://paradisec.org.au/streamingtrial.html
• Transcript page image trial
  http://www.austehc.unimelb.edu.au/~gavan/lana
  /hdms.htm
• TimeMap digitiser tool proof of concept
  http://acl.art.usyd.edu.au/TMDigitiser/

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:1
posted:10/3/2012
language:English
pages:24