DISTRIBUTED ENCODING ENVIRONMENT BASED ON GRIDS AND IBP INFRASTRUCTURE by decree

VIEWS: 10 PAGES: 19

									    DISTRIBUTED ENCODING
         ENVIRONMENT
    BASED ON GRIDS AND IBP
       INFRASTRUCTURE

           Petr Holub*‡ and Lukáš Hejtmánek*

*Faculty   of Informatics and ‡Institute of Computer Science,
                    Masaryk University, Brno
                    and ‡ CESNET, Prague
                        Czech Republic
                                      Motivation
• Huge production of multimedia and esp. video content
      – education (lectures, educational movies), science, fun, etc.
• Need for transformation (transcoding) from source
  formats to formats suitable for downloading and
  streaming
      – very computationally demanding
• Problems with storage capacity

• BUT: We have great Grid infrastructure! :-)

TERENA Networking Conference 2004, Rhodes, Greece                      2
                           Used Infrastructure
• MetaCenter Grid Infrastructure in Czech Rep.
      – PC clusters
         • more than 80 dual processor (PIII and P4) nodes with 2
           GB RAM and fast scratch disk
         • GE and Myrinet interconnection
         • Scheduling system: PBSPro
         • clusters are cheap and grow fast!
      – SGI machines, Alphas...
• Distributed Data Storage (DiDaS)
      – 15 TB of IBP based distributed storage
TERENA Networking Conference 2004, Rhodes, Greece                   3
  MetaCenter, DiDaS, CESNET Network




TERENA Networking Conference 2004, Rhodes, Greece   4
                                 IBP Overview




• exNode - serialized XML metadata
      – collection of capabilities of allocated IBP arrays
      – essential for file access
• We use AFS for storing exNodes
TERENA Networking Conference 2004, Rhodes, Greece            5
                            Scheduling Model
• Selection of best hosts
      – based on Completion Time Estimate (CTE)
• Data location optimization
      – selection of best storage depots
      – prefetch support
• Simplified CTE


      – problem with network performance estimate bD,p(t)

TERENA Networking Conference 2004, Rhodes, Greece           6
                 Scheduling Algorithm (1/2)

• General scheduling  NPO class
      – for uniform processors and jobs of different size
• Our greedy algorithm  PO class when processors and
  depots are connected via a complete graph
      – takes advantage of uniform task size
      – formal proof of correctness
      – for common graph, the scheduling belongs to  PO class
        again as greedy algorithm might prevent maximum utilization
        of depots

TERENA Networking Conference 2004, Rhodes, Greece                     7
                 Scheduling Algorithm (2/2)




TERENA Networking Conference 2004, Rhodes, Greece   8
                               Implementation
• Distributed Encoding Environment
      – for steering transcoding process
• libxio library
      – for enabling IBP in applications
• relies on transcode and HelixProducer for actual
  data transcoding
      – many input/output built in transcode formats: MPEG-1,
        MPEG-2, MPEG-4 (DivX, MS MPEG...), DV, RAW, etc.
      – RealMedia and others through external compression software
        (e.g. HelixProducer)
TERENA Networking Conference 2004, Rhodes, Greece                9
                              libxio library

• Provides equivalents for standard UNIX I/O functions
      – open, close, read, write, fttruncate, lseek, stat, fstat, and lstat
• IBP URI format


      – without lors:// prefix, local file is accessed
      – local_path/file specifies serialized metadata
      – short form lors:///local_path/file is available
        for reading
• IBP enabled transcode based on libxio
TERENA Networking Conference 2004, Rhodes, Greece                         10
        Distributed Encoding Environment
                     Overview




TERENA Networking Conference 2004, Rhodes, Greece   11
  Distributed Encoding Environment (1/3)




• lors tools are used for uploading from editing stations
  (Win32, MacOS X)
• remultiplexing for proper video/sound interleaving

TERENA Networking Conference 2004, Rhodes, Greece           12
  Distributed Encoding Environment (2/3)




• image transformations are performed using transcode
      – image size reduction, de-interlacing, noise reduction, color
        corrections, audio resampling and cleaning

TERENA Networking Conference 2004, Rhodes, Greece                      13
  Distributed Encoding Environment (3/3)




• IBP-enabled servers
• IBP-enabled client applications
TERENA Networking Conference 2004, Rhodes, Greece   14
                      Pilot User Groups (1/2)
• Lecture recording @ Faculty of Informatics, MU
      – 20 hrs/week, new lecturing halls with automatic video
        acquisition
         • HW conversion of analog signals to DV using Canopus
           ADVC-100 boxes
      – several target formats
         • high quality RealMedia (768576 @ 25 fps, 3 Mbps)
         • low quality RealMedia (384288 @ 15 fps, 56-768 kbps)
         • DivX (384288 @ 25 fps, 1CD)


TERENA Networking Conference 2004, Rhodes, Greece                  15
                      Pilot User Groups (2/2)
• Neurosurgery department at St. Anna University
  Hospital in Brno
      – large archives of operation recordings
      – they are willing to make them available to students of
        medicine
      – some editing is necessary: to select interesting pieces only
        and to anonymize patient
      – publishing to CESNET RealMedia streaming server




TERENA Networking Conference 2004, Rhodes, Greece                      16
                                   Future Work
• Deployment of new scheduling systems
      – DataGrid/EGEE, GridLab, or something else?
• Network traffic prediction service
      – suitable for distributed data storage
      – support for regularly running jobs
      – support for in-advance bandwidth allocations
• GUI for DEE



TERENA Networking Conference 2004, Rhodes, Greece      17
                          Acknowledgements
• CESNET Development Foundation projects 017/2002
  (DEE) and 018/2002 (DiDaS)
• CESNET Research Intent MSM 6383917201
• Miloš Liška, Luděk Matyska, Eva Hladká and
  MetaCenter staff




TERENA Networking Conference 2004, Rhodes, Greece   18
Thank you for your attention!

            Q/A?

								
To top