Data Handling at Fermilab and Plans for Worldwide Analysis

Document Sample
Data Handling at Fermilab and Plans for Worldwide Analysis Powered By Docstoc
					Data Handling at Fermilab and Plans for Worldwide
                    Analysis



                 Vicky White
     Computing Division and D0 Experiment,
                   Fermilab
                                          Outline

    (I) -- Solutions already implemented
                 in use today - HEP expts, SloanDigitalSky Survey,Theorist
                  Lattice Guage Computation
                 operational experience with the Mass Storage Component
    (II) -- Solutions being implemented for Collider Run
          II with upgraded detectors (March 2001)
                 Building and testing data handling solutions for CDF and D0
    (III) -- Moving onwards - to the future
                 SDSS and NSF KDI
                 SAN‟s
                 Particle Physics Data Grid
                 Monarch and planning for CMS
    (IV) -- Conclusions

CHEP2000                                                                          2
                     Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
  (I) Solutions already implemented

(the Hierarchical Mass Storage Component of them )




CHEP2000                                                                   3
              Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
              The „Old‟ central mass storage system




CHEP2000                                                                      4
                 Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
              FMSS quota usage updated at Sat Feb 5 01:00:00 CST 2000

              Group Exp FMSS Quota (KB)        Used (KB)
              ==========================================
              g022 g022 00000040000000.0       00000011057801.7
              ktev ktev 00006000000000.0     00005615922181.0
              sdss sdss 00002000000000.0     00001985392386.0
              canopy canopy 00005200000000.0     00005033291488.0
              mssg mssg 00001000000000.0       00000004800833.0
              e781 e781 00003000000000.0      00002911541573.0
              e831 e831 0002000000000.0      00001712487988.0
              minos minos 00000250000000.0      00000023601888.0
              cosmos cosmos 00000100000000.0     00000000000000.0
              e740 e740 00004000000000.0      00004637814378.0
              cms cms 00000800000000.0        00000662489480.0
              auger auger 00000150000000.0    00000080381914.0
              btev btev 00000200000000.0     00000116132322.0
              e791 e791 00000300000000.0      00000260439481.0
              e866 e866 00000200000000.0      00000004801515.0
              e815 e815 00000400000000.0      00000373635826.0
              hppc hppc 00000914400000.0      00000170923724.0
              e811 e811 00000050000000.0      00000019215485.0
              e872 e872 00000075000000.0      00000064752717.0
              theory theory 00000102400000.0   00000055218237.0
              e665 e665 00000020480000.0      00000004737660.0
CHEP2000                                                                           5
                      Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
CHEP2000                                                                   6
              Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
(II) Building and testing data handling
       solutions for CDF and D0

                           the Problem
                   the Solutions - what and how
              dealing with a worldwide collaboration




CHEP2000                                                                       7
                  Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                  Run II - Petabytes Storage and Data
                            Access problem
              Category       Parameter          D0                  CDF
              DAQ rates      Peak rate          53 Hz               75 Hz
                             Avg ev. Size       250 KB              250 KB
                             Level 2 output     1000 Hz             300 Hz
                             Max can log        Scalable            80 MB/s
              Data storage   # of events        600 M/year          900 M/year
                             RAW data           150 TB/year         250 TB/year
                             Reconstructed      75 TB/year          135 TB/year
                             data tier
                             Physics            50 TB/year          79 TB/year
                             analysis
                             summary tier
                             Micro summary      3TB/year            -
              CPU            Recons/event       1000-2500           1200
                                                MIPS.s /ev          MIPS.s/ev
                             Reconstructio      34,000-83,000       56,000 MIPS
                                                MIPS
                             Analysis           60,000-80,000       90,000 MIPS
                                                MIPS
CHEP2000                                                                          8
                     Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                         The CDF Detector




CHEP2000                                                                   9
              Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
              Key Elements of Run II Data Handling

   What is our overall strategy for the DH system?
   How do we physically organize the data?
   On what do we store it - where?
   How do we migrate between parts of the storage
      hierarchy ?
   How do we provide intelligent and controlled access
      for large numbers of scientists?
     … and track all the processing steps
   How do we make it scalable, robust, available?
   How do we work with the data at remote sites?
   What are we learning for the next generation expts?

CHEP2000                                                                      10
                 Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                  Past and Present Strategies for data
                       processing/data handling
   Use „Commodity‟ components where possible
                 inexpensive CPUs in „farms‟ for reconstruction processing
                  e.g. PCs
                 inexpensive (if somewhat unreliable) tape drives and media
   Multi-vendor
                 IBM, SGI, DEC, SUN, Intel PCs
   Use much Open Source Software (Linux,GNU, tcl/tk,
       python, apache,CORBA implementations…)
   Hierarchy of active data stores
                 Disk, Tape in Robot, Tape on Shelf
   Careful placement and categorization of data on
      physical medium
                 optimize for future access patterns

CHEP2000                                                                           11
                      Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                          Processing Farm




CHEP2000                                                                   12
              Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
CHEP2000                                                                   13
              Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                 Several Processing Farms




CHEP2000                                                                   14
              Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                        D0 Data Access (Read and Write)
                                 Abstraction

          Online                         Reconstruction
           Data                            Processing
        Acquisition                         Farms of                              Analysis
        Computers        Database         Computers                              Computers
                         Servers



Network Fabric(s)

  Key factors:
  a) Organization of data on tape to match access                    Data Movers
  b) Understanding and controlling access patterns
  c) Disk caches for most frequently accessed data
  d) Management of pass-through data disk buffers                  Tape Robot(s)
  e) Rate-adapting disk buffers where necessary
  f) Scalability and robustness
  g)Bookkeeping and more bookkeeping…
                                                                                  Tape Shelves
  h) Distributed client/servers ---> worldwide solns
     CHEP2000                                                                                    15
                          Data Handling at Fermilab and Plans for Worldwide Analysis
     Vicky White
              Designing for 200MB/s in/out Robot




CHEP2000                                                                     16
                Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                      D0 - the Real System




CHEP2000                                                                   17
              Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                                CDF Run II Data Flow



                                                          Read data
                                                          1.5 Gbytes/sec




              75 Hz,
              20 Mbytes/sec
                                                         30 Terabytes
                              75 Hz,
                              20                                      Read RAW
                              Mbytes/se           Write Secondary     data
                              c      Read Primary datasets            20 Mbytes/sec
                          Fiber      datasets
                          Channel         150 Mbytes/sec
                                                                            Write
                          connection
                                                                             ~ 50 datasets
                                                                            10 Mbytes/sec


                                75 Hz,
                                20 Mbytes/sec

CHEP2000                                                                           18
                     Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
CHEP2000                                                                   19
              Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
              Key Elements of Run II Data Handling

   What is our overall strategy for the DH system?
   How do we physically organize the data?
   On what do we store it - where?
   How do we migrate between parts of the storage
      hierarchy ?
   How do we provide intelligent and controlled access
      for large numbers of scientists?
     … and track all the processing steps
   How do we make it scalable, robust, available?
   How do we work with the data at remote sites?
   What are we learning for the next generation expts?

CHEP2000                                                                      20
                 Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                      Run II Data Access - strategies

   data content for an event from different processing
       stages stored in different physical collections
                  „tiers‟ of data of different sizes and content - RAW, fully
                  reconstructed, summary reconstructed, highly condensed
                  summary, ntuples and meta-data
   primarily file-oriented access mechanisms
                 fetch a whole collection of event data (i.e. 1 file ~ 1GB)
                 read through and process it sequentially
   optimize traversal of data & control access based on
       physics & user - not on file system
   use relational databases (Oracle centrally ) for file
       and event catalogs and other „detector
       conditions‟ and calibration data (0.5 - 1 TB)
   import simulated data (files and tapes) from MC
CHEP2000                                                                           21
                      Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
              Key Elements of Run II Data Handling

   What is our overall strategy for the DH system?
   How do we physically organize the data?
   On what do we store it - where?
   How do we migrate between parts of the storage
      hierarchy ?
   How do we provide intelligent and controlled access
      for large numbers of scientists?
     … and track all the processing steps
   How do we make it scalable, robust, available?
   How do we work with the data at remote sites?
   What are we learning for the next generation expts?

CHEP2000                                                                      22
                 Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                    Data Tiers for a single Event

                      ~200B                                Data Catalog entry
                5-15KB
                                                         Condensed summary
                                                         physics data

    50-100KB                                       Summary Physics Objects


                                                  Reconstructed Data -
~350KB
                                                  Hits, Tracks, Clusters,Particles

250KB                        RAW detector measurements

  CHEP2000                                                                      23
                  Data Handling at Fermilab and Plans for Worldwide Analysis
  Vicky White
               Data Streams and Data Tiers




CHEP2000                                                                   24
              Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
              Streaming the Data - optimize for data
                        access traversal

                                                              D0 approach to
                                                             streaming the data
                                                                Up-front physical
                                                                data organization
                                                                 and clustering

                                                                Multiple streams
                                                                written and read
                                                                    in parallel

                                                                  Streams are
                                                                physics based,
                                                               unlike disk striping


CHEP2000                                                                              25
                 Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                       CDF Data Streaming

   also separate data into many physical streams
   not „exclusive‟ streams - data may be written to
       multiple physical streams




CHEP2000                                                                   26
              Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                      Access to Objects - OO design
   C++ Reconstruction and Analysis Programs
                 Fully object oriented design - STL, Templates
                 reference-counted pointers (D0)
                 OO data model
                 like OODBMS persistent objects inherit from persistent class
   Objects and Collections of Objects stored
      persistently to disk and tape
                 „flattened‟ out to files in special HEP formats
   d0om persistency package for D0
                     supports various external „flattened‟ format, including
                      relational database
                     allows for possibility of storing some „tiers‟ of the data in OO
                      database if proven useful

   ROOT (HEP analysis package) file format for CDF
   Schema evolution can be tailored to need
CHEP2000                                                                                 27
                      Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                     Object Databases/Strategies and
                                Choices
An OODB more or less adopts the                 The “Natural Laws” of data storage and retrieval
following ideas:                            •The more I know about the data, the more likely and
                                            the faster it can be found
Objects represents entities and concepts
from the application domain. Their          •The sooner I know what you want, the faster you will
behavior is defined by their associated     get it
methods, which also can be stored in the
DB, thus making them „universal‟ and        •The less variety in the data you have, the more
available                                   opportunities for optimization

Hierarchies of classes inherit behavior -   •The less often you restructure the data, the less
                                            overhead in keeping track of it
to avoid storage of redundant
information and improve simplicity          •The more people, from more places who want access
similar objects are grouped together        to the data, the tougher the problem of serving them
GOAL--- have full database capability       •The more often you want to ask the same questions,
available to any object which can be        the easier it will be to optimize for those „queries‟
created in any (supported) language
                                            •It will be much faster to “give you what you stored”
DREAM -- minimum of work to store an        than to find some new pattern contained in several
object + DB provides query, security,       “things” that you stored
integrity, backup, concurrency control,
redundancy + has the performance of a        •The more complicated the pattern you search for, the
hand-tuned object manager for your           longer the search will take
particular application
 CHEP2000                                                                                     28
                       Data Handling at Fermilab and Plans for Worldwide Analysis
 Vicky White
                Object Lessons - Tom Love

   characterization and “Natural Laws” from the book
       Object Lessons - Lessons Learned in Object
      Oriented Development Projects by Tom Love

   “You can never achieve maximum performance with
      a system designed for maximum flexibility”

   CDF and D0 both chose performance over flexibility-
      at least for the bulk of the data




CHEP2000                                                                   29
              Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
              Key Elements of Run II Data Handling

   What is our overall strategy for the DH system?
   How do we physically organize the data?
   On what do we store it - where?
   How do we migrate between parts of the storage
      hierarchy ?
   How do we provide intelligent and controlled access
      for large numbers of scientists?
     … and track all the processing steps
   How do we make it scalable, robust, available?
   How do we work with the data at remote sites?
   What are we learning for the next generation expts?

CHEP2000                                                                      30
                 Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                      Serial Media Working Group Report

              Technology       $/drive     $/media      Size     $/GB      MB/s
                                                                           str/ran
              Redwood          80k         78           46.6     1.67      10/4
              Stor.Tek
              DD-2             72k         72           46.6     1.54      14/8
              Ampex
              3590             30k         54           10       5.4       8/5
              IBM
              DTF              30k         80           42       1.9       12/6
              Sony
              Eliant           1.7k        5.4          6.5      0.83      1/0.9
              Exabyte
              DLT7000          5.5k        80           32.6     2.45      5/2
              Quantum
              EXB-8900         3.5k        72           20       3.6       3/2
              Exabyte
              AIT-1            3.0k        72           25       2.88      3/2.7
              Sony                                                                    2000
               Conclusions for Run II : decide in 1999, maintain options and
               flexibility, purchase multi-drive capable robot => Grau/EMASS
CHEP2000                                                                                     31
                         Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
              EMASS AML2 Robot - flexible media -
                up to 5000 cartridges per tower




                                                                              One for each -
                                                                               CDF and D0



CHEP2000                                                                                       32
                 Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
              Key Elements of Run II Data Handling

   What is our overall strategy for the DH system?
   How do we physically organize the data?
   On what do we store it - where?
   How do we migrate between parts of the storage
      hierarchy ?
   How do we provide intelligent and controlled access
      for large numbers of scientists?
     … and track all the processing steps
   How do we make it scalable, robust, available?
   How do we work with the data at remote sites?
   What are we learning for the next generation expts?

CHEP2000                                                                      33
                 Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                   Storage Management Software
                           Requirements
Robot Tape Library is not an archive but rather an active
   store. We therefore need to:
          control placement of data on tapes
          write RAW data from DAQ reliably and with absolute priority
          exchange tapes frequently between robot and shelf
          use open tape format and provide packages to read/write tapes
          mark files and groups of files as read only
          control robot arm and tape bandwidth according to access mode,
           project, user, etc. Keep the system up 24x7.
          access files from many different vendor machines, including
           PC/linux, without software licensing issues
Unable to assure ourselves that necessary HPSS
   modifications and enhancements would be available
   for Fall „99, Fermilab decided to build a more agile
   and flexible system modeled on that of DESY.
 CHEP2000                                                                       34
                   Data Handling at Fermilab and Plans for Worldwide Analysis
 Vicky White
                    ENSTORE storage management for
                                Run II

                                        Client
                            encp [options] < source>
                            <destination>
        Data Path




          Mover                                                     PNFS Server Host
                         control
              ftt                                                              (from DESY)

                                                                          admin
                             Enstore                             /pnfs
                                                                         usr
                             Servers
                      ENSTORE replacement for OSM
 Media                                                                Perfectly Normal File
                                                                            System
CHEP2000                                                                                     35
                     Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
              ENSTORE system in operation today

         Data Catalog support for 1 million files and 16,000 volumes
          tested
         Integrated Fermilab written tape I/o package - ftt - for tape
          handling - supports error handling and statistics
         Scalability looks good -- achieved 20MB/sec into Origin 2000,
          using just one Gbit Ethernet, also into Farms. Would have
          produced graphs of 50MB/sec with 3 Gbit Ethernets if Cisco
          switch had not broken
         Working on robustness -- mainly of the hardware

   Because of their overall strategy for tape and disk - planning for
       Storage Area Networks for disk, and preferring directly
       connected, separate, tape drives for their Farms and Central
       Analysis Server - CDF do not use Enstore.

   CDF has built its own tape staging package built on mt_tools and the
       same underlying ftt tape I/o package
CHEP2000                                                                       36
                  Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                                                       GB/day - 3494 Robot + HPSS

                                                                                          Fermilab Central Mass Storage System Utilization
                                                                                        Gigabytes Transferred March 16th - August 24th, 1998
                        1500.0
                                                                                                                                                                                                                                                                              Write
                                                                                                                                                                                                                                                                              Read                        16.0
                                                                                                                                                                                                                                                                              I/O Rate
                        1250.0
                                                                                                                                                                                                                                                                                                          14.0
Average is
3 MB/sec                1000.0                                                                                                                                                                                                                                                                            12.0



   Max.                                                                                                                                                                                                                                                                                                   10.0
               GB/day




                                                                                                                                                                                                                                                                                                                 MB/sec
sustained                750.0
                                                                                                                                                                                                                                                                                                          8.0
23 MB/sec
                         500.0                                                                                                                                                                                                                                                                            6.0


                                                                                                                                                                                                                                                                                                          4.0

                         250.0
                                                                                                                                                                                                                                                                                                          2.0


                           0.0                                                                                                                                                                                                                                                                            0.0
                                 03/16/98

                                            03/23/98

                                                       03/30/98

                                                                  04/06/98

                                                                             04/13/98

                                                                                         04/20/98

                                                                                                    04/27/98

                                                                                                               05/04/98

                                                                                                                          05/11/98

                                                                                                                                     05/18/98

                                                                                                                                                05/25/98

                                                                                                                                                           06/01/98

                                                                                                                                                                      06/08/98

                                                                                                                                                                                 06/15/98

                                                                                                                                                                                            06/22/98

                                                                                                                                                                                                       06/29/98

                                                                                                                                                                                                                  07/06/98

                                                                                                                                                                                                                             07/13/98

                                                                                                                                                                                                                                        07/20/98

                                                                                                                                                                                                                                                   07/27/98

                                                                                                                                                                                                                                                              08/03/98

                                                                                                                                                                                                                                                                         08/10/98

                                                                                                                                                                                                                                                                                    08/17/98

                                                                                                                                                                                                                                                                                               08/24/98
 CHEP2000                                                                                                                                                                                                                                                                                                                 37
                                                 Data Handling at Fermilab and Plans for Worldwide Analysis
 Vicky White
                Some recent Enstore statistics from
                             the web

  http://www-d0en.fnal.gov/enstore




CHEP2000                                                                         38
                    Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
              All the Enstore mover nodes kept busy
                by the D0 SAM data access system




CHEP2000                                                                      39
                 Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                               CDF disk to tape




We have invented a “poor man‟s” SAN for
      read-only disk in a heterogeneous
      environment
Suitable for static datasets that change
      infrequently
Use the ISO-9660 file system used by CD-
      ROMs.
We have verified that the UNIX systems of
      interest (SGI,SUN) are able to format a
      disk using the ISO-9660 format, put data
      on it and read the data from multiple
      systems




CHEP2000                                                                       40
                  Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
              Key Elements of Run II Data Handling

   What is our overall strategy for the DH system?
   How do we physically organize the data?
   On what do we store it - where?
   How do we migrate between parts of the storage
      hierarchy ?
   How do we provide intelligent and controlled access
      for large numbers of scientists?
     … and track all the processing steps
   How do we make it scalable, robust, available?
   How do we work with the data at remote sites?
   What are we learning for the next generation expts?

CHEP2000                                                                      41
                 Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                    Experiment Data Access Software

 Define collection of        Specify by Data Tier, Data Stream, Triggers, Run Ranges,
 data to be processed                      Specific Files or Event, etc...




       Resolve to List
          of Files               Use Oracle Relational Database Query Engine




                                               Optimize Traversal of Data
        Intelligent movement Regulate Use of Robot for different purposes and access modes
                of data                 Implement Disk Cache retention policies


         •SAM (Sequential Access Model) System for D0
         •CDF’s smallest unit is a Fileset and they use only this to optimize
         tape access and minimize robot arm use
CHEP2000                                                                                     42
                         Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
               D0 SAM - CORBA based framework
                         “Stations” -- logical or physical grouping of resources
        Networked Clients



                                                         Global Optimizer for Robot File
                                                                  Fetching and
                                                         Regulator of Robot/Tape Access
                                                          according to Access Pattern




                                                                                ENSTORE- Robot,
                                                                                 Tape Drives and
                                                                                     Movers
                                      Servers
CHEP2000                                                                                    43
                   Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                                      SAM in use

   simple command line interface (+ some GUIs and
      web browsers) e.g.
                 sam define project --defname=myproject -- …
                 sam store --filename=xxxx --descrip=metadata-file ….
   transparently integrated into D0 framework and
       d0om file name expanders
                 one consumer can have many processes all helping
                  „consume‟ delivered files - - supports Farm production
                  processing without additional bookkeeping
   distributed disk caches and various „physics group
       driven‟ caching policies




CHEP2000                                                                           44
                      Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
              1000+ Monte Carlo Files stored using
                   SAM - reading them back




CHEP2000                                                                      45
                 Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
               CDF Data Access - Stagers and Disk
                      Inventory Manager




                                                                                 File Caching




Resource Management using Batch system
     and static number of tape drives




CHEP2000                                                                                        46
                    Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
              Key Elements of Run II Data Handling

   What is our overall strategy for the DH system?
   How do we physically organize the data?
   On what do we store it - where?
   How do we migrate between parts of the storage
      hierarchy ?
   How do we provide intelligent and controlled access
      for large numbers of scientists?
     … and track all the processing steps
   How do we make it scalable, robust, available?
   How do we work with the data at remote sites?
   What are we learning for the next generation expts?

CHEP2000                                                                      47
                 Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                   Example File and Event Catalog for
                                 Run II
   Oracle 8 database ==> 0.5 - 1 TB for D0, including
      detector run conditions and calibration data
                 1.8 10**9 Event metadata entries, bit indexes, own data types
                 several million file entries
   Oracle Network sitewide licence - now on Linux too
   SAM system using a CORBA interface between
      components, including to database servers
   CDF user processes consult directly with database
   Data Files Catalogued and related to
                 Runs and Run conditions
                 Luminosity information about the accelerator
                 The processes which produced (and consumed the data)
                 Detector geometry, alignment and calibration data

CHEP2000                                                                           48
                      Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
CHEP2000                                                                   49
              Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
              Persistent Data for all the behaviors of
                  the system and the data itself




CHEP2000                                                                       50
                  Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
              Key Elements of Run II Data Handling

   What is our overall strategy for the DH system?
   How do we physically organize the data?
   On what do we store it - where?
   How do we migrate between parts of the storage
      hierarchy ?
   How do we provide intelligent and controlled access
      for large numbers of scientists?
     … and track all the processing steps
   How do we make it scalable, robust, available?
   How do we work with the data at remote sites?
   What are we learning for the next generation expts?

CHEP2000                                                                      51
                 Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                   Scalability, Robustness, Availability

   SAM starting now to do serious stress testing for
                 high throughput
                 high availability and good error handling
   Database used to store context for recovery in SAM
   We are learning!

   “Oracle 24X7 - Real World Approaches to Ensuring
      Database Availability”
     -- need to start to think like this




CHEP2000                                                                           52
                      Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
              Key Elements of Run II Data Handling

   What is our overall strategy for the DH system?
   How do we physically organize the data?
   On what do we store it - where?
   How do we migrate between parts of the storage
      hierarchy ?
   How do we provide intelligent and controlled access
      for large numbers of scientists?
     … and track all the processing steps
   How do we make it scalable, robust, available?
   How do we work with the data at remote sites?
   What are we learning for the next generation expts?

CHEP2000                                                                      53
                 Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                      Data From Remote sites -- IN2P3,
                          Nikhef, Prague, Texas….



Tape Import                                                                            Tape Export
   Enstore Metadata                                                           Enstore Metadata
                                         Enstore
                                                                      Data Transfer
      File Import
                                   Volume       (sync)
         Data                      info                       File                  Access
    SAM Metadata                                              Request
                                                                         Project
                                                                         request

                                            SAM
                                                                               SAM Metadata
                                                                     Export


CHEP2000                                                                                         54
                       Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                         Access from Remote Sites

   SAM designed for distributed caching system -
                 File can have multiple locations in the database
                 can use central Fermilab database -- or extracts in local
                  Linux Oracle Server
   CDF expects to have local versions of their DH
      system running at non-Fermilab institutions




CHEP2000                                                                           55
                      Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
              Key Elements of Run II Data Handling

   What is our overall strategy for the DH system?
   How do we physically organize the data?
   On what do we store it - where?
   How do we migrate between parts of the storage
      hierarchy ?
   How do we provide intelligent and controlled access
      for large numbers of scientists?
     … and track all the processing steps
   How do we make it scalable, robust, available?
   How do we work with the data at remote sites?
   What are we learning for the next generation expts?

CHEP2000                                                                      56
                 Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
              Lessons for future experiments?

   Draw your own conclusions so far
   We will tell you next year!




CHEP2000                                                                   57
              Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
   (III) Moving onwards - to the future

                SDSS and NSF/KDI proposal
                   Storage Area Networks?
                  Particle Physics Data Grid
              CMS and Worldwide Collaboration
              Next generation Storage Systems?




CHEP2000                                                                     58
                Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                     The Sloan Digital Sky Survey

              A project run by the Astrophysical Research Consortium (ARC)


                           The University of Chicago
                           Princeton University
                           The Johns Hopkins University
                           The University of Washington
                           Fermi National Accelerator Laboratory
                           US Naval Observatory
                           The Japanese Participation Group
                           The Institute for Advanced Study
                           Max Planck Inst, Heidelberg
                           SLOAN Foundation, NSF, DOE, NASA



              Goal: To create a detailed multicolor map of the Northern Sky
                        over 5 years, with a budget of approximately $80M
              Data Size: 40 TB raw, 1 TB processed

CHEP2000                                                                         59
                    Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                          SDSS Data Flow




CHEP2000                                                                   60
              Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                              Geometric Indexing

               “Divide and Conquer”                              Partitioning

                         Attributes                          Number

                         Sky Position                            3
                         Multiband Fluxes                    N = 5+
                         Other                               M= 100+



                                   3NM

              Hierarchical               Split as k-d tree             Using regular
              Triangular                 Stored as r-tree                indexing
                 Mesh                  of bounding boxes                techniques


CHEP2000                                                                               61
                     Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                       SDSS Data Products


                Object catalog                                400 GB
                 parameters of >108 objects
                Redshift Catalog                                1 GB
                 parameters of 106 objects
                Atlas Images                                   1.5 TB
                 5 color cutouts of >108 objects
                Spectra                                        60 GB
                 in a one-dimensional form
                Derived Catalogs                               20 GB
                 - clusters
                 - QSO absorption lines
                4x4 Pixel All-Sky Map                          60 GB
                 heavily compressed


                    All raw data saved in a tape vault at Fermilab
CHEP2000                                                                   62
              Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                 SDSS Distributed Collaboration

                                                    Fermilab


                                                                    U.Chicago
                                         ESNET
                U.Washington

                                                                               I. Advanced
                                                                                   Study

     Japan                             VBNS
                                                                              Princeton U.



                                                                              JHU
 Apache Point
 Observatory            NMSU                   USNO

CHEP2000                                                                                     63
                 Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                    NSF/KDI -- Analysis Data Grid

Collaboration with the Analysis Data Grid:
         proposal to the NSF KDI program by
         JHU, Fermilab and Caltech (H. Newman, J. Bunn) +
         Objectivity, Intel and Microsoft (Jim Gray)
         Involves computer scientists, astronomers and particle physicists


Accessing Large Distributed Archives in Astronomy and Particle
     Physics
        experiment with scalable server architectures,
        create middleware of intelligent query agents,
        apply to both particle physics and astrophysics data sets
Status:
              3 year proposal just funded



CHEP2000                                                                        64
                   Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                      http://grid.fnal.gov/ppdg




CHEP2000                                                                   65
              Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                                 Initial Testbed Applications

     High-Speed Site-to-Site File Replication Service

        Primary Site                                                        Replica Site
       Data Acquisition,                                                     (Partial)
         CPU, Disk,                    Bulk Transfer Service:                 CPU, Disk,
         Tape-Robot                100 Mbytes/s, 100 Tbytes/year              Tape-Robot




                             Primary Site                          Satellite Site
                                                                                           University
                           Data Acquisition,
                                                                                           CPU, Disk,
                             CPU, Disk,                            CPU, Disk,
                             Tape-Robot                            Tape-Robot               Users




                                                                                           University
                                                                   Satellite Site          CPU, Disk,
                                                                                            Users
                                Satellite Site                     CPU, Disk,
                                                                   Tape-Robot
                                CPU, Disk,
                                Tape-Robot



                                                                     University
                                                                     CPU, Disk,
Multi-Site Cached File Access                                         Users




  CHEP2000                                                                                              66
                           Data Handling at Fermilab and Plans for Worldwide Analysis
  Vicky White
              Bulk file transfer testbed --Focuscopy


      Fermilab
                                                                                Indiana U.



                                                                                   HPSS
         Disk
                                           MREN
                                                                                tape library
        Cache                              MREN




                                                          MetaData Catalog
                                                          - SAM:File Location
Operator Mounted                                                     Statistics
  Exabyte Tapes                                                      Engine Status

CHEP2000                                                                                     67
                   Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
              Bulk file transfer testbed -- awaits
              ESNet research network and QOS




CHEP2000                                                                    68
               Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                 Distributed Cache - combining SAM
                             and Condor



Matchmaking: Distributed Resource Management for High Throughput Computing, Proceedings of the
                                  Seventh IEEE International
      Symposium on High Performance Distributed Computing, July 28-31, 1998, Chicago, IL.




   next project --- Objectivity database caching with
      Caltech and ANL?



CHEP2000                                                                                     69
                     Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
               Storage Area Networks (SANs) where are
                                we?

Heterogeneous cluster of machines - not locked
     to one vendor, competitive bids for
     computing.
Requires high bandwidth access to shared disk
     storage to work effectively - NFS and AFS
     not sufficiently high performance.
Use Fiber Channel as the physical layer and run
     SCSI over it
Unfortunately read/write to Fiber Channel disks in
     a heterogeneous environment is not
     currently available at an affordable cost




                                                Proposal from Quantum Research -- unfunded

 CHEP2000                                                                             70
                    Data Handling at Fermilab and Plans for Worldwide Analysis
 Vicky White
              1st phase of research was quite
                        successful




CHEP2000                                                                   71
              Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                    LHC Experiment and other future
                  experiment Data Access Architectures
   Scale and complexity
                 numbers of channels of detectors
                 number of participants and geographic dispersion
                 complexity of collisions
   Network bandwidth hopes
                 distributed store of data, rather than data replication
   Hierarchical Storage Systems evolution
                 HPSS collaboration - Fermilab continues involvement
                 CERN/DESY/Fermilab/Eurostore?
   Disk availability/price
                 all data on disk? => random access to sub-parts of event
                  with less attention to clustering of data on physical medium.
   Object oriented database technology
                 find the right places for it
CHEP2000                                                                           72
                      Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
CHEP2000                                                                   73
              Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                           Monarc Analysis Model Baseline:
                           ATLAS or CMS “Typical” Tier1 RC



               CPU Power                  ~100 KSI95
               Disk space                 ~100 TB
               Tape capacity              300 TB, 100 MB/sec
               Link speed to Tier2        10 MB/sec (1/2 of 155 Mbps)
               Raw data                   1%                10-15 TB/year
               ESD data                   100%              100-150 TB/year
               Selected ESD               25%               5 TB/year           [*]
               Revised ESD                25%               10 TB/year          [*]
               AOD data                   100%              2 TB/year
                                           [**]
               Revised AOD                100%              4 TB/year
                                           [**]
               TAG/DPD                    100%              200 GB/year
                                           Simulated data                      25%
CHEP2000                                   25 TB/year                                  74
                    Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White                                                  (repository)
              [*] Covering Five Analysis Groups; each selecting ~1%
                 MONARC Testbed Systems




CHEP2000                                                                   75
              Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                              Regional Center Architecture
                                 Example by I. Gaines


                            Tape Mass Storage
Network
from
                              & Disk Servers
                                                                                          Tier 2
CERN                         Database Servers
                                                                                              Local
Network
from Tier 2                                                                                   institutes
& simulation         Production           Production            Individual
                   Reconstruction                                                             CERN
 centers                                   Analysis              Analysis
                   Raw/Sim  ESD          ESD  AOD             AOD  DPD
Tapes                                     AOD  DPD              and plots
                                                                                          Tapes
                     Scheduled,
                     predictable
                                           Scheduled              Chaotic
                    experiment/
                   physics groups       Physics groups
                                                                                   Desktops
                                                                Physicists



     Physics                                                      Web Servers            Training
                     R&D Systems            Info servers
    Software                                                      Telepresence          Consulting
                     and Testbeds           Code servers
   Development                                                       Servers            Help Desk
     CHEP2000                                                                                    76
                      Data Handling at Fermilab and Plans for Worldwide Analysis
     Vicky White
                        UF Equipment Plan for FY00

   R&D and User support, UF Hardware:
                 disk storage: 1 TByte
                     large disk pool for ODBMS testbeds and data analysis
                      (Monte Carlo and test beam data)
                 tape storage: up to 10 TByte
                     provide several TB storage for MC and test beam data
                     setup ODBMS testbed
                     start using Objectivity + mass storage system in analysis
                     provide data import and export facility
                 CPU resources: 30 node Linux cluster
                     increase main server. Form production unit for MC production.
                     PC analysis cluster and dedicated special purpose R&D
                      systems
                 Network infrastructure
                     provide sufficient LAN capacity
                     provide WAN connectivity for production and testbed activities

CHEP2000                                                                           77
                      Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                                      Conclusions

   If you have a lot of data to manage and access today
        you must
                 think carefully about how you store it, how you wish to
                  access it, and how you will control access to it
                 be aware of media costs
                 design a system for robustness and uptime (especially if
                  you use relatively inexpensive tape media)
                 design a system for active and managed access to all
                  hierarchies of storage - disk, tape in robot & tape on shelf
   For the next generation of experiments
                 we hope for better network bandwidth and a truly distributed
                  system
                 we investigate OO databases for their potential to provide
                  random access to sub-parts of event data



CHEP2000                                                                           78
                      Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White
                                 THE END




CHEP2000                                                                   79
              Data Handling at Fermilab and Plans for Worldwide Analysis
Vicky White

				
DOCUMENT INFO
Stats:
views:5
posted:4/6/2009
language:English
pages:79