Docstoc

lhc-computing-review-public.web

Document Sample
lhc-computing-review-public.web Powered By Docstoc
					Central Database Support



    Summary of Activities of
    IT/DB Group for LHC
    Computing Review


    http://wwwinfo.cern.ch/db/
Overview

Overview of IT/DB Group
ORACLE activities
Objectivity/DB activities
Espresso
Conclusions Concerning Alternatives
Summary
IT/DB Group
Formed as part of IT restructuring Jan 2K
       • 2 sections: ORACLE (dbr); Objectivity/DB (odb)

Manpower issues:
   urgent staff replacements
       • retirements in pipeline in DBR
       • resignation (already) in ODB
   ensure adequate staffing in 2005+
       • e.g. many staff on short-term contracts

Streamlining working methods
Primary focus is production
Working Methods
Ensure that all activities have both a
 responsible + backup(s)
Identify areas where common solutions can
 be applied
Use appropriate tools to improve response;
 maximize knowledge sharing
  Problem Tracking, Newsgroups, FAQs, Web, …
Leverage many years experience with
 production ORACLE services
ORACLE Activities
ORACLE Activities
“Mission Critical” activities include:
     Support for EDMS project
     Support & Running of Accelerator Services
     Support & Operations of Central DB Servers
Major new activities include:
  ORACLE for “physics” applications
     e.g. ALEPH / LHCb book-keeping; LHC detectors
  LEP decommissioning; LHC construction
  Windows 2000; Forms to Web + Java;
Everybody at CERN is an ORACLE user!
ORACLE Services
Engineering Data Management System
  manage engineering data for LHC project,
   machine and experiments (+SL, ST, PS, …)
Accelerator Services
  LEP logging, LEP+SPS measurements, …
     Anticipate similar usage for LHC
Central Services
  Network DB (LANDB)                    And the CERN network...


Physics-related Activities
ORACLE Summary

ORACLE production fundamental to CERN
Increased usage in physics experiments
Should exploit “career building” value of
 ORACLE experience
  attract short term staff & visitors
But must retain core expertise to run
 these critical services
Objectivity
Activities
Objy Production Services
Used in production by numerous groups
  CHORUS, COMPASS, CMS, NA45, ALEPH …

Major CMS production in progress
  Expect few TB of data over several weeks

ATLAS plan to significantly increase use
 of Objy in 2000

COMPASS production - starts May 2000
Objectivity Successes

Standards “influenced” ODBMS
 successfully deployed for numerous HEP
 experiments at several sites
Major milestones, including 170MB/s data
 rate (CMS), 35+TB total data (BaBar) met
Important enhancement requests (MSS
 interface, security hooks, other AMS
 extensions) delivered and in production
Objectivity Problems
Concerns over market size / company
 stability
  Plans to go public this year; sink or swim?

 Pending enhancement requests (VLDB)
  Planned for end-2000 release

Support issues need to be addressed
  Better response to problems; improved
   information flow; support for RCs etc.
Objectivity Issues
Recent visit to Objy to pursue major issues
     Classified as eXpress, Short, Medium and Long
      term
        • X - build for Red Hat 6.1; AMS instabilities
        • S - DBID API
        • M - VLDB support
        • L - “Multi-FD” issues
         X - V5.2.1 for RH6.1 now available; AMS being worked on
         S - DBID in V6 (~summer 2000?)
         M - draft spec ~May; final spec ~August
         L - enhancements in V6, V6++; revisit later

Cautiously confident that technical issues
Objectivity Summary
Usable and used in production
  Still need some enhancements to meet LHC
   baseline requirements
Baseline assumption for ATLAS and CMS
If market takes off (successful IPO), then
 growth of company, local support, etc. will
 follow
  But will we will be able to influence product?
A fallback strategy is mandatory
Risk Analysis: Issues

Choice of Technology
  ODBMS, ORDBMS, RDBMS,
     “light-weight” Persistency, files + meta-data,
   ...
Choice of Vendor (historically)
  #1 Objectivity, #2 Versant
Size of market
  Did not take off as anticipated; unlikely to grow
   significantly in short-medium term
Persistency: Conclusions
 Objectivity/DB is viable technically
   No viable alternative commercial ODBMS
Other possibilities include:
   “Open Source” (?) ODBMS solution
   ORDBMS-based solution (also for event
   data)
   Meta-data + files
RD45 investigating  &  directly
 based on experience at FNAL / BNL ...
RDBMS Investigations
ORDBMS Questions

To what extent can ORDBMSs scale?
What would be the impact on
  DBA; developer; user
Oracle being used to store meta-data
Project in CMS to study extended RDBMS
 for event data (Informix)
Possible studies also with Oracle
Espresso
Espresso

Espresso is a proof-of-concept prototype
 built to answer questions from Risk Analysis
  Could we build an alternative to Objectivity/DB?
  How much manpower would be required?
  Can we overcome limitations of Objy’s current
   architecture?
     Support for VLDBs, multi-FD work-arounds etc.
  Test / validate import architectural choices
Espresso - Current Status

A working prototype has been produced,
 implementing the ODMG C++ binding
  on which HepODBMS is layered
LHC++ Histograms (HTL), tags, and other
 applications have been successfully ported
  plans to port G4 examples, Iguana, ORCA, ...
Successfully demonstrates feasibility, but
 more work on scalability / performance /
 robustness required
Espresso - Next Steps
Start detailed requirement discussion with
 experiments and other interested institutes
Continue Scalability & Performance Test
     Storage Manager:           larger files (>100GB)
     Page Server:               connections > 500
     Lock Server:               number of locks > 20k
     C++ Binding & Schema Manager: port Geant4 persistency
      examples and Conditions-DB
By summer this year
     Written Architectural Overview of the Prototype
     Development Plan with detailed manpower estimates
     Single user evaluation system
Espresso - Summary

Initial prototype suggests that it is
 technically feasible

Discussions with other sites suggest that
 interest goes well beyond HEP

Manpower estimates / possible resources
 indicate “project” would have to start “soon”
Persistency - Summary

“ODBMS-like” solution is still preferred
Functional & support requirements should
 be available by ~October 2000
Investigations of other possibilities will
 proceed in parallel
Information on all approaches should be
 available in time for “2001 decision”
IT/DB Summary

Production Database Services are the
 “raison d’être” of the IT/DB group



Production services based on ORACLE and
 Objectivity/DB must & will continue


         http://wwwinfo.cern.ch/db/
End of Presentation




    Background slides follow...
RD45 Activities
Proposed activities presented at LCB
 Review (November 1999) and CHEP 2K
Basically consist of:
   Production activities
    IT/DB Group
   Preparation for “2001 choice”
    Requirements WGs, Risk Analysis, Customer /
     HEP visits etc.
Some slides from LCB / CHEP follow…
  Guiding Principles                                  CMS

 “In particular, the data should be presented in as consistent a
  way as possible. The data themselves may be stored in a variety
  of formats but this should be hidden from the user…”
 "The ODMG ... binding is based on one fundamental principle:
  the programmer should perceive the binding as a single
  language for expressing both database and programming
  operations, not two separate languages with arbitrary
  boundaries between them.“
 Capability of scaling to LHC
       data volumes & rates
 Capable of satisfying wide
       variety of HEP needs
    DAQ, SIM, REC, Analysis, ...
 Use of “standard”, widely-used
       solutions if applicable
  Database Production
  Service - What is missing?
   Transparent non-blocking interface with MSS
   User capability to:                     Objy V5.2
    export, extract, replicate data and schema
B
a   manipulate data and schema outside production
B    database and while accessing data and schema from
a    production database
r Fully functional, reliable high-quality database
       system including
        VLDB support (>>1PB)                           Objy V6?
        management tools
From L. Silvestris: Review of application software services for the LHC era, FOCUS 07/10/99
O(R)DBMS Evolution
From CMS Computing Technical Proposal:

 “If the ODBMS industry flourishes it is very likely that by
  2005 CMS will be able to obtain products, embodying
  thousands of man-years of work, that are well matched to
  its worldwide data management and access needs. The cost
  of such products to CMS will be equivalent to at most a few
  man-years. We believe that the ODBMS industry and the
  corresponding market are likely to flourish. However, if this
  is not the case, a decision will have to be made in
  approximately the year 2000 to devote some tens of man-
  years of effort to the development of a less satisfactory
  data management system for the LHC experiments.”
ODBMS / RDBMS / ORDBMS

Complex Data                 RDBMS + “object
Performance,                  extensions”
                                 Can store ADTs
 scalability                     “Methods” on server
Tight Language               Complex Data with
 Binding                       Queries
   OQL - SQL3 query subset
                              $8B in 1996
Growth similar to
 RDBMS in ’80s                Likely to become
                               dominant DBMS
~$1B market by
                               technology
 2001~$100M?
Risk Analysis: Issues

Choice of Technology
  ODBMS, ORDBMS, RDBMS, light-weight
   POM, files + meta-data etc.
Choice of Vendor
  #1 Objectivity, #2 Versant
The Home-Grown approach
  Estimate resources required          Versant

  Implies proof-of-concept prototype
Risk Analysis:
Summary of Options
  Evaluate C++ binding to e.g. ORACLE
  Add ESCROW clause to Objectivity contract
  Pursue possibility of source license
  Visit key Objectivity customers
  Produce new requirements list
  Estimate manpower to support Objy in house
  Estimate manpower for “clean-sheet” solution
  Continue to monitor alternatives
The LCB agrees with the other suggested steps to mitigate risk,
with the addition of trying to insure that user code in reconstruction
and analysis programs is kept as standards compliant as possible.
Risk Analysis: Conclusions

A solution is certainly possible!
 How much should we align ourselves with
  industry trends / standards?
ODBMS unlikely to dominate DBMS market
  Likely to survive foreseeable future - market!
Need to complete current prototype to
 make meaningful manpower estimates
 Future Activities
Production Services         “2001” milestone
Considered essential by    Revise requirements
  several experiments       Visit other HEP labs
Tools, documentation,        (BNL, FNAL, SLAC, …)
  regular releases, …       Provide ODBMS-
  general production level   independent s/w layer
   support
                            Estimate man-power for
Push for VLDB and            alternative POM
 other enhancements
                            Evaluate ORDBMS
                              technology
Summary (+)
We have a good understanding of ODBMS
 technology & Objectivity/DB in particular
System has been demonstrated to work in
 production up to level of today’s (BaBar)
 experiments
Many enhancements have been delivered,
 others in pipeline
Production experience will be invaluable
 for LHC (product enhancements, tools, etc.)
Summary (-)

The ODBMS market has not taken off as
 was previously predicted
We need to assure ourselves that there is
 sufficient non-HEP demand (and $$$)
We need to (in any case) understand how
 an eventual migration could be handled
We need to develop at least one realistic
 fallback scenario
Conclusions

R&D phase of RD45 has now led to
 production ODBMS services
Risks of current strategy well understood
 - risk management must continue
We are well placed to prepare for
     “2001 milestone”
Future focus:
  Production
  Road-map to 2001 and beyond
RD45 - Future Activities
Revise requirements
    establish WGs, together with experiments
Visit other HEP labs (BNL, FNAL, SLAC, …)
    Recent SLAC visit; BNL ~Sep 2K; FNAL 2001?
Provide ODBMS-independent s/w layer
    Extension of existing HepODBMS
Estimate man-power for alternative POM
    Preliminary estimates available: ~15MY
Evaluate ORDBMS technology
    ORACLE meeting Oct 2K; work in CMS with Informix
 Requirements WGs

Functional                        Support / Release
   e.g. scalability to LHC data     e.g. notification of new &
    volumes & rates                   withdrawn features;
   platform / language              support for new “platforms”
    heterogeneity                     within X months;
   transactional safety and         advance notice of release
    crash recovery                    schedule
   navigational access at ~disk     automatic acknowledgement
    / network speed                   of PRs, change of state, etc.



Examples of possible functional / support requirements
 RD45 Summary

Experiments have requested continuation of:
  Meetings; Workshops; White-papers;
     Workshop prior to CHEK 2K; next July 4-5; Oct-Nov?

In addition, proposed R&D items are:
  Support for the choice of database system
  Manpower estimate for an Alternative Persistent Object Manager
  A database independent software layer based largely on the ODMG
   interface standard
  The analysis and revision of LHC database requirements
  The potential use of a mainstream ORDBMS products, such as
   ORACLE 8i

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:3
posted:8/15/2010
language:English
pages:42