Physics Services Support
Relational Databases for the LHC Computing Grid
The LCG Distributed Database Deployment (3D) and Conditions Database (COOL) Projects
Andrea Valassi (CERN IT-PSS-DP)
NEC2007, Varna, Bulgaria 14th September 2007
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Acknowledgements
• Several people have „lent‟ me their slides or contributed useful suggestions for this talk
– – – – Dirk Duellmann and the 3D team Maria Girone and the CERN Physics DB team The COOL and CORAL teams Several users in the experiments
• Many thanks to all of them!
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 2
Outline
• Relational databases for LHC computing
– Reliable services at CERN and other LCG sites – The 3D project: distributed database deployment
• COOL and other conditions data
– COOL development and deployment status
• Conclusions
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 3
Relational databases for LHC
• In LHC computing, relational databases will be crucial to store metadata of both physics applications and grid services
– Detector conditions (calibration, geometry…) – Experiment data production bookkeeping – Core grid services for cataloguing, monitoring and distributing LHC data (e.g. LFC file catalog)
• Key features of relational db services
– High availability, backup and recovery, performance and scalability, security…
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 4
The 3D Project
• Distributed Database Deployment
– The LCG provided initially tools for distributed access and replication of file-based data – The aim of the 3D project is to provide a similar infrastructure for data stored in RDBMS services
• Experience in running RDBMS services at CERN and several other LCG sites already since a long time
• Goals of the 3D project as part of LCG
– Increase database availability and scalability – Allow applications to access databases in a consistent and location-independent way – Provide database replication between sites – Coordinate the setup and deployment of the database and replication infrastructure
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 5
3D Service Architecture
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 6
Building block – db cluster
• CERN db services use Oracle 10g RAC
– High availability – redundant storage and network – Scalability – for CPUs and storage independently – Cost reduction – commodity hardware on Linux
• Homogeneous h/w and s/w setup for all physics DBs
– Similar setup is used by most T1 sites as well
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 7
Physics DB services at CERN
• Size of Oracle services for physics
– 110 mid-range servers, 110 disk arrays
• i.e. 220 CPUs, 440 GB RAM, 300 TB disk space
• Several production clusters
– One offline RAC per LHC experiment (up to 8 nodes), Atlas online RAC, COMPASS RAC – In addition: development and validation services
• Development and validation services too
– Application release cycle
Development service
CERN - IT Department CH-1211 Genève 23 Switzerland
Validation service
Production service
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 8
Frontier and CMS
• Read-only access to Oracle data via http
– Oracle server at T0 – Tomcat server at T0 – Squid web cache at T0/T1/T2
• Frontier used in CMS
– Under evaluation in Atlas (integrated in Coral/Cool) – Successfully tested in CMS CSA‟06, many improvements in 2007 – CMS are confident that they have ways to avoid stale-cache issues
Andrea Valassi – NEC 2007, Varna 3D and COOL - 9
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Replication – Oracle Streams
(Capture, Propagation, Apply)
Barbara Martelli, INFN T1/T2 Workshop, Nov. 2006
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 10
Replication – T0 to T1
• CERN data are replicated to ten T1 sites
– Streams used by Atlas (10 T1) and LHCb (6 T1)
• More details in the slides about COOL deployment
– The present setup can sustain 2 GB/day to T1
CERN - IT Department CH-1211 Genève 23 Switzerland
• This is the Atlas requirement for COOL user data
Andrea Valassi – NEC 2007, Varna 3D and COOL - 11
www.cern.ch/it
Streams downstream capture
• This technology provides isolation of the source database against problems with the network or with the destination databases
• In 3D, this shields the CERN T0 services from problems in the replication to T1 sites
– The redo log retention on the downstream database is optimized (e.g. 5 days) to allow for re-synchronisation without recall from tape
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 12
Replication – online to offline
• Streams used by Atlas, LHCb and CMS
– For LHCb offline to online too (see COOL slides)
• Work in progress with Atlas to test replication of the full PVSS archive
– Allow detector expert analysis without impacting the performance of the online production server – Data rates (6 GB/day) much higher than COOL
• Tests over the last two months are promising
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 13
3D service operation
• DB service level according to WLCG MoU
– At T0: piquet service being set up to replace current 24x7 best-effort operation
• Streams interventions 8x5 for now
– At T1: need more experience to confirm coverage
• Some policies proposed by CERN T0 have been accepted also by the T1 sites
– Backup and recovery (Oracle RMAN) – Security patch application (frequency, procedure) – Database and Streams monitoring, usage reports
• Integration with WLCG procedures
– GGUS tickets, intervention announcement
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 14
Outline
• Relational databases for LHC computing
– Reliable services at CERN and other LCG sites – The 3D project: distributed database deployment
• COOL and other conditions data
– COOL development and deployment status
• Conclusions
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 15
What are conditions data?
• Non-event detector data that vary with time
– And may also exist in different versions
• Data produced both online and offline
– Geometry, detector control, alignment, calibration...
• Data used for event processing and more
– Detector experts – Alignment and calibration – Event reconstruction and analysis
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 16
CondDB in the 4 experiments
• ALICE
– Alice-specific software for time/version handling – ROOT files with AliEn file catalog
• ALICE-managed deployment (AliEn MySQL at T0)
• CMS
– CMS-specific software for time/version handling – Oracle (via POOL-ORA) with Frontier web cache
• 3D/CMS deployment: Oracle/Frontier (T0), Squid (T1/T2)
• ATLAS and LHCb
– COOL common software for time/version handling
• Common development of Atlas, LHCb and CERN IT
– Oracle, MySQL, SQLite, Frontier (via COOL API)
• 3D/Atlas/LHCb deployment: Oracle (T0/T1) with Streams
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 17
COOL software overview
• Consistent approach to many use cases • Technology-neutral C++ API
– Single-version (DCS) and multi-version (calib/align)
– API is not relational - no direct SQL user access – Same user code can be used on all backends
– CORAL and SEAL for C++ implementation – ROOT/Reflex for python bindings (PyCool)
• Maximize reuse of other LCG AA software • Single relational implementation via Coral
– Same code for Oracle, MySQL, SQLite, Frontier – Same relational schema for all backends – Best practices (bulk operations, bind variables) – Detailed performance studies and optimizations
3D and COOL - 18
• Emphasis on read and write performance
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
COOL relational implementation
• Modeling of condition data “objects”
– System-managed common “metadata”
• Data items: many tables, each with many “channels” • Interval of validity - IOV: since, until • Versioning information with handling of interval overlaps
– User-defined schema for “data payload”
• Support for simple C++ types
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 19
Development summary
• Milestones
– COOL 1.0 released in April 2005
• Basic functionality (development started in Nov. 2004)
– COOL 2.0 released in January 2007
• Major backward-incompatible API and schema changes
• Current focus is performance optimization
– Separate optimizations for different use cases
• Several performance issues solved in 2007 • Feedback from and for Atlas/LHCb stress tests
– Work in progress also on support for new platforms and a few functional enhancements
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 20
COOL data distribution
• Replication at the database backend level
– Oracle Streams (see next slides) – Cross-technology replication is possible (same schema for all backends), not really attempted yet
• Oracle remote access via Frontier
– Under evaluation in Atlas
• Replication tools based on the COOL API
– Static (copy once) or dynamic (copy then update)
• Data slicing/selection is also possible
– Cross-technology replication is possible
• Many use cases for SQLite files in Atlas and LHCb
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 21
Deployment in LHCb
• Computing model
– Reconstruction at T0/T1 – Only MC prod at T2
• COOL stores only conditions data for event reconstruction
– Oracle at PIT, T0, T1 with replication via Streams – Geometry and conditions for MC sent to T2 as SQLite file – Replicated forward to T0 and T1 via Streams – Data from PVSS processes – Replicated back to PIT and forward to T1 via Streams – Data computed in offline calibration/alignment jobs
3D and COOL - 22
• Online db master at PIT
(Marco Clemencic, COOL meeting 3 July 2006)
• Offline db master at T0
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna COOL
Deployment in Atlas
• Largest COOL data set comes from DCS
– For offline reconstruction and detector experts
• From the online RAC in the T0 computer centre
– Via the PVSS2COOL data transfer (1.5 GB/day)
• Many options open for T2 replication
– Many use cases (simulation, calibration, analysis) – Static/dynamic replication to sqlite/mysql, Frontier
CERN - IT Department CH-1211 Genève 23 Switzerland
(Florbela Viegas, CHEP 2007)
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 23
COOL deployment status
• The T0 setup is (almost) complete
– The LHCb online server is being set up these days
• Atlas and LHCb T1 sites are all connected
– SARA, RAL, PIC, IN2P3, Gridka, CNAF (both) – Plus Nordugrid, Triumf, BNL, Taiwan (Atlas only) – Distributed tests underway in both experiments
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Much larger data rates in ATLAS! Andrea Valassi – NEC 2007, Varna
3D and Status - 24 COOL COOL
Atlas scalability tests (1)
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 25
Atlas scalability tests (2)
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 26
Outline
• Relational databases for LHC computing
– Reliable services at CERN and other LCG sites – The 3D project: distributed database deployment
• COOL and other conditions data
– COOL development and deployment status
• Conclusions
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 27
Conclusions
• The 3D project has set up a world-wide distributed database infrastructure for LHC
– This is one of the largest distributed deployments of the Oracle database worldwide (over 100 nodes at CERN and a few nodes at each of ten T1 sites) – T0/T1 are ready for ramp-up to LHC production
• The COOL software is used by both Atlas and LHCb to store their conditions data
– COOL deployment is one of the largest users of 3D – First results from Atlas scalability tests confirm that resources allocated should match required #jobs/h
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 28
For more information
• Physics database services at CERN
– http://cern.ch/phydb
• The 3D project
– https://twiki.cern.ch/twiki/bin/view/PSSGroup/LCG3DWiki
• The COOL project
– http://cern.ch/cool
• The CORAL project
– http://pool.cern.ch/coral
CERN - IT Department CH-1211 Genève 23 Switzerland
www.cern.ch/it
Andrea Valassi – NEC 2007, Varna
3D and COOL - 29