ATLAS software workshop at Brookhaven National Laboratory

Document Sample
ATLAS software workshop at Brookhaven National Laboratory Powered By Docstoc
					ATLAS software
 workshop at
    BNL
   23-28 May 2004

  Summary of notes
taken during talks and
some slides taken from
  interesting talks –
Ricardo, 29 May 2004
            Outline
      Overall notes and plans

 Core software, distribution kit, etc

Simulation and detector description

Reconstruction and event selection

Analysis tools and Event Data Model

       Combined Test Beam
Overall notes and plans
          Many different sessions and working group
    meetings…will try to summarize and will focus on some
                    subjects more than others.
    Also, most of this talk is a transcription of my notes, not
            all of it makes sense of is very explicit.
•   Software plenary sessions I & II      •   Grid working group
•   Event selection, reconstruction and   •   Physics validation
    analysis tools                        •   Detector description working group
•   Software infrastructure               •   Distributed analysis working group
•   Simulation and detector description   •   Database and calibration/alignment
•   Data challenge                        •   Combined performance
•   Framework working group               •   International computing board
•   Grid and distributed analysis         •   Analysis tools working group
•   Event Data Model (EDM) working        •   Database working group
    group                                 •   Software distribution and
•   SUSY working group                        deployment working group
•   Reconstruction working group          •   Calibration/alignment working group
•   Physics coordination                  •   Software plenary session III
•   Physics Event Selection Algorithms
    (PESA) working group
• Major production activities soon: DC2, Combined Test
  Beam
• Also: HLT testbed and physics studies leading up to
  Physics Workshop 2005
• Next software week 20-24 September: after DC2 and at
  end of beam test
• Documentation is only in the minds of a few people –
  some plans to document parts of software during the next
  year
• Several buzzwords in this meeting:
    Geant4
    Pileup
    Digitization
    event mixing
    EDM: ESD/AOD (what information should be in AOD? 100kB max)
    Use of Python job options (no more athena.exe in > 8.2.0, use
     jobOptions_XXX.py)
    Grid
    Calibration etc….
       ATLAS Computing Timeline
2003
       • POOL/SEAL release (done)
       • ATLAS release 7 (with POOL persistency) (done)
       • LCG-1 deployment (done)
2004   • ATLAS complete Geant4 validation (done)
NOW    • ATLAS release 8 (done)
       • DC2 Phase 1: simulation production
2005   • DC2 Phase 2: intensive reconstruction (the real challenge!)
       • Combined test beams (barrel wedge)
       • Computing Model paper
       • Computing Memorandum of Understanding
2006
       • ATLAS Computing TDR and LCG TDR
       • DC3: produce data for PRR and test LCG-n
       • Physics Readiness Report
2007   • Start commissioning run
       • GO!
Planning (T.LeCompte)
• For the first time delay in schedule is consistent with zero
• Several project reviews expected soon
• Inner Detector and Core software seem to need lots of
  work and people
Core software, distribution, validation, etc
                                             Release plans:
 Core software                               8.2.0 – 22 May
                                             8.3.0 – 9 June
Core software (C.Leggett)               9.0.0 – 30 June – DC2 production
• Gaudi:
• Current version is v14r5+ATLAS-specific patches (ATLAS version
  0.14.6.1)
• Changes for v14:
    – Uses SEAL, POOL, PI (?)
    – AIDA histogram Svc replaced with ROOT
    – GaudiPython
    – Events merging: can now control exactly which events from 2 files are
      merged
    – Pileup: see Davide's talk Tuesday
    – Interval of Validity (IoV) Svc improved
    – v8.2.0: new version of CLHEP causes wrong version of HepMC to load
      when using athena.exe (not athena.py) -> segfault
• Athena and Gaudi: heading towards rel. 9, Gaudi v15 (try to stay in step
  with LHCb); extend installArea to gaudi
                    Distribution kit
• Development platform vs. deployment platform
• Kit philosophy:
   – Address multiple clients: running binaries/doing development
   – Possibility of downloading partial releases, binaries only
     (Gregory working on this one), source code…
   – Multiple platforms (not yet)
• Until recently only RedHat7.3 + gcc 3.2 supported – in
  future, CERN Enterprise Linux (CEL) 3 + gcc 3.2.3 /
  icc8 (64-bit compiler) / Windows / MacOSX
• Complication in development platforms from external
  software – need to cleanup packages and think about
  dependencies
• Runtime environment has some problems – not easy to
  setup for partial distributions
• Pacman/Kit tutorial to be organized soon
             Validation (D.Costanzo)
• 3 activities:
   - reconstruction of DC events with old zebra events
   - validation of generators for DC2
   - reconstruction of G4 events for DC2
• Plan for next year: generate more events and prepare for
  physics workshop 2005
• v8.2.0 should be next usable release (moore/muid?),
  exercise reconstruction
• Various details:
   – Some degeneration of reconstruction was found wrt 7.0.2
   – Atlfast needed to validate generated event
   – Several problems found e.g. xKalman in 8.0.3
   – A lot of activity in electron reconstruction
   – Jet/ETmiss: jetRec under restructuring, H1-style weights need to
     be recalculated for Geant4
   – DC2 going into production mode
Grid infrastructure (Luc Goosens, et al.)
•   production system architecture (2)
     – executors
         • one for each facility flavour
               – LCG (lexor), NG (dulcinea), GRID3 (Capone), PBS, LSF, BQS, Condor?, …
         • translates facility neutral job definition into facility specific language
               – XRSL, JDL, wrapper scripts, …
         • implements facility neutral interface
               – usual methods: submit, getStatus, kill, …


     – data management system
         • allows global cataloguing of files
               – we have opted to interface to existing replica catalog flavours
         • allows global file movement
               – an atlas job can get/put a file anywhere
         • presents a uniform interface on top of all the facility native data management tools
         • implementation -> Don Quijote
               – see separate talk
  Data Management (database): Don Quixote
  Supervisor: Windmill
  Executors: NG (dulcinea), LCG (lexor), GRID3 (Capone), batch


Don Quixote prodDB                            dms




   super       super           super            super           super   Windmill
                             jabber                           jabber
jabber      jabber                           jabber

   LCG           LCG            NG               G3              LSF
                                                                        Executors
   exe           exe            exe              exe             exe


                       RLS             RLS              RLS
           LCG                 NG              Grid3              LSF   Grids
            DC2 running (S.Albrand)
• DC2 will have a much more sophisticated production system
  than DC1 and uses POOL instead of ZEBRA
• Only one database for all data types; data type is stored in
  database
• Users add properties to their datasets (collection of logical
  files) from a predefined list of possible properties
• Allows database searches for datasets
• Dataset names proposed to user (mods. possible)
• User interface to submit tasks
• Datasets not owned by one physics group alone anymore;
  notion of principal physics group
• Possibility to extract parent and children datasets: parent
  may be generator level dataset and children may be
  reconstructed dataset
Simulation and Detector Description
        Detector description (J.Boudreau)
• The detector description in
  “GeoModel” will be used in both
  simulation and reconstruction
• LAr is last detector missing in
  GeoModel
• TileCal has finished GeoModel
  description but will not be
  available for DC2
• Volumes know their sub-
  volumes: calculate dead material
  etc in transparent way
• Relational database (Oracle) will
  contain versions of whole
  detector: no hardwired detector
  description numbers in
  reconstruction code
• Changing things such as
  alignment will be dealt with
  through different versions of
  detector description in database
• Great live demo of GeoModel!
            Simulation (A.Rimoldi)
Simulation (A.Rimoldi)
• Generators in advanced stage in both Atlfast and
  Atlsim: added “Cascade” and “Jimmy”; some
  differences between G3 and G4 Herwig events
• G4atlas being debugged: use versions > 6.0; new
  Geant4 release 6.1 (25/3/2004)
• Digitization: most infrastructure software now in
  place, but work to do for each subdetector
• Pileup: under development and test; different
  subdetectors can be affected by different bunch
  crossings
• Combined test beam: general infrastructure is
  ready
     G4ATLAS status (A.Dell'Acqua)
• Concentrating on DC2 production
• G4 v6.1 gave some problems, went back to v6.0
• 6.0 has some physics problems (which don't seem to be
  serious from his tone), but aiming for rubustness at the
  cost of some accuracy
• MC truth: fully deployed with rel.8.0.2, full support in
  v8.0.3
• Need guinea pigs to test MC truth and for simulation and
  reconstruction
• G4 truth uses same HepMC format as generated event
• Migration to python job options
  Digitization and Pileup (D.Costanzo)
• Sub-detector breakdown: ID getting there, calorimetry
  very mature since DC1, muons not so
• Pileup: each detector reads pileup from different bunch
  crossings
• Number of files in POOL should be kept low: many files
  leads to mem.leaks!
• Disk usage explodes with use of pileup. most of this is
  MC truth for backgr.event
• Which doesn't have to be written, will be eliminated
• Memory leak problems
             Detector Simulation Conclusions
•   G4atlas (A.Rimoldi):
•   Work on digitization and pileup
•   Full deployment of MCtruth
•   Migration to Python job options
•   Major changes in AtlasG4Sim to converge with atlas
•   # human resources problem (Armin)

•   Digitization and Pileup (D.Costanzo)
•   Detector folders  each detector reads hits from different bunch crossings
•   Emphasis moved to detector description in GeoModel
•   Inner Detector: Noise was implemented with no hits from pixel/SCT
•   LAr calorimeter: Improvements wrt G3; use of GeoModel
    Still bugs and calibration issues in G3/G4
•   Muon spectrometer: migration to GeoModel done in the last few days
    Effort put in validation of hit positions
    Detailed simulation of MDT digitization and response
•   Realistic pileup procedure still needs work
•   Setup for combined test beam in place (or almost)
Analysis Tools
             Analysis tools (K.Assamagan)
•   RTF recommendations: looked at
    modularity, granularity & design of                   data   algorithms
    reconstruction software
•   AnalysisTools: span gap between
    reconstruction and ntuple analysis
•   Tools: Artemis analysis framework
    prototype – Seems to diverge from
    EDM, may not be supported for long
•   PID - prototype to handle part.identif.




                                              Data flow
•   Workshop in april at UCL:
    http://www.usatlas.bnl.gov/PAT/ucl_
    workshop_sumary.pdf
•   - PyRoot, PyLCGDict
•   - Physicists Interfaces (PI) project:
     – extends AIDA; provides services for
       batch analysis in C++, fitting and
       minimization, storage in
       HBOOK/ROOT/xml,
     – plotting in
       ROOT/Hippodraw/OpenScientist,
       etc.
 Python scripting language (W.Lavrijsen)


• GaudiPython: provides binding to Athena core objects;
  basis for job options after release 8.2.0
• PyROOT: bridge between Python and ROOT; distributed
  with ROOT 4; http://root.cern.ch/root/HowtoPyROOT.html
• PyBus: software bus (modules can be “plugged in” to bus)
  implemented in Python; http://cern.ch/wlav/pybus
• ASK (Athena Startup Kit): DC2-inspired tutorial online; full
  chain of algorithms (generators => ... => simulation => ... =>
  analysis); http://cern.ch/wlav/athena/athask/tutorials.html
   Analysis with PyROOT (S.Snyder)
• Why? Because ROOT C++ not reliable enough,
  python much better and as good if
• No speed problems found
• PyROOT now part of ROOT release
• Can use pyroot to interface own code (or any
  code that cint would do)
• PyLCGDict, does the same thing with different
  data dictionary (data definition). When to use?
• If you have external code that already has
  ROOT/LCG dictionary that helps to decide
• PyRoot has less dependencies
Reconstruction, Trigger and Event Data Model
              Reconstruction (D.Rousseau)
• RTF recommendation: Detectors
  (e.g.TileCal and LAr) should share
  code to facilitate downstream
  algorithms
• Combined beam test: analyse CBT
  data with offline code, as little
  dedicated code as possible; big effort
  e.g. to integrate conditions database
  for various detectors
• DC2 reconstruction (release 9.x.x )
• Run on Geant4 data with new
  detector description; validate G4
• Persistence:
    – ESD (event summary data), EDM
      output: issues with ~200k cal cells;
      target size 100kB/ev
    – Ongoing discussions on AOD
      (analysis object data) definition: aim
      for 10kB/ev
        Reconstruction (D.Rousseau)

• Work model:
  Reconstruction  Combined ntuple  ROOT/PAW analysis
  – Changes to:
  reconstr  ESD/AOD  analysis in Athena  small ntuple  ROOT
  – CBNT remains as debugging tool but will not be
    produced in large scale for DC2
• Status:
  – Python job options (no more jobOptions_xxx.txt)
  – People needed for transverse tasks: documentation,
    offline/CBNT reconstruction integration, AOD/ESD
    definition
    Calorimeter reconstruction (P.Loch)

• CALO EDM has navigable classes CaloCell, CaloTower and
  CaloCluster using consistent 4-momentum representations
  (INavigable4Momentum), v8.1.0); can now be used directly by
  JetRec

• Container class CaloCellContainer holds both LAr and TileCal
  CaloCells and persistifies them in StoreGate (8.2.0, key “AllCalo”);
  used by LArCellRec, LArClusterRec, TileRecAlgs explicitly

• Clusters produced by CaloClusterMaker (Sven Menke, v8.1.0,
  topological and sliding window clusters) have full 3D neighbours
  option, crossing boundaries between calorimeters (8.2.0)

• Cluster splitter with 3D clusters spanning different calorimeters
  under test (aim for 8.3.0) – finds individual showers (peaks) in large
  connected cluster
    Calorimeter reconstruction (P.Loch)

•   New structure for algorithm class CaloTowerAlgorithm – calls different
    builders for towers according to calorimeter – makes older FCAL minicells
    obsolete

•   CALO algorithm structure slightly behind EDM: new CaloCellMaker (David
    Rousseau) to be tested – makes cells also calls cell-type corrections (>10
    corrections for LAr) – aim for 8.3.0, needed in 9.0.0

•   No hardwired numbers in code anymore, detector description/job options
    (Database? Job options?)

•   Implement relations between cells and clusters for 9.0.0 (using STL
    maps?) – Proposal to have classes such as “particleWithTrack” to
    implement relations was rejected

•   Asked for volunteers for design, implementation and testing of both the
    calorimeter EDM and the reconstruction algorithms
                 Tracking (E.Moyse)

Many recent developments:
• new tracking class; converters from old formats

• very good Doxygen documentation available from the ID software
  homepage

• A lot of reorganization and new packages recently
• track extrapolation for ID - Dmitri Emeliyakov

• DC2 will be based on iPatRec and xKalman

• there will be manual for EDM and utility packages for v9.0.0
                 Track: Overview




• New interface for Track, shown above.
• TrackStateOnSurface - provides a way of iterating through hits and
  scatterers on the track. It contains pointers to:
   –   RIO_OnTrack
   –   TrackParameter
   –   FitQualityOnSurface
   –   ScatteringangleOnSurface
• Summary
   – “old” summary object still in Track in 8.2.0. Could not be removed
     without changing interface (see later slide)
     TrackParticle: Overview




• Why do we need TrackParticle?
  • Need lightweight object for analysis, providing
    momentum
  • Need to transform parameters from detector to
    physics frame
  • Provides navigation
    Muon Reconstruction (S.Goldfarb)

• Packages Moore and Muonboy (became MuonBox)
• Moore: moved to GeoModel, reconstructs G4 data, DC2
  development
• MuonBoy: G4 reco expected shortly, not using
  GeoModel, development for testbeam
• Common features: unit migration now validated
• Efficiency in eta is now perfect (features reported in
  SUSY full.sim. paper are gone)
• Combined reconstruction: Staco (now ported to Athena,
  will accept all types of tracks
•      being prepared for new EDM)
•      MuID: low pT muon development using tilecal, etc
• Track task force
           Discussions on EDM/ESD/AOD
•   Data flow is:
      Reconstruction  ESD (100kB)  AOD (10kB)  User code  ntuples
•   Meeting at UCL in April: document with conclusions at:
                    http://www.usatlas.bnl.gov/PAT/ucl_workshop_sumary.pdf


•   Discussion:
•   Proposal for a class of “IdentifiedParticle” which could be lepton, tagged jet etc -
    Proposal was rejected, it seemed to need either a very complicated or a
    redundant implementation to be sufficiently general
•   Discussion on e.g. e/gamma ID:
     – egammaBuilder - high-pT electron ID
     – softeGammaBuilder: better for low pT/non-isol. electrons but much overlap
•   Both collections must be kept, but balance must be found in similar matters due
    to AOD size restrictions (aim for 10kB/ev.)
•   CaloCells (200,000! Cannot all be kept!): can keep cells above noise plus sum
    of all cells (critical for ETmiss)
•    Similar issues for tracking, muons, etc

•   Conclusions:
•   Keep more than one collection of , e.g. electrons, introduce “event view” to help
    choose between candidates
•   CaloCellContainer: a technical solution in sight but worries about cell removal,
    needs study
   Reference frames (Richard Hawkins)
Need for more than one frame ?
• Global frame is defined in ATL-GE-QA-2041
• Easier to do reconstruction if we have several subdetector-specific
  frames?
• Boosted frames? For example, such that sum of pT_beam=0, to correct
  for beam tilt (10-4rad, but p_beam=7TeV)

• Q&A
• Markus Elsing - one global frame must be used for reconstruction. Also
  global frame should be determined by inner detector frame, if possible.
  Problem otherwise when using lookup tables for subdetector position
• Various - beam tilt should be corrected for if possible/necessary and size
  of effect estimated; may be done in Monte Carlo

• Conclusions:
• Strive to use only one global frame
• Beam tilt should be taken care of in the simulation, same as the vertex
  smearing
                   PESA (Simon)
• Several technical matters, software automatic testing;
  move from development to production phase (stable,
  user-friendly code etc)

• Discussion on forced acceptance: a fraction of the
  events must be kept regardless of trigger acceptance ->
  studies of noise, background, trigger efficiency etc

• Discussion on how this should be implemented: fine-
  grained (according to LVL1 flags), global (global % of
  bunch crossings), etc

• Rather technical reports on status of e/gamma, LAr and
  muon slices
   Example: e/gamma slice




                          TrigIdScan

   7.0.2 had efficiency
   problem in both
   TrigIdScan and SiTrack,
   now solved



          Mean effic.: 94%                      Mean effic.: 91%
              7.0.2                               7.0.2
                                                                           SiTrack
                                 h                                     h
          Mean effic.: 96%                      Mean effic.: 95%
           8.0.0                                  8.0.0
 -3   2   1     0     1    2   h
                               3       -3   2    1    0   1   2   h3
Single electrons + pile-up 2x1033           Single electrons + pile-up 1034
                         egamma Workshop
Several topics have come up which need more discussion:
• Combined testbeam
    • How to integrate what we will learn there?
• Reconstruction
    • what is done in clustering, what in egamma & electron/photon id?
• Calibration/Corrections
    • what is specific to e and gamma and how to?
    • G4/G3
    • Geometry use
• Physics Studies
    • Zee etc
• Validation of Releases

Difficult to find time to discuss all of this in the usual weeks
Date : Mon-Tue, June 28-29 (tbc)
Place: LPNHE Paris (Fred Derue)
Several combined performance studies…
                   • Discrepancy found between G3
                     and G4 – see K.Benslama,
                    Electron Study
                   • Apparently already explained
     E = 20 GeV      from difference G3 and G4 (“E-
 E                   field effect” not simulated in G3) –
                     see G.Unal, LAr calibration
                   • Below: 50 GeV electrons vs. eta

        ?

                   Energy (MeV)
                                         Transition at h=0.8


                                  Crack barrel/end-cap


                                                               eta
                     JetRec (P.Loch)

• First look at new jets
• Kt for towers seems to be slower than used to be: found some long
  standing bugs
• Most jets in forward direction are extremely low in transverse energy
   apply cuts on jet finder input (towers, so far)
• Number of jets in calorimeter and MC truth is very comparable if no
  cut on Et of input cells: Et > 100/200/500 MeV (very low Et cuts!!!)
• As soon as cuts are applied, basically all distributions (multiplicity,
  jet shapes, radius, etc become different from truth (see next slide)

•   Verify hadronic calibration
•   Invited contributions of new jet algorithms if needed
•   Physics groups should use jet finders and give feedback
•   Preparing extensive documentation for 8.3.0
     Number of Kt jets in calorimeter and MC truth:
N evts / N jets 1/ 2 
                                                                       N jets / h 1/ 0.1
           no selection     Etin  100 MeV   Etin  300 MeV
                                                                               no selection    Etin  100 MeV   Etin  300 MeV




              ok!


                                                                             Etin  500 MeV    Etin  700 MeV    Etin  1 GeV




                                                              N jets
         Etin  500 MeV Etin  700 MeV        Etin  1 GeV                                                                      h


                                                     MC truth                     Tower jets
           Up date on vertexing II
• Vertexing tools work quite stable by now
  - Billoir FastFit method heavily in use
  - SlowFit method: first use case by B-physics
  people in Artemis (J. Catmore)

• InDetPriVxFinder package is the standard primary
  vertex finder in Athena reconstruction

• Several clients for VxBilloirTools already (B-
  physics group, several people from Bonn
  University, b-tagging, …)
                                         Andreas Wildauer
     Results of InDetPriVxFinder
H→uu and H→bb with vertex constraint, Geant4 H →bb
  (0., 0., 0.)±(0.015, 0.015, 56) mm, 4800 events in total



            12 µm                                     32 µm
            13 µm                                     36 µm




                                          Andreas Wildauer
                     Results on QCD di-jets events [from D. Cavalli]




OLD H1 Calibration - Athens results       NEW H1 Calibration from MissingET :
                                          improves proportionality curve
Combined Test Beam
      Combined Test Beam (A.Farilla)
• Immediate plans:
• Finalize and validate first
  version of simulation package
• Finalize first version of
  reconstruction with ESD and
  CBNT output
• Start plugging Combined
  Reconstruction algorithms into
  RecExTB
• Finalize code access to
  Conditions Database
• Develop basic analysis
  framework for reconstructed
  data                           Looking for new people from the CTB
• Work in progress for a         community to add algorithms from
  combined event display in      combined reconstruction into RecExTB
  Atlantis
That’s it
                        Acronyms
•   PESA – Physics event selection algorithms
•   ESRAT – Event Selection, Reconstruction and Analysis Tools
•   EDM – Event Data Model
•   ID – Inner Detector
•   CBT (CTB) – Combined Beam Test
•   STL – Standard Template Library
•   SEAL – Shared Environment for Applications at LHC
    (http://cern.ch/seal/)
•   AIDA – Abstract Interface for Data Analysis
•   LCG – LHC Computing Grid (http://lcg.web.cern.ch)
•   PI – Physicist Interface (extends AIDA, part of LCG)
•   POOL – Pool Of persistent Objects for Lhc (part of LCG)