Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out

ppt - CERN Computing Seminars

VIEWS: 2 PAGES: 55

									        Spring 2002
CMS Monte Carlo Production:
 What ? How ? What Next ?


  Véronique Lefébure (CERN-HIP)
        CERN-IT Seminar
   The 25th of September 2002
Content:
 •What ?        -Physics Applications                                  -Production steps
                                                                       -Data products
                -Resource Constraints                                  -CPU, RAM
                                                                       -Persistency

                -How Much Data                                         -Number of events, TB of data
                                                                       -Delivery deadline

 •How ?         -World-Wide Distributed                                -Where, who
                                                                       -Coordination
                 Production

                -Production Tools                                      -RefDB,IMPALA,DAR,BOSS
                                                                       -Data Transfer
                                                                       -Data Storage
                                                                       -Data Validation
                -Success and Difficulties

 •What next ?   -Possible Improvements
                -Coming Major Production                               -2004 Data Challenge




                  “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”                    2
                                        Véronique Lefébure
                    Introduction: CMS
On-line System
• Multi-level trigger
• Filter out background
• Reduce data volume




                          “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   3
                                                Véronique Lefébure
           Data Simulation Needs

• Spring 2002 Production for the CMS Physics
  Community:
  – need a large amount of simulated data in order to
    prepare the CMS DAQ TDR document:
    “Data Acquisition Technical Design Report”
    due for end of 2002
  – need the most up-to-date physics software to be used
  – need the data before June 2002 CMS week




                 “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   4
                                       Véronique Lefébure
    Monte Carlo Production Steps

•   The full Production Chain consists of 4 steps:

    •   3 Logical Monte Carlo Simulation Steps:
        1) Generation
        2) Simulation
        3) Digitisation RAW data
                                          as produced by the real detector
                                          Stored in Objectivity/DB
    • 1 Reconstruction and Analysis Step
•   Production was performed step by step
    for many different p-p physics channel


              “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   5
                                    Véronique Lefébure
               Monte Carlo Production Steps
                      1) Generation

       Primary interactions
       in vacuum of beam-pipe



•Generation of one p-p interaction at a time
  for a Selected physics channel             p                                                  p
• In reality:
~4 or 20 interactions per beam-crossing
depending on the beam luminosity
(2.1033 or 1034 cm-2 s-1)
i.e. interactions are superimposed: “pile-up”


                            “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”       6
                                                  Véronique Lefébure
            Monte Carlo Production Steps
                   2) Simulation
 Secondary interactions
 in detector material and
 magnetic field

•Individual Hits:
      •Crossing points
      •Energy deposition
      •Time of flight
•In reality:
one beam-crossing every 25 ns
<< time of flight and electric
signal development
i.e. superimposition of signals
from particles from different
beam-crossings: “pile-up”

                           “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   7
                                                 Véronique Lefébure
         Monte Carlo Production Steps
               3) Digitisation
    Response of Sensitive
    detector elements, taking
    into account the two sources
    of Pile-Up
•4 or 20 interactions
per beam-crossings

•Particle time of flight &
Electrical signal development :




  •Beam-crossings: [-5,+3]
  •For 1 Signal p-p event of 1 MB
  We have 70 MB of Pile-up events @1034 cm-2 s-1
                         “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   8
                                               Véronique Lefébure
      Monte Carlo Production Steps
      4) Reconstruction and Analysis
Higher level physics
Reconstruction and
Histograming




•Level-1 trigger Filtering
•Track, clusters, vertices Reconstruction
•First-pass physics Analysis
•Histograming




                       “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   9
                                             Véronique Lefébure
                                 Physics Applications
           Application                                 Input                              Output (for jobs of 500 events)

                                                                                                                     “Generation”
CMKIN/PYTHIA                         ascii file                                        PAW ntuple
(ISAJET,COMPHEP):                                                                      (size: 30 MB)
•Fortran77
                                                                                                                            p    p
•Very fast: 5 sec/event

CMSIM/GEANT3                         ascii file                                        ZEBRA file (size: 0.5 GB)
                                     + Geometry and magnetic field                                                   “Simulation”
•Fortran77
                                     ZEBRA file (size:14 MB)
•Very slow: 1 to >10 min/event
                                     + PAW ntuple

ORCA-COBRA                           ascii file                                        Objectivity/DB data              “ooHit
                                     + Geometry and magnetic field                     & metadata ooHit files
•C++ Object-Oriented
                                     ZEBRA file                                        (size: 0.5 GB)                 formatting”
•ooHit Formatting: very fast (I/O)
                                     + ZEBRA file
•1034 PU (200 PU events):
1 min/event
•Executable size: ~<200 MB
                                     ascii file                                        Objectivity/DB data           “Digitisation”
                                     + Objectivity/DB data                             & metadata Digis files
•Multi-threaded                      & metadata ooHit files for “Signal”               (size: 2 GB)
                                     and “Pile-up” events
                                     ascii file                                        Objectivity/DB data
                                                                                       & metadata files
                                                                                                                     “Reconstruction
                                     + Objectivity/DB data
                                     & metadata ooHit & Digi files for                 or PAW ntuple or ROOT files    and Analysis”
                                     “Signal” and “Pile-up” events

                                        “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”                            10
                                                              Véronique Lefébure
           More Production Steps

• Filtering (Level-1 trigger, …)
• Add digits (eg. First calorimeter digits, then
  Tracker after filtering)
• Cloning of ooHits and/or Digis (smaller collection
  of data to handle, less staging at analysis time)
• Re-digitisation with different algorithms or
  parameters



                “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   11
                                      Véronique Lefébure
                Resource Constraints
•   Long CMSIM jobs: can take 2 days and more
•   RAM : > 512 MB for dual processors (ORCA)
•   Redhat 6.1(.1) for Objectivity/DB license
•   Data server:
    – ~80 GB of Pile-Up events (re-used, otherwise 300TB!)
    – Typically 1 server per 12 CPUs
• Disk space: size of one typical dataset @ 1034:
  50K events (1MB fz + 1MB oohits + 4 MB digis)/event
  = 300 GB
• Lockserver, AMS server:
  number of file handles may reach ~3000

                     “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   12
                                           Véronique Lefébure
                       Job Complexity
• Generation and Simulation jobs: easy part
• ORCA-COBRA jobs: more tricky :
   – Closely-coupled jobs
       •   Shared federation/lockserver, output server, AMS
       •   ~ 5 jobs write in parallel to 1 DB
       •   1 job may populate many DBs (~10)
       •   One stale lock can bring everything to a halt
   – Massive I/O system @ 1034
       • ~100 jobs in parallel
       • Input = ~70 MB pile-up events per 1 MB signal event,
                     1 event/minute = 1MB/sec/job
       • Output = 4 MB/minute/event/job
   – Not yet fully robust physics software: need to recover from
     crashes and to spot infinite loops


                        “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   13
                                              Véronique Lefébure
                    How Much Data ?
• Generation/Simulation:
   – 4 months
   – 6 M events = 150 Physics channels
• ORCA production:
   – 2 months
   – 19000 files = 500 Collections = 20 TB
      NoPU: 2.5M, 2x1033PU:4.4M, 1034PU: 3.8M, filter: 2.9M
   – 300 TB of pile-up movement on the LAN
• 100 000 jobs, 45 years CPU (wall-clock)
• More than 10 TB traveled on the WAN
• Production completed just on time
     Successful Production at a regular global rate !
                       “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   14
                                             Véronique Lefébure
                    CMSIM
       6 million events




Feb. 8th                                                                           June 6th
               “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”              15
                                     Véronique Lefébure
                                  2x1033PU
     4 million events




April 12th                                                                       June 6th
             “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”              16
                                   Véronique Lefébure
                                                   1034PU
      3.5 million events




April 10th                                                                         June 6th
               “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”              17
                                     Véronique Lefébure
                                                Physics Results
  Data is used
  for physics studies,
  not only for computing
  performance studies

                    Jet Resolutions
HLT jet resolution at low and high luminosity




                               Sarah Eno                               5




                                                “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   18
                                                                      Véronique Lefébure
                      How ?


• Production
   – Distribution
   – Coordination
• Production Tool Suite
• Success and Difficulties




      “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   19
                            Véronique Lefébure
World-wide Distributed Production




                CMS Production Regional Centre
         “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   20
                               Véronique Lefébure
    World-wide Distributed Production
• 11 Regional Centres (RC)
  > 20 sites in USA, Europe, Russia
  ~ 1000 CPUs
  Bristol/RAL (UK), Caltech, CERN, Fermilab, Imperial College (UK), IN2P3-Lyon,
  INFN (Bari, Catania, Bologna, Firenze, Legnaro, Padova, Perugia, Pisa, Roma,
  Torino), Moscow (ITEP, JINR, SINP MSU, IHEP), UCSD(San Diego), UFL (Florida),
  Wisconsin;
  Note: Still more sites joining (RICE, Korea, Karlsruhe, Pakistan, Spain,Greece, … )

• > 30 Production Operators
  Maria Damato, Alessandra Fanfani, Daniele Bonacorsi, Catherine MacKay, Dave Newbold, Suresh
  Singh, Vladimir Litvine, Salavatore Costa, Julia Andreeva, Tony Wildish, Veronique Lefebure, Greg
  Graham, Shafqat Aziz, Nicolo Magini, Olga Kodolova, David Colling, Philip Lewis, Claude
  Charlot, Philippe Mine, Giovanni Organtini, Nicola Amapane, Victor Kolosov, Elena Tikhonenko,
  Massimo Biasotto, Stefano Lacaprara, Alexander Kryukov, Nikolai Kruglov, Leonello Servoli, Livio
  Fano, Simone Gennai, Ian Fisk, Dimitri Bourilkov, Jorge Rodriguez, Pamela Chimney, Shridara
  Dasu, Iyer Radhakrishna, Wesley Smith,
  plus probably many more persons in ‘the shadow’ !

• > 20 Physicists as Production “Requestors”

                             “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”        21
                                                   Véronique Lefébure
                  Coordination Issues

• Physicists side
   –   Handle four Physics groups
   –   Check uniqueness of requests
   –   Check number of requested events is reasonable
   –   Take care of requests priorities
• Producers side
   – Deploy and support production tools
   – Distribute physics executables
   – Distribute adequately requests to RCs
   – Insure uniqueness of produced data
   – Track progress of data production and transfer


                      “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   22
                                            Véronique Lefébure
                Coordination Means
• Physicists side:
   – 1 Coordinator per Physics group
   – 1 Coordinator for the 4 Physics groups
   – Meetings
   – Use of MySQL CMS DB for recording and managing the
     production requests (“RefDB”)
• Producers side:
   – 1 Production Manager
   – 1 Production Coordinator in contact with the Physics Coordinators
   – 1 or 2 Contact Persons per Regional Centre
   – Meetings and mailing list
   – Use of MySQL CMS DB for assigning production requests to
     Regional Centres and progress tracking (“RefDB”)
   – Pre-allocation of run numbers, random seeds, DBIDs
   – Automatic file naming provided by “RefDB”
                     “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   23
                                           Véronique Lefébure
                                          RefDB
                 Central Reference Database
Production Requests:
• Submission forms for each
   production step
• List of recorded Requests
• Modification/Correction of
   submitted Requests


Production Assignments:
• Selection of a set of Requests
   for Assignment to an RC
• Re-assignment of a Request to
   another RC or production site
• List and Status of Assignments

                         “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   24
                                               Véronique Lefébure
                                           RefDB
                  Central Reference Database

Meta Data catalogue :
• Browse Datasets
  according to :
   – Physics Channel
   – Software Version
   – …

• Get Production Status
• Get Data Location
• Get Input Parameters



                        “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   25
                                              Véronique Lefébure
                      How ?


• Production
   – Distribution
   – Coordination
• Production Tool Suite
• Success and Difficulties




      “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   26
                            Véronique Lefébure
       Production Tools: Spring02 Components
                                                                                                   “IMPALA”
                                                       Interface

                                                                                            Job Scripts Generator
Web Interface for
Production Requests
                                        Central
                                 Input Parameters DB                                                     Interface
                                                                 Monitoring
             “RefDB”                                             Schema & Scripts

                                                                                                                             “BOSS”
Web Interface for                                                                                       Local
Browsing of Metadata                                                   Interface
                                                                                                   Job Monitoring DB
& Data Location

                                      Central                                                                        Interface
                                 Output Metadata DB


                                                                                                    Job Scheduler



Plus: Executables Distribution     =“DAR”; Data Transfer Tools                    = “Tony’s scripts”;            Data Storage




                                          “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”                           27
                                                                Véronique Lefébure
                     DAR
         “Distribution After Release”

• CMS software distribution tool
• allows to create and install the binaries
• Distribution tar files published at FNAL and at CERN
• Local installation:
  dar -i Distribution_Tar_File Installation_Directory
• Used for distribution of ALL physics executables and
  Geometry file



                  “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   28
                                        Véronique Lefébure
                BOSS
   “Batch Object Submission System”
• tool for job monitoring and book-keeping
  developed by CMS
• not a job scheduler,
  but can be interfaced with any scheduler :
   –   LSF (CERN, INFN)
   –   PBS (Bristol, Caltech, UFL, Imperial College, INFN)
   –   FBSNG (Fermilab)
   –   Condor (INFN, Wisconsin)
• Uses a database (MySQL)




                     “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   29
                                           Véronique Lefébure
                                     BOSS

• User registers a scheduler:
   – Scripts for job submission, deletion and query (DB blobs)
• User registers a job type:
   – Schema for the information to be monitored (new DB table)
   – Algorithms to retrieve the information from the job (DB blobs)
• User submits jobs of a defined type:
   – A new entry is created for the job in the BOSS database tables
   – The running job fetches the user monitoring programs and
     updates the BOSS database




                   “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   30
                                         Véronique Lefébure
                        BOSS

                  STDOUT
 OUT
 pipe    TEE                                          jobExecuter
                                                                                                BOSS
                                                                                                 DB
                                                                          TEE
STDIN   USER                LOG                                           pipe                   Filter
                                                                                                 pipe

 ERR     TEE
                                                                                   Monitoring
 pipe                                                                              Algorithm
                   STDERR


               “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”                          31
                                     Véronique Lefébure
        BOSS for Spring02 Production
Job Type Table                                     BOSS Job Type Registration
                                                         components
                 Generation                               cmkin.schema , preprocess,
      KIN
                                                         runtimeprocess , postprocess


                 Simulation                               cmsim.schema , preprocess,
      SIM
                                                         runtimeprocess , postprocess


                   OOHit                                  oohit.schema , preprocess,
     OOHit
                                                         runtimeprocess , postprocess


                 Digitisation                             oodigi.schema , preprocess,
    OODigi
                                                         runtimeprocess , postprocess

                    “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   32
                                          Véronique Lefébure
          From BOSS to RefDB:
            “Summary scripts”
• Updating RefDB with current status of
  assignment progress
• Book-keeping of the monitored values
• Checking of uniqueness of generation and
  simulation run numbers and random seeds
• Warning for duplicate runs
• Warning for missing or incomplete runs


               “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   33
                                     Véronique Lefébure
            Data Validation Scripts
• After storage of the data:
  Final Validation at the Meta Data level
   – Basically, checks that warnings given by the ‘summary’
     scripts have been corrected
      • Correct number of events
      • No duplicates
   – Closure of DB files (COBRA sense of it: no more data
     will be written to that DB file)
   – All DB files of a Collection are attached to the
     Federation



                   “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   34
                                         Véronique Lefébure
                                                    IMPALA
                                                                                                   “IMPALA”
                                                       Interface

                                                                                            Job Scripts Generator
Web Interface for
Production Requests
                                        Central
                                 Input Parameters DB                                                     Interface
                                                                 Monitoring
             “RefDB”                                             Schema & Scripts

                                                                                                                             “BOSS”
Web Interface for                                                                                       Local
Browsing of Metadata                                                   Interface
                                                                                                   Job Monitoring DB
& Data Location

                                      Central                                                                        Interface
                                 Output Metadata DB


                                                                                                    Job Scheduler



Plus: Executables Distribution     =“DAR”; Data Transfer Tools                    = “Tony’s scripts”;            Data Storage




                                          “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”                           35
                                                                Véronique Lefébure
                                IMPALA
    “Intelligent Monte Carlo Production Local Actuator”


•   Automated script generation tool developed by CMS
    for MC Production
•   Job splitting: 50 000 events = 100 jobs of 500 events
•   Interfaces defined for
    – Parameter Handling
    – Input source discovery and enumeration
    – Tracking (“declared, created, submitted, running, done, problems,
      logs”)
    – Job Submission




                      “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   36
                                            Véronique Lefébure
                                                IMPALA

              Send      RefDB          Assign                     RefDB
              Request   Request        Request to RC              Assignment
Physicist’s
  Brain



                            RC                           IMPALA Tracking/Production files
                                                                                                      IMPALA “Create”
          Export
            Data
                                                          IMPALA Tracking/Batch files
                                                                                                      IMPALA “Submit”
                                                         Data production
                             Data                                                        Farm
                                     MetaData
                                                                                                      BOSS pre/runtime/post process
                                                                                        BOSS DB
                                  Close DBs & Invalidate
                                   Bad runs (MDeamon)                                            IMPALA Summary scripts

                                                       RefDB
                                                       Run table, Assignment Done

                                  “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”                                   37
                                                        Véronique Lefébure
              IMPALA: Configuration

• Executable location (DAR file)
• Output data location
  (Boot file for the Objectivity/DB federation, output disk, …)
• BOSS (or Scheduler) installation location
• Local functions
  (CopyLogFiles, StageIn, StageOut, …)




                       “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   38
                                             Véronique Lefébure
                      Data Transfer
                          (“Tony’s scripts”)

• Transfer tool developed by CMS: “Tony’s scripts”
   – For CERN/Europe
   – Many US sites use GDMP (Grid) and globus-url-copy
• Simple HTTP server publishes list of files
   – Files on disk (‘find’) or on tape (flat list)
• Client searches list for new files
   – Compares to list of files already retrieved, selects by
     pattern-matching (to select datasets)
• Client asks server to push n files
   – ‘DBServer’ pushes files in m parallel streams
       • using designated copy agent: scp, bbcp, rfcp

                    “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   39
                                          Véronique Lefébure
           Spring02 Transferred Data
•To CERN: 3 or 4 exporters in parallel, 7 TB in total
•To FNAL: 5TB
•Sustained rate network-to-disk higher than sustained rate disk-to-tape

                   From                                        To                         Rate
        Bristol, RAL, IC,          CERN                                        ~200GB/day
        IN2P3,INFN,                                                            (disk 150GB limit)
        Caltech,FNAL,UFL,Wisconsin
        Moscow                                                                 Slow
        Caltech, UFL, Wisconsin                         FNAL                   ~ 1 TB/day
        UCSD
        Moscow
        Bristol                                         RAL                    ~ 1 TB/day
        INFN                                            INFN                   ~300 GB/day


                      “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”             40
                                            Véronique Lefébure
                   Data Storage


• CASTOR (CERN)
• ENSTORE (Fermilab)
• Basic tape system (RAL)




               “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   41
                                     Véronique Lefébure
    Success and Difficulties

•   Coordination
•   Farm Setup
•   Running Jobs
•   Data Transfer
•   Data Storage and Publication




         “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   42
                               Véronique Lefébure
    Success and Difficulties: Coordination
• Use of a Central Reference DB “RefDB”
    –   Uniform format of input parameter files                                              NEW
    –   Storage and index of parameter files                                                 NEW
    –   Automatic retrieval of the parameters by IMPALA                                      NEW
    –   Tracking of the global CMS production rate                                           NEW
•   Test-assignments for validation of software installation NEW

• Where GRID tools can help us:
    – Assignment of Requests to RCs is still done by hand
         • Need of a CMS-wide Resource Monitoring System
    – Update of RefDB has to be done by hand
         • Should be automated and incorporated in the Job Monitoring System


                         “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”         43
                                               Véronique Lefébure
  Success and Difficulties: Farm Setup
• We have a Production Tool Suite                                                         NEW
   – But a lot to learn the first time
       • At system level (MySQL, Disk servers configuration for Pile-Up,
         AMS & Lockserver , …)
       • At the software level (test-assignments to play)
   – Heavy support task: rapidly evolving production software: new
     releases, bug fixes (but excellent team spirit)
   – Different Farm configurations: not possible to test the tools for all
   (Different job schedulers, MSS or not, distributed or central disks,
     shared or dedicated CPUs, firewalls or no, data servers on CPU
     nodes or not,…)


• Where GRID tools can help us:
   – ‘installation in one command’ toolkit


                      “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”         44
                                            Véronique Lefébure
 Success and Difficulties: Running Jobs
• ORCA Digitisation Job Resume System                                                      NEW
   – Highly helpful (~10% of failure, jobs can now be easily resumed)
   – Still need more robustness in the user analysis part of ORCA
   – Invalidation of bad runs to be automated
• Objectivity/DB “readonly” option                                                         NEW
   – Much less locking problems than before
• System problem recovery:
   –   Cleaning of stale Objectivity/DB locks
   –   2GB file size limit to be controlled on Solaris disk (CERN)
   –   Network failure (no more disk failure)
   –   Disk space
   –   Scaling problems in the way we use BOSS
• Where GRID tools can help us:
   – Farm Monitoring System , with discovery of crash reason and
     action for recovering
                       “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”         45
                                             Véronique Lefébure
  Success and Difficulties: Data Transfer
• We have transfer tools: “Tony’s scripts” and GDMP
  – Much more data movement than before
          over half the data has traveled on the WAN
   – still problems to be handled by hand:
      •   Transfer interrupted (time limit)
      •   Data corruption
      •   Disk space limitation
      •   Missing files: Datasets spread over up to 500 files for one
          collection (typically 100 files) but we must have every file before
          analysis can start safely

• Where GRID tools can help us:
   – Replica Manager

                       “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   46
                                             Véronique Lefébure
             Success and Difficulties:
              Storage & Publication
• Validation scripts for Dataset integrity check                                         NEW
   – Should be part of the data transfer tool
• Tape failures (RAL)
• Archive failure in Castor: rare but difficult to spot
• Stage in time to Castor can be very long for few files
  (>1hour)

• Interaction between Castor and (multiple) analyses not
  well understood  needs studying


                     “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”         47
                                           Véronique Lefébure
     Success and Difficulties: Summary
• Major improvements in the physics code and in the
  production machinery with respect to previous years
    – ORCA Resume System
    – Use of RefDB and BOSS:
      made better automation and book-keeping possible
    – Our CMS production tools can be improved: more automation
•   GRID tools may help to have it even better:
    –   Tool for Installation/Configuration of Production Tools
    –   Resource Monitoring System
    –   Replica Manager
    –   Anything that can help reducing the manpower needs
• Data access for user analysis has to be improved
• Problems have been addressed by the Production team and
  the Production Tools Review team
                        “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   48
                                              Véronique Lefébure
                                         More and Faster
•   1999:        1TB – 1 month – 1 person
•   2000-2001: 27 TB – 12 months – 30 persons
•   2002:      20 TB – 2 months – 30 persons
•   2003:     175 TB – 6 months – <30 persons
                                         CMS

                        100000
                                                                                                                     LHC
                                     CERN                                                                            1E34 Average
                                     OFFSITE
                                                                                                  LHC
                        10000                                                                                             slope
                                                     DC06                                         2E33
                                               DC05                                                                       =x2.5/year
                                                     Readiness
         kSI95.Months




                         1000        DC04      LCG TDR
                                     Physics TDR
                          100
                                 DAQTDR
                           10


                            1
                                  2002    2003           2004           2005            2006           2007   2008     2009


                                          “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”                        49
                                                                Véronique Lefébure
          Coming Data Challenge
• 2004 Data Challenge“DC04”
   – Analysis of data produced by
     25% LHC startup luminosity (2.1033 cm-2 s-1)
     @ a data-taking rate of 25Hz during 1 month
     = ~ 5. 107 events
   – = 5% LHC final luminosity (1034 cm-2 s-1)
   – To validate the software baseline:
      • new LCG persistency framework (POOL,ROOT)
      • new simulation software (OSCAR/Geant4)
      • new GRID tools and resources
• 2003 pre-challenge:
  production of the 5. 107 events @ 2.1033 cm-2 s-1

                 “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   50
                                       Véronique Lefébure
                                Two Phases
• Pre-Challenge                  (2003, Q3,Q4)                                           (Must be successful)
   –   Large scale simulation and digitization
   –   Will prepare the samples for the challenge
   –   Will prepare the samples for the Physics TDR
   –   Progressive shakedown of tools and centers
        • All centers taking part in challenge should participate to pre-challenge
   – The Physics TDR and the Challenge depend on successful completion



• Challenge                        (2004,Q1,Q2)                      (May fail,
                                             i.e. not be completed on schedule)
   – Reconstruction at “T0” (CERN)
   – Distribution to “T1s”
        • Subsequent distribution to “T2s”


                          “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”                     51
                                                Véronique Lefébure
       Pre-challenge Resource Needs
• Simulation: 100 TB – 5 months – 1000 CPUs
• Digitisation: 75 TB – 2 months – 150 CPUs
  – 800MHz P3 is 33 SI95
  – Working assumption that most farms will be at 50SI95/CPU in late 2003



           Challenge Resource Needs
• Reconstruction: 25 TB – 1 month – 460 CPUs
                  at CERN              @ 50SI95/CPU

• World-wide distributed analysis

                     “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   52
                                           Véronique Lefébure
         Summary and Conclusions

• Very successful MC Production
  – 20 TB of data delivered on time to the Physicists
  – Smooth production over 4 months
  – 20 production sites, 30 persons
• More automation for next Data Challenge
  – Improvements of our CMS tools
  – Expecting help from GRID tools




                 “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   53
                                       Véronique Lefébure
                  More Information


• GRID/production Workshop (June 2002)
   http://documents.cern.ch/age?a02826
• “The Spring02 DAQ TDR Production” CMS Note
   CMS-IN 2002/034
• CMS MC Production web page
  – RefDB, BOSS, IMPALA, DAR
  http://cmsdoc.cern.ch/cms/production/www/html/general/index.html




                    “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   54
                                          Véronique Lefébure
               Acknowledgements

• Thanks to the CERN-IT division for the invitation to give
  this talk
• Thanks to David Stickland and Tony Wildish to let me
  present it
• Thanks to the whole CMS Production Team for achieving
  these nice results, and to everyone on the CERN
  CASTOR, Tape, Objectivity, LSF, AFS and CMS support
  lists !




                   “Spring 2002 CMS MonteCarlo Production: What ? How ? What next ?”   55
                                         Véronique Lefébure

								
To top