Tier-1 - Infn

Document Sample
Tier-1 - Infn Powered By Docstoc
					  CMS at CNAF Tier-1:
Pre-Challenge Production
          and
   Data Challenge 04
     D.Bonacorsi (CNAF/INFN e CMS Bologna)
                                         Outline
    Introductory overview:

         CMS Pre-Challenge Production (PCP)
         Data Challenge (DC04)
          ideas, set-up, results: main focus on INFN contribution



    DC04: the lessons learned
          post-mortem analysis in progress: preliminary results


    the CNAF Tier-1 role and experience in PCP-DC04

        on the above items, discussion on successes/failures,
         problems/solutions/workarounds, still open issues…


III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004   D.Bonacorsi   2
                  CMS PCP-DC04 overview
  Validation of CMS computing model
   large scale test of the computing/analysis models on a sufficient number of T0, T1, T2 sites

 • Pre-Challenge Production: PCP (Jul. 03 - Feb. 04)
      – Simulation and digitization of data samples needed as input for DC            Generation
      – Transfer to CERN: ~1TB/day for several months                                 Simulation
      – PCP Strategy:                                                PCP
          • “it could not fail”, so mainly non-grid production…
          • …but use grid prototypes (CMS/LCG-0, LCG-1, Grid3)
  ~70M Monte Carlo events (20M with Geant-4) produced,
                                                                                      Digitization
   750K jobs, 3500 KSI2000 months, 80 TB of data
  Digitization still going-on “in background”

 • Data Challenge: DC04 (Mar. - Apr. 04)
      – Reconstruction and analysis on CMS data sustained over 2 months
        at the 5% of the LHC rate at full luminosity  25% of start-up lumi
      – Data distribution to Tier-1,Tier-2 sites
      – DC Strategy:
          •   Sustain a 25Hz reconstruction rate in the Tier-0 farm
          •   register data and metadata to a world-readable catalogue            Reconstruction
          •   transfer reconstructed data from Tier-0 to Tier-1 centers
          •   analyze reconstructed data at the Tier-1’s as they arrive DC04
          •   publicize to the community the data produced at Tier-1’s
          •   monitor and archive resources and process info                           Analysis

 Not a CPU challenge, but aimed to the demostration of feasibility of the full chain
III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004        D.Bonacorsi               3
               CMS ‘permanent’ production


• since 03, higher scale w.r.t the past                              PCP start          DC04 start
• ~10 months of continuous running

 no more time for significant develop.
processes between production cycles



  fixing an always-running engine..

                                                                                                     …
                                                    ‘Spring02      ‘Summer02      CMKIN CMSIM Digitisation
                                                      prod’           prod’            + OSCAR


   The system is evolving into a             permanent production effort…

 Strong contribution of INFN and CNAF Tier-1 to CMS past&future productions:
 252 assid’s in PCP-DC04, for all production step, both local and (when possible) Grid

III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004               D.Bonacorsi           4
                  PCP set-up: a hybrid model
                                                                           by C.Grandi
  Phys.Group asks for                                                                    Dataset
                                 RLS
    a new dataset

                                                                          Job            metadata
  Production Manager
  defines assignments           RefDB
                                                                       metadata

             Data-level                             shell                Local
               query                               scripts           Batch Manager

                                                                                         Computer farm
             Job level                                                Grid (LCG)
                            BOSS DB                 JDL
               query                                                  Scheduler               LCG-
                                                                                               0/1

                               McRunjob             DAG                DAGMan                 Grid3
                                                         job
  Site Manager starts          + plug-in                                (MOP)
                                                   job         job
     an assignment             CMSProd
                                                         job




 Push data or info
                                                  Chimera            Virtual Data         Planner
 Pull info                                         VDL                Catalogue
III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004                D.Bonacorsi              5
              PCP @ INFN statistics (1/4)
                                                                   CMS production steps:
         Generation step                                           Generation
                                                                   Simulation
            (all CMS)                                              ooHitformatting
                                                                   Digitisation




                                                                           Generation step
                                                                             (INFN only)
                             contribute to this slope




            Jun – mid-Aug 03


  ~79 Mevts in CMS
  ~9.9 Mevts (~13%) done by INFN (strong contribution by LNL)
III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004    D.Bonacorsi            6
              PCP @ INFN statistics (2/4)
                                                                   CMS production steps:
         Simulation step                                           Generation
                                                                   Simulation
        [CMSIM+OSCAR]                                              ooHitformatting
            (all CMS)                                              Digitisation




                                                                           Simulation step
                                                                          [CMSIM+OSCAR]
                                                                             (INFN only)


                Jul – Sep 03


 ~75 Mevts in CMS
 ~10.4 Mevts (~14%) done by INFN (strong contribution by CNAF T1+LNL)

III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004    D.Bonacorsi            7
              PCP @ INFN statistics (3/4)
                                                                   CMS production steps:
            ooHitformatting step                                   Generation
                                                                   Simulation
                  (all CMS)                                        ooHitformatting
                                                                   Digitisation




                                                                       ooHitformatting step
                                                                           (INFN only)




                             Dec 03    end-Feb 04


  ~37 Mevts in CMS
  ~7.8 Mevts (~21%) done by INFN

III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004    D.Bonacorsi            8
              PCP @ INFN statistics (4/4)
                                                                   CMS production steps:
         2x1033 digitisation step                                  Generation
                                                                   Simulation
               (all CMS)                                           ooHitformatting
                                             DC04                  Digitisation continued through DC!
                                                                   Note:
                                                                   strong contribution to all steps
                                                                   by CNAF T1 but only outside DC04
                                                                   (on DC too hard for CNAF T1 to be a RC also!!)




                                       Feb 04      May 04

                                                                                  2x1033 digitisation step
 ~ 43 Mevts in CMS                                                                      (INFN only)
 ~ 7.8 Mevts (~ 18%) done by INFN
III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004              D.Bonacorsi                9
                  PCP grid-based prototypes
   Constant work of integration in CMS between:              CMS software and production tools
                                                             evolving EDG-XLCG-Y middleware

   in several phases:
      CMS “Stress Test” stressing EDG<1.4, then:
      PCP on the CMS/LCG-0 testbed
      PCP on LCG-1
   … towards DC04 with LCG-2


 EU-CMS: submit to LCG scheduler
  CMS-LCG “virtual” Regional Center
 0.5 Mevts Generation [“heavy” pythia]
 (~2000 jobs ~8 hours* each, ~10 KSI2000 months)
 ~ 2.1 Mevts Simulation [CMSIM+OSCAR]
 (~8500 jobs ~10hours* each, ~130 KSI2000 months)
                                                                                 OSCAR: ~0.6 Mevts
 ~2 TB data                                                                      on LCG-1
                                           * PIII 1GHz



                               CMSIM: ~1.5 Mevts
                               on CMS/LCG-0

III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004         D.Bonacorsi        10
                         Global DC04 layout
                                                                        by C.Grandi


      Tier-0                                                                                  Tier-2
                                                                                             Tier-2
                                     Tier-0                                                 Tier-2
                                                                                              Physicist
                 GDB            data distribution                                            Physicist
                                                                    LCG-2
                                    agents                                                  Physicist
                                                                   Services                       T2
                                                                                                T2
                                                                                               storage
                ORCA                                                                           T2
                                                                                             storage
                RECO                                                                        storage
                 Job                   EB                             Tier-1
                                                                    Tier-1                      ORCA
                                                                        Tier-1                ORCAJob
                                                                                              Local
                                                                   Tier-1
                                                                        agent
                                                                     Tier-1                  ORCA
                                                                                             Local Job
    RefDB         IB                                                agent
                                                                  Tier-1                    Local Job
                                                                              T1
                                                               MSSagent
                                        TMDB                               storage
                                                                           T1
                                                            MSS         storage
             fake on-line                                                T1
                                                           MSS ORCA   storage
               process                                                       ORCA
                                                                Analysis
                                                              ORCA          Grid Job
                                                                   Job ORCA
                                                             Analysis
                                      POOL RLS               ORCA        Grid Job
                                                               Job       ORCA
                                      catalogue             Analysis
                Castor                                                 Grid Job
                                                              Job

III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004              D.Bonacorsi          11
                    DC04 key points (1/2)
                                                                   Adapted from a talk by C.Grandi,
    • Maximize reconstruction efficiency                           EGEE Cork Conf, April 19th, 2004
          no interactions of Tier-0 jobs with outside components

    • Automatic registration and distribution of data
          via a set of loosely coupled agents

    • Support a (reasonable) variety of different data transfer tools and set-up
          SRB (RAL, GridKA, Lyon, with Castor, HPSS and Tivoli SE)
          LCG-2 Replica Manager (CNAF, PIC, with Castor-SE)       see later
          SRM (FNAL, with dCache/Enstore)
       and this reflects into 3 different Export Buffers at CERN (T0  T1’s)

    • Use a single global file catalogue (accessible from all T1’s)
          RLS used for data and metadata (POOL) by all transfer tools

    • Key role of the Transfer Management DB (TMDB)
          a context for agents inter-communication

    • Failover systems and automatic recovery

III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004    D.Bonacorsi              12
                    DC04 key points (2/2)

  • Redundant monitor/archive of info on resources and processes:
         MonaLisa used on almost all resources

         GridICE used on all LCG resources

         LEMON on all IT resources

         Ad-hoc monitoring of TMDB information



  • Strategy for job submission at RCs left to their choice
        e.g. LCG-2 in Italy/Spain                                    see later




  • Data availability for user analysis: work in progress…
        grant data access to users
        prototyping user analysis models on LCG…


III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004   D.Bonacorsi    13
                  LCG-2 components in DC
  Used at CNAF/PIC T1s for the full DC04 chain (but the T0 reco)                      (see ‘DC strategy’ on slide 3)

  register all data and metadata to a world-readable catalogue
     • official RLS service provided by CERN/IT with Oracle backend
     • ~570k files registered (each with 5-10 PFN’s and 9 metadata attributes)
     • Good performance as a global file catalogue:
           Registration of files by both production and all transfer tools
           Fast enough if using the appropriate tools (e.g. LRC C++ API)
     • Bad performance as a global metadata catalogue (up to ~1 KB metadata per file)
     • RLS replica at CNAF (ORACLE multi-master mirroring): may be tested in the next future..

  transfer the reconstructed data from Tier-0 to Tier-1 centers
      • full data transfer chain implemented using LCG-2 Replica Manager between LCG-2 SEs
         (Castor-SEs at CNAF/PIC, classic disk-SEs at CERN (Export Buffer), CNAF, LNL, PIC)

  analyze the reconstructed data at the Tier-1’s as data arrive
     • dedicated bdII (EIS+LCG)  CMS may add resources and/or remove problematic sites
     • real-time analysis with dedicated RB (using above bdII) + standard RBs (CNAF,LNL,PIC)
     • 2 UIs at CNAF, UIs also at LNL,PIC
     • CE+WNs at CNAF, LNL, PIC+CIEMAT

  publicize to the community the data produced at Tier-1’s
     • in progress but straightforward using the usual Replica Manager tools
  end-user analysis at the Tier-2’s (not really a DC04 milestone)
     • first attempts succeeded, more in progress
  monitor and archive resource and process information
    • dedicated GridICE server deployed at CNAF and monitoring all involved LCG-2 resources
III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004               D.Bonacorsi                    14
                 DC04: preliminary results
 Post-mortem analysis is in progress (see later for CNAF T1), nevertheless in general:

 •   the full chain is demonstrated but for limited amount of time
       – reconstruction/data-transfer/analysis may run at 25 Hz
       – at T0: 2200 running jobs/day (on ~500 CPU’s), 4 MB/s produced and
          distributed to each Tier-1, 0.4 files/s registered to RLS (with POOL metadata)

 •   different T1 performances, related to architectural choices on the EB
       – Data transfer infrastructure (with LCG-2 system and use of classical SE-EB)
          used by CNAF and PIC showed very good performances

 •   main areas for future improvements have been identified
       – Reduce number of files (i.e. increase <#events>/<#files>)
            • more efficient use of bandwidth
            • fixed time to “start-up” dominates command execution times (e.g. java in replicas..)
            • address scalability of MSS systems
       – Improve handling of file metadata in catalogues
            • RLS too slow both inserting and extracting full file records
            • introduce the concept of “file-set” to support bulk operations

 •   real-time analysis during DC04 was demonstrated to be possible!
       – But: need a clean environment!
       – ~ 15k jobs submitted via LCG-2 ran through the system
            • problems e.g. filling the RB disk space having sandboxes of about 20 MB



III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004               D.Bonacorsi           15
                                                part 2:


       CNAF Tier-1 role in DC04
       (and experience gained)




III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004   D.Bonacorsi   16
          CMS resources at CNAF T1 in DC04

  CNAF Tier-1 is young, new resources in the brand new Tier-1 hall in Bologna
      set-up and tests during PCP digitization!
      CMS DC04 was an important Data Challenge experience at CNAF Tier-1


 •    Storage: a Castor SE (disk buffer: 4 TB) and a classical disk-only SE (4 TB)
        more could be made available, but there was no need

 •    Network: CNAF interconnected to GARR-B backbone at 1 Gbps, e.g. from the
      Export Buffer to the T1 SEs (FE only for CE/WNs LAN)

 •    CPUs: 43 boxes allocated to CMS + ~25 shared, SMicro motherboard, dual Xeon
      2.4 GHz, hyper-thread on, IDE disks 2x60 GB, 2 GB RAM
        1. CPU power for PCP and real-time analysis in DC04
         2. services needed for the DC

 •    First experience with the new tape library STK L5500 with IBM LTO2 drives



                       ( all what follows: adapted from 2 talks by D.Bonacorsi, CMS-CPT week, May 12nd, 2004: see references)

III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004                      D.Bonacorsi                  17
                  DC04 workflow for INFN
                                                                       Transfer
                       disk-SE
                      disk-SE
                      disk-SE                                         Management
                    Export Buffer
                   Export Buffer                                          DB
                   Export Buffer

  CNAF T1                         TRA-Agent                                        data flow

                                                                                   query db
                                                                                   update db
                           T1                                local
    tape library                                            MySQL
                        Castor SE
                                                                     Legnaro T2
    SAFE-Agent                          REP-Agent

                             T1                                              T2
                          disk-SE                                         disk-SE

  TRA-Agent: T0 (SE-EB)  T1(Castor) Transfer Agent
  REP-Agent: T1(Castor)  T1(disk)/T2(disk) Replica Agent
  SAFE-Agent: migration-check procedure
III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004     D.Bonacorsi        18
             CNAF T1 data transfer agents
     use of LCG-2 Replica Manager for data transfer
     modular infrastructure  versatile management of data flow (see ‘                    ‘)


    TRA-Agent:          T0 (SE-EB)  T1(Castor) Transfer Agent

    • task: T1 main data flow management from SE-EB, supposed to be up&running 24/7
    • real-time reconfiguration of transfer parameters possible
    • implementation: C++, ran on LCG2 UI “shared”

    REP-Agent:          T1(Castor)  T1(disk)/T2(disk) Replica Agent

    • task: replicate files to disk-SEs for fake analysis, target can be T1/T2/both
    • agent code able to:
         guarantee authomatic file-organization at destination
         advertising procedure to inform the “fake-analysis people” about new files
    • agent “cloned” to deal with Castor stager scalability problems
    • implementation: C++, ran on LCG2 UI “CMS-only” (the same used for analysis at CNAF)

    SAFE-Agent:          migration-check procedure

    • task: check if migration to tape occurred and label the files as SAFE on the TMDB
    • rfio/nsls commands less verbose in tapeserver logs w.r.t the edg-gridftp-X family,
      so procedure completely re-engineered in the middle of DC04 (separate agent)
    • implementation: Perl cron job on LCG2 UI “shared”
III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004     D.Bonacorsi                19
              DC data transfer to CNAF T1
    • first Validation Data arrived at CNAF Tier-1 on Saturday, March 6th, 17:23 CERN time
    • last files arrived at CNAF Tier-1 on Sunday, May 2th, 06:43 CERN time


                                                                        DC04 data
                                                                      time window:
                                                                      51 (+3) days

                                                                   March 11th – May 3rd




                                                                             exercise with ‘big’ files
                                                                                 (see later for details)


     A total of >500k files and ~6 TB of data transferred T0  CNAF T1
           • max nb.files per day is ~45000 on March 31st ,
           • max size per day is ~400 GB on March 13th (>700 GB considering the “Zips”)

III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004            D.Bonacorsi                    20
          An example:         T0 ramp-up to 200 jobs


                SE-EB                                              The story: T0 ramped up production,
                                                                   increasing rate predicted to be
                                                                   showing up in the EBs at late night

                                                                   Here you see it for both SE-EB
                                                                   and e.g. CNAF Castor SE




        Castor SE at CNAF T1
           eth I/O                                            CPU load




III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004          D.Bonacorsi           21
            An example:         increasing the T0 rate

   Just an example of how the data transfer system kept the pace
     analysis of the chain SE-EB  CNAF Castor SE during last week of DC04




                    SE-EB




                CNAF



                                     Apr 25                        Apr 30   May 2


III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004            D.Bonacorsi   22
             An example:        Replicas to disk-SEs

      CNAF T1 Castor SE
      eth I/O input                                                     CNAF T1 Castor SE
      from SE-EB                                      TCP connections




                                                     Just one day:
                                                       Apr, 19th                RAM memory
      CNAF T1 disk-SE
     eth I/O input
     from Castor SE
                       green                           Legnaro T2 disk-SE
                                                     eth I/O input from Castor SE




III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004        D.Bonacorsi      23
           Data management in the chain
            disk-SE EB  CNAF/PIC T1’s
  Data transfer between LCG-2 SEs
  Set-up:          Export Buffer at Tier-0 with classical disk-based SE
                         3 SE machines with 1 TB each
                   a Castor-SE at CNAF T1 and at PIC T1
                         but different underlying MSS solution
  Data transfer tool:
  CNAF: Replica Manager CLI (+ LRC C++ API for listReplicas only)
        copy a file and inherently register it to the RLS, with file-size info stored in the
         PFN ‘size’ attribute (i.e. in the LRC): used to check success of replica
            over-head introduced by CLI java processes
            repetition of failed replicas is needed  effect on transfer efficiency

  PIC: globus-url-copy + LRC C++ API
        copy a file then register to the RLS (‘add PFN’), no file-size check
            faster
            less safe regarding the quality-check of replica operations?
    RM looked “safer-but-slower”
      i.e. offers more warranty against failed replicas, but this has a price..

  Anyway, both CNAF/PIC strategies offered good performances
        throughput mainly limited by sizes of files not by transfer tools
III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004         D.Bonacorsi            24
           CNAF “autopsy” of DC04                                          1. Lethal
                                                                           injuries

  Agents drain data from SE-EB down to CNAF T1 and
  they ‘land’ directly on a Castor SE buffer
   it occurred that in DC04 these files were many and small

  So: for any file on the Castor SE fs, a tape migration is
  foreseen with a given policy, regardless of their size/nb..

   this strongly affected data transfer at CNAF T1
         (MSS below is tape lib STK L5500 with IBM LTO-2 tapes)

   Castor stager scalability issues               (more info: Castor ticket CT204339)
      many small files (mostly 500B-50kB)  bad performances of stager db for
      >300-400k entries (may need more RAM?)
       • CNAF fast set-up of an additional stager in DC04: basically worked
       • REP-Agent cloned to transparently continue replication to disk-SEs

   tape library LTO-2 issues             (more info: Castor ticket CT206668)
      high nb. segments on tape  bad tape read/write performances, LTO-2 SCSI errors,
      repositioning failures, slow migration to tape and delays in the TMDB “SAFE”-labelling,
      tape often labelled READONLY  inefficient tape space usage

  A–posteriori question: maybe a disk-based Import Buffer in front of MSS?

III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004                D.Bonacorsi     25
           CNAF “autopsy” of DC04                                   2. Non-lethal
                                                                       injuries


   minor (?) Castor/tape-library issues
         Castor filename length (more info: Castor ticket CT196717)        At this scale the debug
         ext3 file-system corruption on a partition of the old stager         is forced to adopt
         tapes blocked in the library                                     a statistical approach..

   several crashes/hanging of the TRA-Agent               (rate: ~ 3 times per week)
         created from time to time some backlogs, nevertheless fast to be recovered
           post-mortem analysis in progress

   experience with the Replica Manager interface
        e.g. files of size 0 created at destination when trying to replicate from Castor SE some data
        which are temporarily not accessible for stager (or other) problems on the Castor side
          needs further tests to achieve reproducibility and then Savannah reports

   Globus-MDS Information System instabilities                 (rate: ~ once per week)
        some temporary stop of data transfer (i.e. ‘no SE found’ means ‘no replicas’)

   RLS instabilities    (rate: ~ once per week)
        some temporary stop of data transfer (cannot both list replicas and (de)register files)



III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004         D.Bonacorsi          26
          Network: transferring ‘big’ files
                >3k files for >750 GB just at the end of DC04

                                                          CNAF T1 SE eth I/0




 link utilization above 80% during all
 the last DC04 data transfer exercise.




Global CNAF network activity monitored by NOC-GARR
                                                    May 1st        May 2nd
                                                                                     ~340 Mbps
                                                                                     (>42 MB/s)
                                                                                       sustained
                                                                                     for ~5 hours
                                                                                       (max was
                                                                                     383.8 Mbps)
• the network is not a problem: the throughput was limited by the small size of files
III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004       D.Bonacorsi            27
                Summary on CNAF T1 role
                    in CMS DC04
  •   CNAF-INFN is a very young T1, nevertheless worked basically fine and
      enjoyed the race all through DC04, from the first Validation Data file received
      on Saturday, March 6th, 17:23 CERN time, until the successful end of the “big”
      files exercise on 1st-3rd May.

            much effort and much learning, many key items raised for the future

 •    Positive know-how gained in agent coding, configuration and optimization,
      and in the overall data transfer infrastructure set-up and management
 •    Useful experience about CMS interfacing to Tier-1 structure at CNAF
 •    Good results in the LCG2-based data transfer, in the real-time analysis effort
      and in the network tests with big files in the final rush

 •    Typically, problems were of two types:

            things one may expect in a DC (stretch a system, something breaks)
            Castor SEs operational at CNAF, but data transfer affected by Castor issues
             related to the underlying MSS
               what in front of MSS in the future?

  Towards user analysis! Data access and end-user analysis hot topics now..
   AL RCs involved!
III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004   D.Bonacorsi      28
                        General conclusions

   CMS production world-wide evolving into a ‘permanent’ trend

   Lessons from DC04 (so far):
     full chain demonstrated for limited amount of time
         several exercises successful and gave important feedback
         know-how raised and areas for future improvements identified
        DC04 post-mortem goes on…
         “is what we think happened really what happened?”
               collate and correlate information from logs
               generate statistical models of components

   very high level of ‘baby-sitting’ of activity..
         thanks to CNAF T1 personnel, LCG, EIS, IT DB @ Cern !


  Thanks for “stoled-and-revised” slides to:
  P.Capiluppi, C.Grandi, A.Fanfani, T.Wildish, D.Stickland

  References:       Agenda + slides from CMS CPT Week (“DC04-review day”):
                    http://agenda.cern.ch/fullAgenda.php?ida=a041651

III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004   D.Bonacorsi   29
                           Back-up slides




III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004   D.Bonacorsi   30
                       CMS production tools

 • CMS production tools (OCTOPUS)

       – RefDB
             • Contains production requests with all needed parameters to
               produce the dataset and the details about the production process


       – MCRunJob
             • Tool/framework for job preparation and job submission
             • Evolution of IMPALA: more modular (plug-in approach)


       – BOSS
             • Real-time job-dependent parameter tracking. The running job
               standard output/error are intercepted and filtered information are
               stored in BOSS database. The remote updator is based on MySQL.


III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004   D.Bonacorsi   31
                Only CMSIM INFN                                    Only OSCAR INFN




III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004     D.Bonacorsi     32
             Only CMSIM CMS-LCG                                    Only OSCAR CMS-LCG




III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004       D.Bonacorsi      33
                        CMS/LCG-0 testbed

   Joint project CMS-LCG-EDT

   a CMS-wide testbed based on the LCG pilot distribution, owned by CMS, with
      additional component installed with the help of LCG EIS

   Based on LCG pilot distribution
   (using GLUE, VOMS, GridICE, RLS)
   About 170 CPU’s and 4 TB disk
   Sites: Bari Bologna Bristol CERN CNAF Ecole Polytechnique Legnaro
   NCU-Taiwan Padova

   MIlano, U.Iowa, ISLAMABAD-NCP, IC, Brunel only in the deployment phase




III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004   D.Bonacorsi   34
                 Description of RLS usage

                                                   Tier-1                                    Local POOL
          TMDB                                 Transfer agent                                 catalogue
                                                           Replica          SRB
                        RM/SRM/SRB                         Manager         GMCAT
                         EB agents
                                                  4. Copy files                     Resource
                        3. Copy/delete files      to Tier-1’s                        Broker
                                                                     5. Submit
                        to/from export buffers                       analysis job
                                                                                               LCG
      Configuration    2. Find Tier-1                                 6. Process DST          ORCA
                       Location (based           POOL RLS
         agent                                                        and register            Analysis
                       on metadata)              catalogue
                                                                      private data              Job

         XML           1. Register Files                           ORACLE
       Publication                                                             CNAF RLS
                                                                   mirroring
         Agent                                                                  replica


    Specific client tools: POOL CLI, Replica Manager CLI, C++ LRC API based
    programs, LRC java API tools (SRB/GMCAT), Resource Broker

III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004               D.Bonacorsi               35
    RLS used as a POOL catalogue
          –    Register files produced at Tier-0 with their POOL metadata
               (converting POOL XML catalogue into the RLS)
          –    Query metadata to determine which Tier-1 to send files to
          –    Register/delete physical location of files on Tier-0 Export
               Buffers
          –    Transfer tools use catalogue to replicate files to Tier-1’s
                • Local POOL catalogues at Tier-1’s are optionally populated
          –    Analysis jobs on LCG use the RLS through the Resource Broker
               to submit jobs close to the data
          –    Analysis jobs on LCG register their private data




III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004   D.Bonacorsi   36
            A disk Import Buffer in front of a MSS in the future?

                                  SE-EB

                                                                               TMDB

       T1                                 TRA-Agent


                          T1 SE-IB
                                                                                 disk-SEs
                       (Import Buffer)                             REP-Agent

                                                                                        data flow
                                          File manipulation,
                                             grouping, …                                query db
                                                                                        update db
        LTO-2                   T1
     tape library
                           Castor buffer

     SAFE-Agent

III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004          D.Bonacorsi       37
               CNAF filename length
                     problem

 Castor filename length (actually Castor-Murphy’s law…)
       old issue… CNAF had replies to CT196717, Mar 11th
    CNAF Castor installation is standard, but the staging- area configuration resulted in a Castor
    filenames max lenght < length EVD* DC04 files
      install later a special patched ‘longname’ version of the stager code for the new stager
    But DC04-show must go on!  my workaround: store files on Castor SE with GUID’s as
    filenames then restore original filenames when replicating data to disk-SEs for fake-analysis
      worked (lot of work on agents logic..), did not affect data transfer performances 




III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004      D.Bonacorsi            38
                   PIC




                   CNAF

                                                                   CLEANED



                                                                     AT_T1
                                    NEW

                                                   IN_BUFFER             SAFE




III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004             D.Bonacorsi   39
           Different EB scenarios in DC04

   Classic disk-based SE-EB and LCG-2 Replica Manager
   Basically working. See main talk for details

  SRM+dCache EB at Cern and SRM+dCache interface to FNAL Enstore MSS
  quite a few sw/hw component failures during DC04:
      • problems at DC start with system failures
      • MSS needed optimization to handle challenge load
              • needed to increase nb tape drives to handle nb files
              • needed to replace name space server to handle nb files
        • srmcp client hung for unknown reasons
        • cannot handle bad-files situations at both source and destination SRM
            • file does not exist..? file with different sizes..?
        • much non-authomatic interventions

   SRB+MCat interface with different underlying MSS
   Severe failures most of the DC time.. Mainly related to MCat unreliability
        • need much investigation and subsequent work..


III Workshop Calcolo-Reti INFN, Castiadas (CA) - May 24-28, 2004   D.Bonacorsi   40

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:5/1/2013
language:Unknown
pages:40
huangyuarong huangyuarong
About