ATLAS Data Challenge on NorduGrid by wuyunqing

VIEWS: 3 PAGES: 23

									ATLAS Data Challenge on NorduGrid


            CHEP2003 – UCSD



     Anders Wäänänen waananen@nbi.dk
                  NorduGrid project

                                   Launched in spring of 2001, with the aim
                                    of creating a Grid infrastructure in the
                                    Nordic countries.

                                   Idea to have a Monarch architecture
                                    with a common tier 1 center

                                   Partners from Denmark, Norway,
                                    Sweden, and Finland

                                   Initially meant to be the Nordic branch
                                    of the EU DataGrid (EDG) project

                                   3 full-time researchers with few
                                    externally funded




Anders Wäänänen           ATLAS Data Challenge with NorduGrid          2
                         Motivations

  NorduGrid            was initially meant to be a pure deployment project

  One       goal was to have the ATLAS data challenge run by May 2002

  Should          be based on the the Globus Toolkit™

  Available          Grid middleware:
          The Globus Toolkit™
                 A toolbox – not a complete solution
          European DataGrid software
                 Not mature for production in the beginning of 2002
                 Architecture problems




Anders Wäänänen                           ATLAS Data Challenge with NorduGrid   3
                                      A Job Submission Example
                                                              Replica
                                                              Catalogue                                         Information
                          UI            Input “sandbox”
                                                                                                                Service
                         JDL




                                                                                            Input “sandbox”
                                       Output “sandbox”
Author.                                                    Resource
                                                           Broker
            Job Submit




&Authen.
                          Job Query




                                                                                                                          Storage
                                              Job Status                                         Brokerinfo
                                                                                                                          Element
                                                                   Job Submission
                                                                   Service
Logging &                                                         Output “sandbox”
Book-keeping                             Job Status                                                           Compute
                                                                                                              Element
    Anders Wäänänen                                        ATLAS Data Challenge with NorduGrid                                4
                       Architecture requirements

  No        single point of failure

  Should           be scalable

  Resource   owners should have full control over their
     resources

  As       few site requirements as possible:
           Local cluster installation details should not be dictated
                 Method, OS version, configuration, etc…
           Compute nodes should not be required to be on the public network
           Clusters need not be dedicated to the Grid



Anders Wäänänen                       ATLAS Data Challenge with NorduGrid   5
                      User interface

 The    NorduGrid user interface provides a set of commands for
    interacting with the grid
          ngsub – for submitting jobs
          ngstat – for states of jobs and clusters
          ngcat – to see stdout/stderr of running jobs
          ngget – to retrieve the results from finished jobs
          ngkill – to kill running jobs
          ngclean – to delete finished jobs from the system
          ngcopy – to copy files to, from and between file servers and replica
           catalogs
          ngremove – to delete files from file servers and RC’s



Anders Wäänänen                       ATLAS Data Challenge with NorduGrid         6
                   ATLAS Data Challenges

  A   series of computing challenges within Atlas of increasing size and
     complexity.
   Preparing     for data-taking and analysis at the LHC.
   Thorough      validation of the complete Atlas software suite.
   Introduction     and use of Grid middleware as fast and as much as
     possible.




Anders Wäänänen                  ATLAS Data Challenge with NorduGrid     7
                           Data Challenge 1

   Main          goals:
           Need to produce data for High Level Trigger & Physics groups
                  Study performance of Athena framework and algorithms for use in HLT
                  High statistics needed
                      Few samples of up to 107 events in 10-20 days, O(1000) CPU’s
                      Simulation & pile-up
           Reconstruction & analysis on a large scale
                  learn about data model; I/O performances; identify bottlenecks etc
           Data management
                  Use/evaluate persistency technology (AthenaRoot I/O)
                  Learn about distributed analysis
           Involvement of sites outside CERN
           use of Grid as and when possible and appropriate




Anders Wäänänen                          ATLAS Data Challenge with NorduGrid             8
                                DC1, phase 1: Task Flow

     Example: one sample of di-jet events
           PYTHIA event generation: 1.5 x 107 events split into partitions (read: ROOT files)
           Detector simulation: 20 jobs per partition, ZEBRA output


                                Athena-Root I/O                                                                Zebra

                                                                                Atlsim/Geant3                   Hits/
                      Di-jet                                                                                    Digits
                                                 HepMC                              + Filter
                                                                                                 (~450 evts)   MCTruth
                                   105 events                     (5000 evts)
                      Pythia6




                                                                                Atlsim/Geant3                   Hits/
                                                 HepMC                              + Filter                    Digits
                                                                                                               MCTruth




                                                                                Atlsim/Geant3                   Hits/
                                                                                    + Filter                    Digits
                                                 HepMC                                                         MCtruth

                    Event generation                                   Detector Simulation


Anders Wäänänen                                         ATLAS Data Challenge with NorduGrid                              9
                       DC1, phase 1: Summary

   July-August         2002

   39     institutes in 18 countries

   3200          CPU’s , approx.110 kSI95 – 71000 CPU-days

  5     × 107 events generated

  1    × 107 events simulated

   30     Tbytes produced

   35     000 files of output




Anders Wäänänen                     ATLAS Data Challenge with NorduGrid   10
                         DC1, phase1 for NorduGrid
       Simulation

       Dataset 2000 & 2003 (different event generation) assigned to
         NorduGrid
       Total       number of fully simulated events:
                 287296 (1.15 × 107 of input events)
       Total       output size: 762 GB.
       All files uploaded to a Storage Element (University of Oslo) and
         registered in the Replica Catalog.




Anders Wäänänen                         ATLAS Data Challenge with NorduGrid   11
                        Job xRSL script
        &
        (executable=”ds2000.sh”)
        (arguments=”1244”)
        (stdout=”dc1.002000.simul.01244.hlt.pythia_jet_17.log”)
        (join=”yes”)
        (inputfiles=(“ds2000.sh”
          “http://www.nordugrid.org/applications/dc1/2000/dc1.002000.simul.NG.sh”))
        (outputfiles=
        (“atlas.01244.zebra”
          “rc://dc1.uio.no/2000/log/dc1.002000.simul.01244.hlt.pythia_jet_17.zebra”)
        (“atlas.01244.his”
          “rc://dc1.uio.no/2000/log/dc1.002000.simul.01244.hlt.pythia_jet_17.his”)
        (“dc1.002000.simul.01244.hlt.pythia_jet_17.log”
          “rc://dc1.uio.no/2000/log/dc1.002000.simul.01244.hlt.pythia_jet_17.log”)
        (“dc1.002000.simul.01244.hlt.pythia_jet_17.AMI”
          “rc://dc1.uio.no/2000/log/dc1.002000.simul.01244.hlt.pythia_jet_17.AMI”)
        (“dc1.002000.simul.01244.hlt.pythia_jet_17.MAG”
          “rc://dc1.uio.no/2000/log/dc1.002000.simul.01244.hlt.pythia_jet_17.MAG”))
        (jobname=”dc1.002000.simul.01244.hlt.pythia_jet_17”)
        (runtimeEnvironment=”DC1-ATLAS”)
        (replicacollection=”ldap://grid.uio.no:389/lc=ATLAS,rc=NorduGrid,dc=nordugrid,dc=org”)
        (maxCPUTime=2000)(maxDisk=1200)
        (notify=”e waananen@nbi.dk)


Anders Wäänänen                       ATLAS Data Challenge with NorduGrid                   12
                     NorduGrid job submission

   The      user submits a xRSL-file specifying the job-options.
   The      xRSL-file is processed by the User-Interface.
   The   User-Interface queries the NG Information System for resources and
     the NorduGrid Replica-Catalog for location of input-files and submits the
     job to the selected resource.
   Here    the job is processed by the Grid Manager, which downloads or links
     files to the local session directory.
   The  Grid Manager submits the job to the local resource management
     system.
   After simulation finishes, the Grid-Manager moves requested output to
     Storage Elements and registers these into the NorduGrid Replica-
     Catalog.


Anders Wäänänen                    ATLAS Data Challenge with NorduGrid      13
                   NorduGrid job submission

                        RC




                                                               RSL
   RSL
                                                           Gatekeeper
                                                            GridFTP




                                                              Grid
                                                               RSL

                  MDS                                      Manager


Anders Wäänänen              ATLAS Data Challenge with NorduGrid        14
                  NorduGrid Production sites




Anders Wäänänen           ATLAS Data Challenge with NorduGrid   15
Anders Wäänänen   ATLAS Data Challenge with NorduGrid   16
                       NorduGrid Pileup
 DC1,      pile-up:
         Low luminosity pile-up for the phase 1 events

 Number          of jobs: 1300
         dataset 2000: 300
         dataset 2003: 1000

 Total       output-size: 1083 GB
         dataset 2000: 463 GB
         dataset 2003: 620 GB




Anders Wäänänen                     ATLAS Data Challenge with NorduGrid   17
                        Pileup procedure

   Each          job downloaded one zebra-file from dc1.uio.no of approximate
           900MB for dataset 2000
           400MB for dataset 2003

   Use  locally present minimum-bias zebra-files to "pileup" events on
     top of the original simulated ones present in the downloaded file.
     The output size of each file was about 50 % bigger than the original
     downloaded file i.e.:
           1.5 GB for dataset 2000
           600 GB for dataset 2003

   Upload         output-files to dc1.uio.no and dc2.uio.no SE‘s
   Register         into the RC.




Anders Wäänänen                      ATLAS Data Challenge with NorduGrid         18
                        Other details

   At  peak production, up to 200 jobs were managed by the NorduGrid
     at the same time.
   Has  most of Scandinavian production clusters under its belt (2 of
     them are in Top 500)
   However          not all of them allow for installation of ATLAS Software
   Atlas         job manager Atlas Commander support the NorduGrid toolkit
   Issues

           Replica Catalog scalability problems
           MDS / OpenLDAP hangs – solved
           Software threading problems – partly solved
                 Problems partly in Globus libraries




Anders Wäänänen                           ATLAS Data Challenge with NorduGrid   19
                       NorduGrid DC1 timeline

   April         5th 2002
           First ATLAS job submitted (Athena Hello World)

   May       10th 2002
           First pre-DC1-validation-job submitted
            (ATLSIM test using Atlas-release 3.0.1)

   End      of May 2002
           Now clear that NorduGrid mature enough to handle real production

   Spring         2003 (now)
           Keep running Data challenges and improve the toolkit




Anders Wäänänen                     ATLAS Data Challenge with NorduGrid        20
                        Quick client installation/job run

   As      a normal user (non system privileges required):
           Retrieve nordugrid-standalone-0.3.17.rh72.i386.tgz
        tar xfz nordugrid-standalone-0.3.17.rh72.i386.tgz
        cd nordugrid-standalone-0.3.17
        source ./setup.sh
           Get a personal certificate:
        grid-cert-request
           Install certificate per instructions
           Get authorized on a cluster
           Run a job
        grid-proxy-init
        ngsub '&(executable=/bin/echo)(arguments="Hello World")‘

Anders Wäänänen                      ATLAS Data Challenge with NorduGrid   21
                        Resources

   Documentation             and source code are available for download

   Main          Web site:
           http://www.nordugrid.org/

   ATLAS           DC1 with NorduGrid
           http://www.nordugrid.org/applications/dc1/

   Software          repository
           ftp://ftp.nordugrid.org/pub/nordugrid/




Anders Wäänänen                        ATLAS Data Challenge with NorduGrid   22
                       The NorduGrid core group

   Александр         Константинов

   Balázs        Kónya

   Mattias        Ellert

   Оксана         Смирнова

   Jakob         Langgaard Nielsen

   Trond         Myklebust

   Anders         Wäänänen




Anders Wäänänen                       ATLAS Data Challenge with NorduGrid   23

								
To top