Docstoc

e-Infrastructure in Europe

Document Sample
e-Infrastructure in Europe Powered By Docstoc
					                                    Enabling Grids for E-sciencE




                  ARDA status

                  Massimo Lamanna / LCG ARDA




www.eu-egee.org


INFSO-RI-508833
                                             Table of Contents
              Enabling Grids for E-sciencE



• Introduction
   – Support material
   – Questions (Template from Matthias)
• Activity
   – Middleware
   – Metadata
   – Prototypes
         ALICE
         ATLAS
         CMS
         LHCb
• Other points
   – Personnel effort
   – Milestones
   – Outlook

                                                   Massimo Lamanna / CERN   2
                                                Existing material
                 Enabling Grids for E-sciencE



• Last SC2 presentation:
   – http://lcg.web.cern.ch/lcg/PEB/arda/public_docs/ARDAatSC2.ppt
• SC2: very constructive discussions (T. Doyle and J. Shank)
• Recent relevant LCG presentations:
   – LCG PEB (March 05):
     http://lcg.web.cern.ch/LCG/activities/arda/public_docs/2005/Q1/ARD
     A-PEBMarch05.ppt
   – LHCC demo (May 05):
     http://lcg.web.cern.ch/LCG/activities/arda/public_docs/2005/Q2/LHCC
     demo.ppt
   – OSG meeting (June 05):
     http://lcg.web.cern.ch/LCG/activities/arda/public_docs/2005/Q2/ARD
     AatOSG_jun05.ppt
• ARDA document page:
   – http://lcg.web.cern.ch/LCG/activities/arda/documents.html
                                                      Massimo Lamanna / CERN   3
                                                              Template
                  Enabling Grids for E-sciencE



• Presentation of workplan, milestones,
    – there is the SC2 request to have more detailed and meaningful
      milestones
• Base line services:
    – what is available in the current gLite version,
    – development and release plan for future versions
• Evolution of experiments grid applications:
    –   fraction of gLite services used, intended to be used
    –   situation and plans for testing and certification of new versions
    –   use of gLite in SC3/SC4
    –   use of ARDA applications in SC3/SC4
•   Presentation of manpower situation
•   Open questions
•   Achievements
•   Concerns and risks

                                                           Massimo Lamanna / CERN   4
Enabling Grids for E-sciencE




                               Massimo Lamanna / CERN   5
                                                Middleware
             Enabling Grids for E-sciencE



• Mandate
  – Use the gLite middleware proactively in order to provide
    feedback to developers and support the migration of
    experiments’ system
• Implementation
  – Access to the development test bed
  – Contribution to the testing effort (gLite team, non-HEP EGEE
    resources)
  – Contribution to the set up of the preproduction service
  – Use of special installation for detailed tests (e.g. Taipei ARDA
    test bed, ATLAS-Milano resource broker)
  – Involvement of the team and experiments people in
    meetings/reviews of gLite middleware and general discussions
    (ARDA workshops)
• ARDA Metadata activity (AMGA)

                                                 Massimo Lamanna / CERN   6
                    Access to the development test bed
                 Enabling Grids for E-sciencE



• Very positive experience overall
   – Very early access to new software 
   – Nice feedback loop 
   – All 4 experiments used the system to set up their early
     prototypes (till beginning of 2005)
   – Now used to “play” with the middleware (this was always the
     main goal behind the development test bed: it is not a service!)
• Examples
   – Key components tested days
     after having them available on the
     development
   – Watching the system since
     February 05: web results 



                                                     Massimo Lamanna / CERN   7
                                 Contribution to the testing
             Enabling Grids for E-sciencE              effort
• The ARDA team developed “tests” to investigate basic
  functionalities and in several cases this has been
  passed (the ideas, sometimes the actual code) as
  starting example for the gLite team
• Examples:
  – FiReMan measure-
    ments (and compa-
    rison with LFC)
      ACAT 
      LCG workshops
  – Data corruption tests
    using gLiteIO




                                                 Massimo Lamanna / CERN   8
                             Contribution to the set up of the
                                       preproduction service
              Enabling Grids for E-sciencE



• In the early phase of the setting up of the PPS, we
  detected that considerable EGEE application resources
  were used to contribute to the testing/certification
  effort without real coordination
   – This was detected by us and we provided this kind of
     coordination:
     http://egee-na4.ct.infn.it/genapps/wiki/index.php/TestsOfGlite
   – A number of certification tests (“job storms”) have been ported
     with ARDA resources
   – 2 tutorials have been run by ARDA to speed up the transition of
     a number of EGEE NA4 people to gLite (they provided more
     tests for gLite and the PPS)
• The PPS is just becoming available (October 2005)


                                                 Massimo Lamanna / CERN   9
                                     Use of special installation for
              Enabling Grids for E-sciencE           detailed tests
• Some tests require full access to the machines where
  the middleware runs
   – Access to install program to “spy” CPU usage, I/O traffic, etc…
   – Access to a given hardware to make comparison
       E.g. LGC2 RB vs gLite RB
   – Try to crash the system without impacting other users


• Main installations
   – Taipei 
   – ATLAS-Milano installation (ATLAS Task
     Force activity)
        next slide




                                                      Massimo Lamanna / CERN   10
                                          Use of special installation for
                   Enabling Grids for E-sciencE           detailed tests
                First Measurement on gLite 1.4
                                                                      First gLite pre1.4 WMS
                                                                   • First gLite 1.4 WMS
                                                                       performance measurements
                                                                     performance measurement
                                                                     inin Milano
                                                                        Milan
                                                                        –     gLite 1.4 WMS (4CPU, 3GB
                                                                              memory)
                                                                        –     300 simple hello world jobs
                                                                              submitted by 3 parallel
                                                                              threads
                                                                        –     submission rate ~ 4.1
                                                                              jobs/sec
                                                                        –     dispatching rate ~ 0.08
                                                                              jobs/sec
                                                                        –     All jobs in bulk are submitted
                                                                              to the same CE:
                                                                              atlasce.lnf.infn.it




                                                                   • Thanks to Elisabetta
                                                                     Molinari for setting up the
                                                                     gLite WMS in Milan

30 Sept. 2005                         LCG/EGEE Taskforce Meeting

                                                                            Massimo Lamanna / CERN             11
                     Involvement of the team and experiments people
                              in meetings/reviews of gLite middleware and
              Enabling Grids for E-sciencE            general discussions

• 3 ARDA workshops
   – Last in March
• A few ARDA meeting
   – Informal presentations
       Not much successful (but the few seminars were very nice)
• EGEE link
   – ARDA circulated the architecture and design docs to several
     experiments’ experts
   – ARDA suggested experiments’ experts to be invited in relevant
     EGEE technical for a
• Other activities
   – Participation/presentation to GAG and the UK Metadata Group
   – In addition, ARDA is involved in the BaseLine service working
     group and 3D working group
                                                     Massimo Lamanna / CERN   12
Enabling Grids for E-sciencE




                               Massimo Lamanna / CERN   13
                                                                         Metadata
                     Enabling Grids for E-sciencE


• ARDA was built using the idea that all the middleware would have been
  provided by gLite
• Eventual exception: Metadata system
   – Key element of all experiments system (e.g. Production system production)
   – ARDA studied both technology (test of experiments systems) and interface
     (interaction with gLite and the HEP community)
          Presentations at GAG, GridPP UK Metadata…
          Good inputs from gLite
          Eventually the result interface accepted in gLite
          Key role of the working prototype (AMGA)
             • Used by LHCb (Bookkeeping system)
             • Presented at GridPP UK Metadata group
             • Used for technology research (SOAP)
   – Recently the ARDA implementation has been integrated in gLite
        It should appear in gLite 1.5
             • Intergation OK, some activity on security started, future integration with
               catalogues possible but not discussed in any detail yet
        Used in other ARDA products providing database functionality
             • Notably in GANGA
             • By many EGEE and other non-HEP collaborators (ESR, Biomed, GILDA,
               UNOSAT)

                                                                      Massimo Lamanna / CERN   14
                                   Metadata: ARDA Implementation
                      Enabling Grids for E-sciencE


•   Prototype
     – Validate our ideas and expose a                                    Metadata Server
       concrete example to interested parties                                                Oracle

•   Multiple back ends                               Client    SOAP


     – Currently: Oracle, PostgreSQL, SQLite,                                  MD
                                                                              Server
                                                                                             Postgre
                                                                                              SQL

       MySQL                                         Client
                                                                 TCP
                                                              Streaming
•   Dual front ends                                                                          SQLite


     – TCP Streaming
          Chosen for performance
     – SOAP
          Formal requirement of EGEE
          Compare SOAP with TCP Streaming                          Python Interpreter

•   Also implemented as standalone                                                Metadata
                                                                 Client            Python
    Python library                                                                  API

     – Data stored on the file system
                                                                                                       filesystem




                                                                  Massimo Lamanna / CERN                            15
                               Metadata: ARDA Implementation Dual
                                                       Front End
                                    Enabling Grids for E-sciencE


•     Text based protocol                                          •   Most operations are SOAP calls

    Client                 Server                       Database         Client                    Server                      Database



             <operation>             Create DB cursor                                   query               Create DB cursor
               [data]                                                                   [data]
                                          [data]                                                                  [data]
               [data]                                                                 nextQuery
                                          [data]                                                                  [data]
                                                                                       [data]
               [data]                     [data]                                                                  [data]

                                          [data]                                      nextQuery                   [data]
                [data]                                                                  [data]




             Streaming                Streaming                                      SOAP                    Streaming
                                                                                  with iterators

•     Data streamed to client in single                            •   Based on iterators
      connection                                                        –    Session created
•     Implementations                                                   –    Return initial chunk of data and session token
         –    Server – C++, multiprocess                                –    Subsequent request: client calls nextQuery()
         –    Clients – C++, Java, Python, Perl, Ruby                        using session token
                                                                        –    Session closed when:
                                                                                       End of data
                                                                                       Client calls endQuery()
                                                                                       Client timeout
                                                                   •   Implementations
                                                                        –    Server – gSOAP (C++).
                                                                        –    Clients – Tested WSDL with gSOAP, ZSI
                                                                             (Python), AXIS (Java)


                                                                                                  Massimo Lamanna / CERN                  16
                                                                                                         AMGA at ACAT
SOAP Toolkits performance
                           Enabling Grids for E-sciencE




   Test protocol performance
     No work done on the
                                                                               1000 pings
      backend
                                                           25
     Switched 100Mbits LAN                                      TCP-S no KA
                                                                   TCP-S KA
                                                                 SOAP no KA
    Language comparison

                                      Execution Time [s]
                                                          20      SOAP KA

     TCP-S with similar
      performance in all languages                         15

     SOAP performance varies
                                                           10
                                                                               Scalability with Multiple Clients - Pings
      strongly with toolkit
   Protocols comparison                                   5
                                                                                  Measure scalability of protocols
     Keepalive improves
      performance significantly
                                                           0
                                                                C++ (gSOAP)            Switched 100Mbits LAN
                                                                                   Java (Axis) Python (ZSI)
                                                                                                                                                                    1000 pings
     On Java and Python, SOAP                                                    TCP-S 3x faster than gSoap                                                                      TCP-S, no KA




                                                                                                                       Average throughput [calls/sec]
                                                                                                                                                                                     TCP-S, KA
      is several times slower than                                                 (with keepalive)                                                                               gSOAP, no KA
                                                                                                                                                                                    gSOAP, KA
      TCP-S                                                                       Poor performance without                                             10000


                                                                                   keepalive
                                                                                       Around 1.000 ops/sec (both
                                                                                                 University of
                                                                                        gSOAP and TCP-S)                                                 1000
                                                                                                  Coimbra
                                                                                                                                                                                   Client ran
                                                                                                                                                                                  out of sockets



                                                                                                                                                                1        10                                 100
                                                                                                                                                                      # clients




                                                                                                                      Massimo Lamanna / CERN                                                          17
                                                                                                                                                                                                   University of
                                                                                                                                                                                                    Coimbra
               Metadata: ARDA Implementation:
                            Security Concepts
                Enabling Grids for E-sciencE


• Security very important for BioMed (more than for HEP)
      They need confidentiality, not only basic access control
                        Security ↔ Speed
• Standalone catalogue has:
    – ACLs for dirs and Unix permissions dirs/entries
    – Built-in group-management as in AFS
                                                      The extra effort is
• AMGA + LFC back end:
                                                                 largely
   – Posix ACLs + Unix permissions for dirs/entries
     (ACLs currently not checked: slow!)            counterbalanced by
   – Users/groups via VOMS                           having more active
• Currently no security on attribute basis                        users
   – AMGA allows to create views: Safer, faster, similar to RDBMS
Security tested by GILDA team for standalone catalogue, liked built-in group
  management & ACLs, but we need feedback from BioMed!
Enabling Grids for E-sciencE




                               Massimo Lamanna / CERN   19
               ARDA prototypes: starting point
             Enabling Grids for E-sciencE




   LHC         Main focus                   Basic prototype      Middleware
Experiment                                    component
                                              /framework


              GUI to Grid                   GANGA/DaVinci




               Interactive                  PROOF/AliROOT
                analysis

                High-level
              services and                   DIAL/Athena
               integration

             Explore/exploit
               native gLite                     ORCA
             functionality &
               integration

                                                              Massimo Lamanna / CERN   20
                                                                              GANGA
     Enabling Grids for E-sciencE




What is Ganga

                       Job
                      Job
                    Job
                   Job
                                store & retrieve job definition       LSF

                                                                      localhost
                                                     submit, kill
                                                                         gLite
            prepare, configure
 Athena                                                                     LCG2
                                                      get output
                                                      update status         DIRAC
 Gaudi
                                                                              DIAL
 scripts
                                                                             AtlasPROD



                                                    + split, merge, monitor,
                                    Ganga4
                                                    dataset selection
                                                                       Jakub.Moscicki@cern.ch 3



                                                                      Massimo Lamanna / CERN      21
                                                                               Ganga4
                 Enabling Grids for E-sciencE


• Major rewrite
   – End 2004, beginning                  Internal architecture
     of 2005
   – Key contribution of                                               • Ganga 4 is decomposed
                                                                         into 4 functional
     the ARDA team                                                       components
                                                         Application
   – Hands-on activity at                       Client
                                                          Manager
                                                                       • These components also
                                                                         describe the components in
     CERN
                                                                         a distributed model.
• GANGA workshop                             Job           Job         • Strategy: Design each
                                                                         component so that it could
  (London, June 2005)                     Repository
                                              &
                                                         Manager
                                                            &            be a separate service.
   – http://agenda.cern.ch/               Workspace      Monitoring    • But allow to combine two or
     fullAgenda.php?ida=a                                                more components into a
                                                                         single service
     052763
                                                                                        Jakub.Moscicki@cern.ch 9




                                                                        Massimo Lamanna / CERN           22
                                    GANGA (ATLAS and LHCb)
                   Enabling Grids for E-sciencE



• Common project (ATLAS and
                                                            GANGA ATLAS team
  LHCb)
                                                                Karl Harrison
• Cornerstone of the ARDA-LHCb
                                                            

                                                               C.L Tan
  activity from the beginning                                  Dietrich Liko
                                                                   They did all the work ….
• More and more at the centre of                                   I joined much later …

  ATLAS strategy
                                                                          Further resources
   – Tutorials + presentation in the
                                                                             Coordinator: Ulrik Edege
     User Task Force (lead by F.
                                                                              GridPP
     Giannotti)                                                           
                                                                                 A. Soroko
                                                                                                         ARDA - LHCb
• ATLAS specific:                                                            ARDA
                                                                                 Jakub Moscicki

   – Good perspective of integration
                                                                                 Andrew Maier

                                                                             ARDA Metadata (AMGA)
     with the production system                                                  Birger Koblitz
                                                                                 Nuno Santos

   – D. Liko (CERN/ARDA) is the new
     distributed analysis coordinator
                                                  ARDA defined the interface for the gLite
                                                    metadata and recently the implementation
                                                    is part of gLite itself

                                                                                               Massimo Lamanna / CERN   23
                                                      ALICE: interactive parallel
                   Enabling Grids for E-sciencE
PROOF@GRID Multitier
                                                                                    PROOF@GRID: 1-Cluster
Hierarchical Setup with xrootd
                                                                                    Setup
read-cache
   proofd                               Depending on the Catalogue model,
                                        LFNs can be either resolved by the PROOF
   xrootd                               master using a centralized file catalogue
                   Client               or only MSS indexed and resolved by
   MSS                                  Submasters and local file catalogues.



                    PROOF        Master
                                                        Local
                                                   File Catalogue
             Su
             bm
             ast
             er
 Site 1                                                                   Site 2


                        Storage Index
                          Catalogue




                                          GridKA Schule, 30 September 2005 - 53                              GridKA Schule, 30 September 2005 - 52




Evolution of last year activity (SuperComputing
  2004). Key ARDA contributions: integration with
  underlining grid services and improvements in the
  PROOF sector (connectivity, etc…)

                                                                                                Massimo Lamanna / CERN                       24
                                                                                             ALICE
Enabling Grids for E-sciencE



                                                           Access to grid
                                                             services via the C-
                                                             Access Library

                                                                    ARDA shell + C/C++ API
                        C++ access library for gLite has been
                        developed by ARDA

                               •High performance
                                                                                Server
                               •Protocol quite proprietary...                                                Server
                                                                           Server Applicat ion
                                                                                   UUEnc                     Service
                                                                            GSI    gSOAP Securit y-
                        Essential for the ALICE                                     SSL  w rapper

                        prototype




                                                                                      TEXT
                        Generic enough for general use
                                                                                    SSL
                                                                            GSI    gSOAP Securit y-
                        Using this API grid commands have                          UUEnc w rapper
                        been added seamlessly to the                          C-API (POSIX)
                        standard shell
                                                                                  Applicat ion

                                                                                    Client
                                                                          Massimo Lamanna / CERN
                                                                Massimo Lamanna - FNAL Computing Seminar 31-MAY-2005   2515
Enabling Grids for E-sciencE




                               Catalogue inspection
                                  Massimo Lamanna / CERN   26
                                                                  CMS
              Enabling Grids for E-sciencE



• ARDA developed a successful prototype
   – Tool for concrete investigation (ASAP)
   – Used to demo gLite for the LHCC
   – Very important (and positive) users’ feedback
• Future (present)
   – Prototype  convergence on the CMS system
   – EGEE2  not only analysis: production is important!
       In the framework of the CMS structure (Integration and
        development activities and CMS LCG task force):
          • Key components of ASAP refactorised and contributed to the CMS
            framework (CRAB)
          • Contributions to CMS dashboard (aggregation of monitor
            information and high-level view of all CMS computing activities)
          • Informal set of milestones. Close interaction with the other
            contributors within the CMS task force

                                                     Massimo Lamanna / CERN   27
                                                                  CMS dashboard
                            Enabling Grids for E-sciencE




                        CMS Users Jobs




                                                                    CMS Jobs&IO vs time




                                                                                  SC3 jobs
The system collects info from R-GMA
    (middleware), Monalisa (CMS realtime),
    Submission tools (Production, CRAB, ASAP)
This is essential (for CMS) and very instructive for
    the MW (we are using to study the “efficiency”
    of one system at a time…                               http://www-asap.cern.ch/dashboard/
                                                                            Massimo Lamanna / CERN   28
Enabling Grids for E-sciencE




                               Massimo Lamanna / CERN   29
                                                                         Effort
                Enabling Grids for E-sciencE


• Original envelope:
   – 4 FTEs from EGEE
   – Matching funds from LCG: 4 FTEs  ~6 FTEs
• More people interested/attracted
   – 2 PhD students
        CMS thesis (Brunel University)
        Dependability thesis (Coimbra University)
   – Collaboration with LCG Russia (coordinate visits at CERN)
        Very successful
   – Collaboration with ASCC Taipei
        Very successful (2 FTEs at CERN 2 FTEs at ASCC)
   – Within EGEE
        Very important role
           • Especially in “testing”  other colleagues enabled/coordinated to contrbute
             to the gLite testing and certification
• Very positive working environment
   – So far, people were primarily attached to one experiments but being in
     a single team, they interact and augment their efficiency


                                                              Massimo Lamanna / CERN   30
                                                                           Milestones
                  Enabling Grids for E-sciencE


• No major problem so far
   – But note that 2005 (internal) milestones are basically about using gLite
     and giving feedback

   – Template: Use the gLite middleware (version n) on the extended
     prototype (eventually the pre-production service) and provide feedback
     (technical issues and collect high-level comments and experience from
     the experiments)

   – During the year, we decided to accept “delays” (LCG Quarterly Reports):
        Waiting for gLite 1.0 to arrive and stabilise (mainly Q1), we focused with one
         experiment (ATLAS) on middleware evaluation: the results at that stage were
         de facto valid for all the other experiments (basic functionality), while putting
         all the effort on the prototype and development activity (e.g. Ganga4,
         ASAP…)
        Within the ALICE activity (now ALICE task force), a lot of studies of the new
         middleware were done by non-ARDA persons and we agreed to focus on the
         specific ARDA contribution

                                                 I think there is a problem (typo) on the table on the web



                                                                           Massimo Lamanna / CERN      31
                                                  Main 2005 milestone
                   Enabling Grids for E-sciencE



    WBS           Description                          Due             Done
    1.6.18        E2E prototype for each               31-12-04        31-12-04
                  experiment (4 prototypes),
                  capable of analysis (or
                  advanced production)
    1.6.19        E2E prototype for each               31-12-04
                  experiments (4
                  prototypes), capable of
                  analysis and production

•    ALICE: Our contribution focus almost 100% on advanced analysis
     feature. Contributions on xrootd and monitor will be common with the
     production usage
•    ATLAS: Experience in both submitting to Grid and production system. We
     hope that Ganga will be integrate as common “front end”
•    CMS: Working very close in the Task Force. The monitor part does
     receive data from both analysis and production jobs. Hopefully also ome
     part of the task manager will be used in the future production system
•    LHCb: GANGA submits to multiple back ends (including Grid) and DIRAC
                                                             Massimo Lamanna / CERN   32
                                              Milestones evolution
               Enabling Grids for E-sciencE



• Since July, the LCG Task Forces have been set up
   – Under experiments’ lead
   – Experiments milestones will (should) become our milestones
• EGEE2 perspectives
   – Invitation to include production in the future
        As a matter of fact, contained in our milestone…
           • Clearly our mandate was more “users-analysis biased”
        Already happening
           • Since a while: e.g. Submission of analysis jobs to the production
             system (cfr. Ganga workshop in June)
           • In the task forces: e.g. CMS Dashboard
• Not yet formalised
   – Wait for April 2006?


                                                         Massimo Lamanna / CERN   33
                                                              SC3/SC4
               Enabling Grids for E-sciencE



• Main interest: SC4
   – Document + discussions (from arda.cern.ch):
        http://lcg.web.cern.ch/LCG/activities/arda/public_docs/2005/Q3/SC
         4.doc
        http://lcg.web.cern.ch/LCG/activities/arda/public_docs/2005/Q3/AR
         DAatSC4-preGDB.ppt
        Working document exposing several use cases
• On one side, we expect to be involved “via the
  experiments”
   – This is natural due to the increasing integration in the
     experiments’ plan
        Service challenges flowing into the pilot services
   – On the other side, it is ARDA role to prepare for the analysis
     challenges using the present understanding and experience
        It was a successful approach, for example, in studying the
         metadata problem:

                                                       Massimo Lamanna / CERN   34
                                                         SC3/SC4
               Enabling Grids for E-sciencE



• What does it mean “to prepare for the analysis” in
  concrete?
   – “Batch” analysis:
        Low-latency jobs
        Access to worker nodes of a mixture of production (~10 h) and
         analysis jobs (<1 h)
        Users dispatching large number of jobs
        Job resubmission (using experiments policies)
   – “Interactive” analysis
        Low-latency access to partial results
        Low-latency interaction with Grid services
        Use our experience in C-Access Library, xrootd, PROOF, GANGA,
         DIANE…
   – Efficient use of experiment-specific services
        CMS MyFriend layer // VOBOX // Edge Services

                                                     Massimo Lamanna / CERN   35
                                                         Outlook
             Enabling Grids for E-sciencE




• Achievements
  – Positive contributions to the experiments
  – Positive partnership with gLite
      As a “side effect” of our EGEE contribution: good contacts and
       exchange of ideas and experience with other scientific communities
       in EGEE


• Concerns and risks
  – Coherence between the middleware development, the
    deployment and the experiments is a very delicate plant…




                                                   Massimo Lamanna / CERN   36

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:2
posted:2/26/2012
language:
pages:36