Slide presentation Template - Download Now PowerPoint

Document Sample
Slide presentation Template - Download Now PowerPoint Powered By Docstoc
					R-GMA:
First results after deployment

Steve Fisher (EDG - WP3)

s.m.fisher@rl.ac.uk




https://edms.cern.ch/document/376535/
DataGrid is a project funded by the European Union   CHEP 2003   24-28 March 2003   R-GMA   1
Who we are
                Heriot-Watt, Edinburgh
                     Andrew Cooke, Werner Nutt

                IBM-UK
                     James Magowan, (Manfred Oevers), Paul
                      Taylor

                INFN
                     Roberto Barbera, Giuseppe Save, Gennaro
                      Tortone

                Queen Mary, University of London
                     Roney Cordenonsi, (Ari Datta)

                CCLRC
                     Linda Cornwall, Abdeslem Djaoui, Steve
                      Fisher, Robin Middleton

                PPARC
                     Rob Byrom, Laurence Field, Steve Hicks,
                      Manish Soni, Antony Wilson, (Xiaomei Zhu),
                      Jason Leake

                SZTAKI, Hungary
                     Peter Kacsuk, Norbert Podhorszki

                Trinity College Dublin
                     Brian Coghlan, Stuart Kenny, David
                      O‟Callaghan, (John Ryan)

                                 CHEP 2003   24-28 March 2003   R-GMA   2
R-GMA
                                     Producer

   Uses the Grid Monitoring
    Architecture from Global Grid
    Forum

                                                   Registry
   R-GMA is a relational
    implementation



   Applied to both information
    and monitoring



   Creates impression that you                         Information flow
    have one RDBMS per Virtual
    Organisation
                                                        Meta-data flow


                                    Consumer


                                            CHEP 2003   24-28 March 2003   R-GMA   3
Relational Approach


   Not a general distributed RDBMS system, but a way to use the
    relational model in a distributed environment where global
    consistency is not important.
   Producers announce:     SQL “CREATE TABLE”
                publish:    SQL “INSERT”
   Consumers    collect:   SQL “SELECT”
   Some producers, the Registry and Schema make use of RDBMS as
    appropriate – but what is central is the relational model.




                                               CHEP 2003   24-28 March 2003   R-GMA   4
Producers
   DataBaseProducer – Supports History Queries
       Information not lost
       Supports joins
       Clean up strategy

   StreamProducer – Supports Continuous Queries
       In memory data structure
       Can define minimum retention period

   ResilientStreamProducer – Supports Continuous Queries
       Like the StreamProducer but won‟t lose data if system crashes
       So slightly slower

   LatestProducer – Supports Latest Queries
       Just holds the latest information for any “primaryish” key
       Supports joins

   CanonicalProducer – Supports anything
       Offers anything as relations


                                                           CHEP 2003   24-28 March 2003   R-GMA   5
Archiver (Re-publisher)
 It   is a combined Consumer-Producer

 You   just have to tell it what to collect and it does so on your behalf

             to any kind of “Insertable” (i.e. not to the
 Re-publishes
 CanonicalProducer)




                                                    CHEP 2003   24-28 March 2003   R-GMA   6
Schema & Contributions
            CPULoad (Global Schema)

            Country            Site            Facility            Load               Timestamp

            UK                 RAL             CDF                 0.3                19055711022002

            UK                 RAL             ATLAS               1.6                19055611022002

            UK                 GLA             CDF                 0.4                19055811022002

            UK                 GLA             ALICE               0.5                19055611022002

            CH                 CERN            ALICE               0.9                19055611022002

            CH                 CERN            CDF                 0.6                19055511022002




                                                                   CPULoad (Producer 2)

                                                                   UK           GLA       CDF          0.4   19055811022002
CPULoad (Producer 1)
                                                                   UK           GLA       ALICE        0.5   19055611022002
UK        RAL          CDF      0.3   19055711022002

UK        RAL          ATLAS    1.6   19055611022002




                                                       CPULoad (Producer 3)
                                                       CH        CERN         ATLAS      1.6      19055611022002

                                                       CH        CERN         CDF        0.6      19055511022002




                                                                                      CHEP 2003    24-28 March 2003   R-GMA   7
The Mediator
   Producers, associated with views on a virtual data base.

   Queries posed against the virtual data base

   The Mediator must:
       find the right Producers
       combine information from them

   Can now merge information from several producers

   The final mediator will take “any” SQL statement and do the right
    thing




                                                  CHEP 2003   24-28 March 2003   R-GMA   8
R-GMA Tools

     R-GMA CLI
         Command Line Interface (similar to MySQL)
         Supports single query and interactive modes

     R-GMA Browser
         JSP application dynamically generating web pages
         Supports pre-defined and user-defined queries

     Pulse
         R-GMA Java client-based GUI
         Supports streaming and simple graphical displays




                                                   CHEP 2003   24-28 March 2003   R-GMA   9
A user application: CMS
 BOSS   for job tracking on local farm
     It currently forks the executable and parses stdout to publish info
      directly to an SQL DB
     They publish to one table per job type and one table which is common
      to all job types

 They   are now ready to publish via R-GMA instead
     Providing a scaleable Grid solution




                                                      CHEP 2003   24-28 March 2003   R-GMA   10
GIN and GOUT
(Gadget IN and Gadget OUT)

    LDAP                   Consumer            Archiver
 InfoProvider                (CE)
                                                                Consumer
                                             DataBase
                           Consumer                               API
                                             Producer
                             (SE)
 GIN
                           Consumer
                           (SiteInfo)         RDBMS
 CircularBuffer
   Producer               GOUT
                  R-GMA
 CircularBuffer
   Producer
                                                                  LDAP
                                                                  Server
 GIN                              R-GMA
                                 Consumers

    LDAP
 InfoProvider




                                         CHEP 2003    24-28 March 2003   R-GMA   11
CE and SE Tables

                                 CloseStorage                   StorageElement
ComputingElement
                                   Element                           status
dn                             dn                              dn
CEId                           CEId                            SEId
TotalCPUs                      CloseSE                         SEfreespace
FreeCPUs                       ……                              ……
TotalJobs
RunningJobs
……

 “Select a ComputingElement with at least 1 free CPU that also has a
 CloseStorageElement with at least 1000 MB of free space”
 SELECT DISTINCT ComputingElement.CEId FROM
 ComputingElement, CloseStorageElement,StorageElementStatus WHERE
 ComputingElement.FreeCPUs > 0 AND
 (ComputingElement.CEId = CloseStorageElement.CEId AND
 CloseStorageElement.CloseSE = StorageElementStatus.SEId AND
 StorageElementStatus.SEfreespace > 1000)

                                                         CHEP 2003   24-28 March 2003   R-GMA   12
OGSIfied R-GMA                               Consumer
       Application                            Factory

            Consumer                                                         Registry
                                         Consumer
              API
                                          Instance


             Producer
                                         Producer
               API
                                         Instance                              Schema
          Sensor
                                                 Producer
   All Grid Services                             Factory
   OGSA Factories, GSH, GSR
   Registry includes HandleMapper
   SQL as Service Data Element Query Language




                                                            CHEP 2003   24-28 March 2003   R-GMA   13
Other technicalities – no time today
 Soft-state   Registration and the Registry
      Registry records existence of Producers and Consumers
      Registry holds last contact time and „expiry‟ time
      Producers and Consumers periodically refresh their time stamps
      Scheduled removal of entries that have timed-out

 Registry   & schema distribution
      Will have one logical registry and schema per VO
      Each logical registry will have multiple physical “copies”
      Self healing algorithm

 Security


 etc   …


                                                         CHEP 2003   24-28 March 2003   R-GMA   14
Performance
 By   design:
      Very flexible - to avoid bottlenecks
      Powerful queries allow a single query to be made

 Performance     and Optimisation
      Use NetLogger and profiling tools to identify possible bottlenecks




                                                       CHEP 2003   24-28 March 2003   R-GMA   15
Results
   has only just been deployed in the EDG development testbed and
 It
 we do not yet have the results which the title of this talk implied.




                                               CHEP 2003   24-28 March 2003   R-GMA   16
Summary and the future
 R-GMA    is a combined Grid information and monitoring system

 Just   deployed in the EDG development testbed

 Focusing on reliability, stability and performance for the rest of the
 project (9 months)




                                    Thanks to the EU and our
                                    national funding agencies for
                                    their support of this work

                                                  CHEP 2003   24-28 March 2003   R-GMA   17

				
DOCUMENT INFO