GRID Status Report by HC120915042020

VIEWS: 0 PAGES: 34

									UK Testbed Status
       and
EDG Testbed Two.
  Steve Traylen
 GridPP 7, Oxford
                    Presenter Name
                    Facility Name
                    Outline

•   Status of the UK Sites.
•   Release of EDG 2.
•   UK Certificates.
•   Grid monitoring in the UK.




                     28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                                       PPD
                            Manchester
  EDG Testbed                    BaBar Farm                      DZero Farm
   EDG 1.4                        EDG 1.4                         EDG 1.4

 CE          SE                 CE       SE(1.5TB)             CE          SE(5TB)



   9xWN                           80xWN                           60xWN


•GridPP and BaBar VO Servers.
•User Interface
•Plan that DZero farm will join LCG.
•SRIF bid in place for significant HEP resources for the end of the year.


                                  28th April 2003    Steve Traylen, s.traylen@rl.ac.uk
                                                     PPD
                  UCL
EDG Testbed      •Network Monitors for WP7 development.
 EDG 1.4
                 •SRIF bid in place for  200 cpus for the
CE          SE   end of the year to join LCG1.



     1xWN




                 28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                                   PPD
                           RAL PPD
EDG Testbed                     RGMA Testbed
 EDG 1.4                          EDG 2.0
CE         SE                   CE               SE         MON



 9xWN                                       1xWN



•User Interface
•Plan to be a portion of the Southern Tier2 Centre within LCG1.
•50 cpus and 5TB of disk expected for the end of year.



                               28th April 2003        Steve Traylen, s.traylen@rl.ac.uk
                                                      PPD
                   Birmingham
EDG Testbed
 EDG 1.4         •Expansion to 60 cpus and 4TBs.

CE          SE   •Expect to participate within LCG1/EDG2



     1xWN


EDG Testbed
                   Liverpool
 EDG 1.4         •Currently unmaintained.
CE          SE   •Plan to follow EDG 2, possibly integrating BaBar farm.


     1xWN
                         28th April 2003    Steve Traylen, s.traylen@rl.ac.uk
                                            PPD
                               RAL
 EDG Testbed              Teir1/a                     RGMA Testbed
     EDG 1.4              EDG 1.4                        EDG 2.0
CE             SE           CE                        CE              SE            MON

                                                      LCG0 Testbed
      5xWN                 230xWN
                                                       CE             SE            1xWN

EDG Dev Testbed                                •UI within CSF.
         EDG 2.0                               •NM for EDG2.
CE             SE   MON         SE             •Top level MDS for EDG.
                                               •Various WP3 and WP5 dev nodes.
          1xWN                ADS              •VOMS for DEV TB.
                                                    Steve Traylen, s.traylen@rl.ac.uk
                             28th April 2003   •http://ganglia.gridpp.rl.ac.uk/
                                                    PPD
                 Cambridge
EDG Testbed
                •Farm shared with local NA-48, GANGA users.
 EDG 1.4
                •Some RH73 WNs for ongoing Atlas challenge.
CE         SE
                •3TB GridFTP-SE.
                •Plan to join LCG1/EDG2 later in the year with an
 15xWN          extra 50 cpus later this year.
                •EDG jobs will soon be fed into the local E-Science
                farm.
                •http://farm002.hep.phy.cam.ac.uk/cavendish/




                      28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                                        PPD
                   Bristol
 EDG Testbed              RGMA Testbed
  EDG 1.4              EDG 2.0
CE           SE   CE               SE      MON


      1xWN                      1xWN


CMS/LHCb Farm      BaBar Farm             •GridPP RC.
  CMS-LCG0         EDG 1.4                •Plan to join EDG2 and LCG1
 CE          SE   CE              SE



     24xWN         78xWN
                       28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                                         PPD
                Imperial College
EDG Testbed        BaBar Farm
                                         •RB and BD-II for EDG 1.4.
 EDG 1.4           EDG 1.4
                                         •RB and BD-II for EDG 2.0.
CE         SE          CE
                                         •Plan to be in LCG1 and other
                                         testbeds.

 WNs                   WNs


 CMS-LCG0              RGMA Testbed
 CMS-LCG0          EDG 2.0

CE         SE     CE           SE        MON



 WN                          1xWN

                       28th April 2003    Steve Traylen, s.traylen@rl.ac.uk
                                          PPD
                   Queen Mary
EDG Testbed
                    • CE also feeds EDG jobs to 32 node E-Science
 EDG 1.4
                    farm.
CE         SE       •Plan to have LCG1/EDG2 running for the end of
                    the year.
                    •Expansion with SRIF grants.(64WN+2TB in Jan
1xWN       32xWN    2004, 100WN + 8TB in Dec 2004.)
                    •http://194.36.10.1/ganglia-webfrontend




                      28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                                        PPD
                 Oxford
EDG Testbed      •Plan to join EDG2/LCG1.
 EDG 1.4
                 •Nagios monitoring has been set up.
CE          SE   •(RAL is also evaluating Nagios.)
                 •Planning to send EDG jobs into 10 WN
                 CDF farm.
     2xWN
                 •128 node cluster being ordered now.




                  28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                                    PPD
                      Glasgow
 ScotGRID
                      •    New hardware expected soon.
 EDG 1.4
                      •    WNs on a private network with
CE         SE              outbound NAT in place.
                      •    As ScotGRID grows plans to be part
                           of LCG.
 59xWN                •    Various WP2 development boxes.


RGMA Testbed
 EDG 2.0

CE         SE   MON


                          28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                                            PPD
              UK Overview

• Now significant resources within EDG.
• Integrating EDG to farm has been
  repeated many times but it is difficult.
• Sites are keen to take part within LCG1 or
  EDG2.
• By the end of the year many HEP farms
  plan to be contributing to LCG1 resources.

                  28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                                    PPD
                EDG 2.0

• Now in a permanent state of immanent
  release.
• Since 27th May:
   – 25 pre releases.
   – 295 configuration changes.
   – Range from a typo to a new resource
     broker.

                  28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                                    PPD
         Criteria for cutting EDG 2.0
• For EDG 2.0 to the following must be satisfied.
   – 50 sequential jobs. 98% success.
   – 250 jobs being ran by 1 RB. 80% success.
   – 5 jobs with 2GB i/o sandbox. 80% success.
   – 25 jobs which require two proxy renewals. 80%
   – Upload and register 1GB file to an SE, replicate
     to a mass storage device.
   – Register 1000 files in less than 1000s.
   – Match a job against three files on an SE.
                       28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                                         PPD
    Installation of an EDG2 Testbed
• LCFGng recommended installation method - No
  manual install instructions yet.
   – Significantly better than LCFG.
• Configuration (site-cfg.h) is less cryptic.
• Less hand installation required.
   – Install host certificates.
   – PBS server.
   – MySQL tables.
   – mkgridmap.conf.
                   28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                                     PPD
28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                  PPD
          Integration after 2.0.

• Use gcc3.2.2 throughout.
   – Currently used by RB and the APIs the
     RB uses.
• GridFTP access to castor.
• Integration of VOMS.
   – Currently ongoing in parallel.
   – Has no impact on existing software.
• This will be EDG 2.1
                  28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                                    PPD
             Required Nodes

• CE: gatekeeper, MDS, gin, ..
• SE: GridFTP, WP5-SE, gin, …
• WN: PBS batch worker + client tools.
• MON: Servlets for a site, GOut for the RB.
  Also collects fabric monitoring
  information…
   – On small sites can be moved to the CE.
• Generally configuration is more modular.
                   28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                                     PPD
             LCG1 or EDG2

• Which testbed should I join?
  – Significant resources best suited to
    LCG1.
  – Small dynamic testbeds can contribute
    to continued development of testbed
    two.


                  28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                                    PPD
               UK Certificates

• UK EScience CA was added to production
  EDG testbed 3 weeks ago.
• UK Hep CA will stop issuing certificates.
  – Existing certificates will still be valid for
    the remainder of their lifetime.



                     28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                                       PPD
Ratio of UKHep to EScience Certs




             28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                               PPD
EScience Certs by OU.




       28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                         PPD
     VO Membership + EDG Guidelines

 WP6
 Atlas
 CMS
BioMe
                                                                 Members
 Alice
                                                                 UK Members
LHCb
Iteam
Eobs
BaBar

         0   20   40   60      80         100         120

                        28th April 2003         Steve Traylen, s.traylen@rl.ac.uk
                                                PPD
                    Ganglia
• Ganglia provides time plots of system metrics.
• In use at RAL, Cambridge and QMUL.
• By default load, network i/o, memory.
• Trivial to add new metrics, e.g. active MySQL
  connection for CMS.
• Expansion to the UK possible via LCFG objects
  and instructions, however WP4 tools might be a
  used instead.
• Data could be collected centrally for a UK view.
                     28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                                       PPD
28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                  PPD
28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                  PPD
              GridPP Map

• Checks HEP sites every 6(?) hours for:
   – Ping
   – Globus Submission
   – EDG Job Submission via Imperial RB.
   – EDG Job Submission via LYON RB.
• http://www.gridpp.ac.uk/map/


                  28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                                    PPD
28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                  PPD
      GridPP RB Monitoring @ Imperial

• Publishes service status.
• Publishes times for LDAP queries of resources.
• http://www.hep.ph.ic.ac.uk/~dguser/diagnostics.html.
• Imperial also submits test jobs, more sophisticated
  jobs than the map, e.g. check for the existence of a
  CloseSE.
• http://www.hep.ph.ic.ac.uk/~dguser/Qstatus.html


                        28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                                          PPD
28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                  PPD
28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                  PPD
                Monitoring

• Currently lots of monitoring but no central
  location.
• Most monitoring currently only shows the
  current state.
• The Grid operations centre can coordinate
  much of this.


                   28th April 2003   Steve Traylen, s.traylen@rl.ac.uk
                                     PPD

								
To top