Site Validation Report by X69r57T3

VIEWS: 0 PAGES: 9

									Site Validation Session Report


    Co-Chairs:
    Piotr Nyczyk, CERN IT/GD
    Leigh Grundhoefer, IU / OSG
    Notes from Judy Novak
    WLCG-OSG-EGEE Workshop
    CERN, June 19-20th 2006
  Service Availability Monitoring
  (SAM) - “extension” of SFT:
   • generalized framework to monitor all
     LCG/EGEE services and not only CE: BDII,
     RB, LFC, FTS, etc.
   • most of the sensors run remotely (from
     central machine)
   • no installation needed on service machines
   • moved from MySQL to Oracle, optimized
     data schema
Available at:
https://lcg-sam.cern.ch:8443/sam/sam.cgi
•   SAM sensors:
    – currently: BDII (Taiwan), RB (RAL), CE, SRM, LFC, FTS, SE
      (CERN)
•   release updates + SAM (SFT)
    – certifying current tests with each new release
    – Create update tests as necessary
    – CA cert. releases are special
• Availability views
    – current, daily, weekly, monthly
    – For CE, SE, SRM, siteBDII
    – displayed with GridView


          http://glite.cvs.cern.ch/cgi-bin/glite.cgi/sft2/tests/
      OSG Validation services
• CE/SE Validation aggregation : VORS - site scanner, BDII info
   – http://vors.grid.iu.edu/
• OSG VO’s VOMS validation
   – http://voms-monitor.grid.iu.edu/
• GridEX - application validation ( pilot job submissions )
   – http://www.cs.wisc.edu/condor/tools/exerciser/
• Site Policy template and publication
   – http://vors.grid.iu.edu/site_policies.html
• GIP Validation
   – http://grow.its.uiowa.edu/osg-gip/Production.shtml
• Monitoring validation : MonALisa Client status (VO Jobs I/O)
   – http://grid02.uits.indiana.edu:8080/stats?page=summary
• GridCat and the MIS-CI client
   – http://osg-cat.grid.iu.edu/ - Production instance
   – Client software:
      http://software.grid.iu.edu/pacman/tarballs/misci-0.4.1.tar.gz
             Summary
• It seems to be impossible to avoid
  cross-monitoring (OSG monitoring
  doesn't include LCG-specific services,
  and the other way around)
• We should synchronize on VO level, but
  LCG/EGEE is also using regional
  structuring
  OSG and EGEE Validation
      Interoperability
• Site discovery - using discovered sites using
  BDII
  – Ops VO - supported only on OSG sites which are
    interoperable. (fully deployed in July)
  – How can we determine if EGEE site is
    interoperable? Review certain BDII informations
• Cross installation of necessary tools and
  libraries for site validation
  – LCG tools - added as optionally installed package
    for OSG sites
  – OSG environment variables - ? (GIP)
   OSG and EGEE Validation
     Interoperability (cont)
• Use of existing GGUS- OSG GOC ticket exchange
  for error reporting
   – SAM database to use contact information for OSG
      GOC
• Issue of coordinating scheduled downtime
   – OSG GOC will maintain a web page with
      downtimes
• Propose review of effort to add OSG specific
  validations to SAM framework.
• Testing and iterative development will be
  accomplished using Pre-Production sites and OSG
  ITB
DB monitoring in SAM for Tier
    1’s (Dirk Duellmann)
• Jobs are connecting to the DB with either http (VO lib) or direct
  Oracle (instant client)
• Should be completed by October when experiments will start
  using DBs
•   CMS + Alice don't need them, but only 'squid’
• existing DB monitoring is too detailed for SAM/SFT, but SAM
  could provide highlevel monitoring of DB service
• some DB services (like LFC) are already tested by SAM, BUT
  only the functionality is tested, not the DB! The test could be:
   – threshold for connection between T0 -> T1
   – user access (squid)
   – client latency (?)
• Oracle client will be installed on the Worker Nodes
Comments/Discussion

								
To top