Docstoc

Issues

Document Sample
Issues Powered By Docstoc
					Grid Operations Issues and
Difficulties
       Leigh Grundhoefer
       Indiana University
       leighg@indiana.edu


       Open Science Grid Operations
       Workshop
       December 1, 2004
    Operations
   Many sites still experience installation and resource certification
    problems.
      Lack of step-by-step installation instructions for most clusters.

      Effort on “generalized” rather than “specialized” install process.

      Many configuration steps are required after “pacman” command completes.

   Need more tests of real world usage, a common compliant is “The site status is
    green but I can’t use it…”
      More enhancements for operational verification software (site_verify)

      Lack of sufficient monitoring and publication of available storage

      No policy publication and little policy enforcement

   No redundancy for “single point of failure” services, I.e VOMS server,
    documentation services, installation caches
      Avoid centrality and hopefully avoid catastrophic grid failures




             December 1, 2004                                                    .2
    Operations
   Problems keeping communication channels open with resource
    administrators
      Problems solving with resource providers stall.

      Infrastructure updates and configuration management are done as “best effort”.

       This results in simple problems hanging around for extended periods of time.
   Lack of consistency in error reporting and inaccessible log data increases
    complexity.
   No documented remote diagnostics procedures makes problem solving harder.
   No procedure for bug tracking which would allow feedback to software
    developers.
   Lack of required personnel to research and correct all problems
      What is the correct ratio of sites to support personnel?

      How much should be invested?




             December 1, 2004                                                       .3
    Support
   Thus far focused almost exclusively on the resource
    providers. Typical of early adoption of internet
    software.
   Much planning and focus on reacting to critical
    problems. Other problems go unreported or
    unnoticed.
   No training activities. An important part of good
    support?
   Informal channels of support abound.




        December 1, 2004                                  .4

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:4
posted:8/29/2012
language:English
pages:4