The CERN Computer Centres October 14th 2005 Tony.Cass@CERN.ch by SayreW

VIEWS: 5 PAGES: 32

									   The CERN
Computer Centres

  October 14th 2005


    Tony.Cass@CERN.ch
Talk Outline
 Where
 Why
 What
 How
 Who




               Tony.Cass@CERN.ch   2
Where
 Where
  – B513
    » Main Computer Room, ~1,500m2 & 1.5kW/m2, built for mainframes
      in 1970, upgraded for LHC PC clusters 2003-2005.
    » Second ~1,200m2 room created in the basement in 2003 as
      additional space for LHC clusters and to allow ongoing operations
      during the main room upgrade. Cooling limited to 500W/m2.
  – Tape Robot building ~50m from B513
    » Constructed in 2001 to avoid loss of all CERN data due to an
      incident in B513.
 Why
 What
 How
 Who
                            Tony.Cass@CERN.ch                         3
Why
 Where
 Why
  – Support
    » Laboratory computing infrastructure
            Campus networks—general purpose and technical
            Home directory, email & web servers (10k+ users)
            Administrative computing servers
    » Physics computing services
            Interactive cluster
            Batch computing
            Data recording, storage and management
            Grid computing infrastructure

 What
 How
 Who


                                   Tony.Cass@CERN.ch            4
Physics Computing Requirements
 25,000k   SI2K in 2008, rising to 56,000k in 2010
  – 2,500-3,000 boxes
  – 500kW-600kW @ 200W/box.
     2.5MW @ 0.1W/SI2K
 6,800TB   online disk in 2008, 11,800TB in 2010
  – 1,200-1,500 boxes,
  – 600kW-750kW
 15PB   of data per year
  – 30,000 500GB cartridges/year
  – Five 6,000 slot robots/year
 Sustained   data recording at up to 2GB/s
  – Over 250 tape drives and associated servers

                          Tony.Cass@CERN.ch           5
What are the major issues
 Where
 Why
 What   are the major issues
  – Commodity equipment from multiple vendors
  – Large scale clusters
  – Infrastructure issues
    » Power and cooling
    » Limited budget
 How
 Who




                          Tony.Cass@CERN.ch     6
Commodity equipment & many vendors
   Given the requirements, significant pressure to limit cost
    per SI2K and cost per TB.
   Open tender purchase process
    – Requirements in terms of box performance
    – Reliability criteria seen as subjective and so difficult to incorporate
      in process.
        » Also, as internal components are similar, are branded boxes intrinsically
          more reliable?
   Cost requiremens and tender process lead to ―white box‖
    equipment, not branded.
   Tender purchase process leads to frequent changes of
    bidder.
     Good in that there is competition and we aren‘t reliant on a single
      supplier.
     Bad as we must deal with many companies, most of whom are remote
      and subcontract maintenance services.
                                   Tony.Cass@CERN.ch                                  7
Large Scale Clusters
 Thelarge number of boxes leads to problems in
 terms of
  – Maintaining software homogeneity across the clusters
  – Maintaining services despite the inevitable failures
  – Logistics
     » Boxes arrive in batches of O(500)
     » Are vendors respecting the contractual warranty times?
           (Have they returned the box we sent them last week…)
     » How to manage service upgrades
           especially as not all boxes for a service will be up at the time of upgrade

  – …




                                   Tony.Cass@CERN.ch                                      8
Tony.Cass@CERN.ch   9
Infrastructure Issues
 Cooling     capacity limits the equipment we can install
  – Maximum cooling of 1.5kW/m2
  – 40x1U servers @ 200W/box = 8kW/m2


 We cannot provide diesel backup for the full
  computer centre load.
  – Swiss/French auto-transfer covers most failures.
  – Dedicated zone for ―critical equipment‖ with diesel
    backup and dual power supplies.
       » Limited to 250kW for networks and laboratory computing
         infrastructure.
       » … and physics services such as Grid and data management servers
             but not all the physics network, so careful planning needed in terms of
              switch/router allocations and the power connections.

                                     Tony.Cass@CERN.ch                                  10
How
 Where
 Why
 What
 How


 Who




          Tony.Cass@CERN.ch   11
How
 Where
 Why
 What
 How
  – Rigorous, centralised control
 Who




                        Tony.Cass@CERN.ch   12
ELFms
 Extremely    Large Farm management system
  – box nodes in:
     » deliver required configuration
     » monitor performance and any
       deviation from the required state
                                                  Node       Configuration
                                                             Management


     » track nodes through hardware and
       software state changes
                                                   Node
                                                Management



 Three   components:
  – quattor for configuration, installation and node
    management
  – Lemon for system and service monitoring
  – Leaf for managing state changes—both hardware (HMS)
    and software (SMS)

                            Tony.Cass@CERN.ch                                13
quattor
        takes care of the configuration, installation
 quattor
 and management of nodes.
  – A Configuration Database holds the ‗desired state‘ of all
    fabric elements
     » Node setup (CPU, HD, memory, software RPMs/PKGs, network,
       system services, location, audit info…)
     » Cluster (name and type, batch system, load balancing info…)
     » Defined in templates arranged in hierarchies – common
       properties set only once
  – Autonomous management agents running on the node
    take care of
     » Base installation
     » Service (re-)configuration
     » Software installation and management

                           Tony.Cass@CERN.ch                         14
quattor architecture
                                                         Configuration server
                                             SQL
                                          SQL backend
                         CLI




                                   SOAP
                         GUI                 CDB
                        scripts
                                          XML backend

                                            HTTP
                                                                XML configuration profiles



  SW server(s)                                                                                Install server
                                  Node Configuration Manager
                                             NCM
                                                                                                 Install
                                  CompA      CompB      CompC
                                                                                                Manager




                                                                                 HTTP / PXE
      SW
                 HTTP




                        RPMs      ServiceA ServiceB ServiceC          base OS
   Repository
                                          RPMs / PKGs                                            System
                                    SW Package Manager                                           installer
                                          SPMA


                                   Managed Nodes
                                    Tony.Cass@CERN.ch                                                          15
Lemon
 Lemon (LHC Era Monitoring) is a client-server tool
 suite for monitoring status and performance
 comprising
  – sensors to measure the values of various metrics
     » Several sensors exist to monitor node performance, process, hw
       and sw monitoring, database monitoring, security, alarms
     » ―External‖ sensors for metrics such as hardware errors and
       computer centre power consumption.
  – a monitoring agent running on each node. This manages
    the sensors and sends data to the central repository
  – a central repository to store the full monitoring history
     » two implementations, Oracle or flat file based
  – an RRD based display framework
     » Pre-processes data into rrd files and creates cluster summaries
           Including ―virtual‖ clusters such as the set of nodes being used by a given
            experiment.

                                   Tony.Cass@CERN.ch                                      16
Lemon architecture

                                           Repository




                                                             SQL
                                            backend
                                                                                    RRDTool /
                                                                                      PHP
    Correlation                          Monitoring


                                  SOAP




                                                             SOAP
     Engines                             Repository                                     apache
                                           TCP/UDP                                      HTTP




     Nodes

                                                                     Lemon                Web
               Monitoring Agent
                                                                      CLI               browser

      Sensor       Sensor     Sensor                                            User

                                                                    User Workstations

                                         Tony.Cass@CERN.ch                                        17
Leaf
   LEAF (LHC Era Automated Fabric) is a collection of
    workflows for high level node hardware and software
    state management, built on top of quattor and Lemon.
    – HMS (Hardware Management System)
       » Track systems through all physical steps in lifecycle eg. installation,
         moves, vendor calls, retirement
       » Automatically requests installs, retires etc. to technicians
       » GUI to locate equipment physically
       » HMS implementation is CERN specific, but concepts and design should be
         generic
    – SMS (State Management System)
       » Automated handling (and tracking of) high-level configuration steps
              Reconfigure and reboot all LXPLUS nodes for new kernel and/or physical move
              Drain and reconfig nodes for diagnosis / repair operations
       » Issues all necessary (re)configuration commands via quattor
       » extensible framework – plug-ins for site-specific operations possible
    – CCTracker (in development)
       » shows location of equipment in room
                                        Tony.Cass@CERN.ch                                    18
Use Case: Move rack of machines
                      1. Import

                                                           6. Shutdown work order
                                            HMS           10. Install work order
                7. Request move
                                                                                    Sysadmins
 Operations
                 2. Set to standby

              11. Set to production
                                                                  8. Update

                            9. Update
  SMS
                                                                                    LAN DB
 3. Update
 12. Update
                                                                  5. Take out of production
                   CDB
                             4. Refresh          Node

                      13. Refresh                         14. Put into production

                                      Tony.Cass@CERN.ch                                       19
Tony.Cass@CERN.ch   20
Tony.Cass@CERN.ch   21
Tony.Cass@CERN.ch   22
Tony.Cass@CERN.ch   23
Tony.Cass@CERN.ch   24
Tony.Cass@CERN.ch   25
Tony.Cass@CERN.ch   26
Tony.Cass@CERN.ch   27
Tony.Cass@CERN.ch   28
Tony.Cass@CERN.ch   29
Tony.Cass@CERN.ch   30
Who
 Where
 Why
 What
 How
 Who
  – Contract Shift Operators: 1 person 24x7
  – Technician level System Administration Team
     » 10 team members plus 3 people for machine room operations plus
       engineer level manager
  – Engineer level teams for Physics computing
     » System & Hardware support: approx 10FTE
     » Service support: approx 10FTE
     » ELFms software: 3FTE plus students and collaborators.
            ~30FTE-years total investment since 2001
                                  Tony.Cass@CERN.ch                 31
Summary
 Physics requirements, budget and tendering
  process lead to large scale clusters of commodity
  hardware.
 We have developed and deployed tools to install,
  configure, monitor nodes and to automate hardware
  and software lifecycle steps.
 Services must cope with individual node failures
  – already the case for simple services such as batch
  – new data management software being introduced to
    reduce reliance on individual servers
  – focussing now on grid level services
 We   believe we are well prepared for LHC computing
  – but expect managing the large scale, complex
    environment to be an exciting adventure
                        Tony.Cass@CERN.ch                32

								
To top