ppt - CERN Computing Seminars by liuhongmeiyes



A Flexible Fabric and Application
     Information Recorder
 Tome Anticic1, Ruzica Piskac2, Vedran Sego2

            For the ALICE collaboration

1) Rudjer Boskovic Institute, Zagreb, Croatia
2) PMF, Zagreb, Croatia

   What is/will be monitored
   Requirements
   Tools used in AFFAIR
   AFFAIR components
        DIM/SMI
        Round robin databases
        ROOT
        AFFAIR Collectors
        AFFAIR Monitor
        AFFAIR web

 Status of AFFAIR/ performances achieved
 Future work

             AFFAIR                  2      tome.anticic@cern.ch   June 2002
                                                            ALICE DAQ
         Inner tracking               TPC            TRD         Particle        Muon      Trigger detectors
            system                                             identifcation
  bufffers                                                                                           Trigger data
x 435   DDL              1oo Mb/s                                                                                      1.2 msec
           RORC                       RORC           RORC        RORC            RORC        RORC                            L0 trigger
x 334     RORC                       RORC     216   RORC        RORC            RORC        RORC
                                                                                                                       5.5 msec
                    1oo Mb/s
                                              216                                                                            L1 trigger
x 278         LDC                     LDC            LDC         LDC             LDC          LDC
                                                                                                                      88.0 msec
                                 60 Mb/s
                                                                                                                             L2 trigger
                                                                                             L3 trigger

                                                       Gigabit SWITCH
                                                                                                                     Trigger system
                     40 Mb/sec

             x 50          GDC                        GDC                 GDC                GDC

                                                                                                          DDL-Detector Data Link
         1.25 GB/s                                          SWITCH                                        RORC-Read-Out Receiver Card

                                              CASTOR/ROOT                                                 LDC-Local Data Concentrator
                                                                                                          GDC-Global Data Collector
                               PDS                    PDS                 PDS               PDS
                                                                                                          EDM-Event Destination Manager
                                                                                                          PDS-Permanent Data Storage

                       AFFAIR                                             3         tome.anticic@cern.ch        June 2002
                  Why a monitoring program ?
 Long-term: Monitoring for final ALICE DAQ
 Now: ALICE Data Challenges

    Testing of ALICE DAQ and mass storage to be ready for LHC
    Need to monitor system performance of 100s (1000s) of nodes
                  CPU             Disk used
                  Disk IO         RAM free
                  Network IO      Disk free

    Need to monitor the ALICE DAQ software (DATE) and hardware performance
                  LDC             Switches
                  GDC             Storage
                  RORC            DDL

          AFFAIR                               4   tome.anticic@cern.ch   June 2002

   Need down to 10 (or even less) sec updates
   Reliable and robust
   Easy to maintain, easy to setup/install
   Should be as “invisible” as possible
      No growing (or better yet none) logfiles on monitored nodes
      Not cpu intensive
      Not network intensive

   Flexible: new monitoring parameters/software should be easy to add/adjust
   Web access to processed, real time data in the form of graphs, histograms,..
   Scalable – should work equally well for 10 as for 1000 computers
   All monitored data should be permanently stored for offline analysis
   Wide area transparency

             AFFAIR                           5      tome.anticic@cern.ch   June 2002
                          AFFAIR architecture


                                                                      Data Collector
           station                                                    Data Collector

 Data     g
storage          web                DATA

                                                                      Data Collector

              AFFAIR                6       tome.anticic@cern.ch   June 2002

 Previous ADC monitoring used DATE’s info loggers to gather and process
  data (P.Saiz, K. Schossmaier)
         • Worked, but need a more general and flexible tool

 Analysis of open source tools showed none completely fulfilled the

 Combined a number of tools and added own
       DIM
       SMI
       ROOT
       rrdtool
       Sysstat
       Apache/php

            AFFAIR                             7      tome.anticic@cern.ch   June 2002
                         AFFAIR overview at ADC
                                          Run Control
                                                                    (finite state machine)

                                                                              AFFAIR Collector-DATE
               AFFAIR                                                            (LDC,GDC,…)
 database      monitor
                                                                              AFFAIR Collector- System
                               Control, parameters, configuration                  (CPU, IO,…)
rrdtool, ROOT
      AFFAIR plotter                                                                        ~100
ROOT                                     DATA
       AFFAIR web                                                             AFFAIR Collector-DATE
                                                                              AFFAIR Collector- System
                                                                                   (CPU, IO,…)

                AFFAIR                       8         tome.anticic@cern.ch   June 2002
                              Why not SNMP?
 Not simple at all!
 Need root intervention to get started
 At least 2 times the number of calls, and application busy/wasting time during
  calls                                 Request

                        waiting                                       busy
                                                Request                          Server
                        waiting          Data                         busy

 Each variable monitored (CPU, network IO,…) requires separate calls, with
  all the overhead
 Use of SNMP limits one to just standard monitored parameters, so any
  specialized ones can’t be monitored
     Especially true for applications
            AFFAIR                              9         tome.anticic@cern.ch    June 2002
                         How does DIM work

 Data transferred asynchronously, interrupt driven
     Twice less calls
     Parallelism – client can do other things while server busy

                                      Request (at startup)
                          busy                                     busy

                          doing        Data                       doing
          Client           his                                     his      Server
                          own                                      own

                           job         Data                        job

            AFFAIR                            10         tome.anticic@cern.ch   June 2002
                                DIM in practice
                                                                                         Register service
 Client/Server                                                                   (e.g. LDC data for tbed0001)

                                        Name server
                     Request services

   Client                                        data                                Server


                                               data                                  Server

 If Client or server goes down, and up again, automatic connection

            AFFAIR                                11       tome.anticic@cern.ch   June 2002
                 AFFAIR Collector program: DATE
     Name server                                                           SMI (finite state machine) start/stop at
                            AFFAIR Collector                                     beginning/end of each run
     Register service

                                 DIM/SMI library
                                -Bytes recorded
Default parameters/status                                          Shared
                                -Bytes Injected                                         DATE
                                -Events recorded                   memory
                                -Run status                                          (LDC, GDC)
             DATA               -…
                            Endless loop with 10 sec period
 (array of real numbers)

                 AFFAIR                         12            tome.anticic@cern.ch   June 2002
             AFFAIR Collector program: System
     Name server            AFFAIR Collector                        From /proc
         Register service                                           -CPU
                                                                    -Net IO
                                    AFFAIR                          -Disk IO
                                 DIM/SMI library                    -RAM available
    Configuration/start                                             -…
Default parameters/status                                           System commands:
                                                                    -login sessions
                                                                    -Disk free
           DATA                                                     -CPU number
                            Endless loop with 10 sec period         -Kernel version
 (array of real numbers)                                            -/tmp size

               AFFAIR                          13         tome.anticic@cern.ch   June 2002
                                           AFFAIR monitor
                          tbed0002                   Name server
                          ….               Request services, using configuration file
                          xxshare005d                                                       tbed0001 GDC 10
                          …                                                                 tbed0001 LDC 10
                                                                                             tbed0001 SYSMON 10 hda1 hda2
        -Conf file listing all possible computers                                           …
                                                                                        -Conf file for parameters, like period
                                                                                        monitored, partition monitored,…
                                                              DIM                       -Default values automatically
                                                            library                     generated, but can change by hand
                           rrd 1                        AFFAIR                      Configuration/start
                              rrd 2
                               rrd 3                                               Default parameters/status
create if not existing


                                                                                (array of real numbers)

                     AFFAIR                                    14         tome.anticic@cern.ch       June 2002
                        Round robin databases
 Very fast and efficient data storage mechanism
 Works with fixed amount of data (fixed time depth)
 Works with pointer to current element
                          Time = 5                      Time = 6
                                                        Time = 1

                                                          Time = 2
                      Time = 4

                                                   Time = 3

 Each data set (LDC, GDC, system info) for each machine has its own rrd
 Each created so that it takes:
       10 sec info for last 1 hour – 360 rows deep
       1 min for last 6 hours      – 360 rows deep
       4 min for last 24 hours     – 360 rows deep
       … for last 6 hours, 1 month, 3 months, 1 year
             AFFAIR                           15        tome.anticic@cern.ch   June 2002
                     Round robin databases II

 For each time bin the data gets resampled (correctly interpolated) to keep in
  fixed intervals

     e.g. if rrd storing in 10 second interval:
     T= 10 , value = 100 stores as 100 at time = 10
     T= 21, value = 111 stores as 110 at time = 20

 Data Consolidation

     Rrdtool for each time period finds average, maxima, minima (each requires
      separate row)
     These are very useful for graphs

            AFFAIR                          16         tome.anticic@cern.ch   June 2002
                                     AFFAIR plotter
                                             rrd 1
            AFFAIR                            rrd 2                           AFFAIR
                                               rrd 3
             ROOT                                                              ROOT

ROOT used to generate for each last                    Put in permanent storage as ROOT files, for
 hour/6hour/day/…:                                      later detailed analysis

- generates eps graphs for each node
- generates aggregate plots
      (e.g. Total throughput in/out for
- Generates superimposed plots (e.g. GDC
  throughput for all machines on one plot)

         In                                            2
                     Out                               4

                 AFFAIR                         17     tome.anticic@cern.ch    June 2002
                         Graph configuration

 All graphs created using one configuration file
     Completely defines units/ labels/ if graphs aggregate / if graphs superimposed
     Thus no code intervention needed to create the plots
        New monitored variables can be added and configured very easily

            AFFAIR                          18       tome.anticic@cern.ch   June 2002
                    Graph examples
 GDC performance
                          Rates (kB/sec) for last 24 hours for some GDC nodes

                                                          • Full lines average
                                                          • Dashed lines max

                           Rates (kB/sec) for last 7 days for some GDC nodes

         AFFAIR           19       tome.anticic@cern.ch    June 2002
                    Graph examples II
 GDC performance
                                 Total Rates (kB/sec) for last hour for all GDC nodes

                                                      Aggregate plot, calculated by

                                 Recorded events for last 7 days for some GDC nodes

         AFFAIR             20       tome.anticic@cern.ch   June 2002
                        Graph examples III
 System performance for an individual computer
                                              CPU usage for last day for one computer (tbed0005)

 This format also defined using the configuration file

           AFFAIR                        21      tome.anticic@cern.ch   June 2002
                                Web interface
   Web interface written using php/java script
       Completely automatically generated
       New variables, monitored sets automatically reflected in plots


                                                                    •Clicks for last hour/6
                                                                    hour/day …
                                                                    •On click converts eps to png

              AFFAIR                           22      tome.anticic@cern.ch   June 2002
                   Web interface II

         A click will generate
         plots for the machine
         and lead you to its page

AFFAIR                              23   tome.anticic@cern.ch   June 2002
                       AFFAIR performance

 Successfully monitoring ~ 100 computers in 10 sec intervals during ADC for
  last 2 months

 Delay between data received and generated graphs down to 1-5 minutes
     ~10000s of plots generated every few minutes

 Monitoring is done using two dual CPU 1GHz machines, connected with NSF

 No showstoppers encountered:
     Many problems found and solved during run
     Some sporadic small problems remaining – occasional improper shutdown of
      Collector nodes, graphs occasionally garbled, processes occasionally die.

           AFFAIR                          24        tome.anticic@cern.ch   June 2002
                     AFFAIR performance II

 In testing phase monitored all lxshare machines in 1 second intervals

 Proved very useful in developing DATE, finding the DATE performance,
  finding source and solution of problems

 No reason to believe system cannot scale to ~1000s of nodes

     All CPU/disk intensive operations can trivially be spread across a number of nodes

     Main possible limitation is continuous generation of graphs for all individual
      computers (~10000 every few minutes), but is being taken care of (see next slide)

 AFFAIR quite general: easy to add additional applications/variables to monitor

            AFFAIR                           25      tome.anticic@cern.ch   June 2002
             To do/near future (weeks/~month)

   Make graph generation more efficient – factor 2-5 with some coding changes
      Graphs will have a delay under 1 minutes

   Web interface much better
        Buttons for type of plots, not just time periods (also automatic)
        Page with table with latest (last 10 seconds) numerical performance values
        More graphs, better graphs, prettier graphs
        …

   Enable graph generation for individual computers only when click for it

   Have a releasable version of code
      Documentation, user manual…

   Signals/ status of computers/applications
      E.g. disk full/CPU too high/events not incrementing…
      Color code the status for each computer/application on web page

              AFFAIR                               26        tome.anticic@cern.ch     June 2002
                                   To do/long term

   Have a “affair control” interface to manage it (Kylix?)

   Option to store data not in fixed intervals, where data consolidation not wanted:
      e.g. event size/ trigger type

   Add a lot more graph options

   Option to store as SQL database

   Add option to have varying number of variables monitored
      e.g. free space in all partitions

   Detailed offline analysis code

   Add more AFFAIR Collectors for DATE
      Detector Data Optical Link
      Switches (might need to incorporate SNMP)
      Mass storage
              AFFAIR                           27      tome.anticic@cern.ch   June 2002

To top