Docstoc

LHCb Italy Computing for

Document Sample
LHCb Italy Computing for Powered By Docstoc
					LHCb-INFN Computing for 2001-2002


   Castel Gandolfo, September 13, 2001
    Domenico Galli & Umberto Marconi
Present Status
   LHCb-Italy has built-up a Linux PC farm, consisting of 18
    CPUs and 1TB NAS disk
   Farm basic features:
       Diskless dual-processor (866 MHz PIII) motherboards
       Boot on network (Intel PXE)
       Rack mounted
       Array of 14 80 GB EIDE disks configured in RAID-5
   Technical details available from our LHCb note:
       First Implementation of a Diskless Computing Farm for the LHCb
        Tier-1 Prototype
   LHCb software and production tools installed.
   Funds got at the end of February 2001.
   Producing since end of May, 2001.
   Integrated in LHCb distributed production system since
    August, 2001.
                       LHCb-INFN Computing for 2001-2002
                        Domenico Galli and Umberto Marconi
LHCb-Italy Computing Farm




           LHCb-INFN Computing for 2001-2002
            Domenico Galli and Umberto Marconi
    Diskless Operation
 No jobs died due to out-of-memory.
 Swap absence has no drawback.
 The LHCb-INFN Farm is working at 100% CPU load.
 Disk I/O is about 7 Mbps for 18 simultaneous jobs (max 140 Mbps).

 No waiting states due to I/O. No drawback for the NFS Monte
  Carlo I/O.




                     LHCb-INFN Computing for 2001-2002
                      Domenico Galli and Umberto Marconi
Network Boot Operation
   No problems in simultaneous bootstrap of 10 PCs.
   Use of Multicast TFTP (no additional network load
    adding nodes to the Farm) and of DHCP Proxys (CPU
    nodes can be distributed in different VLAN) makes us
    confident that the system can scale.
   OS accesses during normal operation negligible.
   Can be used also by disk-provided farms:
       Easy administration of large Farms (1 logical disk contains
        the whole configuration of the Farm).
       Reliability due to RAID redundancy in boot server.
       CPUs can be easily moved from one experiment to
        another.
            It’s enough to run a script that reconfigures the DHCP server
             and to reboot the PCs.
                         LHCb-INFN Computing for 2001-2002
                          Domenico Galli and Umberto Marconi
Web Monitor of the Farm
http://lhcbpc1.cnaf.infn.it/monitor
   A MRTG-compatible Web monitoring system has been developed
    (Java based, transfer data not graphics through the net, big
    plot, multiple plots overlapped or stacked).
   Monitor system has been interfaced with SMBus to read CPU
    and Motherboards temperatures and fan speed.




                    LHCb-INFN Computing for 2001-2002
                     Domenico Galli and Umberto Marconi
    Produced events
   June and July, 2001
        Produced, in 10 weeks, 5 Mevents (2.5 TB) minimum
         bias RAWH, running in stand-alone.

   Since August, 2001
        Farm is now running
         using the LHCb
         distributed production
         tools, and is producing
         9900 events/day (b-
         bbar inclusive DST2)
         (35 GB/d RAWH +
         DST2).
                      LHCb-INFN Computing for 2001-2002
                       Domenico Galli and Umberto Marconi
 LHCb Distributed MC Production
                                Update
Submit jobs                     bookkeeping
remotely via                    database
Web                             (Oracle at CERN)


                                 Transfer data to
 Execute on                      CASTOR mass
 farm                            store at CERN



Monitor
                                    Data Quality
performance of
                                    Check
farm via Web

                 LHCb-INFN Computing for 2001-2002
                  Domenico Galli and Umberto Marconi
How GRID Will Improve the Distributed
Production System in 2002
   Job Submission:
       WP1 job submission tools allows to secure-submit jobs
        through local GRID commands rather than remote logon.
       Use of JDL (Job Description Language) to choose the “right”
        resource where to submit the job.

   Data copying:
       WP2 GDMP tool via command line interface to transfer
        Zebra format files (control from LHCb scripts). Globus
        authentication fix security problems.

   Mass Storage:
       WP5 interface to CASTOR.


                      LHCb-INFN Computing for 2001-2002
                       Domenico Galli and Umberto Marconi
Milestones for 2001-2002
   12.2001       Integration of the Grid tools (job scheduling + data
    transfer) into the LHCb distributed production System.
   12.2001       Complete the installation of the farm for MC production
    (total 2 TB disk + 50 CPU).
   01.2002        Production of 20% of the simulated events for the LHCb
    collaboration.
   01.2002       [L0/L1 trigger TDR].
   06.2002       Installation of a test farm prototype for the Analysis.
   06.2002       Makes it possible to perform analysis jobs on an entire
    data set distributed among several centres.
   07.2002       First LHCb Data Challenge.
   12.2002       [Computing TDR].
   12.2002       Test of OO software functionality (report to the
    collaboration by July ’02).
   2003          [High Level Trigger TDR].
                        LHCb-INFN Computing for 2001-2002
                         Domenico Galli and Umberto Marconi
Additional Request for the Year 2001
   Complete the MC Farm configuration, reaching the
    planned size (2300 SI95 + 2 TB disk, as requested in
    September, 2000).
       15 dual-processor PC: 30 k€+IVA
       1 NAS Server 1 TB: 30 k€+IVA
       Upgrade link for the NAS and the Switch: 5k€+IVA
       Accessories: 5 k€+IVA (rack-mounted console, VGA switch +
        thin-client workstation to be installed at CNAF)

   Total: 70 k€ + IVA = 85 k€



                     LHCb-INFN Computing for 2001-2002
                      Domenico Galli and Umberto Marconi
   LHCb Trigger fundamental

   Suppression factors         Level      Input rate      Ouput rate   Suppression
                                                                       factor
   by LHCb triggers
                               L0         16MHz           1 MHz        16
   on
   Minimum Bias events         L1         1MHz            40kHz        25

                               L23        40kHz           200Hz        200



Preliminary estimation        Events         Light(%)       Charm(%)   Beauty(%)

of event quark content        Produced       90             10         0.6
at various trigger levels     After L0       86             13         1

                              After L1       79             15         6
                              After L2       44             23         33
                              After L3       0              0          100

                            LHCb-INFN Computing for 2001-2002
                             Domenico Galli and Umberto Marconi
MC Background Data for L23 Trigger
   L23 trigger bandwidth has to be shared among
    several (~10) physics channel.
   L23 total suppression 200 implies a MB suppression
    ~2000 in each physics channel.
   A ~10% uncertainty on the MB suppression factor
    requires N ~ 2  105 events entering L23
   These events have to go through L0 and L1 (because
    of trigger correlation), so that N0 ~ 8  107 events
    should be tracked throughout the whole detector,
    digitised and filtered.
   Production will be shared among CERN, Lyon,
    Liverpool, Rutheford, INFN.
                  LHCb-INFN Computing for 2001-2002
                   Domenico Galli and Umberto Marconi
L0/L1 Trigger Correlation: MB Passing L0 Are
More Likely to Pass Also L1
    Retention efficiency




                                1
                           10


                           1/25



                              2
                           10




                                     0   0.02    0.04    0.06     0.08      0.1        0.12    0.14
                                                                                      L1 probability cut
   L1 retention efficiency versus L1 probability cut. Red line: minimum
bias events. Blue line: minimum bias events passing L0 trigger.
                                                LHCb-INFN Computing for 2001-2002
                                                 Domenico Galli and Umberto Marconi
Request for the Year 2002
   Maintain the CPU capability acquired during 2001
    (that assuming the complete power of ~50 CPUs will
    be operating).
   Test of a prototype computing configuration for
    Analysis, to serve Italian LHCb groups.
       Testing and developing tools to integrate national centres
        into a distributed/integrated analysis centre
       Adding a disk server (1 TB), best suited for chaotic access.
            4 dual-processor PC with Fibre Channel interface + switch FC:
             12.5 k€+IVA
            1 rack: 2.5 k€+IVA
            1 SAN Disk Server 1 TB: 50 k€+IVA

   Total: 65 k€ + IVA = 80 k€.
                         LHCb-INFN Computing for 2001-2002
                          Domenico Galli and Umberto Marconi

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:4
posted:10/17/2012
language:Unknown
pages:15