ATLAS Applications
Frontier physics on the Grid
T.Doherty1, S.George2, K.Harrison2, R.W.L.Jones3, D.Liko4, C.Nicholson1, A.Soroko6, G.Rybkine2,C.L.Tan7
1University of Glasgow, 2RHUL, 3University of Cambridge, 4Lancaster University,
5CERN, 6University of Oxford, 7University of Birmingham
ATLAS is a general-purpose particle physics experiment which
will study topics including the origin of mass, the processes
that allowed an excess of matter over antimatter in the
universe, evidence for Supersymmetry and other new physics,
including even micro black hole production! The experiment
is being constructed by some 1600 scientists in ~150
institutes in 6 continents. The experiment will be located at
the 27km circumference Large Hadron Collider at CERN in
Geneva.
Despite highly efficient filters acting on the raw data
read by the detector, the `good’ events will still
correspond to several Petabytes of data per year, which
will require millions of SpecInt2k to process and analyse.
Even now, to design the detector and to understand the
physics, many millions of simulated events also have to
be produced.
A simulated Higgs boson decay in the ATLAS detector
Only a Grid can satisfy our requirements. ATLAS is a global collaboration with Grid testbeds already deployed worldwide.
While building on generic middleware, we are required to develop several components, which may be reusable. We are
also required to build tools that can run tasks in a coherent way across several grid deployments
These are being exercised and developed
in Data Challenges of increasing size and
complexity
These have now been performed using
three Grid deployments in 85 sites and six
continents. They are a proof of principle
for Grid-based production. We are
merging these activities with a series of
Service Challenges to establish the
required system.
ATLAS Applications
Frontier physics on the Grid
T.Doherty1, S.George2, K.Harrison2, R.W.L.Jones3, D.Liko4, C.Nicholson1, A.Soroko6, G.Rybkine2,C.L.Tan7
1University of Glasgow, 2RHUL, 3University of Cambridge, 4Lancaster University,
5CERN, 6University of Oxford, 7University of Birmingham
The GANGA project provides an interface between the
user, the Grid Middleware and the experimental software GANGA
GUI
Grappa ?
framework. It is being developed jointly with the LHCb
experiment, and as it is using component technologies will
Histograms
allow reuse elsewhere JobOptions Monitoring
Virtual Data Results GRID
Algorithms Services
Athena/
GAUDI
Application
?
The GANGA frameworks provides command line and graphical
interfaces; job preparation templates and tools, including
tools to locate datasets; job splitting; and back-ends into
various systems including Fork, LSF, PBS, Condor, LCG, gLite, PANDA
and Condor systems.
The ATLAS Distributed Analysis system supports distributed users,
data and processing. This includes Monte Carlo production, data
reconstruction and the extraction of summary data. A prototype
system based with a GANGA user interface is being rolled-out
The large number of Grid sites requires automated and scalable Installation Tools. Coherent rpms and tar files are
created from CMT, exposing the package dependencies as PACMAN cache files. PACMAN can pull or push complete
installations to remote sites. Scripts have been developed making the process semi-automatic. it is important to
note that these tools not only install code but also establish the working environment.
ATLAS UK integrates the EGEE/LCG middleware. The
most recent data challenge ran over 570000 production
jobs in 84 sites using grid tools. Even analysis can now
be run this way, and we are now in urgent need of
monitoring and accounting tools to manage production
and individual users on the same Grid.
Number of Jobs
NorduGrid
11%
Grid3
24%
Job submission rates are being improved to avoid
Grid3
bottlenecks
LCG-CG
LCG
LCG-CG The production system has been redesigned to
31%
increase performance
NorduGrid
A new data handling system has been written and is
LCG
being rolled-out across the sites. This will be essential
34%
for handling the expected real data volumes.
An essential element in the Distributed Analysis system is the access metadata describing the event. GridPP is
providing effort in the metadata handling for the ATLAS data in two ways. First, it works on the production
metadata tool, AMI, and provides the plug-in that allows GANGA to browse AMI. Secondly, we are working on the
TAG data. This is contains a record for each event in a sample, with simple descriptive keys that allow physics and
trigger signatures to be selected, and also contains information on the file in which the event occurs and a
pointer to the event within the file. Preliminary studies have show that this can give allow the fast selection and
access for many datasets that are relatively sparsely distributed within the data files