Docstoc

Automated Software Packaging and Installation for the ATLAS

Document Sample
Automated Software Packaging and Installation for the ATLAS Powered By Docstoc
					      Automated software
   packaging and installation
   for the ATLAS experiment
              Simon George
   Royal Holloway, University of London
      Christian Arnault, LAL Orsay; Michael Gardner, RHUL; Roger Jones,
            University of Lancaster; Saul Youssef, Boston University



                       e-Science All Hands Meeting
                               Nottingham
                           2-4 September 2003
S.George@rhul.ac.uk                                     ATLASexperiment.org
                     Introduction
   This talk is about packaging, distribution and
    installation for a large software project
   It is essential because
       The project computing resources are widely distributed
        around 140 institutes, who all want to use the software
       We want to be able to use Grid resources that do not
        have locally managed installations of the software
       Our working model also requires the ability to deploy
        user code that is not part of an official distribution
   I’ll describe the process developed and the tools
    used.

Wed 03Sep03                Simon George RHUL                      2
                  Contents
 ATLAS  and its software
 Requirements
 Tools and formats
 Meta data
 Naming conventions
 Creating and installing the kits
 Conclusions and outlook




Wed 03Sep03        Simon George RHUL   3
          The ATLAS Experiment
       A Particle Physics experiment at
        the Large Hadron Collider, CERN
       1600 physicists, 140 institutes,
        6 continents
       Studies include
         • search for the origin of mass
         • excess of matter over antimatter in
           the universe
         • evidence for Supersymmetry
         • other new physics


Wed 03Sep03                Simon George RHUL     4
              ATLAS software suite
   Simulation, data processing and analysis
   500 “packages”, 50 external, inter-dependent.
   100s of developers and 1000s of users in 140 institutes
   One release build is 2.5 GB of files
   It takes 10 hours to build
   Build types and frequencies
       Production release 3-4 times per year
       Developer release every 2-3 weeks
       Nightly build of snapshot
   Build configuration permutations
       Optimised, debug and sometimes also profile builds.
       Two platforms (RedHat 7.3 on Intel x86, Solaris 8 on SPARC)
       One or more compilers (gcc 3.2)
   Config. management, build and install handled by CMT
   So not a trivial task to package, distribute and install
Wed 03Sep03                   Simon George RHUL                       5
                                                                CMT
                              CMT                     www.cmtsite.org
   Configuration management tool
       Concerned with setting up user’s environment to build
        and run software
       Needs help of tools for a large project
   CMT helps to define and impose conventions
       For naming packages, files, directories
       For describing their relationships
       In other words, package metadata
       This is the key feature exploited for this project.
   Useful features to manage sub projects,
    dependencies
   A broad user base, especially in Particle Physics
    and Astronomy experiments.

Wed 03Sep03                 Simon George RHUL                     6
        Packaging Requirements
   Three types of kit required
       Binary kit
         • Pre-built executables, libraries and configuration files needed to
           run the software
         • Used for data challenges, production, basic users
       Developer’s kit
         • Binary kit plus
         • Headers, libraries and configuration needed to build against it
         • For developers and most users
       Full source kit
         • To rebuild from scratch on binary-incompatible platforms
         • When local source code browsing is required
   For each permutation of platform, config, compiler
Wed 03Sep03                    Simon George RHUL                             7
        Installation requirements
   For large facilities: unattended, push button deployment
   For normal user: relocateable, no root access
   Automatic configuration
   Updates, multiple versions
   Avoid duplication and unnecessary downloads
   Possibility to take subset of software
   Self contained, apart from …
   Prerequisite software: modest list and automatic check
   Set up user’s environment (e.g. LD_LIBRARY_PATH)
   Reversible: uninstall
   Install and work disconnected from network,
    e.g. install onto a laptop from CDs


Wed 03Sep03               Simon George RHUL                    8
                        Constraints
   ATLAS software is divided into sub-projects
       Currently ATLAS and Gaudi
       Could be more in the future, e.g. split ATLAS into
        simulation and reconstruction
       Each sub-project consists off many packages
   External/Internal package distinction
       Internal packages are developed and managed within
        the ATLAS software project
       External packages are the opposite, e.g. software from
        the Particle Physics community, public domain software
        or commercial products.
       Interface packages for externals
         • Pure metadata package
         • Actual external sw can be installed anywhere, any way.
         • Gives it the outside appearance of an internal package

Wed 03Sep03                   Simon George RHUL                     9
              Constraints, continued
   Existing use of CMT
       Package structure already in place
       Meta data provided by packages or implied by default
        policies is already enough for automated packaging.
   Problems
       ATLAS software is written by large communities with a
        mixed level of experience
       All such software projects will have small flaws
        introduced in each release
       These must be worked around when they impact on the
        packaging.
       For example, one problem of particular relevance to
        packaging & installation is cyclic dependencies

Wed 03Sep03               Simon George RHUL                    10
          Packaging: starting point
 One         kit per package
       Follow existing granularity
 Separate        metadata and payload
       Two parts to each kit
 Performed   by librarian as integral part of
  release procedure
 Distribution by web or distributed filesystem
  (e.g. AFS)


Wed 03Sep03              Simon George RHUL    11
                      Tools used
   CMT
       Define and impose conventions on packages
       Query the metadata needed for packaging
   Pacman
       Metadata format
       Tool used to manage kit installation
   Tar and RPM
       Payload format – the package itself
   “Deployment tools” shell scripts
       Construct the kits using CMT
       Control location of Pacman cache and distribution
       Post-installation configuration
Wed 03Sep03                Simon George RHUL                12
Overview of process and tools
  Librarian
                CMT                               Web server
                            Create kits           or AFS
              Deployment
              Tools




  Pacman                                                        CMT

 Local s/w
                                                                  Developer
 manager



                           Local                 Run software
                           computers
Wed 03Sep03                  Simon George RHUL                          13
                     Pacman




                                                       http://physics.bu.edu/~youssef/pacman
   A package manager
   Packager defines how the software should be
    fetched, installed, configured, updated, in a
    “Pacman” file. The package itself can be in any
    format as that file is separate.
   A directory of these files is known as a cache,
    usually available on the web.
   Pacman tool is used to install the software
   Pacman’s feature list is a good match to the
    requirements for installation.
   Already used by several Particle Physics and
    GRID projects.
Wed 03Sep03            Simon George RHUL              14
    Package distribution format
   Tar vs. RPM
   Both can be made relocateable
   Feature set
       Tar has a simple feature set but is complementary to CMT and
        Pacman
       RPM overlaps with CMT and Pacman
         • e.g. RPM also handles dependencies and prerequisites
   Platforms
       RPM is only widely used on Linux, while tar is standard on pretty
        much any Unix
   Annoyances
       Default RPM database needs root access to write to it
         • There are workarounds for this but not pretty
   Conclusion
       Decided to use tar
       but retained RPM as an option

Wed 03Sep03                      Simon George RHUL                          15
                               Meta data
   For each package
       Other packages it uses (dependencies)
       Location of constituents
         •    Applications and libraries
         •    Header files
         •    Run time/config files
         •    CMT requirements file
   External packages
       Pure meta data “glue” packages
       Just define paths to export
   All defined in CMT requirements files
       or implied by default conventions of ATLAS
   Can be queried through cmt
       cmt show uses
       cmt show macro <package>_export_paths

Wed 03Sep03                         Simon George RHUL   16
               Naming and structure
   Package naming convention
       Packages in a sub-project
         • <package name>-<sub-project release id>
       External packages
         • <package name>-<version id>
       These names are used when expressing the inter-package
        dependencies
   Directory structure within each kit
       <sub-project>/<release-id>/InstallArea/
         • contains the sub-directories bin, lib, include, share.
       <sub-project>/<release-id>/<package>/<version>/cmt/
         • Contains the configuration management files
       <external-package>/
         • Assumed to have their own internal structure for versions & builds
   This is designed to support coexistence of:
       Different versions of every piece of software
       Different binary versions (platform and build config)

Wed 03Sep03                       Simon George RHUL                             17
                     Examples
CMT requirements file:
package ExamplePkgA                         Package name and author
author A. Person <ap@cern.ch>
use ExamplePkgB                             Inter-package dependencies
use ExampleExtPkg                           Instruction to build a library
library ExamplePkgA *.cxx                   from source files
apply pattern component_library
                                            Type of library to build,
apply pattern declare_runtime
                                            implies library file names
                                            Default location implied



Pacman file:
description=‘Package ExamplePkgA-01-07-02 in release 6.5.0’
url=‘http://atlas.web.cern.ch/Atlas/GROUPS/SOFTWARE/OO’
source=‘../dist’
download = { ‘*’:’ExamplePkgA-6.5.0.tar.gz’ }
depends = [ ‘ExamplePkgB-6.5.0’, ‘ExampleExtPkg-v1’ ]


Wed 03Sep03             Simon George RHUL                             18
                     Creating the kits
   First, build a release
   Discover cycles in the dependencies
       Use a feature of CMT to discover cycles in the dependencies, as
        these must not be propagated to the kits. Record the output in a
        file.
   Then, use a feature of CMT to visit every package in a
    dependency tree and apply a command there
       cmt broadcast <command>
   Usage of the script to create a kit:
       create_kit.sh –release <release-id> -cycles <file>        [-rpm]
        <target distribution directory>
       Creates a pacman file and tar file, optional RPM file
   Finally, there are often a few things to fix by hand specific
    to each release.
   Note that CMT itself is included as a kit
Wed 03Sep03                    Simon George RHUL                           19
                           Installation
   Performed by site software manager or end user on
    desktop or laptop
   Straightforward procedure:
       Install Pacman, if not already done
       Install prerequisite software
         • Currently just RedHat 7.3 o/s, gcc-3.2 and Java SDK 1.4.1
       Choose directory for the installation
         • Probably the same as before
       Choose which release to install
         • Available releases are listed on a web page
       Use Pacman to download, install and configure it, e.g.
        pacman –get ATLAS:AtlasRelease-6.5.0
         • Dependencies followed automatically to get everything you need
       Optionally, run script to set up a user environment and run a test
   User configures software in the usual way
       Just choose release and private working area as normal
       Run a setup script provided by CMT
Wed 03Sep03                     Simon George RHUL                            20
                          Conclusions
   Procedures and tools have been developed for the
    packaging, distribution and installation of ATLAS software
   Based on Pacman, CMT, tar/rpm and some shell scripts
   The basic principles could be applied more generally
       Using some or all of the same tools
   It satisfies most of the requirements for run-time and
    developers’ kits and for installation.
       Full source kit still to be done.
   Early adopters have given useful feedback and it is now
    being imported into Grid production systems
   Must now move to its use as part of the standard release
    procedure in ATLAS
       by December 2003, for our global `Data Challenge 2'


Wed 03Sep03                      Simon George RHUL               21
              Future developments
 Better  handling of prerequisite software and
   platform compatibility checks
       EDG WP4 configuration management task
         to work with an installation on
 Potential
  demand mechanism for GRID farms
 LCG/EDG/iVDGL GLUE
       Meta packaging proposal for Grid middleware
        and applications, O. Barring et al.
 Pacman       version 3

Wed 03Sep03            Simon George RHUL              22

				
DOCUMENT INFO