Embed
Email

SwiNG

Document Sample

Shared by: wangnianwu
Categories
Tags
Stats
views:
1
posted:
2/9/2012
language:
pages:
29
Initializing a National Grid Infrastructure:

Lessons Learned from the Swiss National

Grid Association Seed Project

Seed Working Group

Swiss National Grid Association (SwiNG)

seed-wg@swing-grid.ch







May 20, 2008 CCGrid 2008, Lyon, France 1

Members of the Seed Working Group

Nabil Abdennadher, Haute École Spécialisée de Suisse Occidentale (HES-SO)

Peter Engel, University of Bern (UniBE)

Derek Feichtinger, Paul Scherrer Institute (PSI)

Dean Flanders, Friedrich Miescher Institute (FMI)

Placi Flury, SWITCH

Sigve Haug, University of Bern (UniBE)

Pascal Jermini, École Polytechnique Fédérale de Lausanne (EPFL)

Sergio Maffioletti, Swiss National Supercomputing Centre (CSCS)

Cesare Pautasso, University of Lugano (USI)

Heinz Stockinger, Swiss Institute of Bioinformatics (SIB)

Wibke Sudholt, University of Zurich (UZH) – Chair

Michela Thiemard, École Polytechnique Fédérale de Lausanne (EPFL)

Nadya Williams, University of Zurich (UZH)

Christoph Witzig, SWITCH



May 20, 2008 CCGrid 2008, Lyon, France 2

Outline



Background

• Grid projects

• SwiNG





Seed Project

• Introduction

• Middleware

• Applications





Conclusions and outlook



May 20, 2008 CCGrid 2008, Lyon, France 3

Grid Projects and Infrastructures

International Grid projects

• EGEE (Enabling Grids for E-sciencE): 91 partners,

PRAGMA (Pacific Rim Applications and Grid Middleware Assembly): 29 partners,

etc.



National Grid projects

• Open Science Grid (USA), ChinaGrid, NAREGI (Japan), e-Science Programme (UK), D-

Grid (Germany), Austrian Grid, etc.



Domain-specific Grid projects

• LCG, Chemomentum, GRIDCHEM, Swiss Bio Grid, EMBRACE, DEGREE, etc.



Local Grid projects

• XtremWeb-CH, JOpera, etc.



Homogeneous Grid middleware

• gLite, UNICORE, Globus, ARC, etc.



May 20, 2008 CCGrid 2008, Lyon, France 4

Situation in Europe

Funding for Grid projects by the EU

• Within FP5 / FP6 / FP7

• Collaboration projects





National Grid Initiatives (NGIs)

• In most European countries

• Some with considerable funding





European Grid Initiative (EGI)

• http://web.eu-egi.eu/

• Design study under way

• Following the model of the National

Research Networks (NRENs)





May 20, 2008 CCGrid 2008, Lyon, France 5

National Grid Initiative (NGI)

“Coordinating body” for Grid activities within a nation





Must May

• Have a mandate to represent • Involve only coordination

researchers and institutions in Grid- • Develop and operate national Grid

related matters towards infrastructure(s)

— International bodies (e.g., EU)

• Be a legal entity on its own

— Funding agencies

— Federal government (SBF, BBT)

• Be limited to academic or research

institutions

• Have only one NGI per country • Also involve participation by the

industry









May 20, 2008 CCGrid 2008, Lyon, France 6

Grid in Switzerland before SwiNG

Various, somewhat isolated

efforts in the Swiss higher

education sector

• Some projects within individual

research groups

• Some projects between a

limited number of Swiss

partners

• Participation in EU-sponsored

projects by some institutions No national coordination

• Participation in international No dedicated funding

projects by some institutions No homogeneous Grid

middleware or infrastructure





May 20, 2008 CCGrid 2008, Lyon, France 7

Swiss National Grid Association (SwiNG)

Mission

• Ensure competitiveness of Swiss science, education and industry by creating

value through resource sharing.

• Establish and coordinate a sustainable Swiss Grid infrastructure, which is a

dynamic network of resources across different locations and administrative

domains.

• Provide a platform for interdisciplinary collaboration to leverage the Swiss Grid

activities, supporting end-users, researchers, industry, education centres,

resource providers.

• Represent the interests of the national Grid community towards other national

and international bodies.



History

• Initialized in September 2006

• Founded as association in May 2007

• Operational since January 2008





May 20, 2008 CCGrid 2008, Lyon, France 8

Organisational Structure









May 20, 2008 CCGrid 2008, Lyon, France 9

Institutional Members

ETH domain Universities of Applied Sciences

• École Polytechnique Fédérale de • Berner Fachhochschule (BFH)

Lausanne (EPFL)

• Fachhochschule Nordwestschweiz

• Eidgenössische Technische Hochschule (FHNW)

Zürich (ETHZ)

• ETH Research Institutions (EAWAG, • Haute Ecole Spécialisée de Suisse

EMPA, PSI, WSL) Occidentale (HES-SO)

• Swiss National Supercomputing Centre • Hochschule Luzern (HSLU)

(CSCS) • Scuola Universitaria Professionale della

Svizzera Italiana (SUPSI)

Cantonal universities

• Universität Basel (UniBas) Specialized institutions

• Universität Bern (UniBE)

• Friedrich Miescher Institute (FMI)

• Université de Geneve (UniGE)

• Swiss Institute of Bioinformatics (SIB)

• Université de Neuchâtel (UniNE)

• Swiss Academic and Research Network

• Université de Lausanne (UNIL)

(SWITCH)

• Università della Svizzera Italiana (USI)

• Universität Zürich (UZH)



May 20, 2008 CCGrid 2008, Lyon, France 10

Working Groups

Initial WGs WGs in planning

• Mandate Letter • AAA/SWITCH projects

• Seed Project • Grid Workflows

— Founded in November 2006 • Industry Relations

— Finished in November 2007





Currently active WGs

• ATLAS: High energy physics

• Proteomics: Bioinformatics

• Infrastructure & Basic Grid Services

— Grid Architecture Team (GAT)

— Grid Operations Team (GOT)

— Data Management Team (DMT)

• Education & Training



May 20, 2008 CCGrid 2008, Lyon, France 11

Seed Project Working Group Goals



1. Identify which resources (people, hardware, middleware,

applications, ideas) are readily available and represent

strong interest among the current SwiNG partners.



2. Based on available resources, propose one or more Seed

Projects that will help to initialize, test, and demonstrate the

SwiNG collaboration. The Seed Project should be realizable

in a fast, easy and inexpensive manner (“low hanging fruit”).



3. Help with the coordination and realization of the defined

Seed Project.

May 20, 2008 CCGrid 2008, Lyon, France 12

Seed Project Survey

Informal inventory of resources available Results

for the Seed Project • There is a lot of interest and expertise.

• Member groups • There is enough hardware available, but

• Available personnel no direct funding for people.

• Computer hardware • Middleware and applications are

• Lower-level grid middleware diverse, but some are more common.

• Higher-level grid middleware • Main interest is in specific tools and Grid

interoperability.

• Scientific application software and data

• Seed project ideas

 Build a cross-product/matrix

infrastructure of selected Grid

 12 answers in December 2006 middleware and applications by

gridifying each application on each

middleware pool in a non-intrusive

manner

 Avoid “chicken-and-egg” dilemma in

bootstrapping a Grid infrastructure by

using known tools and addressing early

adopters

May 20, 2008 CCGrid 2008, Lyon, France 13

Selection Process

Middleware Applications

• Criteria • Criteria

— Already deployed at partner sites — Need from the Swiss scientific user

community

— Sufficient expertise and manpower — Computational demand warrants Grid

— Supported within existing larger Grid execution

efforts — Sufficient expertise and manpower

— Not too complex requirements — Not too complex requirements

— Must be diverse and provide sufficient — Simple gridification, without changing the

set of capabilities source code if possible

• Initial focus — Should be diverse and cover sufficient

set of requirements

— EGEE gLite (deployed at CSCS, PSI, — Reusage of existing Grid-enabled

SIB, SWITCH, UniBas) applications

— Nordugrid ARC (deployed at CSCS, SIB, • Initial focus

UniBas, UniBE, UZH)

— Cones (mathematical crystallography,

— XtremWeb-CH (developed and deployed individual code)

at HES-SO) — GAMESS (quantum chemistry, standard

— Condor (deployed at EPFL) free open source code)

 Huygens (remote deconvolution for

imaging, standard commercial code)

— PHYLIP (bioinformatics, standard free

open source code)

May 20, 2008 CCGrid 2008, Lyon, France 14

Seed Project Definition

Scientific application software



GAM Huy- Swiss

Cones PHYLIP Mascot Physics É

ESS gens BioGrid



First focus







Meta-middleware and grid interoperability

Imple-

Require- Stan- Manage- Produc-

Security menta- Testing É

ments dards ment tion

tion

First focus







Lower-level middleware systems

gLite Globus

Xtrem United UNI

(Globus ARC Condor 4 É

Web-CH Devices CORE

2) (WSRF)

First focus

May 20, 2008 CCGrid 2008, Lyon, France 15

Grid Security: SWITCHslcs

SWITCH Short Lived Credential Service Advantages

(SLCS) • The user does not have to keep track of where

he/she copied his/her certificates between

• http://www.switch.ch/grid/slcs/ hosts.

• Ad-hoc generated X.509 certificates • He/she only needs to use his/her SWITCHaai

• Based on SWITCHaai (Authentication and federation account to obtain a certificate, thus

Authorization Infrastructure) he/she has to maintain one credential less.

• EUGridPMA accredited • He/she does not have to take care of the

expiration, respectively, renewal of the

• Valid 1’000’000 seconds (ca. 11 days) certificates. He/she simply requests a new

• Java-based client software one.

• Identity management becomes simpler since

the central Certification Authority (CA) is not

required to keep a separate master user

database.

• As the SLCS is accredited by the International

Grid Trust Federation (IGTF), the certificate is

recognized by all Grid resources where the

IGTF certificate bundle is installed.

Achievements in Seed Project

• Testing on gLite, ARC and Condor pools

• MyProxy server for automatic renewal of

expiring proxy certificates to bridge long-

running jobs



May 20, 2008 CCGrid 2008, Lyon, France 16

Middleware: EGEE gLite

Middleware of the world’s largest Grid infrastructure Deployment status in Switzerland

• http://www.glite.org/ • In production in LCG and DILIGENT projects since

• Grid middleware developed and deployed in the 2002

EGEE project, installed in most European countries,

and used for CERN’s LCG project • SWITCH (Zurich): Resource Broker, VOMS, CE/UI

• Based on Globus, adding VO support and extending locally behind firewall

data management functionalities

• CSCS (Manno): CE, SE, UI

• Computing Elements (CEs) interfacing worker nodes

to LRMS, Storage Elements (SEs) providing • SIB (Lausanne): UI

standardized data access and transfer services,

information system and resource management, User • UZH (Zurich): UI

Interface (UI) client

Achievements in Seed Project

• Offers many Data Grid components (e.g., file

catalogues, storage management) • Small test-bed with all essential services, but no big

• Security based on GSI with VOMS (VO Membership computer resources

Service) support, including hierarchical subgroups • Creation of new Virtual Organization (VO)

• Several different flavours, running on Scientific Linux • Working with SWITCHslcs

• Testing with Cones application

• Access to EGEE resources possible

Disadvantages

• Complex due to rich functionality, many vendors, and

partly competing implementations

• Installing and running the UI is straightforward, but

large efforts and manpower required for installing and

running service components

• Comparatively intrusive on resources (e.g., requires

Scientific Linux, workers on compute nodes)



May 20, 2008 CCGrid 2008, Lyon, France 17

Middleware: NorduGrid ARC

Advanced Resource Connector Deployment status in Switzerland

• http://www.nordugrid.org/middleware/ • Originally deployed as part of the LHC and Swiss Bio

• Grid middleware developed and deployed in the Grid projects

NorduGrid project of the Nordic countries • CSCS (Manno): GIIS, CE, SE

• Enables production-quality grids, including information • UZH (Zurich): CE (10 node cluster)

services, resource, job and data management

• EPFL (Lausanne): CE (Condor pool)

• Uses replacements and extensions of Globus pre-WS

services (e.g., GridFTP) • SIB (Lausanne): CE

• Cluster-of-clusters model, Computing and Storage Achievements in Seed Project

Elements (CEs and SEs), application Runtime

Environments (REs) • Configuration of resources at CSCS and UZH,

interfacing of Condor pool at EPFL

• Security based on GSI with VOMS support

• Working with SWITCHslcs

• Open source under GPL license, supports up to 22

different Unix distributions • Deployment and testing of Cones and GAMESS

applications, running of Cones in production by

scientific user

 Most successful middleware pool, non-intrusive

solution

Disadvantages

• Only limited support for complex data management

(e.g., no notion of data proximity)

• Compute nodes usually expected to have shared file

system

• Information service limited and not very scalable

• Coordination among sites necessary for stable

configuration, application REs, and error tracking



May 20, 2008 CCGrid 2008, Lyon, France 18

Middleware: XtremWeb-CH

High-performance Desktop Grid / volunteer Functionalities

computing / P2P middleware • Four modules: Coordinator, worker, warehouse, and

• http://www.xtremwebch.net/ broker

• Developed by Nabil Abdennadher et al. at HES-SO • Volatility of workers

• For deployment and execution on public, non- • Automatic execution of parallel and distributed

dedicated platforms via user participation applications

• Symmetric model of providers and consumers • Direct communication between workers, pull model

• Supports direct communication of jobs between • Load balancing

compute nodes, also across firewalls Deployment status in Switzerland

• Can fix the granularity of the application according to • Ca. 200 workers (mainly Windows, few Linux

the state of the platform platforms)

• Sites: EIG (Geneva), HEIG-VD (Yverdon)

Achievements in Seed Project

• Test installation at UZH

• PHYLIP application deployed previously

• Integration of GAMESS application

Disadvantages

• Security limited and based on central user database,

not compatible with GSI, VOs, and SWITCHslcs

• Porting of applications needs some effort

• No special data management features





May 20, 2008 CCGrid 2008, Lyon, France 19

Middleware: Condor

High-throughput computing environment Deployment status in Switzerland

• http://www.cs.wisc.edu/condor/ • Existing production pool at EPFL:

• Provides infrastructure for volatile Desktop http://greedy.epfl.ch/

Grid resources, also cross-institutional • Ca. 200 desktop CPUs (60% Windows, 40%

• Several authentication and authorisation Linux or Mac OS X machines), behind firewall

mechanisms (e.g., GSI, Kerberos)

• Computing power available only during nights

• Job queue and resource management for fair and weekends, machine owner has priority

and optimized assignment and sharing

• Shared file system or input/output file transfer • One submit server and one central manager

to/from the compute nodes • No access to compute nodes for third-party

• Multi-platform (Linux, Windows, Mac OS X, software installation (Condor installed by node

some other Unix variants), open source owners, not Grid managers), thus built-in file

• Can be interfaced with other middleware (e.g., transfer protocol required to transport

UNICORE, Globus, ARC) as LRMS application binaries along with input data

• Due to desktop nature relatively short jobs

advised (6 h max., not enforced)

Achievements in Seed Project

• Interfacing to ARC pool

• Working with SWITCHslcs

• Testing of Cones and GAMESS applications

May 20, 2008 CCGrid 2008, Lyon, France 20

Middleware Interoperability

Despite existing OGF standards such as JSDL (Job Submission Description

Language), most middleware systems have their own mechanisms for

resource and data management, information representation, or job

submission.



Solutions for interoperability

1. Meta-middleware: Complex due to different interfaces and missing standards,

thus out of scope for the Seed Project

2. One-to-one wrappers: Some middleware as entry point and bridge, transforming

one format to another





Achievements in Seed Project

• Integration of Condor pool in ARC pool based on existing wrapper

• ARC installed on gateway machine

• Modified to allow transparent appending of required binaries

May 20, 2008 CCGrid 2008, Lyon, France 21

Seed Project Definition

Scientific application software



GAM Huy- Swiss

Cones PHYLIP Mascot Physics É

ESS gens BioGrid



First focus







Meta-middleware and grid interoperability

Imple-

Require- Stan- Manage- Produc-

Security menta- Testing É

ments dards ment tion

tion

First focus







Lower-level middleware systems

gLite Globus

Xtrem United UNI

(Globus ARC Condor 4 É

Web-CH Devices CORE

2) (WSRF)

First focus

May 20, 2008 CCGrid 2008, Lyon, France 22

Application: Cones

Mathematical crystallography program Code properties

• For given representative quadratic form, • Single-threaded C program

calculates its subcone of equivalent • Developed by Peter Engel, UniBE

combinatorial types of parallelohedra • Several text input files, one execution

command, several text output files

• For dimension d = 6, number expected

to be greater than 200’000’000 Possibilities for Grid distribution

(currently 161’299’100) • Running of several jobs off the same

input file

• Cutting of input file into pieces

Achievements in Seed Project

• Refactoring of source code

• Creation of configure and make files

• Testing on gLite, ARC and Condor

pools

• Running in production on ARC pool

with first scientific user

• Ca. 50’000 new combinatorial types of

primitive parallelohedra identified

• Still a lot of room for improvement

regarding ease and efficiency of use

May 20, 2008 CCGrid 2008, Lyon, France 23

Application: GAMESS

General Atomic and Molecular Electronic Structure Code properties

System • Mainly Fortran 77 and C code and shell scripts

• http://www.msg.chem.iastate.edu/gamess/ • Available for large variety of hardware architectures

and operating systems

• Program package for ab initio molecular quantum

chemistry • Usually one keyword-driven text input file, one

execution command, several text output files

• Computing of molecular systems and reactions in gas • Well parallelized by its own implementation, called

phase and solution (properties, energies, structures, Distributed Data Interface (DDI)

spectra, etc.)

• Comes with more than 40 functional test cases

• Wide range of methods for approximate solutions of

the Schrödinger equation from quantum mechanics Possibilities for Grid distribution

• External: Embarrassingly parallel parameter scans in

• Standard free open source code developed and used input file

by many groups (e.g., at UZH)

• Internal: Component distribution based on current DDI

parallelization implementation

Achievements in Seed Project

• Deployment and testing on ARC and Condor pools at

CSCS, EPFL, and UZH

• Integration into XtremWeb-CH

• Simple corannulene DFT

functional scan test case

provided by Laura Zoppi,

UZH

• 223 small molecule MP2

calculations test case

provided by Kim Baldridge,

UZH



May 20, 2008 CCGrid 2008, Lyon, France 24

Application: PHYLIP

PHYLogeny Inference Package Code properties

• http://evolution.genetics.washington.edu/ • Package of ca. 34 program modules

phylip.html • C source code and executables (Windows,

Mac OS X, Linux) available

• Used to generate “life trees” (evolutionary • Input data read from text files, data

trees, interfering phylogenies) processed, output data written onto text files

• Most widely distributed phylogeny package, in • Data types: DNA sequences, protein

development since the 1980s, 15’000 users sequences, etc.

• Methods available: Parsimony, distance

• Tree composed of several branches, matrix, likelihood methods, bootstrapping,

subbranches, and leaves (sequences), which and consensus trees

are complex and CPU-intensive to construct, Possibilities for Grid distribution

compare, and select • Workflows constructed by users

• Distribution following Single Program

Multiple Data (SPMD) model

Achievements in Seed Project

• Integration and deployment on XtremWeb-

CH as independent project (Seqboot,

Dnadist, Fitch-Margoliash, Neighbor-Joining

and Consensus modules)

• Parallel version of Fitch module

• Execution of HIV sequences-related test

case on XtremWeb-CH pool

• Web service for dynamic configuration of

“Life tree” application platform and parameters



May 20, 2008 CCGrid 2008, Lyon, France 25

Overview of Achievements

EGEE gLite NorduGrid ARC XtremWeb-CH Condor

middleware middleware middleware middleware



Pool established CSCS, SIB, and CSCS, SIB, and HES-SO and UZH EPFL, coupling with

SWITCH, VO UZH, from Swiss ARC

created, UI at UZH Bio Grid



SLCS security Tested at CSCS, Tested at UniBE Needs changes in Tested at EPFL, via

SWITCH, and SIB and UZH or interface to ARC

middleware



Cones Tested at SIB Scientific usage Not started yet Tested at EPFL

application from UniBE



GAMESS Not started yet Test usage from Work in progress at Tested at EPFL

application UZH HES-SO



Huygens No personnel or No personnel or No personnel or No personnel or

application license license license license



PHYLIP Not started yet Not started yet Preexisting at HES- Work in progress at

application SO EPFL



May 20, 2008 CCGrid 2008, Lyon, France 26

Lessons Learned

Project

• Approach based on heterogeneous set of Grid middleware and applications turned out to be useful to

initialize technical collaboration.

• There is considerable interest and expertise in Grid resource and knowledge sharing in Switzerland,

which previously has been directed mostly towards external projects.

• Dedicated partners, clear responsibilities, continuous communication, detailed documentation, and

active project management are required.

• Funding, in particular for people, has to be properly secured for production setup.



Middleware

• Middleware is still demanding to install, maintain, and use, mainly due to its complexity and insufficient

documentation.

• Middleware architectures and interfaces differ considerably, and require efforts in interoperability.

• There is strong need for simplification and standardization, both technically and regarding procedures

(e.g., for policy, development, deployment, and execution mechanisms).



Applications

• Applications are diverse and can be put onto the Grid in different ways, and therefore need direct

cooperation among scientific developers and Grid experts, that is, interdisciplinary work.

• Scientists are mainly interested in the implementation of new methods, thus standards for software

development and packaging are often ignored, leading to poor installation, documentation, and

sometimes performance.

• To run applications on the Grid, applications need to provide well documented and packaged

distributions including standardized installation, configuration, testing and use procedures.



May 20, 2008 CCGrid 2008, Lyon, France 27

Outlook

Future topics

• Setup of a production infrastructure

• Inclusion of additional resources

• Extension to new applications

• Evaluation of further middleware

• Expansion to data management

• Connection to other Grid infrastructures



Continuation of work in Switzerland

• SwiNG Working Groups

• SwiNG-related, funded projects

• Securing of dedicated funding for SwiNG



General focus

• Standardisation and interoperation of middleware

• Professionalisation and standardisation of applications





May 20, 2008 CCGrid 2008, Lyon, France 28

Thank you!



Questions?





May 20, 2008 CCGrid 2008, Lyon, France 29



Related docs
Other docs by wangnianwu
_9 beots-bydr.dre.
Views: 0  |  Downloads: 0
THE TEST OF HOMELAND SECURITY
Views: 0  |  Downloads: 0
THINK DIFFERENT_ BUILD DIFFERENT Eco - SESAC
Views: 0  |  Downloads: 0
40_ 40_
Views: 0  |  Downloads: 0
ISSUE_1_April_2001
Views: 0  |  Downloads: 0
Troop 110 By-Laws
Views: 0  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!