EGEE and the future of Grid Infrastructures

Reviews
Shared by: mirit35
Stats
views:
21
rating:
not rated
reviews:
0
posted:
11/15/2008
language:
English
pages:
0
Enabling Grids for E-sciencE EGEE and the future of Grid Infrastructures International Symposium on Grid Computing 2007 Academia Sinica, Taipei 26-29 March 2007 Bob Jones EGEE-II Project Director CERN www.eu-egee.org INFSO-RI-508833 eScience Enabling Grids for E-sciencE • Science is becoming increasingly digital, needs to deal with increasing amounts of data and computational needs • Simulations get ever more detailed – Nanotechnology – design of new materials from the molecular scale – Modelling and predicting complex systems (weather forecasting, river floods, earthquake) – Decoding the human genome • Experimental Science uses ever more sophisticated sensors to make precise measurements  Need high statistics  Huge amounts of data  Serves user communities around the world INFSO-RI-508833 ISGC2007 2 High Energy Physics Enabling Grids for E-sciencE Large Hadron Collider (LHC): • One of the most powerful instruments ever built to investigate matter • 40 Million Particle collisions per second • 4 Experiments: ALICE, ATLAS, CMS, LHCb • ~15 PetaBytes/year from the 4 experiments • First beams in 2007 HEP track today Mont Blanc (4810 m) Downtown Geneva INFSO-RI-508833 ISGC2007 3 In silico drug discovery Enabling Grids for E-sciencE • Diseases such as HIV/AIDS, SRAS, Bird Flu etc. are a threat to public health due to world wide exchanges and circulation of persons • Grids open new perspectives to in silico drug discovery – Reduced cost and adding an accelerating factor in the search for new drugs International collaboration is required for: • Early detection • Epidemiological watch • Prevention • Search for new drugs • Search for vaccines •Avian influenza: •bird casualties presentation by Ying-Ta WU & Hurng-Chun LEE in life sciences track on Wednesday INFSO-RI-508833 ISGC2007 4 WISDOM Enabling Grids for E-sciencE http://wisdom.healthgrid.org/ Mini Workshop on Thursday INFSO-RI-508833 ISGC2007 5 Medical image processing: analysing tumours Enabling Grids for E-sciencE • Pharmacokinetics: contrast agent diffusion study – co-registration of a time series of volumetric medical images to analyse the evolution of the diffusion of contrast agents • Computational Costs – 20 Patients: 2623 hours (Co-registration + Parametric Image) – Using a 20-processor Computing Farm: 146 hours – Using the Grid: <20 hours Sequential HPC Grid INFSO-RI-508833 If you have enough resources 20x12=240 computers, EGEE has >30,000 ISGC2007 6 Enabling Grids for E-sciencE Example: Determining earthquake mechanisms • Seismic software application determines epicentre, magnitude, mechanism • Analysis of Indonesian earthquake (28 March 2005) – Seismic data within 12 hours after the earthquake – Analysis performed within 30 hours after earthquake occurred – Results  Not an aftershock of December 2004 earthquake  Different location (different part of fault line further south)  Different mechanism  Rapid analysis of earthquakes important for relief efforts Earth Science & Astronomy track today Peru, June 23, 2001 Mw=8.4 Sumatra, March 28, 2005 Mw=8.5 INFSO-RI-508833 ISGC2007 7 Bioinformatics Enabling Grids for E-sciencE GPS@: bioinformatics portal – http://gpsa.ibcp.fr/ web portal – Access up-to-date sequence and 3D-structure databanks (EMBL, GenBank, SWISSPROT etc.) – Tens of bioinformatics legacy code • Convenient easy-to-use interface with access to well-known databanks • Uses grid resources to analyse the sequences INFSO-RI-508833 ISGC2007 8 Data, Data, Data Enabling Grids for E-sciencE Slide by Carole Gobel INFSO-RI-508833 ISGC2007 9 Main trend Enabling Grids for E-sciencE The size of data an organization owns, manages, and depends on is dramatically increasing: – – – – Ownership cost of storage capacity goes down Data generated and consumed goes up Network capacity goes up Distributed computing technology matures and is more widely adopted INFSO-RI-508833 ISGC2007 10 How e-Infrastructrures help e-Science Enabling Grids for E-sciencE • e-Infrastructures provide easier access for – Small research groups – Scientists from many different fields – Remote and still developing countries • To new technologies – Produce and store massive amounts of data – Transparent access to millions of files across different administrative domains – Low cost access to resources  Mobilise large amounts of CPU & storage on short notice (PC clusters) – High-end facilities (supercomputers) • And help to find new ways to collaborate – Develops applications using distributed complex workflows – Eases distributed collaborations – Provides new ways of community building – Gives easier access to higher education INFSO-RI-508833 ISGC2007 11 EGEE Enabling Grids for E-sciencE Flagship grid infrastructure project co-funded by the European Commission Now in 2nd phase with 91 partners in 32 countries Objectives • Large-scale, production-quality grid infrastructure for e-Science • Attracting new resources and users from industry as well as science • Maintain and further improve gLite Grid middleware INFSO-RI-508833 ISGC2007 12 Applications on EGEE Enabling Grids for E-sciencE • Multitude of applications from a growing number of domains – – – – – – – – – – – Astrophysics Computational Chemistry Earth Sciences Keynote by Financial Simulation Luigi Fusco Wednesday Fusion Geophysics High Energy Physics Life Sciences Multimedia Material Sciences ….. Book of abstracts: http://doc.cern.ch//archive/electronic/egee/tr/egee-tr-2006-005.pdf INFSO-RI-508833 ISGC2007 13 Production Usage Status Enabling Grids for E-sciencE 250 200 No. Sites 150 100 50 0 ~17.5 million jobs run (6450 cpu-years) in 2006; Workloads of the “not HEP VOs” is now significant – approaching 810K jobs per day; and 1000 cpu-months/month • one year ago this was the overall scale of work for all VOs 40000 35000 30000 No. CPU 25000 20000 15000 10000 5000 0 INFSO-RI-508833 04 Ju n0 Au 4 g04 O ct -0 4 D ec -0 4 Fe b05 Ap r05 Ju n0 Au 5 g05 O ct -0 5 D ec -0 5 Fe b06 Ap r06 Ju n0 Au 6 g06 O ct -0 6 D ec -0 6 Grid operations & management track on Thursday ISGC2007 14 Ap r- 04 Ju n0 Au 4 g04 O ct -0 4 D ec -0 4 Fe b05 Ap r05 Ju n05 Au g05 O ct -0 5 D ec -0 5 Fe b06 Ap r06 Ju n0 Au 6 g06 O ct -0 6 D ec -0 6 Ap r- EGEE Middleware Distribution Enabling Grids for E-sciencE • gLite – Exploit experience and existing components from VDT (Condor, Globus), EDG/LCG, and others – Develop a lightweight stack of generic middleware useful to EGEE applications (HEP and Life Sciences are pilot applications)  Pluggable components – cater for different implementations  Follow SOA approach, WS-I compliant where possible – Focus is on re-engineering and hardening – Business friendly open source license  Moving to Apache-2 Tutorial held yesterday INFSO-RI-508833 ISGC2007 15 Grid Middleware Enabling Grids for E-sciencE Applications Higher-Level Grid Services Workload Management Replica Management Visualization Workflow Grid Economies ... • Applications have access both to Higher-level Grid Services and to Foundation Grid Middleware • Higher-Level Grid Services are supposed to help the users building their computing infrastructure but should not be mandatory • Foundation Grid Middleware will be deployed on the EGEE infrastructure – Must be complete and robust – Should allow interoperation with other major grid infrastructures – Should not assume the use of Higher-Level Grid Services ISGC2007 16 Foundation Grid Middleware Security model and infrastructure Computing (CE) and Storage Elements (SE) Accounting Information and Monitoring INFSO-RI-508833 gLite Grid Middleware Services Enabling Grids for E-sciencE Access CLI API Security Authorization Auditing Authentication Information & Monitoring Information & Monitoring Application Monitoring Data Management Metadata Catalog Storage Element File & Replica Catalog Data Movement Accounting Workload Management Job Provenance Computing Element Package Manager Workload Management Site Proxy Overview paper http://doc.cern.ch//archive/electronic/egee/tr/egee-tr-2006-001.pdf INFSO-RI-508833 ISGC2007 17 Grid of Grids - from Local to Global National Campus ISGC2007 Community 18 OSG sites Keynote by Ruth Pordes on Wednesday ISGC2007 19 32 Virtual Organizations participating Groups 3 with >1000 jobs max. (all particle physics) 3 with 500-1000 max. (all outside physics) 5 with 100-500 max (particle, nuclear, and astro physics) ISGC2007 20 The DEISA supercomputing environment (21.900 processors and 145 Tf in 2006, more than 190 Tf in 2007) • IBM AIX Super-cluster – FZJ-Julich, 1312 processors, 8,9 teraflops peak – – – – – – • • • • RZG – Garching, 748 processors, 3,8 teraflops peak IDRIS, 1024 processors, 6.7 teraflops peak CINECA, 512 processors, 2,6 teraflops peak CSC, 512 processors, 2,6 teraflops peak ECMWF, 2 systems of 2276 processors each, 33 teraflops peak HPCx, 1600 processors, 12 teraflops peak BSC, IBM PowerPC Linux system (MareNostrum) 4864 processeurs, 40 teraflops peak SARA, SGI ALTIX Linux system, 416 processors, 2,2 teraflops peak LRZ, Linux cluster (2.7 teraflops) moving to SGI ALTIX system (5120 processors and 33 teraflops peak in 2006, 70 teraflops peak in 2007) HLRS, NEC SX8 vector system, 646 processors, 12,7 teraflops peak. • Systems interconnected with dedicated 1Gb/s network – currently upgrading to 10 Gb/s – provided by GEANT and NRENs ISGC2007V. Alessandrini IDRIS-CNRS March 2007EGEE Workshop on Management of Rights in Production Grids Paris, June 19th, 2006 21 National Research Grid Infrastructure (NAREGI) 2003-2007 • Petascale Grid Infrastructure R&D for Future Deployment – $45 mil (US) + $16 mil x 5 (2003-2007) = $125 mil total – Hosted by National Institute of Informatics (NII) and Institute of Molecular Science (IMS) Keynote by Satoshi Matsuoka on Thursday – PL: Ken Miura (FujitsuNII) • Sekiguchi(AIST), Matsuoka(Titech), Shimojo(Osaka-U), Aoyagi (Kyushu-U)… – Participation by multiple (>= 3) vendors, Fujitsu, NEC, Hitachi, NTT, etc. – Follow and contribute to GGF Standardization, esp. OGSA Focused “Grand Challenge” Grid Apps Areas Nanotech Grid Apps “NanoGrid” IMS ~10TF (Biotech Grid Apps) (BioGrid RIKEN) (Other Apps) Other Inst. NEC Osaka-U Titech AIST Fujitsu U-Tokyo U-Kyushu Grid and Network Management Grid Middleware March 2007 National Research Grid Middleware R&D Grid R&D Infrastr. 15 TF-100TF ISGC2007 SuperSINET 22 Hitachi Interoperability Enabling Grids for E-sciencE • Interoperability between e-Infrastructures is essential to provide services to global user communities • “Grid-Interoperability-Now” group within the OpenGridForum is providing a good environment for practical developments • Experience shows this work is most successful when it is driven by the needs of user communities INFSO-RI-508833 ISGC2007 23 G IN Middleware & interoperability track on Wednesday & Thursday Collaborating e-Infrastructures Enabling Grids for E-sciencE TWGRID Potential for linking ~80 countries INFSO-RI-508833 ISGC2007 24 Middleware Standards Enabling Grids for E-sciencE Slide by Dave Snelling INFSO-RI-508833 ISGC2007 25 Middleware Concepts Enabling Grids for E-sciencE Slide by Dave Snelling INFSO-RI-508833 ISGC2007 26 Enabling Grids for E-sciencE Co-located with OGF 20 www.eu-egee.org INFSO-RI-508833 The Future of Grids Enabling Grids for E-sciencE • Increasing the number of infrastructure users by increasing awareness – Dissemination and outreach Education track on Wednesday – Training and education – Grids offer new opportunities for collaborative work • Increasing the number of applications by improving application support and middleware functionality – Increase stability, scalability, and usability  Major efforts needed particularly on VO management, security infrastructure, data management, and job management – High level grid middleware extensions • Increasing the grid infrastructure – Increase manageability of Grid services – Reducing the cost of operation – Ensuring interoperability between infrastructures • Protecting user investments – Better involvement of industry – Move towards a sustainable grid infrastructure Industry & Government track today INFSO-RI-508833 ISGC2007 28 Sustainability: Beyond EGEE-II Enabling Grids for E-sciencE • Need to prepare for permanent Grid infrastructure – Ensure a reliable and adaptive support for all sciences – Independent of short project funding cycles – Infrastructure managed in collaboration with national grid initiatives Presentation by Dieter Kranzlmueller today INFSO-RI-508833 ISGC2007 29 EGEE’07 Conference Enabling Grids for E-sciencE Building Bridges… • Between Science and business • Between users and infrastructures • Between countries • Between scientific disciplines • Between projects • Etc http://www.eu-egee.org/egee07 INFSO-RI-508833 ISGC2007 30 Summary Enabling Grids for E-sciencE • Grids are all about sharing – they are a means of working with groups around the world – Today we have a window of opportunity to move grids from research prototypes to permanent production systems (as networks did a few years ago) • Interoperability is key to providing the level of support required for our user communities • EGEE operates the world’s largest multi-disciplinary grid infrastructure for scientific research – In constant and significant production use • Need to prepare the long-term – EGEE, collaborating projects, national grid initiatives and user communities are working to define a model for a sustainable grid infrastructure that is independent of short project cycles www.eu-egee.org INFSO-RI-508833 ISGC2007 31

Related docs
The Future of EGEE and gLite
Views: 6  |  Downloads: 3
An Introduction to the EGEE Project
Views: 9  |  Downloads: 0
Introduction to EGEE
Views: 7  |  Downloads: 0
Introduction to Grids and the EGEE project
Views: 8  |  Downloads: 0
EGEE Grid Operations Management
Views: 0  |  Downloads: 0
MyProxy and EGEE
Views: 0  |  Downloads: 0
Other docs by mirit35