Acrobat PDF

High Performance Computing at EPA Presentations

Click to download
Reviews
Shared by: EPADocs
Stats
views:
51
rating:
not rated
reviews:
0
posted:
5/15/2008
language:
English
pages:
0
High Performance Computing at EPA National Environmental Scientific Computing Center June 3, 2004 John B. Smith NTSD/OEI/OTOP Agenda • • • • Emerald – EPA’s New HPC System EPA’s HPC Timeline EPA’s Data Growth Grid – Description – Data vs. Compute Grid • EPA’s Grid Project – Phase 1 – Phase 2 • Futures – Top 500 Comparisons System name: emerald IBM eServer Cluster 1600 2 interactive nodes 14 batch nodes High Performance Switch AIX 5L operating system Each node: 8 Power4+ processors 1.5 GHz (6 Gflop/s peak) 16 GB memory 128 processors = 768 Gflop/s peak IBM FAStT900 Storage System Fibre Channel (10K) disks Managed under GPFS 16 TB total disk 11.4 TB usable space Top 500 List 1. Earth Simulator Center – 36 Tflop/s, NEC 2. ASCI Q, LANL – 13.8, HP AlphaServer 3. X, Virginia Tech. – 10.3, Apple G5 4. Tungsten, NCSA – 9.8, Dell PowerEdge 5. Mpp2, PNNL – 8.6, HP Integrity NESCC Mass Data Storage 900 Total Data (TeraBytes) 800 700 600 500 400 300 200 100 0 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 Actual Data Projected Data Data - 20% Data + 20% Fiscal Year Grid • Grid computing is a form of distributed computing that involves coordinating and sharing computing, application, data, storage, or network resources across dynamic and geographically dispersed organizations. Grid technologies promise to change the way organizations tackle complex computational problems. However, the vision of large scale resource sharing is not yet a reality in many areas — Grid computing is an evolving area of computing, where standards and technology are still being developed to enable this new paradigm. Types of Grid Technology • Data – Integrating Distributed Data – Streamline Data Access – Access to Multiple Data Sources • Compute – Sharing of Resources – Secure Access (Policy, Authentication, Authorization) – Efficient Use of Resources Grid Phase 1 • Phase 1 - Project Objective - Deploy a Data and Computational GRID across no more than 2 locations. (Note: Locations need not be distributed over a WAN) • Achieved By: – Partnering with another EPA or industry partner that is committed to project success and whose existing network connectivity can facilitate data transfers as required by trial applications. – Limiting pioneer applications – previously identified as CMAQ. – Leveraging COTS and OpenSource GRID technologies for Phase 1 deployment. Eliminating custom development as much as feasible. – Leveraging the collective talents of Government and Contractor organizations as much as practical within existing constraints. Phase 1 – Cont. • Success measured by: – Demonstrating CMAQ, and the ability for this application to leverage distributed data and computational resources. – Demonstrating a Data GRID, and how distributed data sources can be leveraged among GRID participants. Grid - Phase 2 ƒ Address Management aspects of Grid ƒ ƒ ƒ ƒ ƒ RAS Performance Security – certificate authority Funding after August MOU’s/IAG’s ƒ Identify other grid-worthy applications ƒ Extend GRID beyond the EPA Firewall to external partners. ƒ Provide access to the GRID through Science Portal Performance (in Gflop/s) 10000 15000 20000 25000 30000 35000 40000 5000 0 Futures EPA Top 500 Performance Comparison Time (in 6 month intervals) 1993/0 6 1 93/ 1 9 1 1 94/ 6 9 0 1 94/ 1 9 1 1 95/ 6 9 0 1 95/ 1 9 1 1 96/ 6 9 0 1 96/ 1 9 1 1 97/ 6 9 0 1 97/ 1 9 1 1 98/ 6 9 0 1 98/ 1 9 1 1 99/ 6 9 0 1 99/ 1 9 1 2 00/ 6 0 0 2 00/ 1 0 1 2 01/ 6 0 0 2 01/ 1 0 1 2 02/ 6 0 0 2 02/ 1 0 1 2 03/ 6 0 0 2 03/ 1 0 1 2 04/06 0 #1 #100 #500 EPA Questions?

Related docs
premium docs
Other docs by EPADocs