High Performance Computing
at EPA
National Environmental Scientific
Computing Center
June 3, 2004
John B. Smith NTSD/OEI/OTOP
Agenda
• • • • Emerald – EPA’s New HPC System
EPA’s HPC Timeline EPA’s Data Growth Grid
– Description – Data vs. Compute Grid
• EPA’s Grid Project
– Phase 1 – Phase 2
• Futures – Top 500 Comparisons
System name: emerald
IBM eServer Cluster 1600 2 interactive nodes 14 batch nodes High Performance Switch AIX 5L operating system Each node: 8 Power4+ processors 1.5 GHz (6 Gflop/s peak) 16 GB memory
128 processors = 768 Gflop/s peak
IBM FAStT900 Storage System
Fibre Channel (10K) disks
Managed under GPFS
16 TB total disk
11.4 TB usable space
Top 500
List
1. Earth Simulator Center – 36 Tflop/s, NEC 2. ASCI Q, LANL – 13.8, HP AlphaServer 3. X, Virginia Tech. – 10.3, Apple G5 4. Tungsten, NCSA – 9.8, Dell PowerEdge 5. Mpp2, PNNL – 8.6, HP Integrity
NESCC Mass Data Storage
900 Total Data (TeraBytes)
800 700 600 500 400 300 200 100 0 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006
Actual Data Projected Data Data - 20% Data + 20%
Fiscal Year
Grid
• Grid computing is a form of distributed computing that involves coordinating and sharing computing, application, data, storage, or network resources across dynamic and geographically dispersed organizations. Grid technologies promise to change the way organizations tackle complex computational problems. However, the vision of large scale resource sharing is not yet a reality in many areas — Grid computing is an evolving area of computing, where standards and technology are still being developed to enable this new paradigm.
Types of Grid Technology
• Data
– Integrating Distributed Data
– Streamline Data Access – Access to Multiple Data Sources
• Compute
– Sharing of Resources – Secure Access (Policy, Authentication,
Authorization)
– Efficient Use of Resources
Grid Phase 1
• Phase 1 - Project Objective - Deploy a Data and Computational GRID across no more than 2 locations. (Note: Locations need not be distributed over a WAN) • Achieved By: – Partnering with another EPA or industry partner that is committed to project success and whose existing network connectivity can facilitate data transfers as required by trial applications. – Limiting pioneer applications – previously identified as CMAQ. – Leveraging COTS and OpenSource GRID technologies for Phase 1 deployment. Eliminating custom development as much as feasible. – Leveraging the collective talents of Government and Contractor organizations as much as practical within existing constraints.
Phase 1 – Cont.
• Success measured by: – Demonstrating CMAQ, and the ability for this application to leverage distributed data and computational resources. – Demonstrating a Data GRID, and how distributed data sources can be leveraged among GRID participants.
Grid - Phase 2
Address Management aspects of Grid
RAS Performance Security – certificate authority Funding after August MOU’s/IAG’s
Identify other grid-worthy applications Extend GRID beyond the EPA Firewall to external partners. Provide access to the GRID through Science Portal
Performance (in Gflop/s)
10000
15000
20000
25000
30000
35000
40000
5000
0
Futures
EPA Top 500 Performance Comparison
Time (in 6 month intervals)
1993/0
6 1
93/
1 9 1
1
94/
6 9 0
1
94/
1 9 1
1
95/
6 9 0
1
95/
1 9 1
1
96/
6 9 0
1
96/
1 9 1
1
97/
6 9 0
1
97/
1 9 1
1
98/
6 9 0
1
98/
1 9 1
1
99/
6 9 0
1
99/
1 9 1
2
00/
6 0 0
2
00/
1 0 1
2
01/
6 0 0
2
01/
1 0 1
2
02/
6 0 0
2
02/
1 0 1
2
03/
6 0 0
2
03/
1 0 1
2
04/06 0
#1 #100
#500
EPA
Questions?