The race withMPRACE On GRAPE, FPGA, Petaflops
Document Sample


The race with MPRACE On
GRAPE, FPGA,
Petaflop/s
Application Driven
Reconfigurable Computing for
Astrophysics and other Fields
Rainer Spurzem, Astronomisches Rechen-Institut
Zentrum für Astronomie Univ.Heidelberg, Germany
spurzem@ari.uni-heidelberg.de
http://www.ari.uni-heidelberg.de/mitarbeiter/spurzem/
(ARI)
Foundation Document of ARI
May 10, 1700
Calendar Patent of Duke of Brandenburg
Collaboration:
Sverre Aarseth (IoA Cambridge UK),
David Merritt (RIT, USA),
Naohito Nakasato, Tsuyoshi Hamada
(RIKEN Japan),
Simon Portegies Zwart, Alessia Gualandris
(U Amsterdam),
Dez. 06 COEHT 2007
The GRACE Project = GRAPE + MPRACE
Astrophysical Computer Simulations using Programmable Hardware
R. Spurzem, R. Männer, A. Burkert with
G. Lienhart, G. Marcus, A. Kugel
P. Berczik, I. Berentzen, M. Wetzstein, T. Naab…
Interdisciplinary: Computer Science and Astrophysics
Univ. Heidelberg (ARI-ZAH), Munich (USM)
Univ. Mannheim (Techn. Informatik)
MWK Baden-Württembg.
Dez. 06 COEHT 2007
Astrophysics
Computer Simulation of Dense Star Clusters
Example1: Galactic Globular Clusters
Gravitative Star-Star Interaction
Complexity N2 (N: Number of Stars)
Ground Based
View
Globular Star Cluster Centauri
(Central Region with Hubble Space Telescope
Dez. 06 COEHT 2007
Astrophysics
Example 2: Motion of Supermassive Black Holes (SMBH)
in central galactic star clusters (here not shown),
gravitational wave emission, relativistic dynamics
Left: Orbits of
Triple-SMBH
in central star
cluster (not shown
here), simulation
NBODY6++
Right: SMBH-
Coalescence,
Gravitatonal Wave
Detection with
Space Antenna
LISA (2015)
Source: ESA
Dez. 06 COEHT 2007
LISA: Bin. Black Holes in the Universe
Terrestrial Detectors: (VIRGO, GEO600, LIGO): Galactic Compact
Objects (black holes, neutron stars…) higher frequencies
Astrophysical
Sources
Terrestrial Detectors
Geo600 Hannover
VIRGO, LIGO, TAMA, AIGO
Space detectors
LISA
Dez. 06 COEHT 2007
Hardware - GRAPE
~128 Gflops for a price ~5K USD; Memory for up to 128K particles
~128 Gflops for a price ~5K USD; Memory for up to 128K particles
GRAPE6a PCI board
GRAPE6a, -BL - PCI ASIC Board for PC-Clusters
GRAPE6a, -BL - PCI ASIC Board for PC-Clusters
PROGRAPE-4, FPGA based board from RIKEN (Hamada)
PROGRAPE-4, FPGA based board from RIKEN (Hamada)
GRAPE7 – new FPGA based board from Tokyo Univ. (Fukushige)
GRAPE7 – new FPGA based board from Tokyo Univ. (Fukushige)
GRAPE-DR – new board from Makino et al. NAOJ
GRAPE-DR – new board from Makino et al. NAOJ
MPRACE1,2 – FPGA boards from Univ. Mannheim/GRACE (Kugel et al.)
MPRACE1,2 – FPGA boards from Univ. Mannheim/GRACE (Kugel et al.)
Dez. 06 COEHT 2007
Basic idea of any GRAPE N-body code:
~N ~N^2
r N r r G ⋅mj r
ai = ∑ f ij
j =1; j ≠ i
f ij = −
(r + ε )
2
ij
2 3/ 2
rij
Dez. 06 COEHT 2007
GRAPE = GRAvity PipE – more detail…
r r
mi ; ri ; vi ; ti
r r
m j ; rj ; v j ; t j
r r &
φi ; ai ; ai
Dez. 06 COEHT 2007
ARI-ZAH + RIT 32 node GRAPE6a clusters
Performance Analysis (3.2 Tflop/s):
Harfst et al. 2006, New Astron., in press, astro-ph/0608125
Dez. 06 COEHT 2007
Hardware - GRAPE
4
10
32xGRAPE6a
ARI-ZAH GRAPE
Cluster:
GRAPE6
103
~3.2 Tlop/s
sustained
Speed (GFlops)
GRAPE6a
102
Up to 4 million stars!
World record in this class!
101
(Direct N-Body)
01 Harfst, Gualandris,
100 02
04
Merritt, Spurzem,
08 Portegies Zwart, Berczik
16
32 2006, New Astron. in press
-1
10 astro-ph/0608125
103 104 105 106
Particle number - N
Dez. 06 COEHT 2007
Software, NBODY6++
O(N p) + O(N2 /p) [ + O(N Nn/p) ]
1 2 3
Communication Long Range Short Range
Regular Force Irregular Force
Original code by S.J.Aarseth, S. Mikkola (ca. 20.000 lines):
•Hierarchical Block Time Steps, 4th order Pred./Corr. Scheme
•Ahmad-Cohen Neighbour Scheme
•Kustaanheimo-Stiefel and Chain-Regular.
for close encounters (Quaternions!)
•4th order Hermite scheme (pred/corr)
• Parallelization (Spurzem 1999)
•Implementation on GRAPE Cluster (Harfst et al. 2006)
Dez. 06 COEHT 2007
Hardware
Dez. 06 COEHT 2007
Dez. 06 COEHT 2007
Pipeline Generation on FPGA I
(see talk by Gerhard Lienhart)
Dez. 06 COEHT 2007
Pipeline Generation on FPGA II
(see talk by Gerhard Lienhart)
Dez. 06 COEHT 2007
Hardware FPGA
MPRACE
GRAPE
• GRAPE moves the bottleneck to short range (neighbour) forces
• Use FPGA-platform for accelerating neighbour algorithm
Dez. 06 COEHT 2007
Hardware - GRACE
Univ. Heidelberg (ARI) Univ. Mannheim (LIV)
The GRACE architecture Univ. Munich (USM) RIKEN Institute Tokyo
(GRAPE+MPRACE)
_____ Infiniband Dual PCIe 20Gb/s ____
32 Hosts
4 Tflops, 128 CPUs, 128 GB Memory
(64 P4 Xeon, 32 GRAPE, 32 Xilinx FPGA-MPRACE)
Dez. 06 COEHT 2007
Preliminary
Ongoing Work
Xeon 3.6GHz
FPGA
1 Pipeline
GRAPE
12 Pipelines
Dez. 06 COEHT 2007
Prototype
Testing
Production
Summer 2007
Dez. 06 COEHT 2007
Other Applications
r ⎛ pi ⎞
Smoothed Particle
dvi
dt
= − ∑ mj ⎜
⎜ρ2 ρ
pj r
+ 2 + ∏ ij ⎟∇ iW rij , hij
⎟
( )
j ⎝ i j ⎠
Hydrodynamics (SPH)
( )
⎧ − α cij μ ij + β μ ij 2
⎪
r r
for vij rij ≤ 0
∏ ij = ⎨ ρ ij
(r ) ⎪ r r
N
ρ i = ∑ m jW rij , hij , pi = P ( ρ i ) ⎩ 0 for vij rij > 0
j =1
ρi + ρ j fi + f j r r r
Hydrodynamic equation ρ ij = , f ij = , rij = ri − rj
2 2
of motion, gravity ci + c j hi + h j r r r
cij = , hij = , vij = vi − v j
2 2
r r r
dvi 1 r visc h vr
μ ij = r 2 ij ij2 ij 2 f ij
= − ∇Pi + ai rij + η hij
dt ρi
SPH formulation
Dez. 06 COEHT 2007
Other Applications
Molecular Dynamics
Protein Interactions, with Nanotubes, Ligands, Water
Cellular Signaling
Long Range Force: Fast TREE or direct GRAPE
Intermediate Range: FPGA
Prospective Partners:
* G. Sutmann, A. Schiller,
NIC, FZ Jülich (using Pro-
GRAPE FPGA Board, RIKEN Inst. Japan)
* EML Research Institute Heidelberg, S. Richter, R. Wade
Dez. 06 COEHT 2007
How to build a super-GRACE…
… 50 Tflop/s machine for < 5 % of gen. purpose cost ?
•200 standard nodes, AMD Opteron or Pentium Xeon
•200 super-GRAPEs (250 Gflop/s) MPRACE-2, GRAPE-DR, PROGRAPE
•Super-Network (e.g. AMD Hypertransport, Xtoll-Connection Custom Network
(AMD excellence centre with Univ. of Mannheim, U. Brüning)
Such computer
competes with
general purpose
supercomputers on
the Petaflop/s scale.
Used: Performance Model
of Harfst et al. 06
Dez. 06 COEHT 2007
Other Applications
Astrophysical Excellence Cluster Univ. of
Heidelberg – admitted for 2nd round –
projected cooperation with informatics:
Co-Ordination: Prof. J. Wambsganss
(Director ZAH) Information
…. Science
Co-I‘s Prof. R. Klessen
Prof. R. Spurzem …
Prof. Brüning
Prof. Männer
Further
Dez. 06 COEHT 2007
Partners?
Related docs
Get documents about "