Introduction to the NERSC J90 Cluster
David Turner & Tom DeBoni NERSC User Services Group April 1999
11/20/2008
1
Hardware Overview
J90se CPU
100 MHz, 200 MFlop, 64-bit vector processor "Scalar enhanced"
J90 System
Multiple J90se CPUs 1 GWord shared memory Large, fast RAID filesystem (/tmp)
13 April, 1999
Intro to the J90 Systems
2
Hardware Overview (cont.)
NERSC J90 Cluster
Machine killeen bhaskara franklin seymour fcrick jwatson CPUs 20 28 28 28 28 28 /tmp 161 GB 368 GB 371 GB 334 GB 229 GB 229 GB
}
3
Interactive System ssh, telnet, cqsub
Batch Systems cqsub
13 April, 1999
Intro to the J90 Systems
File Systems
$HOME
“permanent” (but not archival) 5 GB quota, regular backups, file migration 93.7 GB total local to killeen, NFS-mounted on batch systems poor performance for batch jobs /u/repo/user /Un/user /u/ccc/dpturner /U0/dpturner
13 April, 1999 Intro to the J90 Systems 4
File Systems (cont.)
$TMPDIR
temporary (created/destroyed each session) no quota (but NQS limits 4 GB - 32 GB) no backups, no migration local to each machine high-performance RAID arrays system manages this for you A.K.A. $BIG
13 April, 1999
Intro to the J90 Systems
5
File Systems (cont.)
/tmp
location of $TMPDIR 14-day lifetime A.K.A. /big you manage this for yourself HPSS archival tape storage (and file migration) no quota (but tracked by “SRU”) access with hsi, pftp, or ftp
13 April, 1999 Intro to the J90 Systems 6
Environment
Shells
Supported sh csh ksh (same as sh) Unsupported tcsh (module load tcsh) bash (module load tools)
13 April, 1999
Intro to the J90 Systems
7
Environment (cont.)
Modules
Found on many Unix systems Sets environment variables, aliases, executable search paths, man search paths, header file include paths, and/or library load paths Exercise care modifying startup files!
Useful options
module module module module module module list avail load modfile switch modfile modfile.rev display modfile help modfile
13 April, 1999 Intro to the J90 Systems 8
Compiling and linking
Programming
Fortran 90 - f90 No Fortran 77 compiler C/C++ - cc, CC Assembler - as Cray Message Passing Toolkit Use compiler (f90, cc, CC) for linking also f90 file naming conventions
Use for Fortran 77-style code: filename.f - fixed form filename.F - fixed form, run preprocessor first
Use for Fortran 90-style code: filename.f90 - free form filename.F90 - free form, run preprocessor first
13 April, 1999 Intro to the J90 Systems 9
Compiling and linking (cont.)
Useful compiler options
-dp Disable double precision -rlistop Controls content of listing file -Gn Debugging level 0 Full debugging (same as -g) 1 Block by block debugging -Rrunop Run-time checking a Argument number and type b Array bounds
13 April, 1999 Intro to the J90 Systems 10
Compiling and linking (cont.)
Useful compiler options (cont.)
-ev Static storage
-On Optimization level 0 None 1 Conservative: global scalar optimization 2 Moderate: loop nest restructuring 3 Aggressive: autotasking
13 April, 1999
Intro to the J90 Systems
11
Compiler Options Comparison
Feature
Static storage
Autotasking Vectorization Optimization Overindexing
cf77
-a
-Zp -Zv -Zp -Zv (default)
f90
-ev
-Otask3 -Otask0,scalar3,vector3 -Otask3,scalar3,vector3 -Ooverindex
13 April, 1999
Intro to the J90 Systems
12
Compiling and linking (cont.)
Use make for large projects
setenv NPROC 2
Fortran 90 modules
Don't confuse with module command Each Fortran 90 module must be compiled before any routine that uses it
Useful linker options
-Mopt Load map options
13 April, 1999
Intro to the J90 Systems
13
Execution
Multiprocessing
setenv NCPUS 4 export NCPUS=4 (csh) (ksh)
"a.out: Command not found."
./a.out ...
13 April, 1999
Intro to the J90 Systems
14
Execution(cont.)
Interactive
killeen, during “business hours” Small (.le. 80MW), short (.le. 10 hours) jobs
Batch
killeen, night and weekends Small (.le. 40MW), short (.le. 6 hours) jobs bhaskara, franklin, seymour, fcrick, jwatson Big (.le. 512MW), long (.le. 168 hours) jobs
13 April, 1999 Intro to the J90 Systems 15
Debugging
totalview -h
Display brief summary of commands
totalview -L
Line-mode interface (similar to cdbx)
13 April, 1999
Intro to the J90 Systems
16
Libraries
Mathematics
default includes nag, imsl, slatec modules for lsode, harwell
Graphics
ncar gnuplot
I/O
HDF - module load hdf netCDF - module load netcdf
13 April, 1999 Intro to the J90 Systems 17
Applications
Amber
module load amber41
Ansys
module load ansys54
Basis
module load basis11.8
Gamess
module load gamess
Gaussian
module load g98
Nastran
module load nastran
13 April, 1999 Intro to the J90 Systems 18
Batch Computing
• User creates shell script myscript • Submits to NQE with cqsub myscript
• Returns NQE task id (e.g., t1234)
• NQE selects machine and forwards to NQS
• Job remains pending (NPend) until resources available
• NQS runs the job
• Assigns NQS job id (e.g., 5678.bhaskara)
• Run job in appropriate batch queue
• Job log returned upon completion
13 April, 1999 Intro to the J90 Systems 19
Tools
ja ./name ja -cst -n name hpm prof flowview
atexpert
13 April, 1999 Intro to the J90 Systems 20