Port
Shared by: xiaopangnv
-
Stats
- views:
- 0
- posted:
- 10/24/2012
- language:
- English
- pages:
- 22
Document Sample


NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
Porting from the Cray T3E to the IBM SP
Jonathan Carter
NERSC User Services
1
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
Overview
• Focus is on Fortran programs using MPI for communication
• Outline common pitfalls:
– f90 vs. xlf Fortran compiler
– Cray vs. IBM MPI library
– Math libraries
– System libraries
– I/O
2
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
f90 vs. xlf - Main Differences
• f90
– compiles for parallel (MPI) automatically
– accepts file suffix .f90, .F90
– default optimization is -O2
– allows access to full memory on a PE by default
• xlf
– compiler is accessed by several names, each name
“packages” options together
– by default, only file suffix .f and .F allowed
– default is no optimization
– restricted amount of memory available by default
3
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
xlf Compiler Options
• Compiler name can have three parts:
– optional prefix “mp” indicates MPI library is automatically linked
– compiler name, xlf, xlf90, or xlf95 indicates language mode
– optional postfix “_r” indicates threads, or OpenMP capability
• Example:
– mpxlf90 - Fortran 90 language compiler with MPI library available
– mpxlf_r - Fortran 77 language compiler with MPI library, threads,
and OpenMP capability available.
• If you want to use MPI I/O, the thread capable compiler
must be used.
4
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
xlf Compiler Options
• To use different file suffixes, e.g. .f90 and .F90:
– -qsuffix=f=f90,F=F90
• For optimization we recommend:
– -O3 -qtune=pwr3 -qarch=pwr3 -qstrict
• xlf defaults to 32 Kbytes for stack space and 128 Mbyte for
heap space. To increase to maximums of 256 Mbyte for
stack, and 2 Gbyte for heap:
– -bmaxstack:0x10000000 -bmaxstack:0x80000000
5
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
Default Datatypes
Type T3E SP
Length (bytes) Length (bytes)
Character 1 1
Complex 2x8 2x4
Double Complex 2x8 2x8
Double precision 8 8
Integer / Logical 8 4
Real 8 4
• Double Complex is a language extension
• Assume -dp flag for f90
• xlf compiler has -qrealsize=8 to promote all default
reals and real constants to 8 bytes. Also, -qintsize=8 to
promote all integers and logicals.
6
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
Available Datatypes
Type Kind T3E SP
Length (bytes) Length (bytes)
Complex 4 2x4 2x4
8 2x8 2x8
16 NA 2 x 16
Integer / 1 1 4
Logical 2 2 4
4 4 4
8 8 8
Real 4 4 4
8 8 8
16 NA 16
• Fortran 77 “*” syntax is also available to explicitly define a
datatype
7
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
MPI Differences
• Different default datatypes between T3E and SP
• More error checking of arguments on the SP
• Default amount of buffering is different
• Different subset of MPI I/O implemented
8
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
Available MPI Datatypes
Type T3E SP
Length (bytes) Length (bytes)
MPI_Character 1 1
MPI_Complex 2x8 2x4
MPI_Double_Complex 2 x 8 2x8
MPI_Double_Precision 8 8
MPI_Integer 8 4
MPI_Logical 8 4
MPI_Real 8 4
9
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
Default MPI Datatypes
Type T3E SP
Length (bytes) Length (bytes)
MPI_Complex8 NA 2x4
MPI_Complex16 NA 2x8
MPI_ Complex32 NA 2 x 16
MPI_Integer1 4 1
MPI_Integer2 4 2
MPI_Integer4 4 4
MPI_Integer8 8 8
MPI_Logical1 NA 1
MPI_Logical2 NA 2
MPI_Logical4 NA 4
MPI_Logical8 NA 8
MPI_Real4 4 4
MPI_Real8 8 8
MPI_Real16 NA 16
10
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
MPI - Argument Checking
• T3E MPI library has several collective routines which do not
check arguments in accordance with the MPI standard. The
SP does check arguments.
• Examples:
– MPI_Bcast “count” argument is not checked for consistency on
T3E
– MPI_Gatherv array of “counts” is not checked for consistency on
T3E
11
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
MPI - Buffering
• If your program depends on the buffering of standard MPI
Sends and Receives, you may see different behavior between
the T3E and the SP.
• Classic case:
...
if (mype.eq.0) then
call mpi_send(buf,count,type,1,tag,MPI_COMM_WORLD,ierr)
call mpi_recv(buf,count,type,0,tag,MPI_COMM_WORLD,status,ierr)
else if (mype.eq.1) then
call mpi_send(buf,count,type,0,tag,MPI_COMM_WORLD,ierr)
call mpi_recv(buf,count,type,1,tag,MPI_COMM_WORLD,status,ierr)
end if
...
12
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
MPI - Buffering
• On the T3E, a message up to 4 Kbyte are buffered. This can
be changed by setting the environment variable
MPI_BUFFER_MAX.
• On the SP, the default size depends on the number of
processors:
1 to 16 4096
17 to 32 2048
33 to 64 1024
65 to 128 512
127 to 256 256
257 and over 128
• This can be changed by setting the environment variable
MP_EAGER_LIMIT.
13
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
Cray SciLib and IBM ESSL
• Both vendors provide libraries of commonly used Linear
Algebra subroutines
• On the T3E this is linked by default, on the SP use “-lessl”
• These libraries are faster then the public domain BLAS,
LAPACK, etc.
14
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
Using BLAS
• BLAS levels 1 through 3 are completely compatible between
the two machines
• Note which precision of BLAS is being called:
– On the T3E
real*8 a(n), b(n), x
…
x = sdot(n,a,1,b,1)
– On the SP
real*8 a(n), b(n), x
…
x = ddot(n,a,1,b,1)
15
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
Using BLAS
• Instead of changing program source, loader options can be
used to map one routine to another
• To resolve a call to sdot by a call to ddot on the SP:
xlf -o a.out -brename:sdot,ddot b.f
• To resolve a call to ddot by a call to sdot on the T3E:
f90 -o a.out -Wl”-Dequiv(DDOT)=SDOT” b.f
16
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
LAPACK routines
• Most other linear algebra routines in Cray SciLib and IBM
ESSL are compatible with LAPACK.
• In ESSL there are a few incompatibilities (x may be C, D, S,
Z):
xGEEV
xSPEV
xSPSV
xHPEV
xHPSV
xGEGV
xSYGV
• Use installed LAPACK library for these.
17
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
ScaLAPACK library
• Cray SciLib and IBM PESSL support pieces of the standard
ScaLAPACK library.
• Check precision of routines:
– For real*8 on the T3E, routines start “PS”
– For real*8 on the SP, routines start “PD”
• On the SP, you must call BLACS_GET followed by either
BLACS_GRIDINIT or BLACS_GRIDMAP. On the T3E,
only a call to one of the latter two routines is required.
• Public domain ScaLAPACk is also installed on both
machines.
18
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
System Libraries
• Generally, any routines which interact with the operating
system, and provide extensions to the Fortran language.
• Cray provides very many such routines. Some are available
on the SP, for example:
T3E SP Function
Abort Abort Ends program
Exit Exit_ Ends program
Flush Flush_ Flushes Fortran I/O buffer
System Ishell Executes a command
Trbk Xl__trbk Prints a tracback
19
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
System Libraries
• A more comprehensive list is available at:
http://hpcf.nersc.gov/computers/SP/port.html
• Some routines have changed names and slightly different
arguments.
• There are sometimes identically or similarly named routines
on the SP which are designed to be called from C only.
Calling them from Fortran will cause unexpected behavior.
• For example, calling exit instead of exit_ will cause the
program to end without flushing any Fortran I/O buffer.
20
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
Fortran I/O
• Unformatted I/O
– The primitive datatypes on the T3E and SP are compatible (provided they
are of the same length), but control words inserted by Fortran language i/o
layer prevent transferability of sequential access files.
– Direct access files can be freely transferred between the two machines, as
can MPI I/O files.
• Namelist Input/Output
– Users familar with the assign -f77 on the T3E, which causes an old-
style namelist input to be written or read, can set the following environment
variable on the SP to obtain the same effect:
setenv XLFRTEOPTS="namelist=old"
21
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER
Further Information
• T3E and SP webpages and software webpages contain further
information and links to vendor documentation:
http://hpcf.nersc.gov/computers
http://hpcf.nersc.gov/software
22
Get documents about "