LAPACK3E – A Fortran 90-enhanced version of LAPACK
Document Sample


LAPACK3E – A Fortran 90-enhanced
version of LAPACK
Edward Anderson
SAIC Technology Solutions
Anderson.Edward@epa.gov
August 7, 2003
LAPACK 3E – A Fortran 90-enhanced 1
version of LAPACK
LAPACK overview
• A Fortran 77 library for numerical linear algebra operations
• Developed 1988-1992 with support from NSF and DOE
• LAPACK Users‟ Guide published by SIAM
• Several updates: LAPACK 2 (1995), LAPACK 3 (1999)
• Follow-on packages: ScaLAPACK (1997), LAPACK95 (2000)
• 20 million hits at www.netlib.org/lapack
• Adopted by NAG, IMSL, and Mathworks
• Incorporated into scientific libraries by Cray, SGI, Sun, HP,
Digital, …
• … but not IBM
LAPACK 3E – A Fortran 90-enhanced 2
version of LAPACK
Why not IBM?
• ESSL has much of the same functionality as LAPACK
• ESSL was developed concurrently and released first
• 23 routines from LAPACK have been adopted into ESSL
• Several unfortunate name space clashes
Example: SGEEV finds the eigenvalues and eigenvectors of a real
general matrix
• Same name in both LAPACK and ESSL
• Same operation
• Different argument lists
LAPACK 3E – A Fortran 90-enhanced 3
version of LAPACK
Why not use LAPACK 3?
• LAPACK 3 has some known bugs.
• The last update was in 2000.
• LAPACK 3 is not thread safe.
• Differences in default data sizes not handled well by the Fortran
77 interface:
– SGESV on Cray T3E becomes DGESV on IBM SP
– But could be SGESV if you compile with –qrealsize=8
• Fortran 90 interfaces in LAPACK95 require features that were
incompletely implemented in LAPACK 3.
LAPACK 3E – A Fortran 90-enhanced 4
version of LAPACK
What is LAPACK3E?
Project to integrate my performance enhancements with
LAPACK 3 using LAPACK95-style generic interfaces
Suggested
LAPACK 3 (1999) performance LAPACK95 (2000)
improvements
Recent bug Thread safety
LAPACK3E (2002)
fixes & other good stuff
LAPACK 3E – A Fortran 90-enhanced 5
version of LAPACK
Key LAPACK3E features
• Thread safety (remove SAVE statements)
• Parameterized constants
• Common source for single and double precision
• Generic interfaces as in LAPACK95
• Fully compatible with LAPACK 3
• Working towards full compatibility with LAPACK95
LAPACK3E must be built using Fortran 90!
LAPACK 3E – A Fortran 90-enhanced 6
version of LAPACK
SAVE statements in LAPACK
Two contexts:
1) Reverse communication
Add arguments to calling list and rename
2) Computed constants, e.g.,
LOGICAL FIRST
DATA FIRST / .TRUE. /
SAVE FIRST, …
IF( FIRST ) THEN
… Compute constants first time
FIRST = .FALSE. only to reduce overhead
END IF
Make the computed constants parameters to reduce overhead
LAPACK 3E – A Fortran 90-enhanced 7
version of LAPACK
Replicated constants
The LAPACK auxiliary routine SLAMCH computes floating point
model parameters that are intrinsics in Fortran 90:
EPS = SLAMCH( „Epsilon‟ ) @ EPSILON( 1.0 )
SAFMIN = SLAMCH( „Safe minimum‟ ) @ TINY( 1.0 )
SAFMAX = SLAMCH( „Overflow‟ ) @ HUGE( 1.0 )
Many scaling parameters are used repeatedly, but not defined
consistently. SMLNUM is variously computed as
SAFMIN SAFMIN*( N / EPS )
SAFMIN*REAL( MAX( 1, N ) ) SQRT( SAFMIN / EPS )
SAFMIN / EPS SQRT( SAFMIN ) / EPS
LAPACK 3E – A Fortran 90-enhanced 8
version of LAPACK
New module: LA_CONSTANTS
(also: LA_CONSTANTS32 for 32-bit constants)
KIND for floating point data:
WP (=8 for LA_CONSTANTS, =4 for LA_CONSTANTS32)
Floating point real and complex constants:
ZERO, ONE, TWO, CZERO, CONE, …
Floating point model parameters:
EPS, SAFMIN, SAFMAX, ULP
Scaling constants derived from model parameters:
SMLNUM, BIGNUM, RTMIN, RTMAX
Character prefixes of type-specific names for error handling:
SPREFIX, CPREFIX
LAPACK 3E – A Fortran 90-enhanced 9
version of LAPACK
Common source for different KINDs
For common source LAPACK subroutines:
1) Include a file of preprocessor name mappings and USE the
LA_CONSTANTS module:
#include “lapacknames.inc” Rename sgetrf.f to
sgetrf.F to use the
SUBROUTINE SGETRF( … )
preprocessor
USE LA_CONSTANTS
2) Parameterize all declarations by KIND:
REAL(WP) instead of “REAL” or “DOUBLE PRECISION”
3) Parameterize all floating-point constants by KIND
Most defined in LA_CONSTANTS, others specified as, e.g.,
REAL(WP), PARAMETER :: FUDGE = 2.8_WP
LAPACK 3E – A Fortran 90-enhanced 10
version of LAPACK
Compiling two versions from one source
On IBM, files with .F extension invoke the preprocessor:
xlf –WF,-DLA_REALSIZE=4 –c sgetrf.F
xlf –o dgetrf.o –c sgetrf.F
On Cray, files with .F or .F90 extension invoke the preprocesor:
f90 –F –DLA_REALSIZE=4 –o hgetrf.o –c sgetrf.F
f90 –c sgetrf.F
If the defined constant LA_REALSIZE is not set, the 64-bit version
is compiled.
LAPACK 3E – A Fortran 90-enhanced 11
version of LAPACK
Common source generic interfaces
Following LAPACK95, create generic interfaces (in a module)
for all BLAS and LAPACK routines:
#include “lapacknames.inc” Preprocessor instructions
MODULE LA_SCFOO
INTERFACE LA_FOO Generic name
SUBROUTINE SFOO( X )
USE LA_CONSTANTS, ONLY: WP
REAL(WP), INTENT(INOUT) :: X(*)
END SUBROUTINE SFOO
Specific interfaces
SUBROUTINE CFOO( X )
USE LA_CONSTANTS, ONLY: WP
COMPLEX(WP), INTENT(INOUT) :: X(*)
END SUBROUTINE CFOO
END INTERFACE ! LA_FOO
END MODULE LA_SCFOO
LAPACK 3E – A Fortran 90-enhanced 12
version of LAPACK
Gluing together type-specific interfaces
Compile la_scfoo.o and la_dzfoo.o from common source.
Compile la_xfoo.f:
MODULE LA_XFOO
USE LA_SCFOO
USE LA_DZFOO
END MODULE LA_XFOO
Use the generic interface:
PROGRAM MAIN PROGRAM MAIN
USE LA_CONSTANTS32
EXTERNAL SFOO USE LA_XFOO
REAL X(100) REAL(WP) :: X(100)
CALL SFOO(X) CALL LA_FOO(X)
LAPACK 3E – A Fortran 90-enhanced 13
version of LAPACK
Pros and cons of generic interface
+ Provides compile time type matching /argument
checking
+ Supports simpler interfaces through overloading
+ Allows same call for different default real sizes
- Interfaces must match all arguments in type, kind,
and rank
- Adds overhead of extra call for non-default interface
LAPACK 3E – A Fortran 90-enhanced 14
version of LAPACK
Mismatched interfaces
The calling site must match one of the interface specs for every
argument exactly in type, kind, and rank.
If it doesn‟t match, you can
a) Match the interface to the call
b) Match the call to the interface
LAPACK3E modules define both the natural interace and a
“point” interface for BLAS and LAPACK generic interfaces.
• Natural interface: just like the subroutine definition
• Point interface: all arrays are indexed (such as A(I,J) or X(1))
• If the calling site doesn‟t match the natural interface, index all
the arrays to use the point interface
• Point interface is default – natural interface is a wrapper to it
LAPACK 3E – A Fortran 90-enhanced 15
version of LAPACK
Point and natural interfaces
Point interfaces allow argument matching by position and type
without rank for use with indexed arrays.
MODULE LA_XCOPY
INTERFACE LA_COPY CONTAINS
! Point interface for xCOPY1 ! Natural interface for xCOPY1
SUBROUTINE SCOPY1( N, X, Y ) SUBROUTINE SCOPY1_NAT( N, X, Y )
USE LA_CONSTANTS32, ONLY: WP USE LA_CONSTANTS32, ONLY: WP
INTEGER, INTENT(IN) :: N INTEGER, INTENT(IN) :: N
REAL(WP), INTENT(IN) :: X REAL(WP), INTENT(IN) :: X(*)
REAL(WP), INTENT(OUT) :: Y REAL(WP), INTENT(OUT) :: Y(*)
END SUBROUTINE SCOPY1 CALL SCOPY1( N, X(1), Y(1) )
END SUBROUTINE SCOPY1_NAT
MODULE PROCEDURE SCOPY1_NAT
END MODULE LA_XCOPY
END INTERFACE ! LA_COPY
PRIVATE SCOPY1_NAT
LAPACK 3E – A Fortran 90-enhanced 16
version of LAPACK
Performance improvements
• Parallel linear system solves with NRHS > 1, using OpenMP
• Vastly better SLASSQ
• Cleaner SLARTG, SLARFG
• Faster SGEBAL
• Faster SSTEIN (using MGS)
• UPLO argument to CPTSV/CTPSVX (only incompatibility with
LAPACK 3)
• Call Level 3 LAPACK routines, not Level 2 directly
Reference: LAPACK Working Note 158, December 2002
www.netlib.org/lapack/lawns/downloads
LAPACK 3E – A Fortran 90-enhanced 17
version of LAPACK
Availability and future work
What‟s available:
• Version 1.1 at http://www.netlib.org/lapack3e
• Compiled libraries for CRAY T3E, IBM SP, and Sun
• Earlier versions of this talk
• Soon: Common source generic interfaces
In progress/Future work:
• Add modernized interfaces of LAPACK95
• Simplify installation process
• Extend Fortran 90 improvements to test/timing code
LAPACK 3E – A Fortran 90-enhanced 18
version of LAPACK
Related docs
Get documents about "