LAPACK3E – A Fortran 90-enhanced version of LAPACK

Shared by: bzs12927
-
Stats
views:
16
posted:
2/1/2010
language:
English
pages:
18
Document Sample
scope of work template
							      LAPACK3E – A Fortran 90-enhanced
           version of LAPACK
                                       Edward Anderson
                                    SAIC Technology Solutions
                                    Anderson.Edward@epa.gov

                                         August 7, 2003




LAPACK 3E – A Fortran 90-enhanced                               1
version of LAPACK
                                LAPACK overview
 •    A Fortran 77 library for numerical linear algebra operations
 •    Developed 1988-1992 with support from NSF and DOE
 •    LAPACK Users‟ Guide published by SIAM
 •    Several updates: LAPACK 2 (1995), LAPACK 3 (1999)
 •    Follow-on packages: ScaLAPACK (1997), LAPACK95 (2000)
 •    20 million hits at www.netlib.org/lapack
 •    Adopted by NAG, IMSL, and Mathworks
 •    Incorporated into scientific libraries by Cray, SGI, Sun, HP,
      Digital, …

 • … but not IBM



LAPACK 3E – A Fortran 90-enhanced                                     2
version of LAPACK
                                    Why not IBM?
 •    ESSL has much of the same functionality as LAPACK
 •    ESSL was developed concurrently and released first
 •    23 routines from LAPACK have been adopted into ESSL
 •    Several unfortunate name space clashes

 Example: SGEEV finds the eigenvalues and eigenvectors of a real
   general matrix
 • Same name in both LAPACK and ESSL
 • Same operation
 • Different argument lists




LAPACK 3E – A Fortran 90-enhanced                                  3
version of LAPACK
                        Why not use LAPACK 3?
 • LAPACK 3 has some known bugs.
 • The last update was in 2000.
 • LAPACK 3 is not thread safe.
 • Differences in default data sizes not handled well by the Fortran
   77 interface:
    – SGESV on Cray T3E becomes DGESV on IBM SP
    – But could be SGESV if you compile with –qrealsize=8
 • Fortran 90 interfaces in LAPACK95 require features that were
   incompletely implemented in LAPACK 3.




LAPACK 3E – A Fortran 90-enhanced                                  4
version of LAPACK
                              What is LAPACK3E?
 Project to integrate my performance enhancements with
 LAPACK 3 using LAPACK95-style generic interfaces


                                        Suggested
    LAPACK 3 (1999)                    performance    LAPACK95 (2000)
                                      improvements




     Recent bug                                          Thread safety
                                    LAPACK3E (2002)
       fixes                                           & other good stuff



LAPACK 3E – A Fortran 90-enhanced                                           5
version of LAPACK
                         Key LAPACK3E features
 •    Thread safety (remove SAVE statements)
 •    Parameterized constants
 •    Common source for single and double precision
 •    Generic interfaces as in LAPACK95
 •    Fully compatible with LAPACK 3
 •    Working towards full compatibility with LAPACK95

 LAPACK3E must be built using Fortran 90!




LAPACK 3E – A Fortran 90-enhanced                    6
version of LAPACK
                  SAVE statements in LAPACK
 Two contexts:

 1) Reverse communication
     Add arguments to calling list and rename

 2) Computed constants, e.g.,
        LOGICAL FIRST
        DATA FIRST / .TRUE. /
        SAVE FIRST, …
        IF( FIRST ) THEN
           …                         Compute constants first time
           FIRST = .FALSE.           only to reduce overhead
        END IF
         Make the computed constants parameters to reduce overhead
LAPACK 3E – A Fortran 90-enhanced                                     7
version of LAPACK
                               Replicated constants
 The LAPACK auxiliary routine SLAMCH computes floating point
 model parameters that are intrinsics in Fortran 90:
   EPS = SLAMCH( „Epsilon‟ ) @ EPSILON( 1.0 )
   SAFMIN = SLAMCH( „Safe minimum‟ ) @ TINY( 1.0 )
   SAFMAX = SLAMCH( „Overflow‟ ) @ HUGE( 1.0 )

 Many scaling parameters are used repeatedly, but not defined
 consistently. SMLNUM is variously computed as
    SAFMIN                              SAFMIN*( N / EPS )
    SAFMIN*REAL( MAX( 1, N ) )          SQRT( SAFMIN / EPS )
    SAFMIN / EPS                        SQRT( SAFMIN ) / EPS



LAPACK 3E – A Fortran 90-enhanced                               8
version of LAPACK
                New module: LA_CONSTANTS
 (also: LA_CONSTANTS32 for 32-bit constants)

 KIND for floating point data:
    WP (=8 for LA_CONSTANTS, =4 for LA_CONSTANTS32)
 Floating point real and complex constants:
    ZERO, ONE, TWO, CZERO, CONE, …
 Floating point model parameters:
    EPS, SAFMIN, SAFMAX, ULP
 Scaling constants derived from model parameters:
    SMLNUM, BIGNUM, RTMIN, RTMAX
 Character prefixes of type-specific names for error handling:
    SPREFIX, CPREFIX

LAPACK 3E – A Fortran 90-enhanced                                9
version of LAPACK
        Common source for different KINDs
 For common source LAPACK subroutines:
 1) Include a file of preprocessor name mappings and USE the
     LA_CONSTANTS module:
             #include “lapacknames.inc”         Rename sgetrf.f to
                                                sgetrf.F to use the
             SUBROUTINE SGETRF( … )
                                                    preprocessor
                USE LA_CONSTANTS

 2) Parameterize all declarations by KIND:
    REAL(WP) instead of “REAL” or “DOUBLE PRECISION”
 3) Parameterize all floating-point constants by KIND
    Most defined in LA_CONSTANTS, others specified as, e.g.,
             REAL(WP), PARAMETER :: FUDGE = 2.8_WP


LAPACK 3E – A Fortran 90-enhanced                                10
version of LAPACK
   Compiling two versions from one source

 On IBM, files with .F extension invoke the preprocessor:
   xlf –WF,-DLA_REALSIZE=4 –c sgetrf.F
   xlf –o dgetrf.o –c sgetrf.F

 On Cray, files with .F or .F90 extension invoke the preprocesor:
   f90 –F –DLA_REALSIZE=4 –o hgetrf.o –c sgetrf.F
   f90 –c sgetrf.F

 If the defined constant LA_REALSIZE is not set, the 64-bit version
     is compiled.



LAPACK 3E – A Fortran 90-enhanced                                     11
version of LAPACK
           Common source generic interfaces
  Following LAPACK95, create generic interfaces (in a module)
  for all BLAS and LAPACK routines:
         #include “lapacknames.inc”              Preprocessor instructions
         MODULE LA_SCFOO
         INTERFACE LA_FOO                        Generic name
         SUBROUTINE SFOO( X )
            USE LA_CONSTANTS, ONLY: WP
            REAL(WP), INTENT(INOUT) :: X(*)
         END SUBROUTINE SFOO
                                                         Specific interfaces
         SUBROUTINE CFOO( X )
            USE LA_CONSTANTS, ONLY: WP
            COMPLEX(WP), INTENT(INOUT) :: X(*)
         END SUBROUTINE CFOO

         END INTERFACE ! LA_FOO
         END MODULE LA_SCFOO

LAPACK 3E – A Fortran 90-enhanced                                              12
version of LAPACK
     Gluing together type-specific interfaces
 Compile la_scfoo.o and la_dzfoo.o from common source.

 Compile la_xfoo.f:
       MODULE LA_XFOO
         USE LA_SCFOO
         USE LA_DZFOO
       END MODULE LA_XFOO

 Use the generic interface:
    PROGRAM MAIN                     PROGRAM MAIN
                                     USE LA_CONSTANTS32
       EXTERNAL SFOO                 USE LA_XFOO
       REAL X(100)                   REAL(WP) :: X(100)
       CALL SFOO(X)                  CALL LA_FOO(X)


LAPACK 3E – A Fortran 90-enhanced                         13
version of LAPACK
             Pros and cons of generic interface
 + Provides compile time type matching /argument
   checking
 + Supports simpler interfaces through overloading
 + Allows same call for different default real sizes

 - Interfaces must match all arguments in type, kind,
   and rank
 - Adds overhead of extra call for non-default interface




LAPACK 3E – A Fortran 90-enhanced                      14
version of LAPACK
                            Mismatched interfaces
 The calling site must match one of the interface specs for every
 argument exactly in type, kind, and rank.

 If it doesn‟t match, you can
 a) Match the interface to the call
 b) Match the call to the interface

 LAPACK3E modules define both the natural interace and a
 “point” interface for BLAS and LAPACK generic interfaces.
 • Natural interface: just like the subroutine definition
 • Point interface: all arrays are indexed (such as A(I,J) or X(1))
 • If the calling site doesn‟t match the natural interface, index all
     the arrays to use the point interface
 • Point interface is default – natural interface is a wrapper to it

LAPACK 3E – A Fortran 90-enhanced                                   15
version of LAPACK
                     Point and natural interfaces
 Point interfaces allow argument matching by position and type
 without rank for use with indexed arrays.

 MODULE LA_XCOPY

 INTERFACE LA_COPY                  CONTAINS

 !    Point interface for xCOPY1    !   Natural interface for xCOPY1

 SUBROUTINE SCOPY1( N, X, Y )       SUBROUTINE SCOPY1_NAT( N, X, Y )
    USE LA_CONSTANTS32, ONLY: WP       USE LA_CONSTANTS32, ONLY: WP
    INTEGER, INTENT(IN) :: N           INTEGER, INTENT(IN) :: N
    REAL(WP), INTENT(IN) :: X          REAL(WP), INTENT(IN) :: X(*)
    REAL(WP), INTENT(OUT) :: Y         REAL(WP), INTENT(OUT) :: Y(*)
 END SUBROUTINE SCOPY1                 CALL SCOPY1( N, X(1), Y(1) )
                                    END SUBROUTINE SCOPY1_NAT
 MODULE PROCEDURE SCOPY1_NAT
                                    END MODULE LA_XCOPY
 END INTERFACE ! LA_COPY
 PRIVATE SCOPY1_NAT


LAPACK 3E – A Fortran 90-enhanced                                      16
version of LAPACK
                     Performance improvements
 • Parallel linear system solves with NRHS > 1, using OpenMP
 • Vastly better SLASSQ
 • Cleaner SLARTG, SLARFG
 • Faster SGEBAL
 • Faster SSTEIN (using MGS)
 • UPLO argument to CPTSV/CTPSVX (only incompatibility with
   LAPACK 3)
 • Call Level 3 LAPACK routines, not Level 2 directly

 Reference: LAPACK Working Note 158, December 2002
    www.netlib.org/lapack/lawns/downloads



LAPACK 3E – A Fortran 90-enhanced                          17
version of LAPACK
                   Availability and future work
 What‟s available:
 • Version 1.1 at http://www.netlib.org/lapack3e
 • Compiled libraries for CRAY T3E, IBM SP, and Sun
 • Earlier versions of this talk
 • Soon: Common source generic interfaces

 In progress/Future work:
 • Add modernized interfaces of LAPACK95
 • Simplify installation process
 • Extend Fortran 90 improvements to test/timing code

LAPACK 3E – A Fortran 90-enhanced                     18
version of LAPACK

						
Related docs