IMSL Fortran Library Overview and Usage Seminar by Visual - PowerPoint by hbf25307


									     IMSL Fortran Library
 Overview and Usage Seminar
      by Visual Numerics

Lawrence Berkeley National Lab
        May 19, 2004
     Overview of
 Visual Numerics, Inc.

          Tim Leite
Director Education Programs


                          About Visual Numerics

     • Established in 1970; privately held
     • Over 30 years experience delivering trusted technology
       solutions for understanding data
     • Computational techniques for analyzing data
          – quantitative analysis
     • Visual techniques for spotting trends / anomalies in data
          – qualitative analysis
     • Offices worldwide
          – USA, UK, France, Germany, Mexico, Japan, Korea, Taiwan
          – International distributor network
     • Fortune 100 accounts around the globe and well over
       500,000 users worldwide

                           Flagship Customers by Industry

     Financial       Energy          Telecom       Education       Aerospace       Auto &         DoD & Gov.
     Services                                                                   Manufacturing

  Lehman Brothers British Petrol.      AT&T          MCSR           Aerospatiale       GE          Hughes
    Deutsche Bank    Chevron          Airtouch       Harvard      British Aerospace Texas Inst.   Raytheon
   JPMorganChase      Exxon          Motorola    Univ. of London        Dornier        HP        Sandia Labs
      Swiss Bank     Shell Oil      Deutsche Tel    Stanford      Airbus Industries Caterpillar   DERA (UK)
     UBS Warburg     Aramco         Northern Tel       MIT              Boeing         BMW      US Air Force
     Bear Stearns  Tokyo Power        Siemens     Virginia Tech   Lockheed Martin    Chrysler Center for Disease
        Barclays  Hydro Quebec         Alcatel         ETH       Northrop Grumman      Ford         USGS
         SSB      Manitoba Hydro     Vodafone        Cornell      Aerospace Corp        GM           EPA
       Invesco       Pegasus                         Purdue                                        US Army

                    Visual Numerics provides the foundation for
                    mastering data and leveraging information.
      IMSL Fortran 5.0

      Dr. Edward Stewart
Manager of IMSL Technical Support


                          IMSL Overview: Platform History

     • June 1972
          – IMSL released for IBM MVS
          – First customer was Michigan Tech
     • March 1974 - released for UNIVAC
     • CDC, Data General were next
     • At one time supported 65 platforms
     • 1987 - 9.2 to 10.0 (1.0) major release and
       restructure of IMSL library
     • Today we support approximately 20 platforms –
       market consolidation

                          IMSL Fortran Overview (cont)

     • Ease of use features
          – Optional arguments implemented throughout the library
          – Standard naming conventions for routine names,
            matrices, variables, parameters (LFTRG, UMING)
          – Support for F90 language features
          – Extensive use of system environment variables (for
            error handlers, command line builds, license
            management, picking up mod files)
          – Mod files for parameter list checking at compile time
          – Fully backwards compatible
          – Single package
               • All F77, F90, and parallel processing features

                          IMSL Fortran Overview (cont)

     • Performance
          – OpenMP/SMP support
             • Linear Systems, matrix manipulation
             • Eigensystem Analysis, Fast Fourier Transforms
          – BLAS (Basic Linear Algebra Subprograms)
             • Embedded or Vender-supplied (compiler switch)
          – ScaLAPACK utilities
          – MPI Enabled with MPI-Enhanced interface

                          IMSL Fortran Overview (cont)

   • The Library uses Fortran 95 features.
        – Simple, powerful, and flexible interface
             • Short list of required arguments
             • Easier to remember nomenclature and syntax
             • Full depth and control via optional arguments
        – Advanced interface available on all applicable
          routines formerly in the F77 library
             • Maintains full backward compatibility
IMSL Fortran 5.0 Standards


                          IMSL Fortran 5.0 Standards

     • Library Structure
     • Naming Conventions
     • Interfaces
        – Module Units
        – Optional arguments
     • Operators
     • Generic functions
     • IMSL Error Handler system
     • EIAT‟s (Testing Codes)
     • BLAS
     • Machine Constants

                          IMSL Fortran 5.0 Standards
                          Library Structure

     • Routines are usually nested in layers
     • Top-level Routine (Example NEQNF)
          – Level-two routines used to allocate workspace
            and override defaults, ex: N2QNF
          – Level-four routines used to set or change non-
            default values for the routine, ex: N4QNF Used
            to set values in IPARAM and RPARAM arrays
          – Newer F90 style routine interfaces use optional
            arguments to set and pass non-default
            parameter settings

                          IMSL Fortran 5.0 Standards
                          Library Structure (cont)

     • Newer F90 routines versus older FNL style
          – F90 has longer more descriptive names
               • F90: LIN_SOL_GEN
               • F77: LSARG
          – Optional Argument Interface
          – New routines and interfaces written to take
            advantage of F90 language features (i.e. LDA
            can be picked up from array size)

                          IMSL Fortran 5.0 Standards
                          Library Structure (cont)

     • Shared (DLL in Windows) vs. Static
     • Shared libraries
          – Resolve symbols at run-time
          – May make application maintenance easier,
            don‟t have to change out DLL on multiple
          – Distributed executables are smaller, but DLL
            must be made available.

                          IMSL Fortran 5.0 Standards
                          Library Structure (cont)

     • Static libraries
          – Resolves symbols at link time
          – Executable is complete, but larger
          – No DLLs or shared images are necessary
          – All applications must be recompiled if change is
            made to a dependent library such as IMSL

                          IMSL Fortran 5.0 Standards
                          Naming Conventions

     • Naming Conventions (e. g. Linear Prog.)
          – F77 interfaces - DLPRS and DDLPRS
          – F90 Interfaces
               • Generic – no prefix (DLPRS)
               • s_ single precision (S_DLPRS)
               • d_ double precision (D_DLPRS)
               • c_ complex, mostly used in newer F90 specific
               • z_ double complex, mostly used in newer F90
                 specific routines

                          IMSL Fortran 5.0 Standards
                          Interfaces: Mod Files

     • Used for parameter list checking
     • Extensive use of Module Files (Mod files):
          – F90 routines - Use imsl_libraries
          – F77 routines - Use numerical_libraries
          – First letter of routine – Use
          – Individual routines (F90 interface) - Use

                          IMSL Fortran 5.0 Standards
                          Interfaces: Optional Arguments

     • Optional Arguments
          – Older style F77 routines use set parameter list
          – Fortran 5.0 is a combination of F90 routines
            and new interfaces to older F77 routines
          – Takes advantage of F90 language features,
            such as size() function. Example, LDA no
            longer needs to be set, picked up by size (A,1)
          – Besides a shorter, cleaner parameter list, user
            errors are reduced if they have less variables to

                          F90: Simplify Argument Lists

     • Simple and Flexible User Interface
          – Short list of required arguments
          – Full depth and control via optional arguments
          – Maintains full backward compatibility
     • Example:
          – Original: BCLSJ  (FCN, JAC, M, N, XG, IB, XL, XU,
             XS, FS, IP, RP, X, FV, FJ, LDFJ)

          – New:           BCLSJ (FCN, JAC, M, IB, XL, XU, X)

                          IMSL Fortran 5.0 Standards
                          Operators and Generic Functions

     • Commonly used Linear Algebra functions
     • Matrix operations and utilities
     • MPI functions and utilities - Used for solving
       certain problems over a heterogeneous
          – Use the Message Passing Interface
          – Box data types are used to broadcast data to
            different nodes

                          IMSL Fortran 5.0 Standards
                          Error Handler

     • IMSL has its own error handler – errors are
       different than compiler or system errors
     • ERSET – sets behavior of the error handler for
       printing and stopping for different errors
     • IERCD – retrieves last error code, if any. These
       are specific to a routine
     • N1RTY – retrieves last error type. Corresponds to
       the Error Severity Level
     • Environment variable used to retrieve IMSL Error
       message for newer F90 routines; others built in

                          IMSL Fortran 5.0 Standards
                          Error Handler (cont)

     • Error severity Levels (Default action)
          – Level 1        Note (Print:NO, Stop:NO)
          – Level 2        Alert (Print:NO, Stop:NO)
          – Level 3        Warning (Print:YES, Stop:NO)
          – Level 4        Fatal (Print:YES, Stop:YES)
          – Level 5        Terminal (Print:YES, Stop:YES)


     • Basic Linear Algebra Subroutines
     • IMSL supplies our own
     • Vendor BLAS usually provide better
       performance, optimized for compiler and
     • IMSL BLAS in IMSLBLAS.lib and
    IMSL Fortran 5.0 and
High Performance Computing


                          Expanded Support for SMP
                          New for v. 5.0

     • Expanded support for SMP/OpenMP
       multiprocessor environments
          – Expanded number of routines directly enabled
            for SMP
          – Directly enabled computationally intensive
            algorithms for SMP in the areas of:
               • Linear Systems and matrix manipulation
               • Eigensystem Analysis
               • Fast Fourier Transforms

                          Support for DMP

     • Fortran 90 functions
          – Nonnegative Least-Squares Ax  b, x  0
          – Bounded Least-Squares Ax  b, a  x  b
          – Box Data Type - (Many problems, same size
            and type.)
     • MPI Modules
     • ScaLAPACK Utilities

                          Thread Safe Fortran

     • OpenMP-based implementation
          – !$OMP PARALLEL DO
          –     DO I=1,NUMBER_INTERVALS
          –         CALL UVMIF(F, XGUESS(I),
             +                 BOUND(I), X(I))
          –         FX(I)=F(X(I))
          –     END DO
          – !$OMP END PARALLEL DO

                          Thread Safe Fortran (cont)

     • Platform must support OpenMP 2.0
          – IBM AIX, XL Fortran
               • 64-bit
          – Sun Solaris, Sun One Studio 8
               • 32 and 64-bit
     • Multiple copies of existing tests used for
       OpenMP testing - all platforms.

                          Parallel Constrained LSQ

     • Dimension M x N; as N increases
       computational efficiency increases

     • User provides partitioning by blocks of
       columns in array IPART
          – IPART(1:2, 1:max(1,MP_NPROCS))
          – MP_NPROCS is the number of processors
            stored in the communicator

                          Parameter list for Nonnegative

     Call parallel_nonnegative_lsq (A, B, X, &
     A is the input matrix of dimension MxN
     B is the right-hand side vector
     X is the solution vector, where X > 0
     RNORM is the residual vector
     W is the dual vector
     INDEX is the vector that contains constraint
     IPART array containing the partition information

                          Box Data Type

     Linear operators and generic functions
       designed to compute multiple data sets on
       multiple CPU systems
          – Linear algebra functions (inverse, pseudo-
            inverse, solvers, etc.)
          – SVD, Eigensystem analysis, QR
          – FFTs

                          MPI Modules

     Modules MPI_setup_int, MPI_node_int
     Following a call to MP_SETUP(), MPI_node_int will
     •   Number of processors
     •   Rank of a processor
     •   Communicator information
     •   Usage priority order of the node systems

                          Printing Error Messages

     • MP_SETUP(„Final‟)
          – Print IMSL error messages
          – Messages printed by nodes from largest rank to
          – Also terminates MPI execution

                          ScaLAPACK Interface Support

     • Modules assist in identifying mistakes in
       usage at compile time
          – Argument mismatches
          – Missing arguments
     • ScaLAPACK_Support includes all module
          – ScaLAPACK_int, PBLAS_int, BLACS_int,
            Tools_int, LAPACK_Int

                          ScaLAPACK Communication

          – Contains interfaces for ScaLAPACK_Read and
            ScaLAPACK_Write routines
          – Reads data from a file and transmits data into the 2-d
            block-cyclic form
          – Writes block-cyclic matrix data to a file
     Synchronize reads and writes for multiple processes
 Applied Parallel Computing

        Dr. Richard Hanson
Senior Scientist, Numerical Methods



     • Rationale for parallel computation
     • A constrained optimization example –
       solving linear inequalities
     • Outline of communication needed for the
     • Review of the presentation

                          Why Use Parallel Computing?

     • The problem is large, so not all data fits in a
       single memory; or:

     • Using multiple processes is expected to
       improve performance

     • The improvements are significant, so
       inevitable complications are acceptable

                          Whither Parallel Methods

     • Code authors are motivated to use portable
       packages. This often implies use of MPI.
     • The mathematics and algorithm of the
       application determine where parallel
       technology improves performance.
     • This is the spirit of our example, illustrating
       the solution of a single large optimization

                          Constrained Optimization

     • Solve system of linear inequalities

          Gx  h, Gmn
     • “Small” length solutions are needed.
     • Problem originates from an update:

                          Constrained Optimization
                          Example - Continued

     • Solution is the update          x1  x  x0
       and a minimum value of             x   2
                                                  is desired.
     • Thus

                  Gx1  G  x  x0   h, 
                  Gx  h  Gx0  h

                          Solution Method – Example/1

     • Based on a dual algorithm:
     • Solve the constrained least-squares system

       G     T
                  0
        T  u  1  , u  0
       h        
       r  G u, s  1  h u, x  s r
             T             T

                          Solution Method – Example/2

     • The IMSL Fortran Library has a code
       exactly for this problem:
     • MPI routines are called in the solution
       algorithm, as required by the logic.
     • Any number of processors can be used.
       Code ignores MPI calls if the number of
       processors is == 1 – no parallelism needed.

                          Solution Method – Example/3

     • Complications arise, but they come with the
     • Recast dual problem with the partitioning
             G    T
              T       u 
             h        
              A1 : A2 : ... Ap  u    , u  0
                               
                                      1 

                          Solution Method – Example/4

     • The partition is defined in the user‟s program.
       The columns
                         i   A
                           are located on

        processor number         i, i  1, , p
     • The data for that column must be generated or
       copied to the processor before the routine is used.
     • IMSL provides ScaLAPACK_READ for this, but
       we will not discuss that routine here.

                          Solution Method – Example/5

       provides, at each node:
     • Primal solution
                               G T
                                u  1  , u  0
                                 T
                               h     
     •   Dual solution (used later) satisfies w  0

                          Solution Method – Example/6

     • Obtaining the solution requires each
       processor compute ri  Aui , i

                                    i  1,, p
     • Then compute the sum r 
       with MPI_Reduce( ) routine. i 1
                                        ri   

                          Solution Method – Example/7

     • The solution to the inequalities is given by

          x  1  rn1              r1,, rn 
                                 1               T

       and is first located on a single node.
     • The solution is then sent to all nodes with
       the routine MPI_Bcast( ).

                          Solution Method – Example/8

     • Constraint residuals are given by
                             xi 
          f i  Gi : h i   
                          1
                                                                T
     • Use MPI_Gather( ) to construct      f   f1 ,  , f p 
                                                             
     • Sharpen results with the dual variables:

          Where  w  0  f  0

                          Review of Example

     • Complete code is given in the IMSL Fortran
       Library, Math Chapter 1
     • Using parallel computing technologies
       carries weight of increased complexity in
       the software development – no avoidance.
     • Using IMSL library software, e. g.
       development time.
  Additional Visual Numerics
Product Offerings and Services


                          IMSL™ Numerical Libraries

     • IMSL C Numerical Library (CNL)
          – Written in C
          – Over 370 threadsafe, math, stat and
            finance algorithms
     • JMSL™ Numerical Library for Java™
          – 100% Java with IMSL numerics and charting
     • IMSL Fortran Numerical Library
          – Over 1,000 world-renowned algorithms
          – High Performance Computing

                          PV-WAVE Family

         – Language for building desktop visual data
           analysis applications
         – Understand and visualize data
         – Web based software
         – Understand what data means from anywhere,
           at anytime
         – Time Series Analysis application

                          Technical Consulting Services

       • Custom Algorithm and Application
       • Ongoing Interaction with Customers to Maintain
         and Improve Applications
       • Diverse Backgrounds and Expertise
            –   Computer Scientists
            –   Industry experts
            –   Visual Data Analysis (VDA) experts
            –   PhDs in various disciplines including math and stat
       • Combination of Training and Consulting

To top