Tony Drummond's presentation — simula.no

Reviews
The U.S. DOE Advanced CompuTational Software (ACTS) Collection Tony Drummond Lawrence Berkeley National Laboratory LADrummond@lbl.gov SIMULA Research Laboratory - May 2008 OUTLINE • Motivation • Introduction to the DOE ACTS Collection • Interfaces to the ACTS Collection • Software Sustainability Requirements • References SIMULA Research Laboratory - May 2008 Development of HighWhere are theSimulations End Computer applications? • Accelerator Science • Astrophysics • Biology • Chemistry • Earth Sciences • Materials Science • Nanoscience • Plasma Science : Commonalities: • Major advancements in Science http://acts.nersc.gov/MatApps • Increasing demands for computational power • Rely on available computational systems, languages, and software tools SIMULA Research Laboratory - May 2008 Software Development and Evolution min[time_to_first_solution] min[time_to_solution] (prototype) (production) • Outlive Complexity • Increasingly sophisticated models • Model coupling • Interdisciplinary (Software Evolution) • Sustained Performance • Increasingly complex algorithms • Increasingly diverse architectures • Increasingly demanding applications min[software-development-cost] max[software_life] and max[resource_utilization] SIMULA Research Laboratory - May 2008 (Long-term deliverables) OUTLINE • Motivation • Introduction to the DOE ACTS Collection • Interfaces to the ACTS Collection • Software Sustainability Requirements • References SIMULA Research Laboratory - May 2008 THE U.S. DOE ACTS COLLECTION Goal: The Advanced CompuTational Software Collection (ACTS) makes reliable and efficient software tools more widely used, and more effective in solving the nation’s engineering and scientific problems. References: • L.A. Drummond, O. Marques: An Overview of the Advanced CompuTational Software (ACTS) Collection. ACM Transactions on Mathematical Software Vol. 31 pp. 282-301, 2005 • http://acts.nersc.gov SIMULA Research Laboratory - May 2008 The Advanced CompuTational Software Collection (ACTS) Components: • Solid Base: non-commercial and open source tools developed at DOE laboratories and universities. • Independent Tool Evaluations and Consultation provided through acts-support@nersc.gov • High Level User Support problem identification, tool and interface selection, specific tuning parameter configurations, installation, documentation, etc. • Training and Dissemination workshops, lectures, active conference participation (acts.nersc.gov. • Collaborations with HPC centers, computational sciences research centers (national and international level), and software and computer vendors. SIMULA Research Laboratory - May 2008 Category Tool Functionalities Trilinos Numerical Hypre PETSc Ax  b Az  z A  UV T PDEs ODEs Algorithms for the iterative solution of large sparse linear systems. Algorithms for the iterative solution of large sparse linear systems, intuitive grid-centric interfaces, and dynamic configuration of parameters. Tools for the solution of PDEs that require solving large-scale, sparse linear and nonlinear systems of equations. Object-oriented nonlinear optimization package. Solvers for the solution of systems of ordinary differential equations, nonlinear algebraic equations, and differential-algebraic equations. Library of high performance parallel dense linear algebra. Software library for the solution of large sparse eigenproblems on parallel computers. General-purpose library for the direct solution of large, sparse, nonsymmetric systems of linear equations. Large-scale optimization software. Library for writing parallel programs that use large arrays distributed across processing nodes and that offers a shared-memory view of distributed arrays. Object-Oriented tools for solving computational fluid dynamics and combustion problems in complex geometries. Set of tools for analyzing the performance of C, C++, Fortran and Java programs. SIMULA Research Laboratory - May 2008 Tools for the automatic generation of optimized numerical software for OPT++ SUNDIALS ScaLAPACK SLEPc SuperLU TAO Global Arrays Overture  Code Development Run Time Support Library TAU ATLAS Software Sustainability Changes in algorithms sometimes lead to several years advancement in computations. Needs Flexibility! Its performance is influenced by system parameters and in steps in the algorithm. Critical points: portability and scalability. Algorithmic Implementations Application Data Layout Control I/O Tuned and machine Dependent modules New Architecture requires extensive tuning, may even require new programming paradigms. This is Difficult to maintain and not “very” portable. SIMULA Research Laboratory - May 2008 Software Sustainability USER's APPLICATION CODE (Main Control) Compilers + Expert Drivers + Support AVAILABLE AVAILABLE Application Data Layout LIBRARIES & PACKAGES Algorithmic Implementations LIBRARIES & PACKAGES AVAILABLE I/O LIBRARIES Tuned and machine Dependent modules SIMULA Research Laboratory - May 2008 Critical Path for HPC Software Stack • Scientific or engineering context • Domain expertise • Simulation codes • Data Analysis codes General Purpose Libraries •Algorithms •Data Structures •Code Optimization • Programming Languages •O/S - Compilers Hardware - Middleware - Firmware SIMULA Research Laboratory - May 2008 Critical Path for HPC Software Stack Funded by DOE/ASCR Library Development Numerical Tools Code Development Run Time Support http://acts.nersc.gov General Purpose Libraries •Algorithms •Data Structures •Code Optimization • Programming Languages •O/S - Compilers Hardware - Middleware - Firmware SIMULA Research Laboratory - May 2008 ACTS Numerical Tools: Functionality Computational Problem Systems of Linear Equations Methodology Algorithms LU Factorization Cholesky Factorization Direct Methods matrices) Library ScaLAPACK(dense) SuperLU (sparse) ScaLAPACK LDLT (Tridiagonal ScaLAPACK QR Factorization ScaLAPACK QR with column ScaLAPACK pivoting LQ factorization ScaLAPACK SIMULA Research Laboratory - May 2008 ACTS Numerical Tools: Functionality Computational Problem Systems of Linear Equations (cont..) Methodology Algorithms Conjugate Gradient GMRES Library AztecOO (Trilinos) PETSc AztecOO PETSc Hypre AztecOO PETSc AztecOO PETSc Iterative Methods CG Squared Bi-CG Stab AztecOO Quasi-Minimal Residual (QMR) Transpose Free QMR SIMULA Research Laboratory - May 2008 AztecOO PETSc Structure of PETSc PETSc PDE Application Codes ODE Integrators Visualization Nonlinear Solvers, Interface Unconstrained Minimization Linear Solvers Preconditioners + Krylov Methods Object-Oriented Grid Matrices, Vectors, Indices Management Profiling Interface Computation and Communication Kernels MPI, MPI-IO, BLAS, LAPACK SIMULA Research Laboratory - May 2008 Hypre Conceptual Interfaces Linear System Interfaces Linear Solvers GMG, ... FAC, ... Hybrid, ... AMGe, ... ILU, ... Data Layout structured composite block-struc unstruc CSR SIMULA Research Laboratory - May 2008 Hypre Conceptual Interfaces to Solvers List of Solvers and Preconditioners per Conceptual Interface System Interfaces Solvers Jacobi SMG PFMG BoomerAMG ParaSails PILUT Euclid PCG GMRES Struct X X X X X X X X X SStruct FEI IJ X X X X X X X X X X X X X X X X X X SIMULA Research Laboratory - May 2008 ACTS Numerical Tools: Functionality Computational Problem Systems of Linear Equations (cont..) Methodology Algorithms SYMMLQ PETSc Library Precondition CG AztecOO PETSc Hypre Iterative Methods (cont..) Richardson Block Jacobi Preconditioner Point Jocobi Preconditioner Least Squares Polynomials PETSc AztecOO PETSc Hypre AztecOO PETSc SIMULA Research Laboratory - May 2008 ACTS Numerical Tools: Functionality Computational Problem Methodology Algorithms SOR Preconditioning Overlapping Additive Schwartz PETSc PETSc Hypre AztecOO PETSc Hypre AztecOO PETSc PETSc Hypre Hypre Library Systems of Linear Equations (cont..) Iterative Methods (cont..) Approximate Inverse Sparse LU preconditioner Incomplete LU (ILU) preconditioner Least Squares Polynomials MG Preconditioner MultiGrid (MG) Methods Algebraic MG SIMULA Research Laboratory - May 2008 Semi-coarsening Hypre ACTS Numerical Tools: Functionality Computational Problem Methodology Algorithm mi n x || b  Ax || 2 mi n x || x || 2 mi n x || b  Ax || 2 mi n x || x || 2 Library ScaLAPACK ScaLAPACK ScaLAPACK ScaLAPACK (dense) SLEPc (sparse) ScaLAPACK (dense) SLEPc (sparse) ScaLAPACK (dense) SLEPc (sparse) Linear Least Least Squares Squares Problems Minimum Norm Solution Minimum Norm Least Squares Standard Eigenvalue Problem Singular Value Problem Generalized Symmetric Definite Eigenproblem  Symmetric  Eigenvalue Problem   Az  z For A=AH or A=AT Singular Value  Decomposition Eigenproblem  A  UVT A  UV H Az  Bz ABz  z BAz  z SIMULA Research Laboratory - May 2008  ACTS Numerical Tools: Functionality Computational Problem Non-Linear Equations Methodology Algorithm Line Search Trust Regions Library PETSc PETSc PETSc PETSc Newton Based Pseudo-Transient Continuation Matrix Free SIMULA Research Laboratory - May 2008 ACTS Numerical Tools: Functionality Computational Problem Methodology Algorithm Newton Finite-Difference Newton Quasi-Newton Non-linear Interior Point Standard Nonlinear CG Library OPT++ TAO OPT++ TAO OPT++ TAO OPT++ TAO OPT++ TAO OPT++ TAO OPT++ Non-Linear Optimization Newton Based CG Limited Memory BFGS Gradient Projections Direct Search No derivate information SIMULA Research Laboratory - May 2008 TAO - Interface with PETSc SIMULA Research Laboratory - May 2008 OPT++ Interfaces • Four major classes of problems available • NLF0(ndim, fcn, init_fcn, constraint) • Basic nonlinear function, no derivative information available • NLF1(ndim, fcn, init_fcn, constraint) • Nonlinear function, first derivative information available • FDNLF1(ndim, fcn, init_fcn, constraint) • Nonlinear function, first derivative information approximated • NLF2(ndim, fcn, init_fcn, constraint) • Nonlinear function, first and second derivative information available SIMULA Research Laboratory - May 2008 ACTS Numerical Tools: Functionality Computational Problem Methodology Algorithm Newton Finite-Difference Newton Quasi-Newton Non-linear Interior Point Standard Nonlinear CG Library OPT++ TAO OPT++ TAO OPT++ TAO OPT++ TAO OPT++ TAO OPT++ TAO OPT++ Non-Linear Optimization Newton Based CG Limited Memory BFGS Gradient Projections Direct Search No derivate information SIMULA Research Laboratory - May 2008 ACTS Numerical Tools: Functionality Computational Problem Non-Linear Optimization (cont..) Ordinary Differential Equations Methodology Algorithm Feasible Semismooth Unfeasible semismooth Adam-Moulton (Variable coefficient forms) Direct and Iterative Solvers Line Search TAO TAO CVODE (SUNDIALS) CVODES Library Semismoothing Integration Backward Differential Formula CVODE CVODES KINSOL (SUNDIALS) Nonlinear Algebraic Equations Differential Algebraic Equations Inexact Newton Direct and Iterative Solvers IDA (SUNDIALS) Backward Differential Formula SIMULA Research Laboratory - May 2008 ACTS Tools: Functionality Computational Problem Writing Parallel Programs Support Techniques Shared-Memory Distributed Memory Grid Generation Library Global Arrays CUMULVS (viz) Globus (Grid) OVERTURE CHOMBO (AMR) Hypre OVERTURE PETSc CHOMBO (AMR) Hypre OVERTURE Globus Distributed Arrays Structured Meshes Semi-Structured Meshes GRID Distributed Computing Remote Steering Coupling CUMULVS PAWS SIMULA Research Laboratory - May 2008 ACTS Tools: Functionality Computational Problem Writing Parallel Programs (cont.) Profiling Support Distributed Computing Technique Library Check-point/restart CUMULVS Automatic instrumentation User Instrumentation Automatic Instrumentation User Instrumentation PETSc PETSc TAU TAU Algorithmic Performance Execution Performance Code Optimization Library Installation Code Generation Linear Algebra Tuning ATLAS BABEL CHASM CCA Interoperability Language Components SIMULA Research Laboratory - May 2008 OUTLINE • Motivation • Introduction to the DOE ACTS Collection • Interfaces to the ACTS Collection • Software Sustainability Requirements • References SIMULA Research Laboratory - May 2008 How Does One Use ACTS Tools? CALL BLACS_GET( -1, 0, ICTXT ) CALL BLACS_GRIDINIT( ICTXT, 'Row-major', NPROW, NPCOL ) : CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) : : CALL PDGESV( N, NRHS, A, IA, JA, DESCA, IPIV, B, IB, JB, DESCB, $ INFO ) Language Calls • -ksp_type [cg,gmres,bcgs,tfqmr,…] • -pc_type [lu,ilu,jacobi,sor,asm,…] More advanced: • -ksp_max_it • -ksp_gmres_restart • -pc_asm_overlap • -pc_asm_type <. . > Command lines Linear System Interfaces Linear Solvers GMG FAC Hybrid, ... AMGe ILU, ... Problem Domain Data Layout structured composite blockstrc unstruc CSR SIMULA Research Laboratory - May 2008 Tool to Tool Interoperability One Side Interoperability PETSc Ex 1 TAU Ex 2 TOOL A TOOL D SIMULA Research Laboratory - May 2008 High-level User Interfaces to the ACTS Collection PyACTS matlabMPI NetSolve Star-P User Ax = b View_field(T1) Az  z T A UV High Level Interfaces OPT++ AZTEC ScaLAPACK PAWS Hypre SuperLU Globus PETSc TAO CUMULVS Chombo PVODE TAU Global Arrays Overture SIMULA Research Laboratory - May 2008 PyACTS Tony Drummond Lawrence Berkeley National Laboratory Vicente Galiano Miguel Hernandez University Violeta Migallón and José Penadés University of Alicante Goal: Provide a didactical tool to the ACTS collection. Provide a Python based interface to the ACTS Collection. References: • L. A. Drummond, V. Galiano, O. Marques, V. Migallon, J.Penades: PyACTS: A High-level Framework for Fast Development of High Performance Applications. Lecture Notes in Computer Sciences, Vol. 4395, pp 417-425, 2007. SIMULA Research Laboratory - May 2008 PyACTS PyACTS PyScaLAPACK PySuperLU SuperLU Wrappers PyACTS Wrappers ScaLAPACK Wrappers Python World PyMPI NumPy ScaLAPACK ... SuperLU Python SIMULA Research Laboratory - May 2008 PyACTS: Basic Services • BASIC Services: Creation and modification of different data objects and parallel environment specifications (matrices, data layouts, ctx,) • I/O Services : Parallel read/write. Currently supported ASCII and NetCDF. • Verification and Validation: Predicates and parameter type checking. • Data Conversion. Interoperable objects between libraries. SIMULA Research Laboratory - May 2008 PyACTS: Motivation PyClimate (J. Saenz et al,Univ. Basque Country) Support to common tasks during the analysis of climate variability data. • Simple IO operations • Operations with COARDS-compliant NetCDF files • Empirical Orthogonal Function (EOF) analysis, • Canonical Correlation Analysis (CCA) • Singular Value Decomposition (SVD) analysis of coupled datasets • Some linear digital filters • Kernel based probability-density function estimation and • access to DCDFLIB.C library from Python. SIMULA Research Laboratory - May 2008 PyACTS: Performance in PyClimate EOF calculations Empirical Orthogonal Function (Day calc) SIMULA Research Laboratory - May 2008 PyScaLAPACK: pvgesvd Performance SIMULA Research Laboratory - May 2008 PyACTS: Performance > from PyACTS import * > import PyACTS.PyPBLAS as PyPBLAS > import time > n=500 > ACTS_lib=1 # ScaLAPACK library > PyACTS.gridinit() # grid initialization > alpha=Scal2PyACTS(2,ACTS_lib) # convert scalar c=PyPBLAS.pvgemm(alpha,a,b,beta,c) # to PyACTS scalar > beta=Scal2PyACTS(3,ACTS_lib) > a=Rand2PyACTS(n,n,ACTS_lib) # generate a random # PyACTS array > b=Rand2PyACTS(n,n,ACTS_lib) > c=Rand2PyACTS(n,n,ACTS_lib) > c=PyPBLAS.pvgemm(alpha,a,b,beta,c) # call level 3 # PBLAS routine > PyACTS.gridexit() SIMULA Research Laboratory - May 2008 OUTLINE • Motivation • Introduction to the DOE ACTS Collection • Interfaces to the ACTS Collection • Software Sustainability Requirements • References SIMULA Research Laboratory - May 2008 Problem Statement: Software Sustainability THE GOOD • Many successful HPC stories have induced major advances in science and engineering • We have successful run and scale applications on 100000+ processors THE BAD • Portability Across Platforms is Still An Outstanding Issue: •Readiness • Performance • Robustness and Correctness THE UGLY Multi-Core and Many Core Era is knocking at the HPC door SIMULA Research Laboratory - May 2008 Problem Statement: Software Sustainability THE GOOD • Many successful HPC stories have induced major advances in science and engineering • We have successful run and scale applications on 100000+ processors THE BAD • Portability Across Platforms is Still An Outstanding Issue: •Readiness •Performance • Robustness and Correctness THE UGLY Multi-Core and Many Core Era is knocking at the HPC door SIMULA Research Laboratory - May 2008 Problem Statement: Software Sustainability THE GOOD • Many successful HPC stories have induced major advances in science and engineering • We have successful run and scale applications on 100000+ processors THE BAD • Portability Across Platforms is Still An Outstanding Issue: •Readiness • Performance • Robustness and Correctness THE UGLY Multi-Core and Many Core Era is knocking at the HPC door SIMULA Research Laboratory - May 2008 Software Quality Assurance • Robustness • Scalability • Extensibility • Interoperability • User Friendliness • Documentation • Periodic test and evaluations (test engines and dependency graphs) Versions (tools, systems, O/S, compilers) • Sanity-check (robustness) • Interoperability (maintained) • Consistent Documentation SIMULA Research Laboratory - May 2008 ScaLAPACK’s Software Structure ScaLAPACK PBLAS Global Local LAPACK platform specific BLACS BLAS MPI/PVM/... SIMULA Research Laboratory - May 2008 BLAS: Basic Linear Algebra Subroutines BLAS LEVELS: • Level 1 BLAS: vector-vector 2.2 GHz AMD Opteron 10000.0 Mflop/s • Level 2 BLAS: matrix-vector 11 00 13 00 15 00 17 00 order of matrix/vector + * Design Considerations: • Portability • Performance: development of blocked algorithms is important for performance! SIMULA Research Laboratory - May 2008 19 00 10 0 30 0 50 0 70 0 90 0 • Level 3 BLAS: matrix-matrix * 100.0 1000.0 + * BLAS 1 BLAS 2 BLAS 3 ScaLAPACK: Data Layouts • 1D block and column distributions • 1D block-cycle column and 2D block-cyclic distribution • 2D block-cyclic distribution used in ScaLAPACK for dense matrices SIMULA Research Laboratory - May 2008 Astrophysics Applications Cosmic Microwave Background Analysis, BOOMERanG collaboration, MADCAP code (Apr. 27, 2000). • The statistics of the tiny variations in the CMB (the faint echo of the Big Bang) allows the determination of the fundamental parameters of cosmology to the percent level or better. • MADCAP (Microwave Anisotropy Dataset Computational Analysis Package) • Makes maps from observations of the CMB and then calculates their angular power spectra. (See http://crd.lbl.gov/~borrill). • Calculations are dominated by the solution of linear systems of the form M=A-1B for dense nxn matrices A and B scaling as O(n3) in flops. MADCAP uses ScaLAPACK for those calculations. SIMULA Research Laboratory - May 2008 PETSc PETSc PDE Application Codes ODE Integrators Visualization Nonlinear Solvers, Interface Unconstrained Minimization Linear Solvers Preconditioners + Krylov Methods Object-Oriented Grid Matrices, Vectors, Indices Management Profiling Interface Computation and Communication Kernels MPI, MPI-IO, BLAS, LAPACK Image Provided by PETSc Development Team, ANL SIMULA Research Laboratory - May 2008 Basic Conjugate Gradient Algorithm Synchronization Points Scalars , , y  Vectors x, r, p (= search direction), and q SIMULA Research Laboratory - May 2008 Preconditioning Matrices Gauss-Seidel: M = D-E Uses lower triangular part of matrix A Jacobi: M = D Uses diagonal of A SOR: M = 1/(D- E), Uses lower triangular part of A SSOR: M = 1/(2- ) (D- E)D-1(D- F) Uses the whole matrix A SIMULA Research Laboratory - May 2008 PETSc: Matrix Distribution proc 1 M=8,N=8,m=3,n=k1 rstart=0,rend=4 M=8,N=8,m=3,n=k2 rstart=3,rend=6 M=8,N=8,m=2,n= k3 rstart=6,rend=8 proc 2 proc 3 SIMULA Research Laboratory - May 2008 Software Dependency Graph ScaLAPACK Software Dependency Tree: PBLAS Global Local LAPACK BLACS ScaLAPACK: PBLAS, LAPACK LAPACK: BLAS PBLAS: BLACS, MPI Computational Platform Dependency ScaLAPCK: compiles=[compiler-list] options=[compile-options] Software Testing: ScaLAPACK: tests=[dir-list] BLAS platform specific MPI/PVM/... Python-base scripts SIMULA Research Laboratory - May 2008 Software Sustainability Software Testing Engines (automatic) Errors/Problems yes No End Fix/Report and Document User Reported Problems SIMULA Research Laboratory - May 2008 Software Sustainability Performance and Scalability Software Testing Engines (automatic) • Profiling and Tracing Tools: TAU Execution time of PDPOSV for various grid shapes Errors/Problems yes Fix/Report and Document No End 40 35 30 25 seconds 20 1x60 15 10 5 0 10000 9000 8000 7000 6000 5000 4000 3000 2000 1000 problem size 5x12 6x10 2x30 3x20 4x15 grid shape 35-40 30-35 25-30 20-25 15-20 10-15 5-10 0-5 User Reported Problems • Auto-Tuning (OSKI, ATLAS like) SIMULA Research Laboratory - May 2008 Software Sustainability Requirement SIMULA Research Laboratory - May 2008 ACTS Software Sustainability Center · · · t∞ Sustainable Software Support · · · t∞ SIMULA Research Laboratory - May 2008 Open Challenges - Multi-core • Improve interactions between Tool- Compilers-Hardware • Software Distribution and Installation • Automatic Tuning and Profiling (TAU, IPM, etc) • Automatic Code Generators (ATLAS-like) • Debugging tools • Tools and Language Interoperability SIMULA Research Laboratory - May 2008 References • ACTS Information Center: http://acts.nersc.gov • Two Upcoming Journal Issues dedicated to ACTS ACM TOMS IJHPCA • Ninth ACTS Collection Workshop, August 19-22, 2008 SIMULA Research Laboratory - May 2008

Related docs
Tony
Views: 14  |  Downloads: 1
Tony
Views: 3  |  Downloads: 0
tony
Views: 0  |  Downloads: 0
tony
Views: 1  |  Downloads: 0
tony
Views: 8  |  Downloads: 0
tony
Views: 3  |  Downloads: 0
Tony_Beal
Views: 0  |  Downloads: 0
Tony Carlin
Views: 0  |  Downloads: 0
With Presentation of Candida
Views: 145  |  Downloads: 1
tony
Views: 5  |  Downloads: 0
Tony Shaloub Interview
Views: 47  |  Downloads: 0
Tony
Views: 4  |  Downloads: 0
Other docs by presmaster