VIEWS: 9 PAGES: 27 POSTED ON: 12/30/2011 Public Domain
Scaling of First Principles Electronic Structure Methods on Future Architectures W.A. Shelton Oak Ridge National Laboratory UT-BATTELLE Collaborators Computer Science and Mathematics Locally Self-consistent Multiple Scattering Method (Real Space) Oak Ridge National Laboratory N.Y. Moghadam, D.M.C. Nicholson, G.M. Stocks, X.-G. Zhang, and B. Ujfalussy Pittsburgh Supercomputer Center Y. Wang National Energy Research Supercomputer Center A. Canning Screened Methods (Tight-binding like methods) Oak Ridge National Laboratory A. Smirnov University of Illinois (Urbana-Champaign) U.S. DEPARTMENT OF ENERGY D.D. Johnson OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Acknowledgement of Sponsors Computer Science and Mathematics Department of Energy/Office of Science Office of Advanced Scientific Computing Research Mathematics, Information and Computer Science Applied Mathematical Sciences Program Oak Ridge National Laboratory Laboratory Directors Research and Development Program Computing Resources at the Center of Computational Sciences located at Oak Ridge National Laboratory Pittsburgh Supercomputing Center National Energy Research Supercomputing Center located at Lawrence Berkeley National Laboratory U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Motivation Computer Science and Mathematics The introduction of new architectures • Rethink the mathematical model • Design new algorithms that renders a numerical solution • Open new possibilities – Improved scaling – For solving problems that previously were untenable – Software technologies being used by researchers with access to less advanced hardware technologies U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Motivation Computer Science and Mathematics • Nanoscale Science and Engineering Technology Initiative SOFT MATERIALS – Synthetic Polymers and Bio-Inspired Materials – Systems Dominated by Organic-Inorganic Interconnections – Interfacing Nanostructures to Biological Systems HYBRID SOFT-HARD MATERIALS – Carbon-Based Nanostructures – Characterization of Active Sites in Catalytic Materials – Nanoporous Membranes and Nanomaterials for Ultra-Selective Catalysis COMPLEX HARD MATERIALS – Magnetism in Nanostructured Materials – Nanoscale Manipulation of Collective Behavior – Nanoscale Interface Science (Nanoparticles and Nanograins) – Electromagnetic Fields in Confined Structures THEORY / MODELING / SIMULATION – Virtual Synthesis and Nanomaterials Design – Theoretical Nano-Interface Science U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Multiscale Simulations Computer Science and Mathematics Connect Microscopic-level Processes to Macroscopic Response of Material U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Scalable and Accurate First Principles Method Computer Science and Mathematics Atomistic Methods U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Quantum Simulation Goals: Accuracy and Predictive Capabilities Computer Science and Mathematics U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Advances In Hardware Alone Are Not Sufficient Computer Science and Mathematics U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Linear Scaling Algorithms Will Enable Solutions to New Problems Computer Science and Mathematics The combination of new advanced computing platforms and new scaling algorithms will open new areas in quantum-level materials simulations U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Algorithm Design for future generation architectures Computer Science and Mathematics • More accurate • Spectral or pseudo-spectral accuracy • Wider range of applicability • Sparse representation • Memory requirements grow linearly • Each processor can treat thousands of atoms • Make use of large number of processors • Message-Passing • Each atom/node local message-passing is independent of the size of the system • Time consuming step of model • Sparse linear solver • Direct or preconditioned iterative approach U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Density Functional Theory (DFT) Computer Science and Mathematics • DFT in principle is an exact method for treating the many body quantum mechanical effects of electron exchange and correlation • At the heart of this formulation is the ascertain that the ground-state total energy of an electron system in the field of the atomic nuclei is a unique functional of the electronic charge density • The total energy functional attains it minimum value when evaluated with the true ground-state electronic charge density. The quantum mechanical effects of electron exchange and correlation are contained in the non-local exchange-correlation potential Vex-corr(r,r’). – Hence, the electronic interactions are explicitly accounted for by the fundamental quantity, the electronic charge density – Unfortunately, there are no analytical forms for calculating Vex-corr(r,r’) U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE LSDA &Multiple Scattering Theory (MST) Computer Science and Mathematics Multiple Scattering Theory (MST) Initial guess J. Korringa, Physica 13, 392, (1947) nin(r) , min(r) W. Kohn, N. Rostoker, PR, 94, 1111,(1954) Calculate MST Green function methods B. Gyorffy, and M. J. Stott, “Band Structure Veff[n,m]in Spectroscopy of Metals and Alloys”, Ed. D.J. Fabian and L. M. Watson (Academic 1972) Solve Schrodinger S.J. Faulkner and G.M. Stocks, PR B 21, 3222, Equation (1980) Mix Recalculate in & out {( + 2 )1 V eff }G (r, r '; ) 1 (r r ') 2m nout(r) , mout(r) nin(r) = nout(r) ? No n(r ) Im d f ( ) TrG (r, r '; ) 1 min(r) =mout(r) ? 1 m(r ) Im d f ( ) TrG (r, r '; ) Yes Calculate Total Energy U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Complex Energy Plane Computer Science and Mathematics f is the highest occupied electronic state in energy Im Scattering is local since there are no states near the bottom of the energy contour Scattering is local since a large Im is equivalent to rising temperature which smears out the states Near f scattering is non-local (metal) Real f Im Semi-conductors and insulators could work well since they have no states at f Real f The scattering properties at complex energy can be used to develop highly efficient real-space and k-space methods U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Multiple Scattering Theory Computer Science and Mathematics Multiple scattering theory n rn • Green function Rn G(r, r; ) [Z L (rn ; ) LL' ( )Z L' (rn; ) Z L (r; ) J L (r ; ) LL ] n nn n ' n n LL r n • Scattering path matrix R n rn nn'( ) t n ( ) nn' + n tn ( )G(Rn Rn ; ) nn ( ) r n [ ] nn ' M1 |nn t 01 G01 () G0m () M LL ml LL nn GLL M() nn nn G10 ()* t1 1 G1m () Generalization of t-matrix. Converts Gm0 () Gm1() t m1 incoming wave at siteinto outgoing wave at site in the presence of all the other sites i1/ 2 |Ri Rk | 1 e ij Gij () G () decay slowly with increasing distance 4 | Ri Rk | contain free-electron singularities U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Real Space Algorithm Design Computer Science and Mathematics • Linear scaling – Each node performs a fixed size local calculation • Thus each node performs the same number of flops • Message-Passing – Each atom/node local message-passing is independent of the size of the system • Time consuming step of model – Reduce to Linear Algebra step • BLAS level 3 U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Real Space Parallel Implementation Computer Science and Mathematics • Green’s function G(r, r; ) [Z L (rn ; ) LL' ( )Z L' (rn; ) Z L (r; ) J L (r ; ) LL ] n nn n ' n n LL • Scattering path matrix: real space =M-1 M=[t-1()-G(Rij,)] t : scattering from single site G: structure constant matrix • Once M is fixed increasing N does not affect the local calculation of M-1 • The LSMS naturally scales linearly with increasing N U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Matrix Inversion Computer Science and Mathematics A B Partition the m(lmax+ 1)2 matrix, =MxM into M=M1+ M2 into four blocks two of size M1 and two of size M2 C D A-1=A-B D-1C Note that the LxL diagonal block of A-1 is the same LxL block that is desired. Take A and continue to partition until the desired matrix size (lmax+ 1)2 of the central site is reached U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE J(N) Scaling of Real Space Method Computer Science and Mathematics 1998 Gordon Bell Prize 1.02 TFLOPS on a 1500 node Cray T3E U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE J(N) Scaling of Real Space Method Computer Science and Mathematics LSMS Performance on LeMieux GFLOPS versus No. Processors 6000 GFLOPS on LeMieux 5000 Linear Scaling GFLOPS on LeMieux 4000 3000 4.58 TFLOPS 2000 1000 0 0 500 1000 1500 2000 2500 3000 3500 Number of Processors U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Real Space Accuracy Computer Science and Mathematics fcc Cu bcc Cu bcc Mo hcp Co U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Tight-Binding MST Representation Computer Science and Mathematics • Tight Binding Multiple Scattering Theory – Embed a constant repulsive potential • Shifts the energy zero allowing for calculations at negative energy ( V ) s 1/ 2 |Ri R j | – Rapidly decaying interactions G () e ij,s – Free electron singularities are not a problem eikR – Sparse representation R Vs 0 Constant inside a sphere R Rmt R R mt > } 2 Ryd. U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Screened Structure Constants Computer Science and Mathematics • Linear solve using m atom cluster that is less than the n atom system • Easy to perform Fourier transform Gs () [I t sGfree ()] 1 – K-space method G(k, ) Gm,s ()eikR m m • Screened Structure Constants Gs on the left unscreened on the right – Screened structure constants rapidly go to zero, whereas the free space structure constants have hardly changed U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Screened MST Methods Computer Science and Mathematics • Formulation produces a sparse matrix representation – 2-D case has tridiagonal structure with a few distant elements due to periodicity – 3-D case has scattered elements • Mainly due to mapping 3-D structure to a matrix (2-D) • A few elements due to periodic boundary conditions • Require block diagonals of the inverse of () matrix – Block diagonals represent the site () matrix and are needed to calculate the Green’s function for each atomic site • Sparse direct and preconditioned iterative methods are used to calculate ii() – SuperLU – Transpose free Quasi-Minimal Residual Method (TFQMR) U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Screened KKR Accuracy Computer Science and Mathematics fcc Cu bcc Cu bcc Mo hcp Co U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Timing and Scaling of Scr-K KKR-CPA Computer Science and Mathematics U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE Conclusion Computer Science and Mathematics • Initial benchmarking of the Screened KKR method – SuperLU N1.8 for finding the inverse of the upper left block of – TFQMR with block Jacobi preconditioner N1.06 for finding the inverse of the upper left block of • Extremely high sparsity (97%-99% zeros increases with increasing system size) • Large number of atoms on a single processor • Real-space/Scr-KKR hybrid may provide the most efficient parallel approach for new generation architectres • Single code contains – LSMS, KKR-CPA, Scr-LSMS and Scr-KKR-CPA U.S. DEPARTMENT OF ENERGY OAK RIDGE NATIONAL LABORATORY UT-BATTELLE