Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out

Secrets of Supercomputing

VIEWS: 7 PAGES: 52

									LA-UR-07-6793
Approved for public release;
distribution is unlimited




     Secrets of Supercomputing
                  The Conservation Laws
                      Supercomputing Challenge Kickoff
                            October 21-23, 2007

  I. Background to Supercomputing
  II. Get Wet! With the Shallow Water Equations
                   Bob Robey - Los Alamos National Laboratory
                  Randy Roberts – Los Alamos National Laboratory
                            Cleve Moler -- Mathworks


                                                                   1
                     Introductions
• Bob Robey -- Los Alamos National Lab, X division
   – brobey@lanl.gov, 665-9052 or home: robeyfamily@earthlink.net,
     662-2018
   – 3D Hydrocodes and parallel numerical software
   – Helped found UNM and Maui High Performance Computing
     Centers and Supercomputing Tutorials
• Randy Roberts -- Los Alamos National Lab, D Division
   – Java, C++, Numerical and Agent Based Modeling
   – rsqrd@lanl.gov
• Cleve Moler
   – Matlab Founder
   – Former UNM CS Dept Chair
   – SIAM President
   – Author of “Numerical Computing with Matlab” and “Experiments with
     Matlab”

                                                                         2
                  Conservation Laws
• Formulated as a conserved quantity
    – mass
                                                                  Change
    – momentum                            Conserved
                                          variable
    – energy
• Good reference is Leveque’s book and his
  freely available software package
  CLAWPACK (Fortran/MPI) and a 2D
  shallow water version Tsunamiclaw
Leveque, Randall, Numerical Methods for Conservation Laws
Leveque, Randall, Finite Volume Methods for Hyperbolic Problems
CLAWPACK http://www.amath.washington.edu/~claw/
Tsunamiclaw http://www.math.utah.edu/~george/tsunamiclaw.html
                                                                           3
     I. Intro to Supercomputing
• Classical Definition of Supercomputing
  – Harnessing lots of processors to do lots of small
    calculations
• There are many other definitions which usually
  include any computing beyond the norm
  – Includes new techniques in modeling, visualization,
    and higher level languages.
• Question for thought: With greater CPU
  resources is it better to save programmer work
  or to make the computer do bigger problems?

                                                          4
II. Calculus Quickstart
   Decoding the Language of
           Wizards




                              5
    Calculus Quickstart Goals
• Calculus is a language of mathematical wizards.
  It is convenient shorthand, but not easy to
  understand until you learn the secrets to the
  code.
• Our goal is for you to be able to
        READ calculus and         TALK calculus.

• Goal is not to ANALYTICALLY SOLVE calculus
  using traditional methods. In supercomputing we
  generally solve problems by brute force.

                                                    6
           Calculus Terminology
• Two branches of Calculus
   – Integral Calculus
   – Derivative Calculus
• P = f(x, y, t)
   – Population is a function of x, y, and t
• ∫f(x)dx – definite integral, area under the curve, or
  summation
• dP/dx – derivative, instantaneous rate of change, or
  slope of a function
• ∂P/∂x – partial derivative implying that P is a function of
  more than one variable

                                                                7
                    Matrix Notation
  a  c                                      This is just a system of equations


  b   d   0
                                                             a+c=0
                                                             b+d=0

   t   x
The first set of terms are state variables at time t and                           F
usually called U. The second set of terms are the flux            U
variables in space x and usually referred to as F.




                                                                               8
  Patterns for Parallel Programming
  Parallel Algorithms                  Implementation Patterns
• Data Parallel -– most                •   Message Passing
  common with MPI                      •   Threads
• Master/Worker – one                  •   Shared Memory
  process hands out the                •   Distributed Arrays, Global
  work to the other                        Arrays
  processes – great load
  balance, good with
  threads
• Pipeline – bucket brigade

Patterns for Parallel Programming, Mattson, Sanders, and Massingill, 2005

                                                                            9
                      Writing a Program
                         Data Parallel Model
Serial operations are done                                        Sections of distributed data are
on every processor so that                                        “owned” by each processor.
replicated data is the same                                       This is where the parallel
                                         P(400) – distributed     speedups occur.
on every processor.
                                          Ptot -- replicated

This may seem like a waste                                        Often ghost cells around each
of work, but it is easier than                                    processor’s data is a way to
synchronizing data values.                                        handle communication.



       Proc 1                    Proc 2                        Proc 3               Proc 4

      P(1-100)              P(101-200)                    P(201–300)              P(301-400)

        Ptot                      Ptot                          Ptot                 Ptot



                                                                                               10
         2007-2008 Sample
       Supercomputing Project
• Evaluation Criteria – Expo (Report slightly
  different). Use these to evaluate the
  following project.
  – 15% Problem Statement
  – 25% Mathematical/Algorithmic Model
  – 25% Computational Model
  – 15% Results and Conclusions
  – 10% Code
  – 10% Display           Evaluate Us!!

                                                11
             Get Wet!
 With the Shallow Water Equations
• The shallow water model for wave motion is
  important for water flow, seashore waves, and
  flooding
• Goal of this project is to model the wave motion
  in the shallow water tank
• With slight modifications this model can be
  applied to:
  – ocean or lake currents
  – weather
  – glacial movement


                                                     12
Output from a shallow water equation
    model of water in a bathtub.

    The water experiences 5 splashes which generate surface
    gravity waves that propagate away from the splash locations
    and reflect off of the bathtub walls. Wikipedia commons,
    Author Dan Copsey
    Go to shallow water movie.
    http://en.wikipedia.org/wiki/Image:Shallow_water_waves.gif




                                                                  13
           Mathematical Equations
            Mathematical Model
                         Shallow Water Equations
                                                    Notes: mass equals
  Conservation of Mass                              height because width,

       ht  (hu ) x  0                             depth and density are
                                                    all constant



  Conservation of Momentum                           Note: Force term,
                                                     Pressure P=½gh2

      (hu)t  (hu  gh ) x  0
                         2    1
                              2
                                     2                   h -> height
                                                         u -> velocity
                                                         g -> gravity



References: Leveque, Randall, Finite Volume Methods for Hyperbolic
Problems, p. 254
                                                                            14
Shallow Water Equations
     Matrix Notation

 h       uh    
 hu    2 1 2   0.
   t hu  2 gh  x
                   The maximum time
                   step is calculated so
wave speed  gh0   as to keep a wave
                   from completely
                   crossing a cell.




                                           15
             Numerical Model
• Lax-Wendroff two-step, a predictor-corrector
  method
  – Predictor step estimates the values at the zone
    boundaries at half a time step advanced in time
  – Corrector step fluxes the variables using the predictor
    step values
• Mathematical Notes for next slide:
  – U is a state variable such as mass or height.
  – F is a flux term – the velocity times the state variable
    at the interface
  – superscripts are time
  – subscripts are space

                                                               16
       The Lax-Wendroff Method
  Half Step


       n 2
             1
                                  t
  Ui 1           0.5(U  U ) 
                           n
                          i 1       ( Fi 1  Fi )
                                       n  n      n

                                 2x
                                      i
         2


 Whole Step


                           t n
             n 1
                     U i  ( Fi   Fi  )
                                        n         1              1
                         n
     Ui                                            2              2



                           x
                                               1              1
                                               2              2



Explanation graphic courtesy of Jon Robey and Dov Shlacter, 2006-2007
Supercomputing Challenge
                                                                        17
     Explanation of Lax-Wendroff Model

                                                              Data assumed to be
                                                              at the center of cell.
Physical model                                                             Ghost cell


                                                                       t
Original
                                                                       i
                                                                     Space index
Half-step                                                         t+.5
                                                                  i+.5

                                                            t+1
Full step
                                                            i



     Explanation graphic courtesy of Jon Robey and Dov Shlacter, 2006-2007
     Supercomputing Challenge. See appendix for 2D index explanation.                  18
           Extension to 2D
• The extension of the shallow water
  equations to 2D is shown in the following
  slides.
  – First slide shows the matrix form of the 2D
    shallow water equations
  – Second slide shows the 2D form of the Lax-
    Wendroff numerical method



                                                  19
2D Shallow Water Equations

     h          uh            vh      
     hu   hu 2  1 gh2    huv       
                  2                     0.
     hv  t  huv
                         hv 2  1 gh2 
                           x       2     y
       U               F                   G


Note the addition of fluxes in the y direction and a flux cross term
in the momentum equation. The U, F, and G are shorthand for
the numerical equations on the next slide. The U terms are the
state variables. F and G are the flux terms in x and y.



                                                                       20
     The Lax-Wendroff Method
 Half Step
                      n 2
                            1
                                                                t
                  U i 1 , j     0.5(U  U ) n
                                              i 1, j
                                                           n
                                                                   ( Fi 1, j  Fi ,nj )
                                                                        n

                                                               2x
                                                          i, j
                        2


                      n 2
                            1
                                                                t
                  U i, j 1      0.5(U i , j 1  U i , j ) 
                                         n            n
                                                                   (Gin, j 1  Gin, j )
                            2                                  2y
Whole Step

           n 1                      t n  1             n 2
                                                               1
                                                                      t n  1            n 2
                                                                                               1

      U            U        n
                                       ( Fi  1 ,2j  Fi  1 , j )     (Gi , j 2 1  Gi , j  1 )
                                     x                               y
          i, j              i, j
                                               2            2                       2            2




                                                                                                       21
   2D Shallow Water Equations
  Transformed for Programming
      Letting H = h, U = hu and V = hv so that our
      main variables are the state variables in the first
      column gives the following set of equations.

      H           U                 V         
      U   U 2 / H  1 gH 2    UV / H        
                     2                          0.
      V  t  UV / H
                             V 2 / H  1 gH 2 
                               x          2      y

H is height (same as mass for constant width, depth and density)
U is x momentum (x velocity times mass)
V is y momentum (y velocity times mass)



                                                                   22
         Sample Programs
• The numerical method was extracted from
  the McCurdy team’s model (team 62) from
  last year and reprogrammed from serial
  Fortran to C/MPI using the programming
  style from one of the Los Alamos team’s
  project (team 51) with permission from
  both teams.
• Additional versions of the program were
  made in Java/Threads and Matlab

                                        23
                  Programming Tools
                     Three options
1.       Matlab
     –     Computation and graphics integrated into Matlab desktop
2.       Java/Threads
     –     Eclipse or Netbeans workbench
     –     Graphics via Java 2D and Java Free Chart
3.       C/MPI
     –     Eclipse workbench -- An open-source Programmers
           Workbench http://www.eclipse.org.
     –     PTP (parallel tools plug-in) – adds MPI support to Eclipse
           (developed partly at LANL)
     –     OpenMPI – a MPI implementation (developed partly at LANL)
     –     MPE -- graphics calls that come with MPICH. Graphics calls
           are done in parallel from each processor!

                                                                     24
   Initial Conditions and Boundary
              Conditions
• Initial conditions
   – velocity (u and v) are 0 throughout the mesh
   – height is 2 with a ramp to the height of 10 at the right
     hand boundary starting at the mid-point in the x
     dimension
• Boundary conditions are reflective, slip
   – hbound=hinterior; uxbound=0; vxbound=vinterior
   – hbound=hinterior; uybound=uinterior; vybound=0
   – If using ghost cells, force zero velocity at the
     boundary by setting Uxghost= -Uinterior

                                                                25
        Results/Conclusions
• The Lax-Wendroff model accurately
  models the experimental wave tank
  – matches wave speed across the tank
• Some of the oscillations in the simulation
  are an artifact of the numerical model
  – OK as long as initial wave is not too steep
  – numerical damping technique could be added
    but is beyond the scope of this effort

                                               26
           Acknowledgements
Work used by permission:
• Awash: Modeling Wave Movement in a Ripple Tank,
  Team 62, McCurdy High School, 2006-2007
  Supercomputing Challenges
• A Lot of Hot Air: Modeling Compressible Fluid Dynamics,
  Team 51, Los Alamos High School, 2006-2007
  Supercomputing Challenge
We all have bugs and thanks to those who found mine
• Randy Roberts and Jon Robey for finding and fixing a
  bug in the second pass
• Randy Leveque for finding a missing square in the
  gravity forcing term

                                                        27
              Lab Exercises
• TsunamiClaw

• Matlab
• Experimental demonstration

• Java Serial
• Java Parallel

• C/MPI

                               28
         Java Wave Structure
• Wave class does most of the work
  – main(String[] args) calls start()
  – start() creates a WaveProblemSetup
  – start() calls methods to do initialization and
    boundary conditions
  – start() calls methods to iterate and update the
    display
Java Wave Structure (continued)
• WaveProblemSetup stores the new and old
  arrays
• swaps the new and old arrays when asked
  to by Wave
     Java Wave Program Flow
• Create arrays for new, old, and temporary
  data
• Initialize data
• Set boundary data to represent correct
  boundary conditions
• Iterate for the given number of iterations
     Java Wave Iteration Flow
• Update physics into new arrays from data
  in old arrays
• Set boundary data to represent correct
  boundary conditions with updated arrays
• Update display
• Swap new arrays with old arrays
             Java Threads
• How do you take advantage of new Multi-
  Core processors?
• Run parts of the problem on different cores
  at the same time!
     Java Threads (continued)
• WaveThreaded program
  – partitions the problem into domains using
    SubWaveProblemSetup objects
  – runs calculations on each domain in separate
    threads using WaveWorker objects
  – adds complexity with synchronization of
    thread's access to data
C/MPI Program Diagram
        Allocate memory
       Set Initial Conditions
          Initial Display

      Update Boundary Cells
       MPI Communication
       External Boundaries

            First Pass
             x half step
             y half step

          Second Pass

         Swap new/old
        Graphics Output
       Conservation Check       Repeat


       Calculate Runtime
    Close Display, MPI & exit
                                         35
                       MPI Quick Start
•   #include <mpi.h>
•   MPI_Init(&argc, &argv)

•   MPI_Comm_size(Comm, &nprocs) // get number of processors
•   MPI_Comm_rank(Comm, &myrank) // get processor rank 0 to nproc-1

•   // Broadcast from source processor to all processors
•   MPI_Bcast(buffer, count, MPI_type, source, Comm)

•   // Used to update ghost cells
•   MPI_ISend(buffer, count, MPI_type, dest, tag, Comm, req)
•   MPI_IRecv(buffer, count, MPI_type, source, tag, Comm, req+1)
•   MPI_Waitall(num, req, status)

•   // Used for sum, max, and min such as total mass or minimum timestep
•   MPI_Allreduce(&num_local, &num_global, count, MPI_type, MPI_op, Comm)

•   MPI_Finalize()

•   Web pages for MPI and MPE at Argonne National Lab (ANL) -- http://www-
    unix.mcs.anl.gov/mpi/www/
                                                                             36
                   Setup
• The software is already setup on the
  computers
• For setup on home computers, there are
  two parts. First download the files from the
  Supercomputing Challenge website for the
  lab in C/MPI if you haven’t already done
  that.
• Untar the lab files with
  “tar –xzvf Wave_Lab.tgz”

                                             37
         Setting up Software
   Instructions in the README file
Setting up System Software    Setting up User’s
• Need Java, OpenMPI and        workspace
  MPE package from            • Download eclipse
  MPICH                         software including
• Download and install          eclipse, PTP and PLDT
  according to instructions   • Install according to
  in openmpi_setup.sh           instructions in
• Can install in user’s         eclipse_setup.sh
  directory with some         • Import wave source files
  modifications                 and setup eclipse
                                according to instructions
                                in eclipse_setup.sh

                                                            38
                   Lab Exercises
• Try modifying the sample program (Java and/or C
  versions)
   – Change initial distribution. How sharp can it be before it goes
     unstable?
   – Change number of cells
   – Change graphics output
   – Try running 1, 2, or 4 processes and time the runs. Note that you
     can run 4 processes even if you are on a one processor system.
   – Switch to PTP debug or Java debug perspective and try stepping
     through the program
• Comparing to data is critical
   – Are there other unrealistic behaviors of the model?
   – Design an experiment to isolate variable effects. This can greatly
     improve your model.


                                                                     39
            Appendix A.
    Calculus and Supercomputing
• Calculus and Supercomputing are intertwined.
  Why?
• Here is a simple problem – Add up the volume of
  earth above sea-level for an island 500 ft high by
  half a mile wide and twenty miles long.


• Typical science homework problem using simple
  algebra. Can be done by hand. Not appropriate
  for supercomputing. Not enough complexity.

                                                  40
             Add Complexity
• The island profile is a jagged mountainous terrain
  cut by deep canyons. How do we add up the
  volume?

• Calculus – language of complexity
  – Addition – summing numbers
  – Multiplication – summing numbers with a constant
    magnitude
  – Integration – summing numbers with an irregular
    magnitude

                                                       41
          Divide and Conquer
• In discrete form
                               ∑ -- Summation symbol
                               ∆ -- delta symbol or x2-x1




• Divide the island into small pieces and sum up
  the volume of each piece.
• Approaches the solution as the size of the
  intervals grows smaller for a jagged profile.


                                                            42
           Divide and Conquer
• In Continuous Form – Integration




• Think of the integral symbols as describing a shape
  that is continuously varying
• The accuracy of the solution can be improved by
  summing over smaller increments
• Lots of arithmetic operations – now you have a
  “computing” problem. Add more work and you have a
  “supercomputing” problem.


                                                        43
             Derivative Calculus
                Describing Change
• Derivatives describe the change in a variable
  (numerator or top variable) relative to another variable
  (denominator or bottom). These three derivatives
  describe the change in population versus time, x-
  direction and y-direction.


                P P     P
                  ,   and
                t x     y


                                                             44
    Appendix B.
Computational Methods
   1. Eulerian and Lagrangian
      2. Explicit and Implicit




                                 45
Two Main Approaches to Divide up
           Problem
• Eulerian – divide up by spatial coordinates
  – Track populations in a location
  – Observer frame of reference
• Lagrangian – divide up by objects
  – Object frame of reference
  – Easier to track attributes of population since they
    travel with the objects
  – Agent based modeling of Star Logo uses this
    approach
  – Can tangle mesh in 2 and 3 dimensions

                                                          46
                            Eulerian
Eulerian – The area stays fixed and has a                  Population moves out of cell
  Population per area. We observe the
  change in population across the
  boundaries of the area.                                     Eulerian




Lagrangian – The population stays
  constant. The population moves with        Population moves and so does region

  velocity vx and vy and we move with
  them. The size of the area will change if
  the four vertexes of the rectangle move at Lagrangian
  different velocities. Changes in area will
  result in different densities.


                                                                                      47
        Explicit versus Implicit
• Explicit – In mathematical shorthand, Un+1=
  f(Un). This means that the next timestep values
  can be expressed entirely on the previous
  timestep values.
• Implicit – Un+1=f(Un+1,Un). Next timestep values
  must be solved iteratively. Often uses a matrix or
  iterative solver.
• We will stick with explicit methods here. You
  need more math to attempt implicit methods.

                                                  48
    Appendix C
Index Explanation for 2D Lax
         Wendroff




                               49
               Programming
• Most difficult part of programming this method is
  to keep track of indices – half step grid indices
  cannot be represented by ½ in the code so they
  have to be offset one way or the other.
• Errors are very difficult to find so it is important
  to be very methodical in the coding.
• Next two slides show the different sizes of the
  staggered half-step grid and the relationships
  between the indices in the calculation (courtesy
  Jon Robey).

                                                     50
                               i
              0        1       2       3       4
                                                   1st Pass
    0
                       y       y       y
    1              x       x       x       x             0,0     -- 1,0 | 1,1
                       y       y       y
                                                       j,i     -- j+1,i | j+1,i+1
j   2              x       x       x       x
                       y       y       y
    3              x       x       x       x
                       y       y       y                       X step grid          Main grid
    4




        0,0       -- 0,1 | 1,1
    j,i       -- j,i+1 | j+1,i+1



              Y step grid          Main grid                                                    51
                               i
              0        1       2       3       4
                                                   2nd Pass
    0
                       y       y       y
    1              x       x       x       x             1,1     -- 0,0 | 0,1
                       y       y       y
                                                       j,i     -- j-1,i-1 | j-1,i
j   2              x       x       x       x
                       y       y       y
    3              x       x       x       x
                       y       y       y                       Main grid            X step grid
    4




        1,1       -- 0,0 | 1,0
    j,i       -- j-1,i-1 | j,i-1



              Main grid            Y step grid                                                    52

								
To top