Docstoc

Qsub Example - PowerPoint

Document Sample
Qsub Example - PowerPoint Powered By Docstoc
					Getting Started with HPC
On Iceberg
     Michael Griffiths
     Corporate Information and Computing Services
     The University of Sheffield
     Email m.griffiths@sheffield.ac.uk
           Outline
•   e-Science
•   Review of hardware and software
•   Accessing
•   Managing Jobs
•   Building Applications
    – Single processor
    – Shared memory multiprocessor tasks
    – Parallel tasks
• Resources
• Getting Help
e-Science

• More science relies on computational experiments
• More large, geographically disparate, collaborative
  projects
• More need to share/lease resources
   – Compute power, datasets, instruments, visualization
e-Science Requirements

• Simple and secure access to remote resources across
  administrative domains
• Minimally disruptive to local administration policies and
  users
• Large set of resources used by a single computation
• Adapt to non-static configuration of resources
                       Types of Grids
•Cluster Grid
   •Beowulf clusters
•Enterprise Grid, Campus
Grid, Intra-Grid
   •Departmental clusters,
    servers and PC network
•Cloud,Utility Grid
   •Access resources over internet on demand
•Global Grid, Inter-grid
   •White Rose Grid, National Grid Service,
   Particle physics data grid
Three Applications of Grid Computing
• Compute grids
• Data grids
• Collaborative grids
                            Grid Types:
                            Data Grid



•  Computing Network
   stores large volume of                      Engine flight data
   data across network
• Heterogeneous
  data sources
                                            London Airport
                            Airline                                    New York Airport


                                                   Grid
                                                                         Diagnostics centre


                              Maintenance Centre
                                                                      American data center

                                               European data center
Grid Types: Collaborative
     •   Internet videoconferencing
     •   Collaborative Visualisation
EGEE



  • The EGEE project brings together experts from over 27
    countries
     – Build on recent advances in Grid technology.
     – Developing a service Grid infrastructure in Europe,
         • available to scientists 24 hours-a-day.
Available Grid Services
 • Access Grid
 • White Rose Grid
    – Grid research
    – HPC Service
 • National Grid Service
    – Compute Grid
    – Data Grid (SRB)
 • National HPC Services
    – HPCx and CSAR (part of NGS)
 • Portal Services
Review: Hardware 1
•   AMD based supplied by Sun Microsystems
•   Processors: 252
•   Cores: 624
•   Performance: 435GFLOPs
•   Main Memory: 2.296TB
•   Filestore: 8TB
•   Temporary disk space: 18TB
•   Physical size: 4 racks
•   Power usage: 36KW
Review: Hardware 2
•   Older V20 and V40 Servers for Grid pp community
•   Dual headnode
     – Node 1 login node
     – Node 2 cluster services (including sge master), behaves as failover node
•   435 Cores General Use
     – 96 Sun X2200 nodes, each with 4 cores and 16 GB of RAM.
     – 23 "Hawley" nodes, each with 8 cores and 32 GB of RAM each
•   Comparing L2 Cash
     – AMD Opteron 1MB
     – Ultrac sparc III Cu (Titania) 8MB
                  Review:Hardware 3
Inside an X2200
unit.
Review: Hardware 4
• Two main Interconnect types gigabit (commodity),
  Infiniband (more specialist)
   – Gigabit – Supported as standard good for job farms, and
     small to mid size systems
   – Infiniband – High End solution for large parallel applications
     has become defacto standard for clusters (4Gb/s)
        Infiniband specifications.
•High data rates of upto 1880 MBits/s
 ConnectX™ IB HCA Card, Single Port 16Gb/s
InfiniBand

•Low latency of ~ 1 ms
    •Gigabit Ethernet is order of 100 ms
•SilverStorm 24-Port InfiniBand DDR Switch
Review: Hardware 5
• 64bit v 32 bit
   – Mainly useful for programs requiring large memory –
     available on bigmem nodes
   – Greater Floating Point accuracy
   – Future-proof: 32-bit systems are becoming obselete in HPC
            Service     Service    Service    Service    Service    Service     Service    Service    Service    Service
            Proc 1      Proc 2     Proc 1     Proc 1     Proc 1     Proc 1      Proc 2     Proc 1     Proc 1     Proc 1
           Worker     Worker      Worker     Worker     Worker     Worker     Worker      Worker     Worker     Worker
           node 1     node 2      node 3     node 4     node 5     node 1     node 2      node 3     node 4     node 5



            Service     Service    Service    Service    Service    Service     Service    Service    Service    Service
            Proc n      Proc n     Proc n     Proc n     Proc n     Proc n      Proc n     Proc n     Proc n     Proc n
           Worker     Worker      Worker     Worker     Worker     Worker     Worker      Worker     Worker     Worker
           node 6     node 7      node 8     node 9     node 10    node 6     node 7      node 8     node 9     node 10



            Service    Service     Service    Service    Service    Service    Service     Service    Service    Service
            Proc n     Proc n      Proc n     Proc n     Proc n     Proc n     Proc n      Proc n     Proc n     Proc n
           Worker     Worker      Worker     Worker     Worker     Worker     Worker      Worker     Worker     Worker
           node n     node n      node n     node n     node n     node n     node n      node n     node n     node n



            Service    Service     Service    Service    Service    Service    Service     Service    Service    Service


                                               YHMAN Network
            Proc n     Proc n      Proc n     Proc n     Proc n     Proc n     Proc n      Proc n     Proc n     Proc n
           Worker     Worker      Worker     Worker     Worker     Worker     Worker      Worker     Worker     Worker
           node 56    node 57     node 58    node 59    node 60    node 56    node 57     node 58    node 59    node 60


            Service     Service    Service    Service    Service    Service     Service    Service    Service    Service
            Proc 1      Proc 2     Proc 1     Proc 1     Proc 1     Proc 1      Proc 2     Proc 1     Proc 1     Proc 1
           Worker     Worker      Worker     Worker     Worker     Worker     Worker      Worker     Worker     Worker
           node 1     node 2      node 3     node 4     node 5     node 1     node 2      node 3     node 4     node 5



            Service     Service    Service    Service    Service    Service     Service    Service    Service    Service
            Proc n      Proc n     Proc n     Proc n     Proc n     Proc n      Proc n     Proc n     Proc n     Proc n
           Worker     Worker      Worker     Worker     Worker     Worker     Worker      Worker     Worker     Worker
           node 6     node 7      node 8     node 9     node 10    node 6     node 7      node 8     node 9     node 10



            Service    Service     Service    Service    Service    Service    Service     Service    Service    Service
            Proc n     Proc n      Proc n     Proc n     Proc n     Proc n     Proc n      Proc n     Proc n     Proc n
           Worker     Worker      Worker     Worker     Worker     Worker     Worker      Worker     Worker     Worker
           node n     node n      node n     node n     node n     node n     node n      node n     node n     node n



            Service    Service     Service    Service    Service    Service    Service     Service    Service    Service
            Proc n     Proc n      Proc n     Proc n     Proc n     Proc n     Proc n      Proc n     Proc n     Proc n
           Worker     Worker      Worker     Worker     Worker     Worker     Worker      Worker     Worker     Worker
           node 56    node 57     node 58    node 59    node 60    node 56    node 57     node 58    node 59    node 60




White Rose Grid
    Review: Software 1
                          Ganglia



  Portland,
   GNU                   Sun Grid
Redhat 64bit             Engine v6
 Scientific
   Linux
                          MPICH
    Opteron
    Review: Software 2
•    Maths and Statistical
      –   Matlab2009a, scilab 5
      –   R+ 2.0.1
•    Engineering and Finite Element
      –   Fluent 6.2.16, 6.1.25 and 6.1.22 als gambit, fidap and tgrid
      –   Ansys v90
      –   Abaqus
      –   CFX 5.7.1
      –   DYNA 91a
•    Visualisation
      –   IDL 6.1
      –   OpenDX
Review:Software 3
• Development
   – MPI, mvapich2, openmpi
         • mvapich2, Hawley nodes‟s
         • OpenMPI, Hawley nodes and using GigE
   – OpenMP
   – Nag, 20
   – ACML
• Grid
   – Globus 2.4.3 (via gpt 3.0)
   – SRB s-client tools to follow
    Accessing 1: Registration
• Registration
   – Details at http://www.shef.ac.uk/wrgrid/register
   – WRG users complete form at
       • http://www.wrgrid.group.shef.ac.uk/cms/WRG_form_newuser.pdf
   – e-Science Certificate registration optional
Accessing 2: Logging in
• ssh client
   – putty, SSH Secure Shell Client (from
     http://www.shef.ac.uk/wrgrid/trainingresources/SSHSecureShellCli
     ent-3.2.9.exe)
• X-Windows
   – Exceed 3d (just start exceed and login using ssh client)
   – Cygwin
   – Note: when using SSH secure shell client
       • From menu: edit-> settings
       • Select: Connection->tunneling
       • Tick Tunnel X11 connections
Accessing 3:Linux vs Solaris
• For end users, things are much the same
• RedHat Enterprise 5 (Scientific Linux)
• BASH is default shell (use up and down key for
  history, type “history” , use tab for auto-completion
• Setting Aliases for BASH is like
• “export $environment_var=“setting”
Accessing 4: Login Environment
• Paths and environment variables have been setup.
  (change things with care)
• BASH, CSH and TCSH setup by default more exotic
  shells may need additional variables for things to
  work correctly
• Install any e-Science certs in your .globus
Resources 1: Filestore
• Two areas of filestore available on Iceberg.
   – A permanent, secure, backed up area in your home
     directory /home/username
   – data directory /data/username
      • Not backed up to tape
      • Data is mirrored on the storage server
Resources2: Scratch area
• Temporary data storage on local compute nodes
   – I/O much faster than NFS mounted /home and /data
• Data not visible to other worker nodes and not backed up
• Create a directory using your username in /scratch on a worker
  and work from this directory
• The data in the /scratch area is deleted periodically when the
  worker is nnot accessed by any processor or jobs
Resources 3: Storage Allocations
• Storage allocations for each area are as follows:
   – Your home directory has a filestore of 5 GB,but you can
     request additional space.
   – If you change directory to /data you will see a directory
     labelled by your username.
   – In /data you can store 50GB of files you can request
     additional space.
Resources 4: Important Notes
• The data area is not backed up.
• Check quota regularly if you go over quota the
  account will become frozen you‟ll need to contact
  iceberg-admins
• Check quota using the command quota
• If you exceed your quota using the RM command
   – Note upper case
Resources 5 : Transferring Data
• Command line tools such as scp, sftp
• Use sftp tools such as winscp for windows
   – http://winscp.net/eng/index.php
            Running programs on iceberg
•   Iceberg is the gateway to the cluster of worker nodes and the
    only one where direct logging in is allowed.
•   Iceberg‟s main purpose is to allow access to the worker
    nodes but NOT to run cpu intensive programs.
•   All cpu intensive computations must be performed on the
    worker nodes. This is achieved by the qsh command for the
    interactive jobs and qsub command for the batch jobs.
•   Once you log into iceberg, taking advantage of the power of a
    worker node for interactive work is done simply by typing qsh
    and working in the new shell window that is opened. This
    what appears to be a trivial task has would in fact have
    queried all the worker nodes for you and started a session on
    the least loaded worker in the cluster.
•    The next set of slides assume that you are already working
    on one of the worker nodes (qsh session).
Managing Jobs 1:Sun Grid Engine Overview

• Resource management system, job sscheduler,
  batch system…
• Can schedule Serial and Parallel jobs
   – Serial jobs run in individual host queues
   – Parallel jobs must include a parallel environment request (-
     pe <pe_name> N)
SGE                              SGE                                   SGE                                                             SGE                                 SGE
worker                           worker                                worker                                                          worker                              worker
node                             node                                  node                                                            node                                node




                                                                                                                                    B Slot 1

                                                                                                                                               C Slot 1

                                                                                                                                                          C Slot 2




                                                                                                                                                                           A Slot 1

                                                                                                                                                                                      B Slot 1

                                                                                                                                                                                                 C Slot 1
A Slot 1

           A Slot 2

                      B Slot 1



                                 C Slot 1

                                             C Slot 2

                                                        C Slot 3




                                                                       B Slot 1

                                                                                   B Slot 2

                                                                                              B Slot 3
                                                                                                         Queue-A       Queue-B   Queue-C


                                            SGE MASTER node




                                                                                                                                                                     Queues
                                                                                                                                                                     Policies
                                                                                                                                                                     Priorities

                                                 JOB Y                            JOB Z                                                                              Share/Tickets
           JOB X

                                                                                                                   JOB O                                             Resources
                                            JOB N
           JOB U                                                                                                                                                     Users/Projects

                                                                   Job scheduling on the cluster
Managing Jobs 2: Job Scheduling
• Job schedulers work predominantly with “batch” jobs
  - require no user input or intervention once started
• Installation here also supports interactive use via
  “qsh”
Managing Jobs 3: Working with SGE jobs

• There are a number of commands for querying and
  modifying the status of a job running or queued by
  SGE
   – qsub (submit a job to SGE)
   – qstat (query job status)
   – qdel (delete a job)
Managing Jobs 4: Submitting Serial Jobs
• Create a submit script (example.sh):
  #!/bin/sh
  # Scalar benchmark
  echo ``This code is running on`` /bin/hostname
  /bin/date

• The job is submitted to SGE using the qsub
  command:
  $ qsub example.sh
        Managing Jobs 5: Options Used with SGE
-l h_rt=hh:mm:ss     The wall clock time. This parameter must be specified,
                     failure to include this parameter will result in the error
                     message: “Error: no suitable queues”.
-l mem=memory        sets the memory limit e.g. –l mem=10G


-l h_vmem=memory Sets the limit of virtual memory required (for parallel
                 jobs per processor).
-help                Prints a list of options


-pe ompigige np      Specifies the parallel environment to be handled by
                     the Score system. np is the number of nodes to be
                     used by the parallel job. Please note: this is always
                     one more than needed as one process must be
                     started on the master node, which, although does not
                     carry out any computation, is necessary to control the
                     job.
      Managing Jobs 6: Options Used with SGE

cwd                  Execute the job from the current working directory;
                     output files are sent to the directory form which the
                     job was submitted, not to the user‟s home directory.
-m be                Send mail at the beginning and at the end of the job
                     to the owner
-S shell             Use the specified shell to interpret the script rather
                     than the C shell (default).
-masterq iceberg.q   Specifies the name of the master scheduler as the
                     master node (iceberg)
-v                   Export all environment variables to all spawned
                     processes.
Managing Jobs 7: qsub
qsub arguments:

qsub –o outputfile –j y –cwd ./submit.sh

OR in submit script:

     $!/bin/bash
     $# -o outputfile
     $# -j y
     $# -cwd
     /home/horace/my_app
Managing Jobs 8: Interactive Use
• Interactive but with a dedicated resource
• “qsh”
   – Then use as your desktop machine
   – Fluent, matlab…
Managing Jobs 9: Deleting Jobs with qdel
• Individual Job
$ qdel 151
gertrude has registered the job 151 for deletion
• List of Jobs
$ qdel 151 152 153
• All Jobs running under a given username
$qdel –u <username>
Managing Jobs 9:Monitoring Jobs with qstat
• To list the status and node properties of all nodes:
   qstat (add –f to get a full listing)

• Information about users' own jobs and queues is provided by
  the qstat -u usersname command. e.g
   qstat -u fred

Monitor job and show memory usage qstat –f -jjobid |
  grep usage
         Managing Jobs 10:qstat Example
job-ID prior name               user         state submit/start at queue                                  slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------

206951 0.51000 INTERACTIV bo1mrl     r 07/05/2005 09:30:20 bigmem.q@comp58.iceberg.shef.a 1
206933 0.51000 do_batch4 pc1mdh    r 07/04/2005 16:28:20 long.q@comp04.iceberg.shef.ac. 1
206700 0.51000 l100-100.m mb1nam    r 07/04/2005 13:30:14 long.q@comp05.iceberg.shef.ac. 1
206698 0.51000 l50-100.ma mb1nam    r 07/04/2005 13:29:44 long.q@comp12.iceberg.shef.ac. 1
206697 0.51000 l24-200.ma mb1nam    r 07/04/2005 13:29:29 long.q@comp17.iceberg.shef.ac. 1
206943 0.51000 do_batch1 pc1mdh    r 07/04/2005 17:49:45 long.q@comp20.iceberg.shef.ac. 1
206701 0.51000 l100-200.m mb1nam    r 07/04/2005 13:30:44 long.q@comp22.iceberg.shef.ac. 1
206705 0.51000 l100-100sp mb1nam    r 07/04/2005 13:42:07 long.q@comp28.iceberg.shef.ac. 1
206699 0.51000 l50-200.ma mb1nam    r 07/04/2005 13:29:59 long.q@comp30.iceberg.shef.ac. 1
206632 0.56764 job_optim2 mep02wsw r 07/03/2005 22:55:30 parallel.q@comp43.iceberg.shef 18
206600 0.61000 mrbayes.sh bo1nsh   r 07/02/2005 11:22:19 parallel.q@comp51.iceberg.shef 24
206911 0.51918 fluent cpp02cg    r 07/04/2005 14:19:06 parallel.q@comp52.iceberg.shef 4
206954 0.51000 INTERACTIV php04awb r 07/05/2005 10:06:17 short.q@comp01.iceberg.shef.ac 1
Managing Jobs 11:Monitoring Job Output

• The following is an example of submitting a SGE job
  and checking the output produced
   qsub –pe mpich 8 myjob.sh
      job <131> submitted
   qstat –f (is job running ?)
   tail –f myjob.sh.o.131
Managing Jobs 12:SGE Job Output
• When a job is queued it is allocated a job number.
  Once it starts to run output usually sent to standard
  error and output are spooled to files called
   – <script>.o<jobid>
   – <script>.e<jobid>
Managing Jobs 13:Reasons for Job Failures
  –   SGE cannot find the binary file specified in the job script
  –   Required input files are missing from the startup directory
  –   Environment variable is not set (LM_LICENSE_FILE etc)
  –   Hardware failure (eg. mpi ch_p4 or ch_gm errors)
Managing Jobs 14:SGE Job Arrays
• Add to qsub command or script file (with #$ at
  beginning of line)
   – “ –t 1-10 “
• Would create 10 tasks from one job
• Each task has $SGE_TASK_ID set in the
  environment
Specifying The Memory Requirements of a
Job
• Policies that apply to queues
   – Default memory requirement for each job is 4GB
   – Jobs will be killed if memory exceeds amount requested
• Determine memory requirements for a job as follows
   – qstat –f –j jobid | grep mem
   – The reported figures will indicate
     - the currently used memory ( vmem )
     - Maximum memory needed since startup ( maxvmem)
     - cumulative memory_usage*seconds ( mem )
   – When you run the job next you need to use the reported value of
     vmem to specify the memory requirement
Managing Jobs 15:SGE Parallel
Environments
• Parallel environments on Iceberg
   –   ompigige
   –   openmp
   –   openmpi-ib
   –   mvapich2-ib
• See later
  Managing Jobs 16:Job Queues on Iceberg
Queue name      Job size limit   System
                                 specification
short.q         8 cpu hours
long.q          168 cpu hrs      2GB memory per cpu
parallel.q      168 cpu hrs      Jobs requiring
                                 multiple cpus (V40‟s)
openmp.q        168 cpu hrs      Shared memory jobs
                                 using openmp
parallelx22.q   168 cpu hrs      Jobs requiring
                                 multiple cpus
                                 (X2200‟s)
Managing Jobs 17:Interactive Computing

• Software that runs interactively should not be run on
  the head node.
• Instead you must run interactive jobs on an
  execution node (see „qsh‟ command below).
• The time limit for interactive work is 8 hours.
   – Interactive work on the head node will be killed off.
Checkpointing Jobs
• Simplest method for checkpointing
   – Ensure that applications save configurations at regular
     intervals so that jobs may be restarted (if necessary) using
     these configuration files.
• Using the BLCR checkpointing environment
   – BLCR commands
   – Using BLCR checkpoint with an SGE job
Checkpointing Using BLCR
• BLCR commands
  – cr_run, cr_checkpoint, cr_restart
• Run the code
  – cr_run ./executable
  – To checkpoint a process with process id PID
    cr_checkpoint -f checkpoint.file PID
  – To restart the process from a checkpoint
    cr_restart checkpoint.file
Using BLCR Checkpoint with An SGE Job
• A checkpoint environment has been setup called BLCR - it's
  accessible using the test cstest.q queue.
• An example of a checkpointing job would look something like:
  #
  #$ -l h_cpu=168:00:00
  #$ -q cstest.q
  #$ -c hh:mm:ss
• #$ -ckpt blcr
  cr_run ./executable >> output.file
• The -c hh:mm:ss options tells SGE to checkpoint over the
  specified time interval .
• The -c sx options tells SGE to checkpoint if the queue is
  suspended, or if the execution daemon is killed.
                      Tutorials

On iceberg copy the contents of the tutorial directory to
   your user area into a directory named sge:
 cp –rp /usr/local/courses/sge sge
 cd sge
In this directory the file readme.txt contains all the
   instructions necessary to perform the exercises.
Building Applications 1: Overview
• The operating system on iceberg provides full
  facilities for,
   – scientific code development,
   – compilation and execution of programs.
• The development environment includes,
   – debugging tools provided by the Portland test suite,
   – the eclipse IDE.
        Building Applications 2: Compilers
•   C and Fortran programs may be compiled using the GNU
    or Portland Group. The invoking of these compilers is
    summarized in the following table:




    Language               GNU Compiler      Portland
    C                      gcc               pgcc
    C++                    g++               pgCC

    FORTRAN 77             g77               pgf77
    FORTRAN 90/95                            pgf90
Building Applications 3: Compilers
• All of these commands take the filename containing
  the code to be compiled as one argument followed
  by numerous options.
• Example
    – pgcc myhelloworld.c –o hello
• Details of these options may be found through the
  UNIX man facility,
• To find details about the Portland f90 compiler use:
  man pgf90
Building Applications 4: Compiler Options
             Option                Effect

-c Compile            Compile, do not link.

-o exefile            Specifies a name for the
                      resulting executable.
-g                    Produce debugging information
                      (no optimization).
-Mbounds              Check arrays for out of bounds
                      access.
-fast                 Full optimisation with function
                      unrolling and code reordering.
        Building Applications 5: Compiler Options

      Option                           Effect

-Mvect=sse2       Turn on streaming SIMD extensions (SSE) and
                  SSE2 instructions. SSE2 instructions operate on
                  64 bit floating point data.
-Mvect=prefetch   Generate prefetch instructions.

-tp k8-64         -tp k8-64 Specify target processor type to be
                  opteron processor running 64 bit system.
-g77 libs         Link time option allowing object files generated
                  by g77 to be linked into programs (n.b. may
                  cause problems with parallel libraries).
Building Applications 6: Sequential Fortran
• Assuming that the Fortran 77 program source code is contained
  in the file mycode.f, to compile using the Portland group
  compiler type:
  % pgf77 mycode.f
• In this case the code will be output into the file a.out. To run this
  code issue:
  % ./a.out
  at the UNIX prompt.
• To add some optimization, when using the Portland group
  compiler, the –fast flag may be used. Also –o may be used to
  specify the name of the compiled executable, i.e.:
  % pgf77 –o mycode –fast mycode.f
• The resultant executable will have the name mycode and will
  have been optimized by the compiler.
Building Applications 7: Sequential C
• Assuming that the program source code is contained
  in the file mycode.c,
• to compile using the Portland C compiler, type:
  % pgcc –o mycode mycode.c
• In this case, the executable will be output into the file
  mycode which can be run by typing its name at the
  command prompt:
  % ./mycode
Memory Issues
• Programs using <2GB require no modification
• Large memory associated with heap or data memory segment,
  if this exceeds 2GB use following compiler flags
• C/C++ compilers
   – pgcc –mcmodel=medium
• Fortran compilers
   – pgf77/pgf90/pgf95 –mcmodel=medium
   – g77 –mcmodel=medium
Setting available memory using ulimit
• ulimit provides control over available resources for
  processes
   – ulimit –a report all available resource limits
   – ulimit –s XXXXX set maximum stacksize
• Sometimes necessary to set the hardlimit e.g.
   – ulimit –sH XXXXXX
Useful Links for Memory Issues
• 64 bit programming memory issues
   – http://www.ualberta.ca/CNS/RESEARCH/LinuxClusters/64-
     bit.html
• Understanding Memory
   – http://www.ualberta.ca/CNS/RESEARCH/LinuxClusters/me
     m.html
Building Applications 8: Debugging
• The Portland group debugger is a
   – symbollic debugger for Fortran, C, C++ programs.
• Allows the control of program execution using
   – breakpoints,
   – single stepping and
• enables the state of a program to be checked by
  examination of
   – variables
   – and memory locations.
Building Applications 9: Debugging
• PGDBG debugger is invoked using
   – the pgdbg command as follows:
   – pgdbg arguments program arg1 arg2.. Argn
   – arguments may be any of the pgdbg command line
     arguments.
   – program is the name of the traget program being debugged,
   – arg1, arg2,... argn are the arguments to the program.

• To get help from pgdbg use:
  pgdbg -help
Building Applications 10: Debugging
• PGDBG GUI
  – invoked by default using the command pgdbg.
  – Note that in order to use the debugging tools applications
    must be compiled with the -g switch thus enabling the
    generation of symbolic debugger information.
Building Applications 11: Profiling
• PGPROF profiler enables
   – the profiling of single process, multi process MPI or SMP
     OpenMP, or
   – programs compiled with the -Mconcur option.
• The generated profiling information enables the
  identification of portions of the application that will
  benefit most from performance tuning.
• Profiling generally involves three stages:
   – compilation
   – exection
   – analysis (using the profiler)
          Building Applications 12: Profiling
•    To use profiling in is necessary to compile your program with
     the following options indicated in the table below:


          Option                          Effect
    -Mprof=func      Insert calls to produce function level pgrpof
                     output.
    -Mprof=lines     Insert calls to produce line level pgprof
                     output.
    -Mprof=mpi.      Link in mpi profile library that intercepts MPI
                     calls to record message sizes and count
                     message sends and receives. e.g. -
                     Mprof=mpi,func.
    -pg              Enable sample based profiling.
Building Applications 13: Profiling
• The PG profiler is executed using the command
  % pgprof [options] [datafile]
   – Datafile is a pgprof.out file generated from the program
     execution.
Shared Memory Applications 1: OpenMP

• Source code containing OpenMP compiler directives
  can be compiled on symmetric multiprocessor nodes
• On Iceberg
   – 2 X dual core (amd opteron)
   – 2 X quad core(amd shanghai)
Shared Memory Applications 2: Compiling
OpenMP
• SMP source code is compiled using the PGI
  compilers with the -mp option.
• To compile C, C++, Fortran77 or Fortran90 code,
  use the mp flag as follows,
   –   pgf77 [compiler options] -mp filename
   –   pgf90 [compiler options] -mp filename
   –   pgcc [compiler options] -mp filename
   –   pgCC [compiler options] -mp filename
Shared Memory Applications 3: Simple
OpenMP Makefile
# Simple openmp makefile
# C compiler and options
CC= pgcc -fast -mp
LIB= -lm
# Object files
# OBJ= main.o \
        another.o
# Compile
myapp: $(OBJ)
    $(CC) -o $@ $(OBJ) $(LIB)

.c.o:
    $(CC) -c $<
# Clean out object files and the executable.
clean: rm *.o myapp
Shared Memory Applications 4: Specifying
Required Number of Threads
• The number of parallel execution threads at
  execution time is controlled by setting the
  environment variable OMP_NUM_THREADS to the
  appropriate value.
• For the csh or tcsh this is set using,
  setenv OMP_NUM_THREADS=2
• or for the sh or bash shell use,
  export OMP_NUM_THREADS=2
Shared Memory Applications 5: Starting an
OpenMP Interactive Shell
• To start an interctive shell with NPROC processors
  enter,
  qsh -pe openmp NPROC -v
  OMP_NUM_THREADS=NPROC
• Note
   – although the number of processors required is specified
     with the -pe option,
   – it is still necessary to ensure that the
     OMP_NUM_THREADS environment variable is set to the
     correct value.
Shared Memory Applications 6: Submitting
an OpenMP Job to Sun Grid Engine
•   The job is submitted to a special parallel environment that ensures the
    job ocupies the required number of slots.
•   Using the SGE command qsub the openmp parallel environment is
    requested using the -pe option as follows,
     – qsub -pe openmp 2 -v OMP_NUM_THREADS=2 myjobfile.sh

•   The following job script, job.sh is submitted using, qsub job.sh
    Where job.sh is,

         #!/bin/sh
         #$ -cwd
         #$ -pe openmp 4
         #$ -v OMP_NUM_THREADS=4
         ./executable
Parallel Programming with MPI 1:
Introduction
• Iceberg is designed with the aim of running MPI
  (message passing interface ) parallel jobs,
• the sun grid engine is able to handle MPI jobs.
• In a message passing parallel program each process
  executes the same binary code but,
   – executes a different path through the code
   – this is SPMD (single program multiple data) execution.
• Iceberg uses
   – openmpi-ib and mvapich2-ib implementation provide by
     infiniband (quadrics/connectX)
   – Using IB fast interconnect at 16GigaBits/second.
            Parallel Programming with MPI 2:
                     Hello MPI World!
#include <mpi.h>
#include <stdio.h>
int main(int argc,char *argv[])
{
int rank; /* my rank in MPI_COMM_WORLD */
int size; /* size of MPI_COMM_WORLD */
/* Always initialise mpi by this call before using any mpi functions. */
MPI_Init(& argc , & argv);
/* Find out how many processors are taking part in the computations. */
MPI_Comm_size(MPI_COMM_WORLD, &size);
/* Get the rank of the current process */
MPI_Comm_rank(MPI_COMM_WORLD, & rank);
if (rank == 0)
printf("Hello MPI world from C!\n");
printf("There are %d processes in my world, and I have rank %d\n",size, rank);
MPI_Finalize();
}
Parallel Programming with MPI 3:
Output from Hello MPI World!
• When run on 4 processors the MPI Hello World program
  produces the following output,

  Hello MPI world from C!

  There are 4 processes in my world, and I have rank 2
  There are 4 processes in my world, and I have rank 0
  There are 4 processes in my world, and I have rank 3
  There are 4 processes in my world, and I have rank 1
Parallel Programming with MPI 4:
Compiling MPI Applications Using Myrinet
on V40’s
• To compile C, C++, Fortran77 or Fortran90 MPI code
  using the portland compiler, type,

      mpif77 [compiler options] filename
      mpif90 [compiler options] filename
      mpicc [compiler options] filename
      mpiCC [compiler options] filename
              Parallel Programming with MPI 4:
      Compiling MPI Applications Using Gigabit ethernet
                          on X2200’s

•   To compile C, C++, Fortran77 or Fortran90 MPI code using the
    portland compiler, with OpenMPI type,

        export MPI_HOME=“/usr/local/packages5/openmpi-pgi/bin”
        $MPI_HOME/mpif77 [compiler options] filename
        $MPI_HOME/mpif90 [compiler options] filename
        $MPI_HOME/mpicc [compiler options] filename
        $MPI_HOME/mpiCC [compiler options] filename
Parallel Programming with MPI 5:
Simple Makefile for MPI
# MPI Makefile for intrompi examples .
SUFFIXES: .f90 .f .o
MPI_HOME = /usr/local/mpich-gm2_PGI
MPI_HOME=“/usr/local/packages5/openmpi-pgi/bin”

MPI_INCLUDE = $(MPI_HOME)/include

# # C compiler and options
CC = ${MPI_HOME}/bin/mpicc
CLINKER = ${CC}
COPTFLAGS = -O -fast
F90 = ${MPI_HOME}/bin/mpif90
FLINKER = $(F90) FOPTFLAGS = -O3 -fast
                                                  Comment out one of
LINKER=$(CLINKER)
OPTFLAGS=$(COPTFLAGS)
                                                  these line using a # to
# # Object files
OBJ= ex1.o \
                                                  select either
       another.o
# # Compile
                                                  OpenMPI or mpich-
ex1: $(OBJ)
      $(CC) -o $@ $(OBJ) $(LIBS)
                                                  gm compiler
.c.o:
      $(CC) -c $<
# Clean out object files and the executable.
clean: rm *.o ex1
Parallel Programming with MPI 6:
Submitting an MPI Job to Sun Grid Engine
• To submit a an MPI job to sun grid engine,
   – use the openmpi-ib parallel environment,
   – ensures that the job occuppies the required number of slots.
• Using the SGE command qsub,
   – the openmpi-ib parallel environment is requested using the
     -pe option as follows,

     qsub -pe openmpi-ib 4 myjobfile.sh
Parallel Programming with MPI 7:
Sun Grid Engine MPI Job Script
• The following job script, job.sh is submitted using,
    – qsub job.sh
    – job.sh is,

   #!/bin/sh
   #$ -cwd
   #$ -pe openmpi-ib 4
   #$ -q parallel.q
   # SGE_HOME to locate sge mpi execution script
   #$ -v SGE_HOME=/usr/local/sge6_2
   /usr/mpi/pgi/openmpi-1.2.8/bin/mpirun ./mpiexecutable
Parallel Programming with MPI 9:
Sun Grid Engine MPI Job Script
• Using this executable directly the job is submitted using qsub in
  the same way but the scriptfile job.sh is,
  #!/bin/sh
  #$ -cwd
  #$ -pe mvapich2-ib 4
  #$ -q parallel.q

   # MPIR_HOME from submitting environment
   #$ -v MPIR_HOME=/usr/mpi/pgi/mvapich2-1.2p1

   $MPIR_HOME/bin/mpirun_rsh –rsh -np 4 -hostfile
   $TMPDIR/machines ./mpiexecutable
            Parallel Programming with MPI 10:
           Sun Grid Engine OpenMPI Job Script
•   Using this executable directly the job is submitted using qsub in
    the same way but the scriptfile job.sh is,
    #!/bin/sh
    #$ -cwd
    #$ -pe ompigige 4
    #$ -q parallelx22.q

    # MPIR_HOME from submitting environment
    #$ -v MPIR_HOME=/usr/local/packages5/openmpi-pgi

    $MPIR_HOME/bin/mpirun -np 4 -machinefile mpiexecutable
Parallel Programming with MPI 10:
Extra Notes
•   Number of slots required and parallel environment
    must be specified using -pe openmpi-ib NSLOTS
•   The correct SGE queue set up for parallel jobs must
    be specified using -q parallel.q
•   The job must be executed using the correct
    PGI/Intel/gnu implementation of mpirun. Note also:
    – Number of processors is specified using -np NSLOTS
    – Specify the location of the machinefile used for your
      parallel job, this will be located in a temporary area on the
      node that SGE submits the job to.
Parallel Programming with MPI 10:
Pros and Cons.
• The downside to message passing codes is that they are
  harder to write than scalar or shared memory codes.
    – The system bus on a modern cpu can pass in excess of
      4Gbits/sec between the memory and cpu.
    – A fast ethernet butween PC's may only pass up to 200Mbits/sec
      between machines over a single ethernet cable and
        • this can be a potential bottleneck when passing data between compute
          nodes.
• The solution to this problem for a high performance cluster such
  as iceberg is to use a high performance network solution, such
  as the 16Gbit/sec interconnect provided by infiniband.
    – The availability of such high performance networking makes
      possible a scalable parallel machine.
Supported Parallel Applications on Iceberg

•   Abaqus
•   Fluent
•   Matlab
•   For information see documentation at
    – http://www.wrgrid.group.shef.ac.uk/forum/
        • See the iceberg users forum for Research Computing
Getting help
• Web site
   – http://www.shef.ac.uk/wrgrid/
• Documentation
   – http://www.shef.ac.uk/wrgrid/documents
• Training (also uses the learning management system)
   – http://www.shef.ac.uk/wrgrid/training
• Forum
   – http://www.wrgrid.group.shef.ac.uk/forum/index.php
• Contacts
   – http://www.shef.ac.uk/wrgrid/contacts.html
                      Tutorials

On iceberg copy the contents of the tutorial directory to
   your user area into a directory named sge:
 cp –rp /usr/local/courses/sge sge
 cd sge
In this directory the file readme.txt contains all the
   instructions necessary to perform the exercises.

				
DOCUMENT INFO
Description: Qsub Example document sample