OpenMP Tutorial
Seung-Jai Min
(smin@purdue.edu) School of Electrical and Computer Engineering Purdue University, West Lafayette, IN
High-Performance Parallel Scientific Computing 2008 Purdue University
Shared Memory Parallel Programming in the Multi-Core Era
• Desktop and Laptop
– 2, 4, 8 cores and … ?
• A single node in distributed memory clusters
– Steele cluster node: 2 8 (16) cores – /proc/cpuinfo
• Shared memory hardware Accelerators
• Cell processors: 1 PPE and 8 SPEs • Nvidia Quadro GPUs: 128 processing units
High-Performance Parallel Scientific Computing 2008 Purdue University
OpenMP: Some syntax details to get us started
• Most of the constructs in OpenMP are compiler directives or pragmas.
– For C and C++, the pragmas take the form:
#pragma omp construct [clause [clause]…]
– For Fortran, the directives take one of the forms:
C$OMP construct [clause [clause]…] !$OMP construct [clause [clause]…] *$OMP construct [clause [clause]…]
• Include files
#include “omp.h”
High-Performance Parallel Scientific Computing 2008 Purdue University
How is OpenMP typically used?
• OpenMP is usually used to parallelize loops:
• Find your most time consuming loops. • Split them up between threads.
Parallel Program Sequential Program void main() { int i, k, N=1000; double A[N], B[N], C[N]; for (i=0; i1) goto more; } printf(“ All done \n”); if(count==1) goto more; #pragma omp parallel { more: do_big_job(id); if(++count>1) goto done; } done: if(!really_done()) goto more; Not A structured block
A structured block
High-Performance Parallel Scientific Computing 2008 Purdue University
Structured Block Boundaries
• In C/C++: a block is a single statement or a group of statements between brackets {}
#pragma omp parallel { id = omp_thread_num(); A[id] = big_compute(id); } #pragma omp for for (I=0;I