Embed
Email

A Pattern Language for Parallel Programming

Document Sample

Shared by: Lingjuan Ma
Categories
Tags
Stats
views:
2
posted:
10/29/2011
language:
English
pages:
28
A Pattern Language for

Parallel Programming

CSS 555 - Spring 2010

Fumitaka Kawasaki







CSS 555: Evaluating Software 1

Design

Design Patterns



 Design pattern is a “solution to a

problem in context”.

 Pattern name

 Description of the context

 Forces (Goals and constraints)

 Solution







CSS 555: Evaluating Software 2

Design

Simple Example of a Pattern

 Name: FixFlatTire

 Context: Our car had a flat tire on the way.

 Goals and constraints: The goals and constraints at this

point are efficiency and correctness.

 Solution:

1. Open the tire repair kit that you keep in your car just in case this sort of

thing happens.

2. Remove the offending object from the tire-this usually requires pliers.

3. Take the rasp tool included in the kit, quickly insert and remove it from

the hole to roughen and clean the rubber.

4. Take the plug and cover it in cement. Both the plug and cement are

included in the kits. Use the included insertion tool to stick the plug

into the hole. About 1/2" of the plug should remain outside the tire.

5. Quickly, pull the insertion tool straight out. This should leave the plug

in the hole.

6. Cut the plug flush with surrounding tire treads.

7. Remember that a plug is a temporary fix. You'll want to get the tire

internally patched or replaced as soon as possible.

CSS 555: Evaluating Software 3

Design

Design Patterns in Software

Engineering

 Originally introduced to the software

engineering by Beck and Cunnigham.

 Becoming prominent in the area of object-

oriented programming.

 The book, Elements of Reusable Object-Oriented

Software, by Gamma, Helm, Johnson, and

Vlissides, affectionately known as the Gof (Gang

of Four) book, gives a large collection of design

patterns for object-oriented programming.



CSS 555: Evaluating Software 4

Design

Example: Design Patterns in

Object-Oriented Programming (1)

 Creational patterns

 Abstract Factory groups object factories that

have a common theme.

 Builder constructs complex objects by separating

construction and representation.

 Factory Method creates objects without

specifying the exact class to create.

 Prototype creates objects by cloning an existing

object.

 Singleton restricts object creation for a class to

only one instance.

CSS 555: Evaluating Software 5

Design

Example: Design Patterns in

Object-Oriented Programming (2)

 Structural patterns

 Adapter allows classes with incompatible interfaces to work

together by wrapping its own interface around that of an

already existing class.

 Bridge decouples an abstraction from its implementation so

that the two can vary independently.

 Composite composes zero-or-more similar objects so that

they can be manipulated as one object.

 Decorator dynamically adds/overrides behaviour in an

existing method of an object.

 Facade provides a simplified interface to a large body of

code.

 Flyweight reduces the cost of creating and manipulating a

large number of similar objects.

 Proxy provides a placeholder for another object to control

access, reduce cost, and reduce complexity



CSS 555: Evaluating Software 6

Design

Example: Design Patterns in

Object-Oriented Programming (3)

 Behavioral patterns

 Chain of responsibility delegates commands to a chain of processing

objects.

 Command creates objects which encapsulate actions and parameters.

 Interpreter implements a specialized language.▪

 Iterator accesses the elements of an object sequentially without exposing

its underlying representation.

 Mediator allows loose coupling between classes by being the only class

that has detailed knowledge of their methods.

 Memento provides the ability to restore an object to its previous state

(undo).

 Observer is a publish/subscribe pattern which allows a number of observer

objects to see an event.

 State allows an object to alter its behavior when its internal state

changes.

 Strategy allows one of a family of algorithms to be selected on-the-fly at

runtime.

 Template method defines the skeleton of an algorithm as an abstract

class, allowing its subclasses to provide concrete behavior.

 Visitor separates an algorithm from an object structure by moving the

hierarchy of methods into one object.

CSS 555: Evaluating Software 7

Design

What is a Pattern Language?

 Many patterns form a language.

 Pattern language was first introduced by

Christopher Alexander in 1977 to refer to

common problems of the design and

constructions of buildings and towns.

 Patterns are organized into a structure so that

the user can design complex system through the

collection of patterns.

 Thus, a pattern language embodies a design

methodology and provides domain-specific advice

to the designer.



CSS 555: Evaluating Software 8

Design

A Pattern Language for

Parallel Programming









CSS 555: Evaluating Software 9

Design

Why Parallel Programming?

 To solve a given problem in less time.

 To solve bigger problems within a given

amount of time.

 To achieve better solutions for a given

problem and a given amount of time.







CSS 555: Evaluating Software 10

Design

Difficulties in Parallel

Programming (1)

 Parallel computer programs are more

difficult to write.

 Concurrency introduces several new

classes of potential software bugs, such as

race conditions.

 Communication and synchronization

between the different tasks are main

issues to get good performance.





CSS 555: Evaluating Software 11

Design

Difficulties in Parallel

Programming (2)

 Varity of parallel architectures, memory architectures, and

parallel programming environment.

 Parallel Architectures:

 Single Instruction, Single Data (SISD)

 Single Instruction, Multiple Data (SIMD)

 Multiple Instruction, Single Data (MISD)

 Multiple Instruction, Multiple Data (MIMD): SPMD, MPMD

 Memory Architectures:

 Shared Memory: SMPS, NUMA

 Distributed Memory: MPP, Cluster

 Hybrid Systems

 Parallel Programming Environment:

 OpenMP, MPI, Java, CUDA, OpenCL

 Specialized Parallel Computers:

 FPDA, GPGPU, ASIC, Vector Processors







CSS 555: Evaluating Software 12

Design

A Pattern Language for

Parallel Programming

 Finding Concurrency: The Finding

Concurrency design space is concerned

with structuring the problem, where the

available concurrency is identified and Finding Concurrency

exposed for use in the algorithm design

phase.

 Algorithm Structure: The Algorithm

Structure design space is concerned with

structuring the algorithm, where high- Algorithm Structure

level structures for organizing a parallel

algorithm are identified.

 Supporting Structure: The third phase

that is an intermediate phase between Supporting Structures

algorithms and source code, where an

organization of the parallel program and

management of shared memory are

considered.

 Implementation Mechanism: The Implementation Mechanisms

Implementation Mechanism design space

is concerned with how the patterns of

high-level spaces are mapped into Figure 1: Overview of the pattern language

particular programming environments.





CSS 555: Evaluating Software 13

Design

The Finding Concurrency Design Space

Finding Concurrency



Dependency Analysis

Decomposition

Group Tasks

Task Decomposition Design Evaluation

Order Tasks

Data Decomposition

Data Sharing









Algorithm Structure





Supporting Structures





Implementation Mechanisms



Figure 2: Overview of the Finding Concurrency design space

CSS 555: Evaluating Software 14

Design

Example: The Task Decomposition

Pattern

The Task Decomposition Pattern

In this pattern space, we will decompose a problem into tasks that can execute

concurrently.



Context: The first step of designing a parallel algorithm is a good understanding of

the target problem: identifying the computationally intensive parts of the problem,

the key data structure, and the relationship of them. The task and data

decomposition is the next step of the design process. Finding available concurrency

in tasks and suitable algorithms is challenging. Sometimes, it is easier to focus on

data, decompose the data, and identify tasks related to the data. In any case, tasks

must be identified because parallel algorithms need them.



Forces: The goals and constraints at this point are flexibili ty (in terms of the

number of tasks generated), efficiency (to minimize creation / context switch

overhead, and to keep all the processors fully occupied), and simplicity (for tasks to

be debugged and maintained easily ).



Solution: There are two keys to effective task decomposition: the independence of

tasks and the load-balancing of tasks. That is, the tasks are sufficiently independent,

and managing dependencies must be minimum. Also, execution of the tasks must

be evenly distributed. The good strategy to identify tasks is to start with too many

tasks and later try to merge them. The patterns in finding tasks are the functional

decomposition (in case each task corresponds to a distinct call to a function), the

loop-splitting algorithm (in case distinct iteration of the loop is mapped onto a

task), and the data-driven decomposition (in case each task updates different

chunks of a large data structure).

CSS 555: Evaluating Software 15

Design

The Algorithm Structure Design Space



Finding Concurrency





Algorithm Structure





Organize By Tasks Organize By Data Decomposition Organize By Flow of Data



Task Parallelism Geometric Decomposition Pipeline



Divide and Conquer Recursive Data Event-Based Coordination









Supporting Structures





Implementation Mechanisms



Figure 3: Overview of the Algorithm Structure design space

CSS 555: Evaluating Software 16

Design

The Supporting Structures Design Space

Finding Concurrency





Algorithm Structure





Supporting Structures

Program Structures Data Structures



SPMD Shared Data



Master/Worker Shared Queue

Loop Parallelism

Distributed Array

Fork/Join









Implementation Mechanisms



Figure 4: Overview of the Supporting Structures design space

CSS 555: Evaluating Software 17

Design

The Supporting Structures Design Space (cont.)

Task Divide Geometric Recursive Pipeline Event-Based

Parallelism and Decomposition Data Coordination

Conquer



SPMD **** *** **** ** *** **

Loop **** ** ***

Parallelism



Master/ **** ** * * * *

Worker



Fork/Join ** **** ** **** ****



Table 1: Relationship between Supporting Structures and Algorithm Structure







CSS 555: Evaluating Software 18

Design

The Supporting Structures Design Space (cont.)



OpenMP MPI Java





SPMD *** **** **

Loop Parallelism **** * ***



Master/ ** *** ***

Worker



Fork/Join *** ****

Table 2: Relationship between Supporting Structures and Programming Environment









CSS 555: Evaluating Software 19

Design

The Implementation Mechanisms Design Space



Finding Concurrency





Algorithm Structure





Implementation Mechanisms







Implementation Mechanisms





UE Management Synchronization Communication









Figure 5: Overview of the Implementation Mechanisms design space

CSS 555: Evaluating Software 20

Design

Conclusion

 Patterns help us describe expert solutions to

parallel programming.

 They give us a language to describe the

architecture of parallel software.

 They provide a roadmap to the frameworks

we need to support general purpose

programmers.

 And they give us a way to systematically map

programming languages onto of parallel

algorithms.

CSS 555: Evaluating Software 21

Design

Questions?









CSS 555: Evaluating Software 22

Design

A Pattern Language for

Parallel Programming ver2.0

 The patterns will be changing.

 Interested readers should consult the

link for updates.

http://parlab.eecs.berkeley.edu/wiki/patterns/patterns









CSS 555: Evaluating Software 23

Design

Architectural Patterns

These patterns define the overall architecture for a program.

 Pipe-and-filter: view the program as filters (pipeline stages) connected by pipes

(channels). Data flows through the filters to take input and transform into output.

 Agent and Repository: a collection of autonomous agents update state managed on their

behalf in a central repository.

 Process control: the program is structured analogously to a process control pipeline with

monitors and actuators moderating feedback loops and a pipeline of processing stages.

 Event based implicit invocation: The program is a collection of agents that post events

they watch for and issue events for other agents. The architecture enforces a high level

abstraction so invocation of an agent is implicit; i.e. not hardwired to a specific

controlling agent.

 Model-view-controller: An architecture with a central model for the state of the program,

a controller that manages the state and one or more agents that export views of the

model appropriate to different uses of the model.

 Bulk Iterative (AKA bulk synchronous): A program that proceeds iteratively ... update

state, check against a termination condition, complete coordination, and proceed to the

next iteration.

 Map reduce: the program is represented in terms of two classes of functions. One class

maps input state (often a collection of files) into an intermediate representation. These

results are collected and processed during a reduce phase.

 Layered systems: an architecture composed of multiple layers that enforces a separation

of concerns wherein (1) only adjacent layers interact and (2) interacting layers are only

concerned with the interfaces presented by other layers.

 Arbitrary static task graph: the program is represented as a graph that is statically

determined meaning that the structure of the graph does not change once the

computation is established. This is a broad class of programs in that any arbitrary graph

can be used.



CSS 555: Evaluating Software 24

Design

Computational Patterns

These patterns describe computations that define the components in a programs architecture.

 Backtrack, branch and bound: Used in search problems ... where instead of exploring all possible points

in the search space, we continuously divide the original problem into smaller subproblems, evaluate

characteristics of the subproblems, set up constraints according to the information at hand, and

eliminate subproblems that do not satisfy the constraints.

 Circuits: used for bit level computations, representing them as Boolean logic or combinational circuits

together with state elements such as flip-flops.

 Dynamic programming: recursively split a larger problem into subproblems but with memorization to

reuse past subsolutions.

 Dense linear algebra: represent a problem in terms of dense matrices using standard operations defined

in terms of Basic linear algebra (BLAS).

 Finite state machine: Used in problems for which the system can be described by a language of strings.

The problem is to define a piece of software that distinguishes between valid input strings (associated

with proper behavior) and invalid input strings (improper behavior).

 Graph algorithms: a diverse collection of algorithms that operate on graphs. Solutions involve preparing

the best representation of the problem as a graph, and developing a graph traversal that captures the

desired computation.

 Graphical models: probabilistic reasoning problems where the problem is defined in terms of probability

distributions represented as a graphical model.

 Monte Carlo: A large class of problems where the computation is replicated over a large space of

parameters. In many cases, random sampling is used to avoid exhaustive search strategies.

 N-body: Problems in which each member of a system depends on the state of every other particle in the

system. The problems typically involve some scheme to approximate the naive O(N2) exhaustive sum.

 Sparse Linear Algebra: Problems represented in terms of sparse matrices.Solutions may be iterative or

direct.

 Spectral methods: Problems for which the solution is easier to compute once the domain has been

transformed into a different representation. Examples include Z-transform, FFT, DCT, etc. The transform

itself is included in this class of problems.

 Structured mesh: Problem domains are mapped onto a regular mesh and solutions computed as averages

over neighborhoods of points (explicit methods) or as solutions to linear systems of equations (implicit

methods)

 Unstructured mesh: The same as the structured mesh problems, but the mesh lacks structure and hence,

the computations involved scatter and gather operations.



CSS 555: Evaluating Software 25

Design

Algorithm Patterns

These patterns describe parallel algorithms used to implement the

computational patterns.

 Task parallelism: Parallelism is expressed as a collection of explicitly

defined tasks. This pattern includes the embarrassingly parallel

pattern (no dependencies) and separable dependency pattern

(replicated data/reduction).

 Data parallelism: Parallelism is expressed as a single stream of tasks

applied to each element of a data structure. This is generalized as an

index space with the stream of tasks applied to each point in the

index space.

 Recursive splitting: A problem is recursively split into smaller

problems until the problem is small enough to solve directly. This

includes the divide and conquer pattern as a subset wherein the final

result is produce by reversing the splitting process to assemble

solutions to the leaf-node problems into the final global result.

 Pipeline: Fixed coarse grained tasks with data flowing between them.

 Geometric decomposition: A problem is expressed in terms of a

domain that is decomposed spatially into smaller chunks. Solution is

composed of updates across chunk boundaries, updates of local

chunks, and then updates to the boundaries of the chunks.

 Discrete event: a collection of tasks that coordinate among

themselves through discrete events. This pattern is often used for GUI

design and discrete event simulations.

 Graph partitioning: Tasks generated by decomposing recursive data

structures (graphs)

CSS 555: Evaluating Software 26

Design

Software Structure Patterns

Program structure

 SPMD: One program used by all the threads or processes, but based on ID

different paths or different segments of data are executed.

 Strict data parallel: A single instruction stream is applied to multiple data

elements. This includes vector processing as a subset.

 Loop level parallelism: Parallelism is expressed in terms of loop iterations that

are mapped onto multiple threads or processes.

 Fork/join: Threads are logically created (forked), used to carry out a

computation, and then terminated (joined).

 Master-worker/Task-queue: A master sets up a collection or work-items

(tasks), a collection of workers pull work-items from the master (a task-

queue), carry out the computation, and then go back to the master for more

work.

 Actors: a collection of active software agents (the actors) interact over

distinct channels.

 BSP: The Bulk Synchronous model from Leslie Valiant.

Data Structure Patterns

 Shared queue: this pattern describes ways to any of the common queue data

structures and manage them in parallel

 Distributed array: An array data type that is distributed about a threads or

processes involved with a parallel computation.

 Shared hash table: A hash table shared/distributed among a set of threads or

processes with any concurrency issues hidden behind an API.

 Shared data: a “catch all” pattern for cases where data is shared within a

shared memory region but the data can not be represented in terms of a well

defined and common high level data structure.





CSS 555: Evaluating Software 27

Design

Execution Patterns

Process/thread control patterns

 CSP or Communicating Sequential Processes: Sequential processes execute independently

and coordinate their execution through discrete communication events.

 Data flow: sequential processes organized into a static network with data flowing

between them.

 Task-graph: A directed acyclic graph of threads or processes is defined in software and

mapped onto the elements of a parallel computer.

 SIMD: A single stream of program instructions execute in parallel for different lanes in a

data structure. There is only one program counter for a SIMD program. This pattern

includes vector computations.

 Thread pool: The system maintains a pool of threads that are utilized dynamically to

satisfy the computational needs of a program. The pool of threads work on queues of

tasks. Work stealing is often used to enforce a more balanced load.

 Speculation: a thread or process is launched to pursue a computation, but any update to

the global state is held in reserve to be entered once the computation is verified as valid.

Coordination Patterns

 Message passing: two sided and one sided message passing

 Collective communication: reductions, broadcasts, prefix sums, scatter/gather etc.

 Mutual exclusion: mutex and locks

 Point to point synchronization: condition variables, semaphores

 Collective synchronization: e.g. barriers

 Transactional memory: transactions with roll-back to handle conflicts.







CSS 555: Evaluating Software 28

Design



Related docs
Other docs by Lingjuan Ma
Body Tissues_1_
Views: 34  |  Downloads: 0
Bivarate Data Practice
Views: 32  |  Downloads: 0
BIS 200 mm Tool_Technology Sale Presentation
Views: 44  |  Downloads: 0
BIOS Data for OS Booting
Views: 28  |  Downloads: 0
Biomolecular Spectroscopy
Views: 26  |  Downloads: 0
BINARY DATA
Views: 28  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!