Docstoc

Patterns for Parallel Computing

Document Sample
Patterns for Parallel Computing Powered By Docstoc
					Patterns for Parallel Computing
David Chou
david.chou@microsoft.com blogs.msdn.com/dachou

> Outline

An architectural conversation
• • • • Concepts Patterns Design Principles Microsoft Platform

> Concepts

Why is this interesting?
• • • • • • • Amdahl’s law (1967) Multi-core processors Virtualization High-performance computing Distributed architecture Web–scale applications Cloud computing

 Paradigm shift!

> Concepts

Parallel Computing == ??
• Simultaneous multi-threading (Intel HyperThreading, IBM Cell
microprocessor for PS3, etc.)

• Operating system multitasking (cooperative, preemptive; symmetric multiprocessing, etc.)

• • • •

Server load-balancing & clustering (Oracle RAC, Windows HPC Server, etc.) Grid computing (SETI@home, Sun Grid, DataSynapse, DigiPede, etc.) Asynchronous programming (AJAX, JMS, MQ, event-driven, etc.) Multi-threaded & concurrent programming (java.lang.Thread,
System.Thread, Click, LabVIEW, etc.)

• Massively parallel processing (MapReduce, Hadoop, Dryad, etc.)  Elements and best practices in all of these

> Patterns

Types of Parallelism
• • • • • • • Bit-level parallelism (microprocessors) Instruction-level parallelism (compilers) Multiprocessing, multi-tasking (operating systems) HPC, clustering (servers) Multi-threading (application code) Data parallelism (massive distributed databases) Task parallelism (concurrent distributed processing)

 Focus is moving “up” the technology stack…

> Patterns > HPC, Clustering

Clustering Infrastructure for High Availability

> Patterns > HPC, Clustering

High-Performance Computing
Browser Browser

Web/App Server

Web/App Server

A-Z

A-Z

> Patterns > HPC, Clustering > Example

Microsoft.com
• Infrastructure and Application Footprint
– 7 Internet data centers & 3 CDN partnerships – 120+ Websites, 1000’s apps and 2500 databases – 20-30+ Gbits/sec Web traffic; 500+ Gbits/sec download traffic

• 2007 stats (microsoft.com):
– – – – – #9 ranked domain in U.S; 54.0M UU for 36.0% reach #5 site worldwide; reaching 287.3M UU 15K req/sec, 35K concurrent connections on 80 servers 600 vroots, 350 IIS Web apps & 12 app pools Windows Server 2008, SQL Server 2008, IIS7, ASP.NET 3.5

• 2007 stats (Windows Update):
– 350M UScans/day, 60K ASP.NET req/sec, 1.5M concurrent connections – 50B downloads for CY 2006 – Update Egress – MS, Akamai, Level3 & Limelight (50-500+ Gbits/sec)

> Patterns > Multi-threading

Multi-threaded programming
Sequential Concurrent

Execution Time

Execution Time

> Patterns > Multi-threading

Multi-threading
• Typically, functional decomposition into individual threads • But, explicit concurrent programming brings complexities
– Managing threads, semaphores, monitors, dead-locks, race conditions, mutual exclusion, synchronization, etc.

• Moving towards implicit parallelism
– Integrating concurrency & coordination into mainstream programming languages – Developing tools to ease development – Encapsulating parallelism in reusable components – Raising the semantic level: new approaches

> Patterns > Multi-threading > Example

Photobucket
• 2007 stats:
– – – – – – – +30M searches processed / day 25M UU/month in US, +46M worldwide +7B images uploaded +300K unique websites link to content #31 top 50 sites in US #41 top 100 sites worldwide 18th largest ad supported site in US
API PIC Thumbs Content Content Pods Content Pods Content Pods Pods Web Browser

• Scaling the performance:
– Browser handles concurrency – Centralized lookup – Horizontal partitioning of distributed content

Images Content Content Pods Content Pods Content Pods Pods

Albums Content Content Pods Content Pods Content Pods Pods

Groups Content Content Pods Content Pods Content Pods Pods

Metadata

Membersh ip

> Patterns > Data Parallelism

Data Parallelism
• Loop-level parallelism • Focuses on distributing the data across different parallel computing nodes
– Denormalization, sharding, horizontal partitioning, etc.

• Each processor performs the same task on different pieces of distributed data • Emphasizes the distributed (parallelized) nature of the data • Ideal for data that is read more than written (scale vs. consistency)

> Patterns > Data Parallelism

Parallelizing Data in Distributed Architecture
Browser Browser Browser

Web/App Server

Web/App Server

Web/App Server

Web/App Server

Web/App Server

Index

A-Z

A-M

N-Z

A-G

H-M

N-S

T-Z

> Patterns > Data Parallelism > Example

Flickr
• 2007 stats:
– Serve 40,000 photos / second – Handle 100,000 cache operations / second – Process 130,000 database queries / second

• Scaling the “read” data:
– Data denormalization – Database replication and federation
• Vertical partitioning • Central cluster for index lookups • Large data sets horizontally partitioned as shards • Grow by binary hashing of user buckets

> Patterns > Data Parallelism > Example

MySpace
• 2007 stats:
– – – – – 115B pageviews/month 5M concurrent users @ peak +3B images, mp3, videos +10M new images/day 160 Gbit/sec peak bandwidth

• Scaling the “write” data:
– MyCache: distributed dynamic memory cache – MyRelay: inter-node messaging transport handling +100K req/sec, directs reads/writes to any node – MySpace Distributed File System: geographically redundant distributed storage providing massive concurrent access to images, mp3, videos, etc. – MySpace Distributed Transaction Manager: broker for all non-transient writes to databases/SAN, multi-phase commit across data centers

> Patterns > Task Parallelism

Task Parallelism
• Functional parallelism • Focuses on distributing execution processes (threads) across different parallel computing nodes • Each processor executes a different thread (or process) on the same or different data • Communication takes place usually to pass data from one thread to the next as part of a workflow • Emphasizes the distributed (parallelized) nature of the processing (i.e. threads) • Need to design how to compose partial output from concurrent processes

> Patterns > Task Parallelism > Example

Google
• 2007 stats:
– – – – – +20 petabytes of data processed / day by +100K MapReduce jobs 1 petabyte sort took ~6 hours on ~4K servers replicated onto ~48K disks +200 GFS clusters, each at 1-5K nodes, handling +5 petabytes of storage ~40 GB/sec aggregate read/write throughput across the cluster +500 servers for each search query < 500ms

• Scaling the process:
– MapReduce: parallel processing framework – BigTable: structured hash database – Google File System: massively scalable distributed storage

> Design Principles

Parallelism for Speedup

> Design Principles

Parallelism for Scale-up
• Sequential  Parallel
– Convert sequential and/or single-machine program into a form in which it can be executed in a concurrent, potentially distributed environment

• Over-decompose for scaling
– Structured multi-threading with a data focus

• Relax sequential order to gain more parallelism
– Ensure atomicity of unordered interactions

• Consider data as well as control flow
– Careful data structure & locking choices to manage contention – User parallel data structures – Minimize shared data and synchronization

• Continuous optimization

> Design Principles > Example

Amazon
• Principles for Scalable Service Design (Werner Vogels, CTO, Amazon)
– – – – – –
– – – – –

Autonomy Asynchrony Controlled concurrency Controlled parallelism Decentralize Decompose into small well-understood building blocks Failure tolerant Local responsibility Recovery built-in Simplicity Symmetry

> Microsoft Platform

Parallel computing on the Microsoft platform
• • • • • Concurrent Programming (.NET 4.0 Parallel APIs) Distributed Computing (CCR & DSS Runtime, Dryad) Cloud Computing (Azure Services Platform) Grid Computing (Windows HPC Server 2008) Massive Data Processing (SQL Server “Madison”)

 Components spanning a spectrum of computing models

> Microsoft Platform > Concurrent Programming

.NET 4.0 Parallel APIs

• • • •

Task Parallel Library (TPL) Parallel LINQ (PLINQ) Data Structures Diagnostic Tools

> Microsoft Platform > Distributed Computing

CCR & DSS Toolkit
• Concurrency & Coordination Runtime • Decentralized Software Services • Supporting multi-core and concurrent applications by facilitating asynchronous operations • Dealing with concurrency, exploiting parallel hardware and handling partial failure • Supporting robust, distributed applications based on a light-weight state-driven service model • Providing service composition, event notification, and data isolation

> Microsoft Platform > Distributed Computing

sed, awk, grep, etc. legacy code C# PSQL Perl C++ Queries C#

SSIS Vectors C++

Distributed Shell

DryadLINQ Dryad

SQL server

Distributed Filesystem Cluster Services

CIFS/NTFS

Windows Server

Windows Server

Windows Server

Windows Server
28

• General-purpose execution environment for distributed, dataparallel applications • Automated management of resources, scheduling, distribution, monitoring, fault tolerance, accounting, etc. • Concurrency and mutual exclusion semantics transparency • Higher-level and domain-specific language support

Job queueing, monitoring

Dryad

Machine Learning

> Microsoft Platform > Cloud Computing

Azure Services Platform
ASP.NET ASP.NET (Web Role) ASP.NET (Web Role) ASP.NET (Web Role) ASP.NET
(Web Role) (Web Role)

Web Svc ASP.NET (Web Role) ASP.NET (Web Role) ASP.NET (Web Role) ASP.NET

Jobs
(Worker ASP.NET ASP.NET Role) (Web Role) ASP.NET (Web Role) ASP.NET (Web Role) (Web Role)

(Web Role) (Web Role)

Table Storage Service

Blob Storage Service

Cache Service

Queue Service

Application Data
SQL Data Services

Application Data
BI Services

Ref erence Data

Conn. Bindings Service Bus

Identities & Roles
Access Control Service

Service Orch.
Workflow Service

• • • •

Internet-scale, highly available cloud fabric Auto-provisioning 64-bit compute nodes on Windows Server VMs Massively scalable distributed storage (table, blob, queue) Massively scalable and highly consistent relational database

> Microsoft Platform > Grid Computing

Windows HPC Server

• #10 fastest supercomputer in the world (top500.org)
– 30,720 cores – 180.6 teraflops – 77.5% efficiency

• • • •

Image multicasting-based parallel deployment of cluster nodes Fault tolerance with failover clustering of head node Policy-driven, NUMA-aware, multicore-aware, job scheduler Inter-process distributed communication via MS-MPI

> Microsoft Platform > Massive Data Processing

SQL Server “Madison”

• Massively parallel processing (MPP) architecture • +500TB to PB’s databases • “Ultra Shared Nothing” design
– IO and CPU affinity within symmetric multi-processing (SMP) nodes – Multiple physical instances of tables w/ dynamic re-distribution
• Distribute / partition large tables across multiple nodes • Replicate small tables • Replicate + distribute medium tables

> Resources

For More Information
• Architect Council Website (blogs.msdn.com/sac)
– This series (blogs.msdn.com/sac/pages/council-2009q2.aspx)

• • • • • •

.NET 4.0 Parallel APIs (msdn.com/concurrency) CCR & DSS Toolkit (microsoft.com/ccrdss) Dryad (research.microsoft.com/dryad) Azure Services Platform (azure.com) SQL Server “Madison” (microsoft.com/madison) Windows HPC Server 2008 (microsoft.com/hpc)

Thank you!
david.chou@microsoft.com blogs.msdn.com/dachou

© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.


				
DOCUMENT INFO
Shared By:
Stats:
views:936
posted:6/11/2009
language:English
pages:29