What is Grid Computing?
Cevat Şener
Dept. of Computer Engineering, METU
Why Do We Need?
Our computational needs are infinite,
whereas our financial resources are finite
users will always want more & more powerful computers try & utilize the potentially hundreds of thousands of computers that are interconnected in some unified way need seamless access to remote resources
February 2007 2
Evolution
P e r f o r m a n c e + Q o S
2100 2100 2100 2100
2100
2100
2100
2100
2100
Personal
February 2007
SMP, Super
Cluster
Cluster of Clusters
The Global Grid
3
What is Grid?
An infrastructure that couples Computers (e.g., PCs, clusters, ...) Software (e.g., special purpose applications) Databases (e.g., access to human genome database) Special Instruments (e.g., radio telescope) People (e.g., researchers) Across the Internet and presents them as
an unified integrated (single) resource
February 2007 4
An Analogy
“The (Computational) Grid
is analogous to Electricity (Power) Grid and the vision is to offer a dependable, consistent, pervasive, and inexpensive access to high-end resources irrespective their location of physical existence and the location of access.”
February 2007 5
The Grid Impact!
“The global computational grid is
expected to drive the economy of the 21st century similar to the electric power grid that drove the economy of the 20th century”
February 2007
6
The Internet and …
Network Network Network Network Network … Network
Internetwork Internetwork
The Internetwork
…
(The Internet)
Internetwork
February 2007
7
… The Grid
Cluster Cluster Cluster Cluster Cluster Cluster … Cluster
Cluster of Clusters Cluster of Clusters
The Cluster of Clusters
…
(The Grid)
Cluster of Clusters
February 2007
8
Grid and Web Services Standards
Started far apart in applications & technology
Have been converging
WSRF
Convergence of Core Technology Standards allows
common base for Business and Technology Services
February 2007 9
The Value of Open Standards
Distributed Computing: Grid (Globus OGSA) Applications: Web Services (SOAP, WSDL, UDDI) Operating System: Linux Information: World-wide Web (html, http, j2ee, xml) Communications: e-mail Networking: (pop3,SMTP,Mime) The Internet (TCP/IP)
February 2007 10
Standards Involved
SOA Standards WSDL UDDI BPEL WS-Profile WS-Security WS-Choreography And many others… Grid Standards OGSI Extension to WSDL WS-Resource WS-ResourceLifetime WS-ResourceProperties WS-RenewableReferences WS-ServiceGroup WS-BaseFaults
February 2007
11
Computational Grids
A network of geographically distributed
resources. Each user should have a single login account to access all resources. Resources may be owned by diverse organizations.
February 2007
12
Computational Grids
Grids are typically managed by grid
middleware (gridware). Gridware can be viewed as a special type of middleware that enable sharing and manage grid components based on user requirements and resource attributes (e.g., capacity, performance, availability…)
February 2007
13
Methods of Grid Computing
Distributed Supercomputing
High-Throughput Computing
On-Demand Computing Data-Intensive Computing
Collaborative Computing
Logistical Networking
February 2007
14
Distributed Supercomputing
Combining multiple high-capacity resources
on a computational grid into a single, virtual distributed supercomputer. Tackle problems that cannot be solved on a single system.
February 2007
15
High-Throughput Computing
Uses the grid to schedule large numbers of
loosely coupled or independent tasks, with the goal of putting unused processor cycles to work.
February 2007
16
On-Demand Computing
Uses grid capabilities to meet short-term
requirements for resources that are not locally accessible. Models real-time computing demands.
February 2007
17
Data-Intensive Computing
The focus is on synthesizing new information
from data that is maintained in geographically distributed repositories, digital libraries, and databases. Particularly useful for distributed data mining.
February 2007
18
Collaborative Computing
Concerned primarily with enabling and
enhancing human-to-human interactions. Applications are often structured in terms of a virtual shared space.
February 2007
19
Logistical Networking
Global scheduling and optimization of data
movement. Contrasts with traditional networking, which does not explicitly model storage resources in the network. Called "logistical" because of the analogy it bears with the systems of warehouses, depots, and distribution channels.
February 2007
20
Who Needs Grid Computing?
A chemist may utilize hundreds of processors
to screen thousands of compounds per hour. Teams of engineers worldwide pool resources to analyze terabytes of structural data. Meteorologists seek to visualize and analyze petabytes of climate data with enormous computational demands. ...
February 2007
21
More and More Application Areas
High Energy Physics
Biomedicine
Earth Sciences Computational Chemistry
Astronomy
Geo-Physics Financial Simulation
...
February 2007
22
An Example: LHC from EGEE
The Large Hadron Collider (LHC) located at CERN,
Geneva, Switzerland Scheduled to go into production in 2007 Will generate 10 Petabytes of information per year This information must be processed and stored somewhere It is beyond the scope of a single institution to manage this problem
February 2007 23
Grid People
Grid developers
Tool developers
Application developers End Users
System Administrators
February 2007
24
Grid Developers
Very small group.
Implementers of a grid “protocol” who
provides the basic services required to construct a grid.
February 2007
25
Tool Developers
Implement the programming models used by
application developers. Implement basic services similar to conventional computing services:
User authentication/authorization Process management Data access and communication
February 2007
26
Tool Developers
Also implement new (grid) services such as: Resource locations Fault detection Security Electronic payment
February 2007
27
Application Developers
Construct grid-enabled applications for end-
users who should be able to use these applications without concern for the underlying grid. Provide programming models that are appropriate for grid environments and services that programmers can rely on when developing (higher-level) applications.
February 2007
28
System Administrators
Balance local and global concerns.
Manage grid components and infrastructure.
Some tasks still not well delineated due to
the high degree of sharing required.
February 2007
29
Grid Architecture
Applications Diverse global services User Applications Collective services Core Services and Abstractions Resource and Connectivity protocol Fabric Local OS
February 2007 30
Workflows as Application Model
An application is developed as a workflow
containing one or more jobs Connections among jobs are all off-line through files.
DAG
February 2007
31
Workflows as Application Model
Jobs could be executed
sequentially or in parallel. A job may contain tasks interconnected through on-line MPI calls.
Sequential
February 2007
Parallel
32