The Grid - Day's End

Document Sample
The Grid - Day's End Powered By Docstoc
					                   The Grid
A Technology for Widely Distributed Supercomputing



                       By

                    Chris Day
                    100030883




                   COMP 4223
          Advanced Computer Architecture

                   Submitted to:
             Dr. Agnieszka Bogobowicz

                  April 8th 2002
Table of Contents



Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

          Definition of a Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

          Uses of Grid Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

          Sharing and Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

          The Differences in Grid Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Grid Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

          The Layered Grid Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

          The Globus Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Grid Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

          The NASA Information Power Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

          Sample Implementation - A Small Scale Grid . . . . . . . . . . . . . . . . . . . . . 13

          Seti@home . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

The Future of Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21




                                                                                                                         1
Table of Figures



Figure 1: The Layered Grid Protocol Architecture and Example . . . . . . . . . . . . . . . . 6

Figure 2: Models in the Aviation Safety Simulation . . . . . . . . . . . . . . . . . . . . . . . . . 11

Figure 3: National Air Space Simulation Environment . . . . . . . . . . . . . . . . . . . . . . . 12




                                                                                                      2
                                    Introduction


Definition of a Grid

       A Grid is a high performance widely distributed system. It allows coordinated

and flexible sharing and collaboration of resources between Virtual Organizations.

Virtual Organizations (VOs) can include individuals, institutions, or resources which

are defined by sharing rules. There are typically a large number of resources

available to a Grid, including hardware, software, data, scientific instruments, or

anything that can be connected to a computer. Grid computing combines these

resources to generate enormous computing power. For example, if there were several

processors on a Grid, their power could be combined to execute a computing-

intensive task. There are many different definitions and terms used to describe Grids.

One way to think of a Grid is as one massive virtual computer with many users.



Uses of Grid Technology

       Grids can be used for simulations, the collaboration of information, data

computations, or analysis with high-end instruments but are not limited to be used for

just one application. Details of specific examples will be given later in this report. In

most common applications for Grids, the use of processing power is usually the most

important. These are often referred to as computational Grids. These are valuable

because in many instances, the processing power of a system is not being used to its



                                                                                        3
full potential, and these cycles are wasted. For example, of this inefficiency, roughly

ninety percent of the students in the class stated that they leave their computer on all

the time. Another source of wasted computing cycles is in a resource such as a

university computer lab where there can be several machines constantly idle. While

there are many wasted cycles, it is important to note that Grids do not serve as a

source of free cycles. Unless one has complete access, a person cannot simply

connect to the Grid and perform any task they want. Grid computing, as a rule, is

about controlled sharing.



Sharing and Security

       The sharing we are concerned with is not simply file exchange, but direct

access to a computer‟s resources (for example: hardware, software, data). Obviously,

if direct access can be attained to systems resources, the sharing and security of Grids

must be highly regulated. Sharing rules must be established and controlled. Users do

not want someone else using their processing power while they need it, and they do

not want to allow access to personal files. These sharing relationships can become

very complex, and as a result, users of a Grid must be careful about what is shared,

who is allowed to share, and the circumstances they are permitted to share. These

relationships can also be dynamic, for example, with different types of access

available to students and professors. The owner of a resource is responsible for

supplying the sharing rules and they will likely limit access.




                                                                                           4
The Differences in Grid Technology

       It is a common misconception that a Grid is no different than the Internet or

that distributed computing can accomplish the same things as a Grid. What sets Grids

apart from traditional distributed computing is the direct access to resources on the

Grid. While Grid computing is primarily done using the Internet, the TCP/IP and

HTTP protocols that support the WWW could not be used for Grid technology. The

reason for this is that they use the client/server model for communicating where only

one system makes requests and only one system responds. This model does not meet

the needs nor is the philosophy of Grid applications where each system has the same

capabilities, similar to a peer-to-peer communications model. It is for this reason that

new protocols and standards need to be developed. In the next section the

requirements for such new grid architecture will be explained.




                                                                                        5
                                   Grid Architecture



       Grids require a standard protocol and syntax for sharing, just as HTTP and

HTML did for the WWW. In order for Grid architecture to be successful, it must

specify how distributed systems interact with each other to perform certain tasks. The

protocol architecture defines the basic system for users to establish, manage, and use

sharing relationships. Interoperability, the ability of one system to work with another,

is the most vital concern when designing a protocol because it defines how well the

systems interact with each other. This also improves portability of languages,

platforms, and programming environments.

The Layered Grid Architecture

           Application                          Simulation GUI (Application)



                      Collective                   Scheduling (Collective)



                                                                         Resource
               Resource                                                  management




          Connectivity


             Fabric
                                               Storage     Network      Processing
                                              resource     resource      resource


Figure 1: The layered Grid Protocol Architecture and Example




                                                                                         6
       The left hand side of Figure 1 describes a layered Grid Protocol, and the right

hand side gives an example of this.

       The fabric layer interfaces to local control, and makes the resources available

to be shared. These would include computational resources or storage resources.

       The connectivity layer describes communication and authentication protocols

to allow easy and secure communication. This enables the exchange of information

between the resources on the fabric layer.

       The resource layer builds on the communication layer for sharing single

resources. This includes things such as information and management protocols.

       The collective layer focuses on coordinating multiple resources. This layer

handles interaction across collections of resources and can perform a wide variety of

sharing operations. For example:

        Directory services

        Co-allocation, scheduling and brokering

        Monitoring & diagnostic services

        Data replication services

        Grid-enabled programming systems

        Workload management systems

        Software discovery systems

        Community authorization servers

        Collaborating services




                                                                                         7
       Finally, the Application layer includes the user applications that are in the grid

environment. Inter-grid protocols also need to be defined to allow different

organizations to share data.

       The right hand side of the Figure 1 shows a visualization of the Grid

architecture in action using a simulation example. The application layer contains the

GUI and human interface. The collective layer schedules the resources and the

resource layer is responsible for managing the resources (knowing when they are

available). The connectivity layer is responsible for connecting the resources in the

fabric layer.



The Globus Project

       Over the past few years there has been much research leading to new

developments involving Grids. There are protocols, services, and tools that precisely

meet the challenges which occur when building Grids. The Globus Project is an

organization of researchers that develop Grid protocols and tools. They have released

a software package called the Globus Toolkit to provide the tools and software

needed to build a Grid and Grid-based applications. The Globus Toolkit has become

the most widely adopted grid technology solution because it does an excellent job in

meeting the requirements for grid architecture described in the previous section. The

Globus Toolkit is open source, and many organizations, such as IBM, have

contributed to it. The toolkit deals with security, information infrastructure, resource

management, data management, communication, fault detection and portability in

Grids. The current version of the Globus Toolkit (version 2.0) includes many grid

                                                                                           8
protocols. The most important ones will be described here: (Note: More information

can be found at www.globus.org)

     The Grid Security Infrastructure (GSI) protocols build on the Transport Layer

       Security (TLS) protocol to allow single sign-on, delegation, integration with

       local security systems, and user-based trust relationships. The GSI uses public

       key encryption, X.509 certificates, and the secure socket layer (SSL).

     In Globus, the Grid Resource Information Services (GRIS) is based on the

       Lightweight Directory Access Protocol (LDAP) for a standard resource

       information model to allow querying resources for their current configurations

       and status.

     The Grid Index Information Service (GIIS) indexes all GRIS Services

       (resources) so they can be searched, if, for example, one wanted to identify all

       the available systems in a particular area.

     The Grid Resource Access and Management (GRAM) based on HTTP is

       used for allocating and controlling resources. When a job is submitted, the

       GRAM allocates the resources and manages all active jobs.

     The GridFTP, which builds on the File Transfer Protocol (FTP) is a

       management protocol for data downloads. It uses GSI for authentication and

       adds new extensions such as partial file transfer.

To improve portability, the Globus toolkit provides system-level APIs.




                                                                                       9
                              Grid Implementations

       The development of Grids started in the mid 1990‟s, but only recently has the

technology has begun to approach the theory. Now there are several implementations

of Grids in use. There are large implementations such as the National Technology

Grid and the European DataGrid and more specific Grids such as the Particle Physics

Data Grid for simulations on high-energy and nuclear physics experiments. One of

the more recent implementations is a cancer Grid that connects hospitals to allow for

collaboration and sharing of information to aid in cancer research. There is also a

Canadian grid in the very early stages of development. In the following sections a

few implementations of grids will be explained in detail.



NASA Information Power Grid

       NASA‟s Information Power Grid (IPG) is a high performance computational

Grid. It links high-end computers and resources all over the United States and is

accessible by researchers in many organizations. The open protocols and standards

defined in the Globus Toolkit are in use in IPG to provide these common grid

services, as well as defining systems of its own. Major systems such as the NASA

Wind Tunnels, the NASA Earth Observation System, and the Instrument Data

Archives, which produce data, are all linked together. This is important because

scientific collaborators must analyze large volumes of data at multiple centers, labs

and universities around the world.



                                                                                        10
       An application of the IPG is a simulation for aviation safety, which simulates

the entire commercial airspace in the United States. This is a huge simulation

involving multiple systems and sub-systems interacting with each other. First, there

must be a plane model, which includes models for the engine, wings, human crew,

airframe, stabilizer and landing gear. A simulation of the engine models, for example

would return things like graphs of altitude vs. time or thrust vs. time. Figure 2 shows

these models.




Figure 2: Models in the Aviation Safety Simulation (source:http://www.ipg.nasa.gov/)

There are different portions of this simulation developed by different teams of

researchers in different locations. For example, the engine models were developed in

California, landing gear models in Virginia, and Engine models in Ohio so these are

widely distributed resources that must be combined. The airplane models are all in a

virtual national airspace with variables like whether data, surface data, airline

                                                                                     11
schedule data, digital flight data and radar tracks. In this virtual airspace there are

22,000 (US) commercial flights per day. The whole simulation outline can be seen in

Figure 3. This large scale application is perfect for a grid because of its scheduling

and data stream management to support large scale applications such as Aviation

security. It involves executing multiple simulations across multiple resources.

Programs on the IPG typically use a GUI, and the way that jobs are submitted to the

grid will be explained in the next section




Figure 3: National Air Space Simulation Environment (source:http://www.ipg.nasa.gov/)

       To perform simulations like this, a lot of power is needed. In its current state,

the IPG has the following resources:

    About 800 CPU nodes in a half dozen SGI Origin 2000s

    1024 node O2K and a Cray SV-1 are currently being added

    Several workstations and clusters at research centers (otherwise idle SUN and

       SGI workstations)

                                                                                          12
     300 Nodes in a Condor1 pool

     Wide area network interconnects of at least 100 mbit/s

     Storage resources include 50-100 TB of archival information uniformly and

        securely accessible from all IPG systems.



        The IPG will continue to grow in size. The Earth Observation System can

generate 918 GB/day, simulations can generate many TB/day, and data from space

probes is always increasing. The data‟s lifetime is indefinite and is used repeatedly

by experts and researchers. The next objective planned to be completed for the IPG is

to make real-time analysis of experiment data possible. This would allow for human

operation in the Aviation Safety Simulation.




Sample Implementation - A Small Scale Grid

        Grids do not always have to involve high-end systems and multi-million dollar

scientific instruments. According to Sun Microsystems‟ Peter Jeffcock “Grid

computing's biggest benefits may be for very small operations with just 10-20 machines”,

(source: http://www.gridcomputingplanet.com/features/article/0,,3291_946331,00.html ). In

this example, a small scale Grid was set up in order to become familiar with a grid

environment. Due to lack of available computers and the time it takes to set up and

configure each one, only three systems were on this grid. These were not powerful


1
 Condor is a Grid technology for high throughput computing. It identifies what computers are available
and determines the best machine or machines to run jobs on.


                                                                                                         13
systems; they included a Pentium Celeron 500MHz (laptop), a Pentium 166 MHz and

an AMD K6-2 400MHz. The first step was to install the Globus Toolkit (version

2.0). The toolkit was downloaded from www.globus.org. Since the Globus Toolkit

will only run in Linux and Solaris systems, this implementation used the same Linux

distribution (Red Hat 7.2) running a version 2.4.3 kernel to avoid portability issues.

Next, the systems had to be configured to run on a Grid. Each system required a

hostname and a certificate from Globus. The certificate allows for single sign-on

access described earlier. The hostname for the systems are as follows:

            cdgrid1.dyn.dhs.org - Pentium Celeron 500MHz

            cdgrid2.dyn.dhs.org - Pentium 166 MHz

            cdgrid3.dyn.dhs.org - AMD K6-2 400MHz



       The Globus Toolkit is used in the NASA IPG so the low-level language of

sending jobs is the similar; however, the IPG usually uses GUIs. The syntax for

submitting a job in Globus is as follows:

globus-job-run <hostname> </path/executable> <arguments>

For example, to send this job to the Pentium 166MHz on the Grid,

globus-job-run cdgrid2.dyn.dhs.org –stage /bin/echo Hello World

This would run the /bin/echo program and output “Hello World”. The “-stage” option

tells the system to run the executable on the remote system, rather than on the host.




                                                                                         14
    The Globus Toolkit also defines its own language called the Resource

Specification Language (RSL). It is used to describe jobs and the resources needed to

run them.

The same program using RSL would be defined as follows:

In the file hello.rsl include

(count=1)

(executable=/bin/echo)

(arguments=”Hello World”)

This file is run with a different command called globusrun:

globusrun –s –r cdgrid3.dyn.dhs.org

This would output “Hello World” just like the first program. There are also easy ways

to manage your jobs. For example, you can check the status, kill, or retrieve a job at

any time. Although each of these systems contained different resources, once on the

grid they were all able to do the same thing. Since the Pentium 166MHz is a much

slower system by today‟s standards, an advantage would be to submit any large

computing tasks to the other two faster systems on the Grid. To submit a job to

multiple resources, an example is shown:

File name: job1

#! /bin/csh/ -f

set tools_bin = „gtk2/bin/globus-tools-path –bindir‟

echo –n “From host: ” ; $tools_bin/globus-hostname

echo $1

echo “ + “

                                                                                         15
echo $2

echo “ = “

echo “scale=4; $1+$2” | /usr/bin/bc -1



Now on the command line of the Pentium 166MHz system, use the following

commands:

globus-job-run –args 2 9 \

-: cdgrid1.dyn.dhs.org –stage ./job1 \

-: cdgrid2.dyn.dhs.org –stage ./job1



The output of running this would be:

“From host: cdgrid1.dyn.dhs.org

2 + 9 = 11

From host: cdgrid2.dyn.dhs.org

2 + 9 = 11“

       To submit a single job to multiple sources and have its work spread across

each resource, the job is usually sent to a job manager where handles scheduling. The

purpose of this experiment was to become familiar with how to submit jobs to a grid

in different ways. These are simple applications that provide the basics of a Grid.




                                                                                      16
SETI@home

           The SETI@home project is the search for extraterrestrial intelligence. It uses

the Arecibo radio telescope in Puerto Rico that can capture 35 GB of data per day and

this data needs to be processed in many steps. Anyone around the world can

download the SETI@home software which will run as a screensaver and process

some data whenever the computer is idle. SETI@home averages 35 TeraFLOPS 2

which is faster than the top 20 supercomputers combined3. It is debatable whether the

SETI@home project is a true grid. It asks people to offer their processing power for a

particular purpose. The client gets the tasks from a central server. In order for it to be

a grid under all definitions, SETI@home would connect to a grid and it would send

the tasks to any computers that were idle. Even if SETI@home is not a real grid, it is

still important an application that proves the significance of grids. These types of

distributed computing applications are examples that are beyond client/server

communication views and have a great deal in common with grid computing. While

it is not true grid technology, the popular peer-to-peer file sharing programs such as

Napster, Morpheus, or Gnutella also have a lot in common with grids. The differences

being they are typically only file sharing with no write control – that is, you can only

download and not upload.




2
    According to the SETI@home webpage at http://setiathome.ssl.berkeley.edu/
3
    As of November 2001 using the LINPACK benchmarks


                                                                                         17
                                    The Future of Grids



        The SETI@home project is a brilliant example of what grids can do by using

idle machines connected to the Internet. It suggests that Grids will be the source for

the most processing power in the future. With the increasing number of computers in

the world, there is going to be more and more wasted cycles. Think what would be

possible if all idle computers were connected. It is expected that "the next big thing

will be grid computing"(source: http://domino.ngi.ibm.com/patrick/pages/inthenews/eweek_grid.html),

according to John Patrick, IBM's vice-president for Internet strategies. It will be a

major leap forward in high performance computing as more organizations use Grids.

There are no commercial grids yet, but they are not far away. An implementation

would be similar to a power company, but instead of paying for power, you pay for

membership to a grid where you have access to its processing power (and other

resources).

        Experts from Microsoft have recently said the days of HTTP are coming to an

end because of the limitations of the client/server communications model it uses.

(Source: http://zdnet.com.com/2100-1105-845220.html). This might mean it will be

replaced by a protocol similar to one a Grids uses. It is important to remember that

Grids will not be the next generation of the Internet. Grids are additional protocols

and services that build on the IP. If something is connected to the grid, it is usually




                                                                                                      18
connected to the Internet. Due to their focus on sharing, grids are designed to work

together with other technologies, and not compete with them. Grids will also not

eliminate the need for supercomputers. For example; a grid may have several

thousand processors accessible in a Grid, but this does not mean supercomputers will

be thrown out because there will still be the need for systems with low latencies and

high communication bandwidths. In fact, most grid implementations use

supercomputers, such as in the NASA IPG.

       Grid protocols can be built into existing applications. For example; Web

browsers use TLS for authentication, and do not support things like single sign-on or

delegation. If the GSI extensions to the TLS protocol were built in to web browsers, it

would allow single sign-on to multiple web servers.




                                                                                       19
                                     Conclusion



       While Grids are still a relatively new technology, in comparison with other

technologies, they have many advantages. Grids have benefits in cost, a high degree

of scalability, low hardware requirements, low maintenance, as well as a low risk of

failure. Grids are able to connect almost anything with the standards and protocols

are being developed. It will let businesses & people send data faster, share software

easily, store a lot more information, and give them access to supercomputer

processing power.

       IBM's ASCI White, is rated at 12 TeraFLOPS and costs $110 million.

SETI@home is rated at 35 TeraFLOPS and so far it has only cost less than $500k.

Which would you rather use?




                                                                                        20
                                        Bibliography

The Grid: Blueprint for a New Computing Infrastructure, I. Foster and C. Kesselman,
Morgan Kaufmann, 1999

The Grid, A Critical Review of Current Status and Future Directions in Grid
Technology, D. Hently, 2000

The Physiology of the Grid: An Open Grid Services Architecture for Distributed
Systems Integration, I. Foster, C. Kesselman et al., 2002

The Anatomy of the Grid: Enabling Scalable Virtual Organizations, I. Foster, C,
Kesselman, et al. 2001

The Globus Project: http://www.globus.org. Accessed March 2002

Condor: http://www.nas.nasa.gov/Groups/Condor/. Accessed March 2002

NASA Information Power Grid: http://www.ipg.nasa.gov/. Accessed March 2002

SETI@home: Search for Extraterrestrial Intelligence,
http://setiathome.ssl.berkeley.edu/. Accessed March 2002

IBM Asci White:
http://www-1.ibm.com/servers/eserver/pseries/hardware/largescale/supercomputers/asciwhite/ Accessed
April 2002

LINPACK Benchmarks: http://www.netlib.org/linpack/. Accessed March 2002

Grid Forum: www.gridforum.com. Accessed March 2002

Particle Physics Data Grid: http://www.ppdg.net/. Accessed March 2002

Grid Canada: http://www.gridcanada.ca. Accessed April 2002




                                                                                                 21

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:2
posted:10/15/2011
language:English
pages:22
tlyaappjdlag tlyaappjdlag
About