Grid Performance Evaluation and Modelling by hcj


									                  Grid Performance Evaluation and Modelling


The Grid is a highly heterogeneous environment that can potentially provide seamless, fast and
efficient access to range of resources that are distributed over a wide area. At the moment there
are no commonly accepted ways to systematically measure and understand the types of metric
that can make Grid performance evaluation and modelling an engineering discipline, rather than
an ad hoc exercise as is the case currently. After reviewing some ongoing efforts, we first break
down the Grid into manageable components. Then, for each constituent component, we describe
it characteristics and define metrics that can be used for understanding its performance.

1        Introduction

Performance evaluation and modelling of computer-based systems has always been, to say the
least, a contentious and problematic exercise. A tension often arises due to the varying
stakeholder viewpoints. For example, application scientists are normally only interested in the
fast and reliable execution of their application. Whereas, the systems operators typically desire a
system that is easy to configure, manage, and maintain. Alternatively, the vendor often wants to
highlight the good aspect of their machine and minimise any bad facets. Other stakeholders, such
as a funding body may have other criteria, like costs, or the reliability of the vendor, and so on.
Consequently, attempts to create “standard” performance measurements and methodologies to
date have only been partially successful. It should be noted here that these attempts have only
been made on sequential or homogeneous parallel systems. The evaluation and modelling of the
Grid will introduce a raft of new concerns and issues, as well as providing an effective way to
address the difficult issues around the functioning of grid applications and middleware.

1.1 The Necessity of Performance Evaluation and Modelling: Motivation

In the past performance evaluation and modelling of computer system was performed in the
main, for one of three reasons:
1.   To help purchase the best system to execute a suite of applications,
2.   To understand architectural concerns and look at ways of enhancing future systems,
3.   To help optimise an applications performance, based on knowledge of how the application
     should execute on that architecture.

The evaluation and modelling of the Grid, in some way changes our fundamental reasons for
undertaking this task; as we are no longer looking at a single system, but rather considering
potentially large collections of resources, both hardware and software, working together to
provide services to an application. In addition, we are no longer considering a quiet and
controlled system in which to do our performance evaluation; now we are forced to cope with a
wide-area distributed system, where we may not have exclusive control over the components,
but also they may fail or become a bottleneck while the evaluation tests are being carried out.

2         Ongoing Efforts

2.1       The GGF Grid Benchmark Research Group
The GB-RG [6] plans to advance the efficient use of grids by defining metrics to measure the
performance of grid applications and architectures and rate the functionality and efficiency of
grid architectures. These metrics should assist good engineering practices by allowing alternative
implementations to be compared quantitatively. The defined tasks will be specified as paper-and-
pencil benchmarks that can be implemented, in principle, using any of the existing and future
grid environments. The BWG also aim to provide some reference implementations of the tasks
that can be used by grid users and developers as starting points for assessing grid
implementations. It appears that this working group has made little real progress since its

2.2       IETF Benchmarking Methodology Working Group (BMWG)
The goal of the BMWG [7] is to make a series of recommendations concerning the measurement
of the performance characteristics of various internetworking technologies; further, these
recommendations may focus on the systems or services that are built from these technologies.
The BMWG are focussed on primarily on Internet networking technologies, and are not on Grid
technologies at the moment.

2.3       Grid Assessment Probes (GRASP)
As a means of attempting to provide an insight into the stability, robustness, and
performance of the Grid, the GRASP project [8] have developed a set of probes that exercise
basic grid operations by simulating simple grid applications. The probes can be run on a grid
testbed, collecting performance data such as compute times, network transfer times, and
middleware overheads. The GRASP system is currently designed to use only Globus
infrastructure and tests the following activities:
     Check for a valid grid proxy,
     Perform a basic authentication to all nodes involved in the probe,
     Validate the configuration file, if necessary,
     Check directory sizes to ensure that target directories can accommodate files to be
     Query the Globus MDS to and find available information on all nodes involved in the probe.

The GRASP project is at an early stage of development; the developers aim to produce a suite of
reprehensive Grid application benchmarks to test and evaluate Grid environments.

2.4       GridBench,
GridBench [10] is work package of the CrossGrid project [9]. The goal of GridBench is to propose
a set of performance metrics to describe the performance capacity of the Grid and its applications.
The work package aims to develop and implement GridBench, a suite of benchmarks that are
representative of typical Grid workloads. The benchmarks will be used to estimate the
performance of different Grid configurations, to identify factors that affect end-to-end application
performance, and provide application developers with initial estimates of expected application
2.5     Integrated Performance Analysis of Computer Systems (IPACS)
The IPACS Project [11] is funded by a German 'High-Performance-Computing' programme.
Within this project, grid benchmarks and the technologies for grid-adaptive applications will be
developed. The grid benchmarks will be used to analyse and to parameterise the grid
environment in the first instance. These benchmarks will be the basis for the further development
and optimisation of grid products and applications. IPACS started in the summer of 2002, no
results or downloads are available yet.

2.6     Summary

Most of the efforts described above are adapting existing techniques for benchmarking from
various disciplines to the Grid, without taking into account the unique features of the Grid.
While this is understandable, since it makes sense to leverage existing knowledge, it is unclear
whether such benchmarks can provide answers to the questions that are posed by grid
applications and systems. In the following sections, we will briefly review the state of the art in
computer system benchmarking (Section 3), the architecture of the Grid, particular features that
are unique to the Grid (Section 4), and some recommendations for grid benchmarking and
performance modelling (Section 5).

3       A Historical View of Performance Evaluation and Modelling

3.1     Introduction

A variety of benchmarks have been used as the means of assessing the performance of computer
architectures. Typically, the benchmarks can be classified into three categories:
a) Low-level, these determine the rates at which a machine can perform fundamental
   operations, such as MIPS, flops/s, and memory I/O.
b) Application kernels, these are typical core application algorithms, such as matrix operations
   or maybe an FFT.
c) Full applications, these would have all the major components of full applications, such as the
   NAS Parallel Benchmarks, which are based on computational fluid dynamics applications, or
   the GAMES [16] code.

4       The Architecture of the Grid

4.1     Characteristics and Components

4.1.1   Characteristics
One simple way to look at the Grid is as a collection of computing resources connected by
network links. In this model, the computing resources are endpoints in a graph representing the
connections between the resources; the network connecting them makes these separate resources
into a grid.

There are two major types of grid:
      a) An intra-organizational grid; this is a grid within a single administrative domain. An
         example is a tool to make use of the desktops within a single organization.
      b) An inter-organizational grid; this is a grid that crosses administrative domains. In
         addition to the issues faced by software for an intra-organization grid, software for an
         inter-organizational grid must deal with the complex issues of authentication, trust, and

The Grid differs from classical computer systems in several respects:
a) Grids are typically physically large, often spanning thousands of kilometres. This implies a
   speed-of-light delay of milliseconds (10-3 seconds), compared with the clock rate of a typical
   personal computer that is now measured in fractions of nanoseconds (10 -9 seconds). This
   leads to a different application mix than is appropriate for a single parallel computer or
   computing resource.
b) Grid resources are shared. Even if the computational services are dedicated, the network
   connecting them is typically shared with many users (it if is the Internet, with millions of
   users). Because resources are shared, the sort of experimental reproducibility based on the
   use of dedicated resources so common in computer system measurement is rarely feasible or
   realistic in the Grid.
c) Grid resources are heterogeneous. While it is possible to build a grid out of homogenous
   components (e.g., the same computing platform at all endpoints), this is uncommon.
d) Faults are a fact of life on the Grid. Grid software and algorithms must deal with faults; this
   differs greatly from the single-system case (at least in most areas of technical computing)
   where faults are rare and software and algorithms are designed under the assumption that
   faults are exceedingly rare. Benchmarks, particularly for grid usability and productivity,
   must be sensitive to how the system responds to faults.
e) Grid applications need to operate securely on distributed resources using various security
   measures (such as firewalls or using technologies such as Kerboros). The extra security
   needed in a grid, has potentially an impact and consequences on an applications
   performance, which would not affect an application on a traditional single-site platform. It is
   important that we understand the impact that of different levels of security has on grid

4.1.2    Components

A grid consists of heterogeneous hardware and software components linked together over,
potentially, a wide area via the Internet. We can categorise a grid as consisting of hardware and
software services… something generic here as to why we want this!

4.2      Hardware Services:

Can’t say much about this side Basically stuck with what’s out these. May be worth thinking
about the means of characterising these statistically at a later stage…

                         Client
                         Servers

4.3       Software Services:
         Software layers and their affect… e.g. GT + WS == GS, SOAP engines,…
         Client – client side processing used to initiate some action!
         Information Services – registration, lookup, update, and remove services and associated
         Security – authorisation, authentication, assertion…
         Communications – put/get information + data
         Batch Systems -
         Spawning – starting, stopping and removing job.
         Servers – apache, tomcat….
         Serialisation
         Runtime libraries and language…

5         Measurements and Metrics

5.1       Measurements

Thoughts: may be advocate…
         SI units -
         Hockney’s ideas -
         Is there an IETF effort?


         May      be    look    at     IETF   Benchmarking               Methodology   (bmwg)   ,

          Test set up
          Test Considerations
          Reporting Format:

          Existing definitions - use IETF where possible.

          The Meaning of
Common Definitions and discussion

6       Goals and Requirements for Grid Benchmarks

As described in Section 3, there are many well-understood techniques for understanding the
performance of a single computer system. Thus, a grid benchmark should be measure the
features of the Grid that are (roughly) independent of the performance of a computational
endpoint. That is, the characteristics measured by a grid benchmark should be (nearly)
orthogonal to those measured by existing benchmarks for single systems.

Grid benchmarks should be reproducible. Reproducibility is a hallmark of good experimental
science. Since the grid is a shared resource and in most cases cannot be completely controlled by
the benchmarker, good grid benchmarks will need to use statistical techniques to provide valid

Usability benchmarks are much harder to quantify (and to reproduce). However, such
benchmarks are needed to help improve the robustness and reliability of grid software.

7       Summary and Conclusions.


[2] Top500,
[3] ParkBench,
[4] NPB,
[5] The SPEC Benchmarks,
[6] Grid Benchmarking Research Group,
[7] IETF Benchmarking       Methodology     (BMWG),
[8] Grid Assessment Probes (GRASP),
[9] CrossGrid,
[10] GridBench,
[11] Integrated Performance Analysis of Computer Systems (IPACS), http://www.ipacs-
[12] M.A. Frumkin and L. Shabanov, Arithmetic Data Cube as a Data Intensive Benchmark NAS
     Tech                                Report                                 NAS-03-005,
[13] , R.F. Van der Wijngaart, R. Biswas, M. Frumkin, and H. Feng, Beyond the NAS Parallel
     Benchmarks: Measuring Performance of Dynamic and Grid-oriented Applications, Workshop
     on      the    Performance      Characterization    of    Algorithms,      July  2001.
[14] B. Plale, C. Jacobs, Y. Liu, C. Moad, R. Parab, and P. Vaidya, Benchmark Details of Synthetic
    Database Benchmark/Workload for Grid Resource Information, Indiana University Technical
    Report      583,    August     2003,        27      pp.
[15] C.A. Lee, C. De Matteis, J. Stepanek, and J. Wang, Cluster Performance and the Implications
     for           Distributed,            Heterogeneous             Grid          Performance,
     Heterogeneous          Computing           Workshop          2000,       pp        253-261,
[16] GAMES-UK,

To top