Grid Computing and Alternative Distributed Computing by CP6oI6v


									             Grid Computing


Alternative Distributed Computing Solutions

                                Noman Islam
                                Oct, 2007
                                NU-FAST, Karachi
The defining characteristic of a grid [1]:
 “The essence of grid computing lies in the
 efficient and optimal utilization of a wide
 range of heterogeneous, loosely coupled
 resources in an organization tied to
 sophisticated      workload     management
 capabilities or information virtualization”
 A Three Point Check List for Grids [4]

1. Coordinates resources that are not subject to
   centralized control
  –   A Grid integrates and coordinates resources and
      users that live within different control domains
2. Uses standard, open, general-purpose
   protocols and interfaces
  –   built from multi-purpose protocols and interfaces
      that address such fundamental issues as
      authentication, authorization, resource discovery,
      and resource access
 A Three Point Check List for Grids [4]

3. Deliver nontrivial qualities of service
  – Allows its constituent resources to be used in a
    coordinated fashion to deliver various qualities of
    service to meet complex user demands, so that the
    utility of the combined system is significantly greater
    than that of the sum of its parts
Grid Fills a Crucial Gap [1]
 Introduction to Cluster Computing
• A group of tightly coupled computers that work
  together closely so that in many respects they
  can be viewed as though they are a single
• They are often connected to each other through
  fast LAN
• Cluster Categories
  – High-availability (HA) clusters
  – Load-balancing clusters
  – High-performance computing (HPC) cluster
An example of Cluster
     Grid Vs Cluster Computing
• The key difference between grids and traditional clusters
  are that grids connect collections of computers which do
  not fully trust each other, or which are geographically
• Grid computing is optimized for workloads which consist
  of many independent jobs or packets of work, which do
  not have to share data between the jobs during the
  computation process. Grids serve to manage the
  allocation of jobs to computers which will perform the
  work independently of the rest of the grid cluster.
  Resources such as storage may be shared by all the
  nodes, but intermediate results of one job do not affect
  other jobs in progress on other nodes of the grid.
   Grid Vs Cluster Computing
• Grids consist of heterogeneous resources
  (integrates storage, networking, and
  computation resources) where as clusters
  have computational resources
• Clusters usually contain a single type of
  processor and operating system; grids can
  contain machines from different vendors
  running various operating systems
    Grid Vs Cluster Computing
• Grids are dynamic by their nature. Clusters
  typically contain a static number of processors
  and resources; resources come and go on the
  grid. Resources are provisioned onto and
  removed from the grid on an ongoing basis
• Grids are inherently distributed over a local,
  metropolitan, or wide-area network. Usually,
  clusters are physically contained in the same
  complex in a single location; grids can be (and
  are) located everywhere. Cluster interconnect
  technology delivers extremely low network
     Grid Vs Cluster Computing
• Grids offer increased scalability. Physical proximity and
  network latency limit the ability of clusters to scale out;
  due to their dynamic nature, grids offer the promise of
  high scalability
• But Cluster and grid computing are becoming completely
  complementary. Many grids incorporate clusters among
  the resources they manage. Indeed, a grid user may be
  unaware that his workload is in fact being executed on a
  remote cluster. And while there are differences between
  grids and clusters, these differences afford them an
  important relationship because there will always be a
  place for clusters -- certain problems will always require
  a tight coupling of processors
   Grid Vs Cluster Computing
• As networking capability and bandwidth
  advances, problems that were previously
  the exclusive domain of cluster computing
  will be solvable by grid computing. It is
  vital to comprehend the balance between
  the inherent scalability of grids and the
  performance advantages of tightly coupled
  interconnections that clusters offer
         Introduction to P2P
• P2P is a class of applications that takes
  advantage of resources-storage, cycles,
  content, human presence - available at the
  edges of the Internet
• A pure peer-to-peer network does not
  have the notion of clients or servers, but
  only equal peer nodes that simultaneously
  function as both "clients" and "servers" to
  the other nodes on the network.
               Grid Vs P2P
• Grid were motivated by the requirements of
  professional communities needing to access
  remote resources, federate datasets, and/or
  pool computers for large-scale simulations and
  data analyses. It was initially developed to
  address the needs of scientific collaborations,
  commercial interest is growing
• P2P has been popularized by grass roots,
  mass-culture file-sharing and highly parallel
  computing applications that scale in some
  instances to hundreds of thousands of nodes
                Grid Vs P2P
• Grid integrate resources that are more
  powerful, more diverse, and better connected
  than the typical P2P
  – Grid resource - cluster, storage system, database, or
    scientific instrument administered in an organized
    fashion according to some well defined policy.
• P2P often deal with intermittent participation and
  highly variable behavior.
  – Major resources are home computers.
              Grid Vs P2P
• Grid often involves only modest numbers
  of participants. The amount of activity can
  be large.
  – Early Grid implementations did NOT address
    scalability and self management as
• P2P has far larger communities
                  Grid Vs P2P
• In Grid, works have been done associated with creating
  and operating persistent, multipurpose infrastructure
  services for authentication, authorization, discovery,
  resource access, data movement...Less effort has been
  devoted to managing participation in the absence of trust
• P2P offers much scalability, fault tolerance, self-
  configuration, automatic problem determination. P2P
  system have tended to focus on the integration of simple
  resources (individual computers) by protocols. The
  persistence properties of such infrastructures are not
  specifically engineered but are rather emergent
                    Grid Vs P2P
• P2P system lacks a central point of management; this
  makes it ideal for providing anonymity. Grid
  environments, on the other hand, usually have some
  form of centralized management and security (for
  instance, in resource management or workload
• Lack of centralization means:
   – More scalable
   – More tolerant of single-point failures than grid computing
     systems. (Although grids are much more resilient than tightly
     coupled distributed systems, a grid inevitably includes some key
     elements that can become single points of failure)
• The key to building grid computing systems is finding a
  balance between decentralization and manageability --
  not an easy chore
                Grid Vs P2P
• Also, while an important characteristic of grid
  computing is that resources are dynamic, in P2P
  systems the resources are much more dynamic
  in nature and generally are more fleeting than
  resources on a grid
• A final distinction between the two systems is
  standards -- the general lack of standards in the
  P2P world contrasts with the host of standards in
  the grid universe. And, thanks to entities like the
  Global Grid Forum, the grid universe has a
  mechanism for refining existing standards and
  creating new ones
Common Object Request Broker Architecture
          Grid Vs CORBA
 – OGSA and CORBA, both are based on the
   concept of service-oriented architecture
 – CORBA assumes object orientation (after all,
   it is part of the name), but grid computing
   does not
 – There are also issues of interoperability
   among different platforms in CORBA
Distributed Computing Environment
• The Distributed Computing Environment (DCE) is a
  software system developed in the early 1990s by a
  consortium that included Apollo Computer (later part of
  Hewlett-Packard), IBM, Digital Equipment Corporation,
  and others. The DCE supplies a framework and toolkit
  for developing client/server applications. The framework
  includes a remote procedure call (RPC) mechanism
  known as DCE/RPC, a naming (directory) service, a time
  service, an authentication service, an authorization
  service and a distributed file system (DFS) known as
             Grid Vs DCE
• Not so much an architecture but an
  environment, DCE facilitates distributed
  computing; grid computing (in the form of
  OGSA) is more of an end-to-end
  architecture designed to encapsulate
  many of the intricacies of the mechanics of
  distributed computing
• We have examined Grid Computing and
  its importance at Enterprise Level
• Also an analysis of the similarities and
  differences between grid computing and
  four major distributed computing systems
• Based on the benefits of these paradigms,
  we can expect these approaches to
  eventually converge
[1] “Perspectives on grid: Grid computing -- Next-generation distributed
    computing”, Matt Haynos, Program Director, Grid Marketing and
    Strategy, IBM,

[2] Grid Vs Peer-to-Peer, Yin Chen,

[3] Wikpedia, the Free Encyclopedia,

[4] ‘What is the Grid? A Three Point Check List’, Ian Foster, 2002

To top