Distributed Computing Utilities_ Grids and Clouds by miarja


									               I n t e r n a t i o n a l             T e l e c o m m u n i c a t i o n   U n i o n

    Distributed Computing: Utilities, Grids &
                                                                  ITU-T Technology Watch Report 9

           Terms such as ‘Cloud Computing’ have gained a lot of attention, as they are used
           to describe emerging paradigms for the management of information and computing
           resources. This report describes the advent of new forms of distributed computing,
           notably grid and cloud computing, the applications that they enable, and their
           potential impact on future standardization.


      Telecommunication Standardization Policy Division
      ITU Telecommunication Standardization Sector
                                              ITU-T Technology Watch Reports

ITU-T Technology Watch Reports are intended to provide an up-to-date assessment of promising
new technologies in a language that is accessible to non-specialists, with a view to:
  Identifying candidate technologies for standardization work within ITU.
  Assessing their implications for ITU Membership, especially developing countries.

Other reports in the series include:
   #1 Intelligent Transport System and CALM
   #2 Telepresence: High-Performance Video-Conferencing
   #3 ICTs and Climate Change
   #4 Ubiquitous Sensor Networks
   #5 Remote Collaboration Tools
   #6 Technical Aspects of Lawful Interception
   #7 NGNs and Energy Efficiency
   #8 Intelligent Transport Systems

This report was prepared by Martin Adolph. It has benefited from contributions and comments from
Ewan Sutherland and Arthur Levin.
The opinions expressed in this report are those of the authors and do not necessarily reflect the views
of the International Telecommunication Union or its membership.
This report, along with previous Technology Watch Reports, can be found at
Your comments on this report are welcome, please send them to tsbtechwatch@itu.int or join the
Technology Watch Correspondence Group, which provides a platform to share views, ideas and
requirements on new/emerging technologies.
The Technology Watch function is managed by the ITU-T Standardization Policy Division (SPD).


                                                ITU 2009

All rights reserved. No part of this publication may be reproduced, by any means whatsoever, without the
prior written permission of ITU.
                                                ITU-T Technology Watch Reports

                             Distributed Computing:
                             Utilities, Grids & Clouds
   The spread of high-speed broadband                      relationship can extend across borders and
   networks in developed countries, the                    continents.
   continual increase in computing power, and
                                                           A number of new paradigms and terms
   the growth of the Internet have changed
                                                           related to distributed computing have been
   the way in which society manages
                                                           introduced, promising to deliver IT as a
   information and information services.
                                                           service. While experts disagree on the
   Geographically distributed resources, such              precise boundaries between these new
   as storage devices, data sources, and                   computing models, the following table
   supercomputers, are interconnected and                  provides       a      rough      taxonomy.
   can be exploited by users around the world
   as single, unified resource. To a growing
                                                           New            New Services       New or
   extent, repetitive or resource-intensive IT             Computing                         enhanced
   tasks can be outsourced to service                      Paradigms                         Features
   providers, which execute the task and often
   provide the results at a lower cost. A new               Cloud         Software as a     Ubiquitous
                                                             computing      Service (SaaS)     access
   paradigm is emerging in which computing is
   offered as a utility by third parties whereby            Edge          Infrastructure    Reliability
   the user is billed only for consumption.                  computing      as a Service
                                                                                              Scalability
   This    service-oriented     approach    from                            (IaaS)
                                                            Grid
   organizations offering a large portfolio of                                                Virtualization
                                                             computing     Platform as a
   services can be scalable and flexible.                                   Service (PaaS)    Exchangeabil-
                                                            Utility
                                                                                               ity / Location
   This report describes the advent of new                   computing     Service-
   forms of distributed computing, notably grid                             Oriented
   and cloud computing, the applications that                               Architecture      Cost-
                                                                            (SOA)              effectiveness
   they enable, and their potential impact on
   future    standardization.  The     idea    of          It is difficult to draw lines between these
   distributing resources within computer                  paradigms: Some commentators say that
   networks is not new. It dates back to                   grid, utility and cloud computing refer to
   remote job entry on mainframe computers                 the same thing; others believe there are
   and the initial use of data entry terminals.            only subtle distinctions among them, while
   This was expanded first with minicomputers,             others would claim they refer to completely
   then with personal computers (PCs) and                  different phenomenon. 2 There are no clear
   two-tier client-server architecture. While              or standard definitions, and it is likely that
   the PC offered more autonomy on the                     vendor A describes the feature set of its
   desktop, the trend is moving back to client-            cloud solution differently than vendor B.
   server architecture with additional tiers, but          The    new      paradigms   are   sometimes
   now the server is not in-house.                         analogized to the electric power grid, which
   Not only improvements in computer                       provides universal access to electricity and
   component      technology   but     also   in           has had a dramatic impact on social and
   communication protocols paved the way for               industrial development. 3 Electric power
   distributed computing. Networks based on                grids are spread over large geographical
   Systems Network Architecture (SNA),                     regions, but form a single entity, providing
   created by IBM in 1974, and on ITU-T’s                  power to billions of devices and customers,
   X.25, approved in March 1976 1 , enabled                in a relatively low-cost and reliable
   large-scale    public  and    private    data           fashion. 4 Although owned and operated by
   networks. These were gradually replaced by              different     organizations    at     different
   more efficient or less complex protocols,               geographical locations, the components of
   notably    TCP/IP.    Broadband     networks            grids appear highly heterogeneous in their
   extend     the    geographical    reach    of           physical characteristics. Its users only
   distributed computing, as the client-server             rarely know about the details of operation,

Distributed Computing: Utilities, Grids & Clouds (March 2009)                                                   1
                                               ITU-T Technology Watch Reports
       Figure 1: Stack of a distributed system

                 Clients (e.g., web browser, and other locally installed software, devices)

                    Middleware services (e.g., for load balancing, scheduling, billing)

          Resource entity 1     Resource entity 2      Resource entity 3        Resource entity n
          (e.g., application      (e.g., virtual        (e.g., database,
                server)             system)                 storage)

                                         Resource interconnecter

                                             Shared resources

or the location of the resources they are
                                                          Figure 1 outlines a possible composition of
                                                          a distributed system. Similar system stacks
In general terms, a distributed system is “is             have been described, e.g., specifically for
a collection of independent computers that                clouds 8 and grids 9 , and in a simplified stack
appears to its users as a single coherent                 with three layers 10 : application layer,
system” (Andrew S. Tanenbaum) 5 . A                       mediator (=resource interconnecter), and
second description of distributed systems                 connectivity layer (=shared resources).
by Leslie Lamport points out the importance
                                                          The client layer is used to display
of considering aspects such as reliability,
                                                          information, receive user input, and to
fault tolerance and security when going
                                                          communicate with the other layers. A web
distributed: “You know you have a
                                                          browser, a Matlab computing client or an
distributed system when the crash of a
                                                          Oracle database client suggest some of the
computer you’ve never heard of stops you
                                                          applications that can be addressed.
from getting any work done.” 6
                                                          A transparent and network-independent
Even without a clear definition for each of
                                                          middleware layer plays a mediating role: it
the distributed paradigms: clouds and grids
                                                          connects clients with requested and
have been hailed by some as a trillion dollar
                                                          provisioned resources, balances peak loads
business opportunity. 7
                                                          between multiple resources and customers,
Shared resources                                          regulates the access to limited resources
                                                          (such     as   processing    time    on   a
The main goal of a distributed computing
                                                          supercomputer), monitors all activities,
system is to connect users and IT resources               gathers statistics, which can later be used
in a transparent, open, cost-effective,
                                                          for billing and system management. The
reliable and scalable way.                                middleware has to be reliable and always
The resources that can be shared in grids,                available. It provides interfaces to over-
clouds and other distributed computing                    and underlying layers, which can be used
systems include:                                          by programmers to shape the system
                                                          according to their needs. These interfaces
      Physical resources
                                                          enable the system to be scalable,
          o Computational power
                                                          extensible, and to handle peak loads, for
          o Storage devices
                                                          instance during the holiday season (see Box
          o Communication capacity
    Virtual resources, which can be
                                                          Different resources can be geographically
   exchanged and are independent from its
                                                          dispersed or hosted in the same data center.
   physical location; like virtual memory
                                                          Furthermore, they can be interconnected.
          o Operating systems
                                                          Regardless of the architecture of the
          o Software and licenses
                                                          resources, they appear to the user/client as
          o Tasks and applications
                                                          one entity. Resources can be formed into
          o Services

                                                  ITU-T Technology Watch Reports
         Box 1: Amazon.com holiday sales 2002-2008

                                   2002      2003     2004       2005     2006      2007     2008
          Number     of   items    1.7m      2.1m     2.8m       3.6m     4m        5.4m     6.3m
          ordered on peak day
          Average number of        20        24       32         41       46        62.5     72.9
          items   ordered   per
          second on peak day

         Amazon.com, one of the world’s largest online retailers, announced that 6.3 million
         items were ordered on the peak day of the holiday season on 15 December 2008 – a
         multiple of the items sold on an ordinary business day. This is 72.9 items per second
         on average.
         Source: Amazon.com press releases, 2002-2008

   virtual organizations, which again can make               nuclear research, have a production which
   use of other resources.                                   involves more than 150,000 daily jobs sent
                                                             to the EGEE infrastructure and generates
   Provided that the service meets the
                                                             hundreds of terabytes of data per year. This
   technical specifications defined in a Service
                                                             is done in collaboration with the Open
   Level Agreement, for some users the
                                                             Science Grid (OSG 14 ) project in the USA
   location of the data is not an issue.
                                                             and the Nordic Data Grid Facility (NDGF 15 ).
   However, the users of distributed systems
   need to consider legal aspects, questions of              The CERN grid is also used to support
   liability  and     data    security,   before             research communities outside the field of
   outsourcing data and processes. These                     HEP. In 2006, the ITU-R Regional Radio
   issues are addressed later in this report.                Conference (RRC06 16 ) established a new
                                                             frequency plan for the introduction of digital
   Grid computing                                            broadcasting in the VHF (174-230 MHz) and
   Grid computing enables the sharing,                       UHF (470-862 MHz) bands. The complex
   selection, and aggregation by users of a                  calculations involved required non-trivial
   wide variety of geographically distributed                dependable computing capability. The tight
   resources owned by different organizations                schedule at the RRC06 imposed very
   and is well-suited for solving IT resource-               stringent time constraints for performing a
   intensive problems in science, engineering                full set of calculations (less than 12 hours
   and commerce.                                             for an estimate of 1000 CPU/hours on a 3
                                                             GHz PC).
   Grids are very large-scale virtualized,
   distributed computing systems. They cover                 The ITU-R developed and deployed a client-
   multiple administrative domains and enable                server distributed system consisting of 100
   virtual organizations. 11 Such organizations              high speed (3.6 GHz) hyper-thread PCs,
   can share their resources collectively to                 capable of running 200 parallel jobs. To
   create an even larger grid.                               complement the local cluster and to provide
                                                             additional flexibility and reliability to the
   For instance, 80,000 CPU cores are shared
                                                             planning system it agreed with CERN to use
   within EGEE (Enabling Grids for E-sciencE),
                                                             resources from the EGEE grid infrastructure
   one of the largest multi-disciplinary grid
                                                             (located at CERN and other institutions in
   infrastructure in the world. This brings
                                                             Germany, Russia, Italy, France and Spain).
   together more than 10,000 users in 140
   institutions (300 sites in 50 countries) to               UNOSAT 17 is a humanitarian initiative
   produce a reliable and scalable computing                 delivering satellite solutions to relief and
   resource available to the European and                    development organizations within and
   global research community. 12 High-energy                 outside the UN system for crisis response,
   physics (HEP) is one of the pilot application             early recovery and vulnerability reduction.
   domains in EGEE, and is the largest user of               UNOSAT uses the grid to convert
   the grid infrastructure. The four Large                   uncompressed       satellite  images     into
   Hadron Collider (LHC) experiments at                      JPEG2000 ECW 18 files. UNOSAT has already
   CERN 13 , Europe’s central organization for               been involved in a number of joint activities

Distributed Computing: Utilities, Grids & Clouds (March 2009)                                                 3
                                          ITU-T Technology Watch Reports
      Box 2: Folding@home: What is protein folding and how is folding linked to

      Proteins are biology’s workhorses, its “nanomachines.” Before proteins can carry
      out these important functions, they assemble themselves, or “fold.” The process
      of protein folding, while critical and fundamental to virtually all of biology, in
      many ways remains a mystery. Moreover, when proteins do not fold correctly (i.e.
      “misfold”), there can be serious consequences, including many well known
      diseases, such as Alzheimer’s, Mad Cow (BSE/CJD), Huntington’s, Parkinson’s,
      and many cancers.
      Folding@home uses distributed computing to simulate problems millions of times
      more challenging than previously achieved, by interconnecting idle computer
      resources of individuals from throughout the world, represented as red dots in the
      Figure above (May 2008). More than 400,000 CPUs are active, corresponding to a
      performance of 4.5 PFLOPS.
      Source: http://folding.stanford.edu/

with ITU, particularly in providing satellite     3,500 CPUs operating in its data centers in
imagery for humanitarian work 19 .                four countries to carry out derivative trades,
                                                  which     rely     on    making     numerous
In volunteer computing, individuals donate
                                                  calculations based on future events, and
unused or idle resources of their computers
                                                  risk analysis, which also looks to the future,
to distributed computing projects such as
                                                  calculating risks based on available
SETI@home 20 , Folding@home 21 (see Box 2)
                                                  information 25 . The German shipyard FSG 26
and LHC@home 22 . A similar mechanism has
                                                  uses     high      performance     computing
also been implemented by ITU-R utilizing
                                                  resources to solve complex and CPU-
idle PCs of ITU’s staff to carry out the
                                                  intensive calculations to create individual
monthly compatibility analysis of HF
                                                  ship designs in a short time. On-demand
broadcasting schedules at nighttime.
                                                  access to resources, which are not available
The resources of hundreds and thousands           locally   or    which    are   only   needed
PCs are organized with the help of                temporarily, reduces cost of ownership and
middleware systems. The Berkeley Open             reduces technical and financial risks in the
Infrastructure for Network Computing              ship design. By increasing the availability of
(BOINC 23 ) is the most widely-used               computing resources and helping to
middleware in volunteer computing made            integrate data, grid computing enables
available to researchers and their projects.      organizations to address problems that
                                                  were previously too large or too complex
Grid technology has emerged from the
                                                  for them to handle alone. Other commercial
scientific and academic communities and
                                                  applications of grid computing can be found
entered the commercial world. For instance,
                                                  in logistics, engineering, pharmaceuticals
the world’s largest company and banking
                                                  and the ICT sector. 27
group 24 HSBC uses a grid with more than

                                                ITU-T Technology Watch Reports
   Utility computing                                       However, in many cases it proves useful to
                                                           employ data centers close to the customer,
   The shift from using grids for non-
                                                           for example to ensure low rates of latency
   commercial scientific applications to using
                                                           and packet loss in content delivery
   them in processing-intensive commercial
                                                           applications. For example, content delivery
   applications led to also using distributed
                                                           providers such as Akamai 29 or Limelight
   systems for less challenging and resource-              Networks 30 built their networks of data
   demanding tasks.                                        centers around the globe, and interconnect
   The concept of utility computing is simple:             them with high-speed fiber-optic backbones.
   rather than operating servers in-house,                 These are directly connected to user access
   organizations subscribe to an external                  networks, in order to deliver to a maximum
   utility computing service provider and pay              of users simultaneously, while minimizing
   only for the hardware and software                      the path between the user and the desired
   resources they use. Utility computing relies            content.
   heavily on the principle of consolidation,
                                                           Cloud computing
   where physical resources are shared by a
   number of applications and users. The                   Over the years, technology and Internet
   principal resources offered include, but are            companies such as Google, Amazon,
   not     limited  to,    virtual   computing             Microsoft and others, have acquired a
   environments (paid per hour and data                    considerable expertise in operating large
   transfer), and storage capacity (paid per               data centers, which are the backbone of
   GB or TB used).                                         their businesses. Their know-how extends
                                                           beyond physical infrastructure and includes
   It is assumed that in-house data centers
                                                           experience with software, e.g., office suites,
   are idle most of the time due to over-
                                                           applications for process management and
   provisioning. Over-provisioning is essential
                                                           business intelligence, and best practices in
   to be sure they can handle peak loads (e.g.,
                                                           a range of other domains, such as Internet
   opening of the trading day or during holiday
                                                           search,     maps,     email    and      other
   shopping seasons), including unanticipated
                                                           communications applications. In cloud
   surges in demand. Utility computing allows
                                                           computing, these services are hosted in a
   companies to pay only for the computing
                                                           data center and commercialized, so that a
   resources they need, when they need
                                                           wide range of software applications are
   them. 28 It also creates markets for resource
                                                           offered by the provider as a billable service
   owners to sell excess capacities, and
                                                           (Software as a Service, SaaS) and no
   therefore make their data centers (and
                                                           longer need to be installed on the user’s
   business) more profitable. The example of
                                                           PC. 31    For example, instead of Outlook
   online retailer Amazon was mentioned in
                                                           stored on the PC hard drive, Gmail offers a
   Box 1. To increase efficiency, one Amazon
                                                           similar service, but the data is stored on
   server can host, in addition to a system
                                                           the providers’ servers and accessed via a
   managing the company’s e-commerce
                                                           web browser.
   services, multiple other isolated computing
   environments used by its customers. These               For small and medium-sized enterprises,
   virtual      machines       are      software           the ability to outsource IT services and
   implementations of ‘real’ computers that                applications not only offers the potential to
   can be customized according to the                      reduce overall costs, but also can lower the
   customers’     needs:    processing    power,           barriers to entry for many processing-
   storage capacity, operating system (e.g.,               intensive activities, since it eliminates the
   Linux, MS Windows), software, etc.                      need for up-front capital investment and
                                                           the necessity of maintaining dedicated
   With     the    increasing     availability of
                                                           infrastructure. Cloud providers gain an
   broadband networks in many countries,
                                                           additional source of revenue and are able to
   some computer utility providers do not
                                                           commercialize their expertise in managing
   necessarily need to be geographically
                                                           large data centers.
   distributed or in close proximity to clients:
   providers tend to build their data centers in           One main assumption in cloud computing
   areas with the lowest costs, e.g., for                  consists of infinite computing resources
   electricity, real estate, etc. and with access          available on demand and delivered via
   to renewable energy (e.g. hydroelectric).               broadband. However that is not always the

Distributed Computing: Utilities, Grids & Clouds (March 2009)                                               5
                                            ITU-T Technology Watch Reports
case. Problems faced by users in developing           Amazon Web Services (AWS)
countries include the high cost of software          provide companies of all sizes with an
and hardware, a poor power infrastructure,           infrastructure platform in the cloud,
and limited access to broadband. Low-cost            which includes computational power,
computing devices 32 equipped with free and          storage, and other infrastructure
open source software might provide a                 services. 40 The AWS product range
solution for the first problem. Although the         includes EC2 (Elastic Compute Cloud), a
number of broadband Internet subscribers             web service that provides computing
has grown rapidly worldwide, developed               capacity in the cloud, and S3 (Simple
economies still dominate subscriptions, and          Storage Service), a scalable storage for
the gap in terms of penetration in                   the Internet, that can be used to store
developed and developing countries is                and retrieve any amount of data, at any
widening 33 . Internet users without                 time, from anywhere on the web.
broadband access are disadvantaged with
                                                      Google App Engine is a platform for
respect to broadband users, as they are
                                                     building and hosting web applications on
unable to use certain applications, e.g.,
                                                     infrastructure operated by Google. The
video and audio streaming, online backup
                                                     service is currently in “preview”,
of photos and other data. Ubiquitous and
                                                     allowing developers to sign up for free
unmetered access to broadband Internet is
                                                     and to use up to 500MB of persistent
one of the most important requirements for
                                                     storage and enough CPU and bandwidth
the success of cloud computing.
                                                     for about 5 million page views a
Applications available in the cloud include          month. 41
software suites that were traditionally
                                                      Salesforce.com is a vendor of
installed on the desktop and can now be
                                                     Customer Relationship Management
found in the cloud, accessible via a web
                                                     (CRM) solutions, which it delivers using
browser (e.g., for word processing,
                                                     the software as a service model. CRM
communication, email, business intelligence
                                                     solutions include applications for sales,
applications, or customer relationship
                                                     service and support, and marketing.
management). This paradigm may save
                                                     Force.com is a Platform-as-a-Service
license fees, costs for maintenance and
                                                     product of the same vendor that allows
software updates, which makes it attractive
                                                     external developers to create add-on
to small businesses and individuals. Even
                                                     applications that integrate into the CRM
some large companies have adopted cloud
                                                     applications and to host them on the
solutions with the growing capacities,
                                                     vendor’s infrastructure. 42
capabilities and success of the service
providers. Forrester Research suggests that           The Azure Services Platform (Azure)
cloud-based email solutions would be less            is a cloud services platform hosted in
expensive than on-premise solutions for up           Microsoft data centers, which provides
to 15,000 email accounts. 34       Another           an operating system and a set of
approach is to outsource certain tasks to            developer services that can be used
the cloud, e.g., spam and virus filtering,           individually or together. After
and to keep other tasks in the corporate             completing its “Community Technology
data center, e.g., the storage of the                Preview” launched in October 2008, the
mailbox.                                             services will be priced and licensed
                                                     through a consumption-based model. 43
Other typical services offered include web
services for search, payment, identification      While there are different pricing models,
and mapping.                                      so-called    consumption-based    models,
                                                  sometimes referred to as “Pay As You Go”
Utility and cloud providers                       (PAYG), are quite popular and measure the
The list of providers of utility and cloud        resources used to determine charges, e.g.,
computing services is growing steadily.               Computing time, measured in
Beside many smaller providers specialized            machine hours
cloud and grid services, such as 3tera 35 ,           Transmissions to and from the data
FlexiScale 36 , Morph Labs 37 , RightScale 38 ,      center, measured in GB
are some of the best known names in web               Storage capacity, measured in GB
and enterprise computing, of which three              Transactions, measured as
(still) have their core activities in other          application requests
areas (online retail, Internet search,
software) are: 39

                                                ITU-T Technology Watch Reports
   In these types of arrangements, customers               Access and usage restrictions: In
   are not tied to monthly subscription rates,             addition to privacy concerns, the possibility
   or other advance payments; they pay only                of storing and sharing data in clouds raises
   for what they use.                                      concerns about copyright, licenses, and
                                                           intellectual property. Clouds can be
   Cloud computing and information                         accessed at any time, by any user with an
   policy                                                  Internet connection, from any place.
   While the main focus of this report is on the           Licensing,     usage    agreements      and
   impact of distributed computing on future               intellectual property rights may vary in
   standards work, it should be noted that the             different participating countries, but the
   continued and successful deployment of                  cloud hides these differences, which can
   computing as a utility presents other                   cause problems.
   challenges, including issues of privacy,                Governments will need to carefully consider
   security, liability, access, and regulation.            the appropriate polices and levels of
   Distributed computing paradigms operate                 regulation    or    legislation    to    provide
   across borders, and raise jurisdiction and              adequate      safeguards      for    distributed
   law enforcement issues similarly to those of            computing, e.g. by         mandating greater
   the Internet itself. These issues are briefly           precision    in    contracts      and    service
   described below.                                        agreements between users and providers,
   Reliability and liability: As with any other            with a possible view to establishing some
   telecommunications service, users will                  minimal levels of protection. These may
   expect the cloud to be a reliable resource,             include:
   especially if a cloud provider takes over the                Basic thresholds for reliability;
   task       of    running      “mission-critical”             Assignment of liability for loss or
   applications,     and    will   expect    clear             other violation of data;
   delineation of liability if serious problems                 Expectations for data security;
   occur. Although service disruptions will                     Privacy protection;
   become increasingly rare, they cannot be                     Expectations for anonymity;
   excluded.      Data    integrity    and     the              Access and usage rights.
   correctness of results are other facets of              Gartner summarizes seven issues cloud
   reliability. Erroneous results, data lost or            customers should address before migrating
   altered due to service disruptions can have             from in-house infrastructure to external
   a negative impact on the business of the                resources:     privileged   user     access,
   cloud user. The matters of reliability,                 regulatory compliance, data location, data
   liability and QoS can be determined in                  segregation, data recovery, investigative
   service-level agreements.                               support, and long-term viability. 45
   Security, privacy, anonymity: It may be                 While different users (e.g., individuals,
   the case that the levels of privacy and                 organizations, researchers) may have
   anonymity available to the user of a cloud              different expectations for any of these
   will be lower than the user of desktop                  points when they “outsource” their data
   applications. 44 To protect the privacy of              and processes to a cloud or grid, it is
   cloud users, care must be taken to guard                necessary that both providers and policy
   the users’ data and applications for                    makers address these issues in order to
   manipulating that data. Organizations may               foster user trust and to handle eventual
   be concerned about the security of client               events of damage or loss.
   data      and     proprietary   algorithms;
   researchers may be concerned about                      Future standardization work
   unintended      release    of  discoveries;             Parallels can be drawn between the current
   individuals may fear the misuse of sensitive            state of distributed computing and the early
   personal information. Since the physical                days of networking: independent islands of
   infrastructure in a distributed computing               systems with little interoperability, only few
   environment is shared among its users, any              standards and proprietary management
   doubts about data security have to be                   interfaces:
                                                                “The problem is that there’s no
                                                                standard to move things around. I

Distributed Computing: Utilities, Grids & Clouds (March 2009)                                                 7
                                            ITU-T Technology Watch Reports
   think it’s the biggest hurdle that             scalability  and   extensibility of  their
   cloud computing has to face today.             infrastructure. Standards work in these
   How do we create an open                       areas will need to be aware of those who
   environment between clouds, so                 contend that such efforts are premature
   that I can have some things reside             and could impede innovation. 47
   in my cloud and some things in
                                                  Other SDOs
   other people’s data center? A lot of
   work needs to be done.” Padmasree              Among the bodies engaged in the
   Warrior, CTO, Cisco 46                         standardization of distributed computing
                                                  concepts are:
The key to realizing the full benefits of
cloud and grid computing may well lie in          Common Component          http://www.cca-
standardization,  particularly   in   the         Architecture Forum        forum.org
middleware layer and the area of resource         (CCA)
interconnection.                                  Distributed               http://www.dmtf.org
                                                  Management         Task
In addition to the questions about reliability,
                                                  Force (DMTF)
liability, trust, etc., discussed above, the
users        of     distributed      computing    Globus Alliance           http://www.globus.org
infrastructure also are likely to be              Organization for the      http://www.oasis-
concerned        about      portability   and     Advancement       of      open.org
interoperability.                                 Structured
Portability, the freedom to migrate data on
                                                  Standards (OASIS)
and off the clouds of different providers,
without significant effort and switching          Open    Grid      Forum   http://www.ogf.org
costs, should be a major focus of attention       (OGF)
in       standardization.      Furthermore,       Optical                   http://www.oiforum.com
standardized solutions for automation,            Internetworking
monitoring, provisioning and configuration        Forum (OIF)
of cloud and grid applications need to be
                                                  TeleManagement            http://www.tmforum.org
found, in order to provide interoperability.      Forum (TMF)
Users may want to employ infrastructure
and services from different providers at the      The objective of the Common Component
same time.                                        Architecture (CCA) Forum, formed by
                                                  members        of    the     academic    high-
Today’s services include both proprietary         performance computing community, is to
and open source solutions. Many of them           define     standards     interfaces   that   a
provide     their   own   APIs  (application      framework has to provide to components,
programming interfaces) that improve              and can expect from them, in order to allow
interoperability by allowing users to adapt       disparate components to be interconnected.
their code and applications according to the      Such        standards       would     promote
requirements of the service. However, the         interoperability      between      components
APIs are essentially proprietary and have         developed by different teams across
not been subject of standardization, which        different institutions.
means that users cannot easily extract their
data and code from one site to run on             The Distributed Management Task Force
another. Instead they need to repeat              (DMTF) is a group of 160 member
adaptation efforts for each cloud service         companies and organizations that develops
used. Global standards could allow services       and maintains standards for systems
of different vendors to interoperate.             management of IT environments in
Standardized interfaces would allow users         enterprises and the Internet. These
to use the same code on different                 standards        enable      management
distributed computing solutions, which            interoperability   among     multi-vendor
could additionally decrease the risk of a         systems, tools, and solutions within the
total loss of data.                               enterprise in a platform-independent and
                                                  technology-neutral way. DMTF standards
On the provider side, there could be an           include:
interest in standards for distributed network
management, memory management and                     Common Information Model (CIM).
load balancing, identity management and              Defines how managed elements in an IT
security, and standards that allow for               environment are represented as a
                                                     common set of objects and relationships

                                                ITU-T Technology Watch Reports
       between them. This is intended to allow             The TeleManagement Forum (TMF) is an
       consistent management of these                      industry     association     focused       on
       elements, and to interconnect them,                 transforming       business        processes,
       independent of their manufacturer or                operations and systems for managing and
       provider.                                           economizing        online        information,
                                                           communications and entertainment services.
        Web-Based Enterprise Management
                                                           Existing Internet standards, such as HTTP,
       (WBEM) is a set of standardized system
                                                           XML, SSL/TLS, developed at W3C, IETF, etc.
       management technologies for the
                                                           play    an    important     role     in   the
       remote management of heterogeneous
                                                           communication      between      client   and
       distributed hardware and software
        Open Virtualization Format (OVF) is               ITU-T
       an open standard used in the resource
                                                           The ITU-T has approved a number of
       layer for packaging and distributing
                                                           Recommendations that indirectly impact on
       virtual appliances or more generally
                                                           distributed computing.
       software to be run in virtual machines.
                                                           These concern technical aspects, for
   The OGF is an open community committed
                                                           instance the work on multimedia coding in
   to driving the rapid evolution and adoption
                                                           Study Group 16, or on telecommunication
   of applied distributed computing. This is
                                                           security in Study Group 17, as well as
   critical to developing new, innovative and
                                                           operational aspects, accounting principles
   scalable applications and infrastructures
                                                           and QoS, treated in Study Groups 2, 3 and
   that are seen as essential to productivity in
   the enterprise and the scientific community.
   Recommendations developed by the OGF                    ITU-T Study Groups 13 and 15 48 have
   cover       middleware     and       resource           liaisons with the Optical Internetworking
   interconnection layers and include                      Forum       (OIF),     which       provides
                                                           interoperability agreements (IAs) that
        Open Grid Services Architecture
                                                           standardize interfaces for the underlying
       (OGSA), which describes a service-
                                                           communication infrastructure to enable the
       oriented grid computing environment
                                                           resources to be dynamically interconnected.
       for business and scientific use.
                                                           ITU-T Recommendations of the E-Series
        Distributed Resource Management
                                                           (“Overall network operation, telephone
       Application API (DRMAA), a high-level
                                                           service, service operation and human
       specification for the submission and
                                                           factors”) address some of these points and
       control of jobs to one or more
                                                           provide, inter alia, definitions related to
       Distributed Resource Management
                                                           QoS (E.800) and propose a framework of a
       Systems (DRMS) within a grid
                                                           Service Level Agreement (E.860).
                                                           Recommendations in the ITU-T M.3000
        Configuration Description,
                                                           series describe the Telecommunication
       Deployment, and Lifecycle Management
                                                           Management Network protocol model,
       (CDDLM) Specification, a standard for
                                                           which provides a framework for achieving
       the management, deployment and
                                                           interconnectivity and communication across
       configuration of grid service lifecycles or
                                                           heterogeneous operation systems and
       inter-organization resources.
                                                           telecommunication networks. The TMF
   The Globus Alliance is a community of                   multi-technology    network  management
   organizations and individuals developing                solution    is    referenced  in     ITU-T
   fundamental technologies for the grid. The              Recommendation M.3170.0 ff.
   Globus Toolkit is an open source grid
   middleware component that provides a
   standard platform for services to build upon.           This Report describes different paradigms
   The toolkit includes software for security,             for distributed computing, namely grid,
   information     infrastructure,     resource            utility and cloud computing. The spread of
   management,         data       management,              communication networks, and in particular
   communication,     fault   detection,    and            the growth of affordable broadband in
   portability.                                            developed      countries,   has    enabled

Distributed Computing: Utilities, Grids & Clouds (March 2009)                                              9
                                         ITU-T Technology Watch Reports
organizations to share their computational
resources. What originally started as grid
computing,     temporarily  using    remote
supercomputers or clusters of mainframes
to address scientific problems too large or
too complex to be solved on in-house
infrastructures, has evolved into service-
oriented business models that offer physical
and virtual resources on a pay as you go
basis – as an alternative to often idle, in-
house data centers and stringent license
The user can choose from a huge array of
different solutions according to its needs.
Each provider offers its own way of
accessing the data, often in the form of
APIs. That complicates the process of
moving from one provider to another, or to
internetwork different cloud platforms.
Increased focus on standards for interfaces,
and other areas suggested in the report,
would enable clouds and grids to be
commoditized       and     would     ensure

                                                         ITU-T Technology Watch Reports

   Notes, sources and further reading

     Foster, I. and Kesselman, C. “The grid: blueprint for a new computing infrastructure,” Morgan Kaufmann Publishers Inc., San
        Francisco, CA, 1998
     Tanenbaum, A. S. and van Steen, M. “Distributed Systems: Principles and Paradigms”
     Anderson, R. “Security Engineering: A Guide to Building Dependable Distributed Systems,” chapter 6
     Buyya et al. “Market-Oriented Cloud Computing: Vision, Hype, and Reality for Delivering IT Services as Computing Utilities,”
      Tutschku, K. et al. “Trends in network and service operation for the emerging future Internet,” Int J Electron Commun (AEU)
      Delic, K. A. and Walker, M. A. “Emergence of the academic computing clouds,” Ubiquity 9, 31 (Aug. 2008), 1-1.
      ECW is an enhanced compressed wavelet file format designed for geospatial imagery.
      Jaeger et al. “Cloud Computing and Information Policy: Computing in a Policy Cloud?”
      infoDev Quick guide: Low-cost computing devices and initiatives for the developing world,
      See UNCTAD Information Economy Report 2007-2008,
        http://www.unctad.org/Templates/webflyer.asp?docid=9479&intItemID=1397&lang=1&mode=highlights, and ITU World
        Telecommunication/ICT Indicators Database 2008 (12th Edition), http://www.itu.int/ITU-D/ict/publications/world/world.html
      Delaney “Google plans services to store users' data,” Wall Street Journal
      Greenberg (Forbes.com) “Bridging the clouds,” http://www.forbes.com/technology/2008/06/29/cloud-computing-3tera-tech-cio-

Distributed Computing: Utilities, Grids & Clouds (March 2009)                                                                       11

To top