Cloud Computing - Dashboard - University of Illinois - Engineering by malj


									Cloud Computing

          Reza Farivar
Slides adapted from cloud computing course CS 598
         By Prof. Roy Campbell, Reza Farivar
          Objectives and Syllabus
• Introduction to some of the major developments in cloud
• Teach how to re-think batch processing computational
  problems to fit the MapReduce programming paradigm,
  and other streaming computations in terms of the Strom
• Through hands-on experience in labs, reinforce and
  deepen your knowledge of Hadoop MapReduce and Storm

• What is meant by
   – Cloud Computing
   – Utility Computing
   – {Infrastructure, Platform, Software} as a Service
• Why do corporations need to pay attention
• General principles
• Research

                                     Tremendous Buzz

   “Not only is it faster and more           “Cloud computing achieves a        “ Economic downturn, the
    flexible, it is cheaper. […] the           quicker return on               appeal of that cost advantage
 emergence of cloud models radically             investment“                     will be greatly magnified"
alters the cost benefit decision“                     (Lindsay Armstrong of
                                        , Dec 2008)                            (IDC, 2008)
                       (FT Mar 6, 2009)

“Revolution, the biggest upheaval since the invention of
                                                                               “No less influential than e-
the PC in the 1970s […] IT departments will have little left to                         business”
  do once the bulk of business computing shifts […] into the
                                                                                              (Gartner, 2008)
                                                     (Nicholas Carr, 2008)

    The economics are compelling, with business                                   Domestic cloud
   applications made three to five times cheaper and                          computing estimated to
    consumer applications five to 10 times
                                                                                   grow at 53%
                                 (Merrill Lynch, May, 2008)
                                                                                   June, 2011) 3
Gartner’s 2011 Hype Cycle

         Cloud Computing
    A Computing paradigm where the
      boundaries of computing will be
determined by economic rationale rather
             than technical limits
       Professor Ramnath Chellappa
              Emory University
It is not just Grid, Utility, or Autonomic
                NIST Definition
July 5, 2011:

The NIST Definition of Cloud Computing identified
cloud computing as:

a model for enabling ubiquitous, convenient, on-
demand network access to a shared pool of
configurable computing resources (e.g., networks,
servers, storage, applications, and services) that can
be rapidly provisioned and released with minimal
management effort or service provider interaction.

       Cloud Characteristics
• On-demand self-service
• Ubiquitous network access
• Location independent resource
• Rapid elasticity
• Pay per use

                  Delivery Models
• Software as a Service (SaaS)
   – Use provider’s applications over a network
• Platform as a Service (PaaS)
   – Deploy customer-created applications to a cloud
   – AppEng
• Infrastructure as a Service (IaaS)
   – Rent processing, storage, network capacity, and other
     fundamental computing resources
   – EC2, S3

                 Software Stack
                  Mobile (Android), Thin client (Zonbu)
                     Thick client (Google Chrome)
                  Identity, Integration Payments, Mapping,
  Services           Search, Video Games, Chat
 Application      Peer-to-peer (Bittorrent), Web app
                     (twitter), SaaS (Google Apps, SAP)
                  Java Google Web Toolkit, Django, Ruby on
   Storage           Rails, .NET
Infrastructure    S3, Nirvanix, Rackspace Cloud Files, Savvis,
                  Full virtualization (GoGrid), Management
                     (RightScale), Compute (EC2), Platform
     NIST: Interactions between Actors in
               Cloud Computing
                                                        Cloud Auditor

      Cloud Broker                                     Cloud Provider

        The communication path between a cloud provider & a cloud consumer
        The communication paths for a cloud auditor to collect auditing information
        The communication paths for a cloud broker to provide service to a cloud
• By 2015, those companies who have adopted
  Big Data and extreme information
  management (their term for this area) will
  begin to outperform their unprepared
  competitors by 20% in every available
  financial metric.
  – Gartner

        Forbes Predictions 2011
• Cloud Adopters Embrace Cloud For Both
  Innovation and Legacy Optimization
• Replace most new procurement with cloud
• Start with private clouds as a stepping stone to
  public clouds.
• Get real about security. Move to private clouds as
  a back up to public clouds.
• The Bottom Line: Cloud Adoption Provides a Path
  to the Next Generation Enterprise

 Google Trends:- Cloud computing,
● cloud computing ●virtualization

             Utility Computing

“Computing may someday be organized as a public
utility, just as the telephone system is organized as a
                       public utility”
                  John McCarthy, 1961

   Perils of Corporate Computing
• Own information systems 
• However
  – Capital investment 
  – Heavy fixed costs 
  – Redundant expenditures 
  – High energy cost, low CPU utilization 
  – Dealing with unreliable hardware 
  – High-levels of overcapacity (Technology and Labor) 

                 NOT SUSTAINABLE                    15
                 Google: CPU Utilization

Activity profile of a sample of 5,000 Google Servers over a period of 6 months

Google: Energy Overhead

Google: Service Disruptions

           Utility Computing
• Let economy of scale prevail
• Outsource all the trouble to someone else
• The utility provider will share the overhead
  costs among many customers, amortizing the
• You only pay for:
   – the amortized overhead
   – Your real CPU / Storage / Bandwidth usage
      Why Utility Computing Now
•   Large data stores
•   Fiber networks
•   Commodity computing
•   Multicore machines
                          Utility Computing
• Huge data sets
• Utilization/Energy
• Shared people
       Data Intensive Computing
• Data collection too large to transmit economically over
  Internet --- Petabyte data collections
• Computation produces small data output containing a
  high density of information
• Implemented in Clouds
• Easy to write programs, fast turn around.
• MapReduce.
      • Map(k1, v1) -> list (k2, v2)
      • Reduce(k2,list(v2)) -> list(v3)
• Hadoop, PIG, HDFS, Hbase
• Sawzall, Google File System, BigTable

    Cloud Interoperability Standards
•   Open Cloud Computing Interface – Infrastructure
•   EC2 API
•   Simple Storage Service (S3) API
•   Windows Azure Storage Service REST APIs
•   Windows Azure Service Management REST APIs
•   Deltacloud API
•   Rackspace Cloud Servers API
•   Rackspace Cloud Files API
•   Cloud Data Management Interface
•   vCloud API
•   GlobusOnline REST API
CLOUD  from an economic viewpoint:
1. Common Infrastructure
       – pooled standardized resources, statistical multiplexing
2. Location-independence
       – ubiquitous availability meeting performance requirements
       – latency reduction and user experience enhancement
3. Online connectivity
       – an enabler of other attributes ensuring service access
       – (not discussed here)
4. Utility Pricing
       – usage-sensitive or pay-per-use pricing
       – benefits environments with variable demand levels
5.      on-Demand Resources
       – scalable, elastic resources provisioned and de-provisioned without delay or costs
         associated with change
• Sometimes in contrast with each other (as we will see)
Cloudonomics: A Rigorous Approach to Cloud Benefit Quantification, Joe Weinman,
1. The Value of Common Infrastructure
• Economies of scale
   – Reduced overhead costs
   – Buyer power through volume purchasing
• Statistics of Scale
   – For infrastructure built to peak requirements: Multiplexing
     demand  higher utilization
      • Lower cost per delivered resource than unconsolidated workloads
   – For infrastructure built to less than peak: Multiplexing
     demand  reduce the unserved demand
      • Lower loss of revenue or a Service-Level agreement violation

A useful Measure of “Smoothness”
• The coefficient of variation CV
   – ≠ the variance σ2 nor the correlation coefficient
• Ratio of the standard deviation σ to the absolute value of
  the mean |μ|
• “smoother” curves:
   – large mean for a given standard deviation
   – or smaller standard deviation for a given mean
• Importance of smoothness:
   – a facility with fixed assets servicing highly variable demand will
     achieve lower utilization than a similar one servicing relatively
     smooth demand.
• Multiplexing demand from multiple sources may reduce
  the coefficient of variation CV

Coefficient of variation CV

But what about dependent workloads?

 Common Infrastructure in Real World
• Correlated demands:
   – private, mid-size and large-size providers can experience similar
     statistics of scale
• Independent demands:
   – Midsize providers can achieve similar statistical economies to an
     infinitely large provider
• Available data on economy of scale for large providers is
   – use the same COTS computers and components
   – Locating near cheap power supplies  everyone can do that
   – Early entrant automation tools  3rd parties take care of it
• Take away lesson: you don’t need to be as large as to compete! 
   – At least according to “Value of Common Infrastructure”

2. Value of Location Independence
• We used to go to the computers , but applications, services
  and contents now come to us!
   – Through networks: Wired, wireless, satellite, etc.
• But what about latency?
   – Human response latency: 10s to 100s miliseconds
   – Latency is correlated with:
       • Distance (Strongly)
       • Routing algorithms of routers and switches (second order effects)
   – Speed of light in fiber: only 124 miles per milisecond
   – If the Google word suggestion took 2 seconds 
   – VOIP with latency of 200ms or more 
• Supporting a global user base requires a dispersed service
   – Coordination, consistency, availability, partition-tolerance
   – Investment implications
Covering a large area with centers

        4. Value of Utility Pricing
• As mentioned before, economy of scale might not
  be very effective
• But cloud services don’t need to be cheaper to be
• Consider a car
  – Buy or lease for $10 per day
  – Rent a car for $45 a day
  – If you need a car for 2 days in a trip, buying would be
    much more costly than renting
  – It depends on the demand

Utility Pricing in Detail

          Utility Pricing in Real World
1. In practice demands are often highly spiky
   – News stories, marketing promotions, product launches,
     Internet flash floods (slashdot effect), tax season,
     Christmas shopping, processing a drone footage for a 1
     week border skirmish, etc.
2. Often a hybrid model is the best
  –       You own a car for daily commute, and rent a car when
          traveling or when you need a van to move
  –       Key factor is again the ratio of peak to average demand
  –       But we should also consider other costs
      •      Network cost (both fixed costs and usage costs)
      •      Interoperability overhead
      •      Consider Reliability, accessability

5. Value of on-Demand Services

E.g. Penalty Costs for exponential

          Behavioral Cloudonomics
• humans do not always make purely rational and
  quantitative decisions
• Cons:
   – “loss aversion:” people generally get less satisfaction from
     gaining a dollar than they feel pain from losing one
   – customers may recognize the financial advantage of pay-
     per-use, but avoid it due to a “flat-rate” bias
      • E.g. fear of an unexpected large monthly cell phone bill  a flat-
        rate plan > measured service
• Pros:
   – special attraction of “free”
   – The lack of upfront investment in using the cloud becomes
     extremely attractive

           Computational Complexity
• Satisfying demands with constraints (e.g. distance) is
  computationally intractable
   – based on a transformation of BOOLEAN 3-SATISFIABILITY
• Even if there is exactly the right aggregate capacity in a
  distributed cloud system, it may be impossible to find the
  right assignment of capacity to demand
   – E.g. map scheduling in Hadoop vs File chunk locations in HDFS
• Common Infrastructure and Location Independence (latency
  optimization) are usually a tradeoff
   – we can choose to optimize the statistics of scale by building
     fewer consolidated facilities, and we can choose to optimize user
     experience by building more, dispersed facilities
   – determining an optimal tradeoff is intractable                37
• Weinman mentions a “classification” of clouds
  based on economic models
  – Compute Clouds
  – Hotel Clouds
  – Rental Car Clouds
  – Restaurant Clouds
  – Etc.
• What do you think about each category? Can
  you come up with others?


To top