PowerPoint Presentation - Department of Computer Science and

Document Sample
PowerPoint Presentation - Department of Computer Science and Powered By Docstoc
					  Cloud Computing Systems

Amazon Web Services and EC2

                     Lin Gu

  Hong Kong University of Science and Technology
                   Sept. 26, 2011
                  Cloud Systems
• Infrastructure as a Service (IaaS): basic compute and
  storage resources
   – E.g., Amazon AWS/EC2, VMWare vCloud
• Platform as a Service (PaaS): cloud application
   – E.g. Google App Engine,, Windows Azure
• Software as a Service (SaaS): cloud applications
   – E.g. Google Docs, Microsoft Office Web Companions, Office
Commerce Department Statistics

                   % of
         Capital Equipment Budget
           spent on IT in 2000?

                                 % of Utilized Server Capacity
                                           on Average?

                                          Economist Survey on IT, 2008
         Elasticity – Provisioning for Peak
Real World Server Utilization Is 5% to 20%
• Provision for peak?
• Painful to under-provision
• Do we know the “peak”?

                                             Provisioning for Peak
                                               Without Elasticity,
                                                Waste Resources
                                                 (Shaded Areas)
                                             During Non-Peak Times
       Elasticity – Pay as You Go

1. You pay ONLY for what you use
2. ONLY when you use it
3. With the ability to SCALE up and down
         Incremental Scalability
• Traditional in-house IT services is difficult to
   – Large Up-Front Investment
   – Invest Ahead of Demand
   – Load is Unpredictable
• The scaling process should be incremental
   – But sometimes you cannot predict the growth
Seasonal Spikes
Diurnal, seasonal, and occasional fluctuations
“Every year, we take the busiest minute of the busiest
  hour of the busiest day and build capacity on that, we
  built our systems to (handle that load) and we went
  above and beyond that.” *
“Yet something went terribly wrong. As procrastinating
  taxpayers came home from work on the East Coast on
  Tuesday and began to file their returns, the company's
  servers began to overload…”

  -- Scott Gulbransen
  Intuit Spokesman
Solution: Integrate users, logic, and data at
                 larger scale
Statistical Multiplexing, and more…
•Scale capacity on demand
•Turn fixed costs into variable costs
•Always available, high reliability
•Follow established APIs and conceptual models
•Reduced time to market
•Focus on product & core competencies
                  Amazon Web Services

A set of APIs and programming models which give developer-
level access to Amazon’s infrastructure and business data
  Infrastructure As A Service              Platform As A Service
     Amazon Elastic Compute Cloud              Amazon Simple Queue Service
                                               Amazon Simple Storage Service

  Data As A Service
                                           People As A Service
     Amazon E-Commerce Service
                                               Amazon Mechanical Turk
     Amazon Historical Pricing

                Search As A Service
                    Alexa Web Information Service

• Commercially usable and
• Monthly billing
• Self-serve model:
   – Sign up as developer
   – Choose services
   – Agree to service licenses
   – Enter payment info
   – Start coding
    Amazon Elastic Compute Cloud

• Virtual machine with         $.10 per
various OS and pre-installed
software packages              server hour
• Elastic Capacity
• 1.7 GHz x86, 1.7 GB RAM,
160 GB Disk, 250
MB/Second Network
• Network Security Model        $.10 - $.18 per
                               GB data transfer
                AMI and instances

• Amazon Machine Image (AMI):
   – Bootable, pre-defined or user-built
   – OS: Fedora, Centos, Gentoo, Debian,
     Ubuntu, Windows Server
   – Software packages: LAMP, mpiBLAST, Hadoop

• Instance:
   – Running copy of an AMI
   – Launch in less than 2 minutes
   – Start/stop programmatically
       Other Available Configurations
• Large Instance: $0.40 per instance-hour
   –   7.5 GB of memory,
   –   4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each),
   –   850 GB of instance storage
   –   64-bit platform
• Extra Large Instance: $0.80 per instance-hour
   –   15 GB of memory,
   –   8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each),
   –   1690 GB of instance storage,
   –   64-bit platform
                  Amazon EC2 At Work
• Startups
   – Cruxy – Media transcoding
   – GigaVox Media – Podcast Management

• Larger businesses:
   – High-Impact, Short-Term Projects
   – Development Host

• Science / Research:
   – Hadoop / MapReduce
   – mpiBLAST

• Load-Management and Load Balancing Tools:
   – Pound
   – Weogeo
   – Rightscale
            EC2 SOAP/Query API
• Images:                 • Image Attributes:
   – RegisterImage           – ModifyImageAttribute
   – DescribeImages
                             – DescribeImageAttribute
   – DeregisterImage
                             – ResetImageAttribute
• Instances:
   – RunInstances         • Security Groups:
   – DescribeInstances       – CreateSecurityGroup
   – TerminateInstances
                             – DescribeSecurityGroups
   – GetConsoleOutput
   – RebootInstances         – DeleteSecurityGroup
                             – AuthorizeSecurityGroupIngres
• Keypairs:                    s
   – CreateKeyPair           – RevokeSecurityGroupIngress
   – DescribeKeyPairs
   – DeleteKeyPair
          Azure Node Structure
• A node is a management unit of FC
  – Contains an FC Agent in Hyper-V root partition
  – Each role instance runs in Guest OS with GA
  – FA delegates GAs to handle VM status
              Pros and cons
• Cheap (to begin)
• Scalable: as (reasonably) many servers as you
• Upgrade to more virtual processors
• Fault tolerant: Failover machines
• No hardware required, no up-front
However, …
• Random IP Addresses
• Costs accrue
• Non-persistent storage
    Amazon Simple Storage Service

• Object-Based Storage       $.15 per GB
• 1 B – 5 GB / object        per month
• Fast, Reliable, Scalable     storage
• Redundant, 99.99%
Availability Goal            $.01 for 1000 to
• Private or Public          10000 requests
• Per-object URLs & ACLs
• BitTorrent Support          $.10 - $.18 per
                             GB data transfer
    Amazon Simple Storage Service (S3)
• Objects:
   – Opaque data to be stored (1 byte … 5 Gigabytes)
   – Authentication and access controls

• Buckets:
   – Object container – any number of objects
   – 100 buckets per account / buckets are “owned”

• Keys:
   – Unique object identifier within bucket
   – Up to 1024 bytes long
   – Flat object storage model

• Standards-Based Interfaces:
   – REST and SOAP
   – URL-Addressability – every object has a URL
               S3 SOAP/Query API
• Service:                   • Objects:
   – ListAllMyBuckets            – PutObject
                                 – PutObjectInline
                                 – GetObject
• Buckets:                       – GetObjectExtended
   – CreateBucket                – DeleteObject
   – DeleteBucket                – GetObjectAccessControlPolicy
   – ListBucket                  – SetObjectAccessControlPolicy
   – GetBucketAccessControlPolicy
   – SetBucketAccessControlPolicy
   – GetBucketLoggingStatus
   – SetBucketLoggingStatus
               Windows Azure Storage
The Windows Azure storage services provide storage for
binary and text data, messages, and structured data
   –Blob service: storing binary and text data
   –Queue service: storing messages that may be accessed by a
   –Table service: structured storage for non-relational data
   –Windows Azure drives: mounting an NTFS volume accessible
    to code running in your Windows Azure service
•“Programmatic access to the Blob, Queue, and Table
services is available via the Windows Azure Managed
Library and the Windows Azure storage services REST
API”                    --
Amazon Simple Queue Service

    Amazon Simple Queue Service

                             $.10 per 1000
• Scalable Queuing
• Elastic Capacity             messages
• Reliable, Simple, Secure
                              $.10 - $.18 per
                             GB data transfer
• Queues: persistent, named message container
   – Messages: Up to 256KB of data per message
   – Messages are stored redundantly across multiple servers
     and datacenters
• A reliable, highly scalable hosted distributed queue
  for storing messages
   – Scalable:
      • Unlimited number of queues per account
      • Unlimited number of messages per queue
   – Runs within Amazon's high-availability datacenters
• Amazon's messaging infrastructure as a web service
• Platform-agnostic, allowing any computer on the
  Internet to add or read messages through the
  defined API
           SQS SOAP/Query API
• Queues:                   • Security:
   – ListQueues                – AddGrant
   – DeleteQueue               – ListGrants
   – SetVisibilityTimeout      – RemoveGrant
   – GetVisibilityTimeout

• Messages:
  – SendMessage
  – ReceiveMessage
  – DeleteMessage
  – PeekMessage
          Infrastructure as a Service
                         Elastic Compute


Simple Storage                                  Simple Queue
Service                                         Service

          Store                            Message
 Azure Apps– Overview
                    The Internet
                  The Internet via TCP or HTTP

    L                                                  L
    B                                                  B

 Web Site
Web Site                                          Worker
     WCF)                                         Service
IIS as Host                                        Managed
                                                 Interface Call


                Windows Azure Data Center

Shared By: