Web Scale Computing

Document Sample
Web Scale Computing Powered By Docstoc
					 Introduction to Amazon Web
Services and Cloud Computing
Amazon’s Three Businesses

 Consumer (Retail)              Seller            IT Infrastructure
     Business                  Business                Business

Over one hundred         Sell on Amazon        Cloud computing
million active           websites              infrastructure for
customer accounts                              hosting web-scale
                         Use Amazon
Eight countries: US,     technology for your
UK, Germany, Japan,      own retail website    Hundreds of
France, Canada, China,                         thousands of
                         Leverage Amazon’s
Italy                                          registered customers
                         massive fulfillment
                         center network
Amazon Technical Heritage

• Technology investment in the billions of dollars

• Amazon is itself a $30B mission-critical real-time
  online transaction processing enterprise

• Distributed computing infrastructure honed for 15+
The Cloud is Suddenly Everywhere
 What is Cloud Computing?

An analogy: think of electricity

Power is a utility service -
available to you on-demand and
you pay only for what you use.   You simply plug into a vast electrical
                                 grid managed by experts to get a low
                                 cost, reliable power supply – available
                                 to you with much greater efficiency
                                 than you could generate on your own.
 What is Cloud Computing?

Cloud Computing is also a utility service - giving you
access to technology resources managed by experts
and available on-demand.

                          You simply access these services over
                          the internet, with no up-front costs and
                          you pay only for the resources you use.
Attributes of Cloud Computing

• No capital expenditure
• Pay as you go and pay only for what you use
• True elastic capacity; Scale up and down
• Improves time to results
• Managed – You can focus on what differentiates
  your work / research instead of managing the
  undifferentiated heavy lifting of infrastructure
Cloud Computing: A Natural Evolution

 Industry Trends            Cloud Computing

Software as a Service   Host any solution in a scalable,
                        reliable environment

                        Take advantage of thousands of
Grid Computing          networked servers for virtually
                        unlimited compute power
                        Employ virtual machines for complete
Virtualization          flexibility in your development
                        Use web services to programmatically
Service Oriented
                        control infrastructure from
                        within apps
 What is Amazon Web Services?

Amazon Web Services is a cloud computing platform that provides
flexible, scalable, and cost-effective IT infrastructure for
businesses of all sizes around the world…

 …running on the same reliable, secure
  technology platform used to power’s global web properties.
AWS Computing Platform
The Cloud Scales: Amazon S3 Growth

               Peak Requests:                              196 Billion
                 per second

                                             102 Billion

                                40 Billion

                   14 Billion
 2.9 Billion

                Total Number of Objects Stored in Amazon S3
 The Cloud Scales: AWS Global Reach

AWS Regions
 US East (Northern Virginia)
 US West (Northern California)
 Europe (Dublin)
 Asia Pacific (Singapore)

AWS CloudFront Locations
Ashburn, VA / Dallas, TX / Los Angeles, CA / Miami, FL /
Jacksonville, FL / Newark, NJ / New York, NY / Palo Alto, CA /
Seattle, WA / St. Louis, MO / Amsterdam / Dublin / Frankfurt /
London / Hong Kong / Tokyo / Singapore
      AWS Pace of Innovation

                                                                    » EC2 Reserved Instances
                                                                    » New SimpleDB Features            » Amazon Simple Notification Service
                                                                    » IBM on EC2                       » RDS Multi-Availability Zone Support
                                                                    » Windows Server 2008 on EC2       » S3 Reduced Redundancy Storage
                                  » Premium Support                 » Amazon RDS                       » New Locations and Features for
                                  » Amazon CloudFront               » Amazon Virtual Private Cloud       CloudFront
          » Amazon EC2            » EC2 Elastic IP addresses        » Amazon Elastic MapReduce         » S3 Bucket Policies
          » Amazon S3               & Availability Zones            » EBS Shared Snapshots             » Cluster Instances for EC2
          » Developer Portal &    » Windows Server, MySQL,          » Monitoring, Auto Scaling &
                                                                      Elastic Load Balancing for EC2                          » Amazon Linux AMI
            Forums                  Oracle, & JBoss on EC2
                                                                    » AWS Import/Export                                       » Oracle on EC2
                                  » Lower Data Transfer Costs
                                                                                                                              » New EC2 Features
                                                                                                                              » SUSE Linux on EC2

     2005      2006        2007      2008                  2009                                 2010

                                                                                                                   » Micro Instances
                                                          » AWS Services in N. California                          » Lower Pricing for EC2
                 » Amazon SimpleDB
                                                          » AWS Multi-Factor Authentication                          High Mem Instances
                 » Amazon Flexible Payments
                                                          » AWS Management Console                                 » Identity & Access Management
                                                          » AWS Economics Center
                 » S3 in Europe                                                           » AWS Services in Singapore
                                                          » AWS in Education
                 » EC2 new instance types                                                 » RDS Reserved Database Instances
                                                          » AWS Security Center
                 » AWS Start-Up Challenge                                                 » RDS Read Replicas & Lower Pricing
                                                          » SAS70 Type II Audit
» Amazon SQS                                              » More services in EU           » Lower Outbound Transfer Pricing
» Amazon Mechanical Turk          » Public Data Sets      » Lower EC2 Pricing             » Data Transfer Usage Tiers
                                  » Elastic Block Store   » Lower S3 Pricing              » Consolidated Billing for AWS
                                  » EC2 SLA               » Lower pricing for             » Amazon S3 Versioning Feature
                                  » EC2 in EU               Outbound Data Transfer        » EC2 High Memory Instances
                                  » S3 Tiered Pricing     » AWS Solution Provider Program
 What You Want

Your Idea         Product /
                 Research /
                 Work Result

            Undifferentiated   Successful
Your Idea
             “Heavy Lifting”    Product /
                               Research /
                               Work Result
 Heavy Lifting = Price of Admission…

                           Contract negotiation
Server hosting
                 Bandwidth management

Purchase decisions
                                Moving facilities
  Scaling and managing physical growth

                   Heterogeneous hardware
 Legacy software

             Coordinating large teams
 It Gets Worse…

             Undifferentiated    Initial
Your Idea
             “Heavy Lifting”    Success

            Improvement Loop
         Predicting Infrastructure Needs

                              Actual Usage

Compute Power


                                                      Predicted Usage


Example: Wall Street App on Amazon EC2

 3000 -                      3000 CPU’s for one firm’s risk management processes
 Number of EC2 Instances

                                                                300 CPU’s on

    300 -

                           Wednesday    Thursday      Friday     Saturday     Sunday      Monday     Tuesday
                            4/22/2009   4/23/2009   4/24/2009    4/25/2009   4/26/2009   4/27/2009   4/28/2009
     Example: Video App on Amazon EC2

                                                                Scaled to peak of
Number of EC2 Instances

                                                                5,000 instances in 3 days

                                                               Launch of Facebook

                          4/12/2008   4/13/2008   4/14/2008   4/15/2008   4/16/2008   4/17/2008   4/18/2008   4/19/2008   4/20/2008
The Dirty Little Secret

                   30%                    70%

On-Premise         Your           Managing All of the
Infrastructure   Business   “Undifferentiated Heavy Lifting”
AWS Goal: Flip This Equation

                 30%                      70%

On-Premise       Your            Managing All of the
Infrastructure   Work             “Heavy Lifting”

AWS                                             Configuring
                  More Time to Focus on
Cloud-Based                                     Your Cloud
                       Your Work
Infrastructure                                    Assets

                         70%                      30%
AWS Principles

                 Easy to Use
The Bottom Line Benefit

       The AWS Cloud turns capital
    expenses into variable costs while
    preserving flexibility and enhancing
      the scalability, availability, and
        security of IT infrastructure
Common Use Cases

• High performance computing, batch data
  processing, and large scale analytics
• Web site hosting
• Application hosting/SaaS hosting
• Internal IT application hosting
• Content delivery and media distribution
• Storage, backup, and disaster recovery
• Development and test environments
The Cloud for HPC

• Lots of data
   – TB, PB-sized data sets
   – Continuous stream
• Lots of compute
   – [Cycle-hungry code] * [lots of data]
   – Need for parallelism
• Lots of people
   – Cooperation
   – Sharing
   – Multiple locations
AWS - HPC Use Cases

• CFD – Computational Fluid Dynamics
    – OpenFOAM on EC2
    – CloudFlu
• Molecular Modeling
    – Eli Lilly, Pfizer
• Sequence Analysis
    – CloudBioLinux
•   Engineering Design
•   Energy Trading & Financial Modeling
•   I/O-intensive Applications
•   Graphics / 3D Rendering
Implications of Scale

• Eliminate the cost and complexity of procuring,
  configuring and operating expensive in-house compute

• Increase the speed of innovation and output by accessing
  compute resources in minutes instead of months

• Scale compute resources up to the size and time
  appropriate for each workload, then shut them down
  when no longer needed

• New challenges: data management, data processing, data
AWS Scalability

• Architectural design
   – Networking substrate designed for redundancy and ability
     to add capacity at each link
   – Multiple redundant facilities within each geographic region
   – Multiple redundant transit points and transit providers for
     each facility
   – Traditional facility-level redundancy (UPS, generator, etc.)
   – Loosely coupled software architecture highly tolerant of
     infrastructure failure
AWS Scalability (continued)

• Capacity investments
  – Substantial hardware inventory
  – Designed and built to withstand massive loads
      • Example: S3 exceeds 200K rps
      • Another Example: EC2 on-demand spinup of thousands
        of compute instances for customers
  – AWS handles more load per day than all of Amazon’s global
    retail sites
  – Ensures that no single application can dominate the entire
AWS Performance

• Very high, dedicated bandwidth between Amazon
  EC2 fleet and all other services
• Network interconnnects within a rack and between
  racks are designed not to be bottlenecks
• Intelligent use and deployment of Agg Routers, Core
  Routers, and Load Balancers
• Constant focus on throughput and latency
• Latency comparable with latency found in customer-
  owned and operated data centers
Amazon Elastic Compute Cloud

• Amazon EC2: on-demand compute power
   – Obtain and boot new server instances in minutes
   – Quickly scale capacity up or down
• Key features:
   – Support for Windows, Linux, and OpenSolaris
   – Supports all major web and application
   – Deploy across Availability Zones for reliability
   – Elastic IPs provide greater flexibility
   – Persistent storage with Elastic Block Store
   – Elastic Load Balancing and Auto-Scaling built-in
   – Amazon CloudWatch monitors status and usage
• Service Level Agreement: 99.95%
Amazon EC2 Regions and Availability Zones

   US East (Northern Virginia)                           EU (Dublin)

     Availability        Availability
      Zone A              Zone B
                                                Availability       Availability
                                                 Zone A             Zone B
    Availability        Availability
     Zone C              Zone D

Amazon EC2 Regions:
US East (Northern Virginia) / US West (Northern California) / EU (Dublin) /
Asia Pacific (Singapore)
Amazon EC2 Pricing Options

    On-Demand                 Reserved                     Spot
     Instances                Instances                 Instances

Pay as you go for       Pay a low up-front      Enables you to bid on
compute power           fee and receive a       unused Amazon EC2
                        significant discount    capacity
Pay only for what you
                        on the hourly pricing
use, no up-front                                Spot Price is based on
                        for that instance
commitments or long                             supply/demand and is
-term contracts         1- or 3-year terms      determined
Unix/Linux instances    Helps ensure that
start at $0.085/hour    compute capacity is     If the Spot Price is below
USD in the US East      available when it is    your bid, your instances
Region                  needed                  will start
                                                If the Spot Price rises
                                                above your bid, your
                                                instances will stop
Amazon EC2 Instance Types

• Standard Instances
  – Well suited for most applications
• High Memory Instances
  – Offer large memory sizes for high throughput applications,
    including database and memory caching applications
• High CPU Instances
  – Have proportionally more CPU resources than memory
    (RAM) and are well suited for compute-intensive
• Cluster Compute Instances
  – Low latency, 10 Gbps networking between instances
Amazon EC2 Pricing (US East Region)

•   Billed for actual usage on monthly basis
•   Standard Instances
     – Linux/UNIX starting at $0.085 per hour
     – Windows starting at $0.12 per hour
•   High CPU Instances
     – Linux/UNIX starting at $0.17 per hour
     – Windows starting at $0.29 per hour
•   High Memory Instances
     – Linux/UNIX starting at $1.20 USD per hour
     – Windows starting at $1.44 USD per hour
• Cluster Compute Instances
     – Linux/Unix at $1.60 per hour
     – Linux/Unix with 2x NVidia “Fermi” GPU at $2.10 per hour
•   + Data Transfer Costs
•   Reserved Instances
     – Make a low, one-time payment for each instance
     – Receive lower pricing for that instance
Amazon Public Datasets

• Free, centralized data repository enables low-cost
  collaboration for AWS cloud-based applications
• Pre-built data repositories for immediate use:
   –   Ensembl Annotated Human Genome
   –   3-D PubChem Library
   –   UGI Virtual Conformer Library
   –   1980, 1990, and 2000 U.S. Census Bureau data
   –   U.S. Department of Labor statistical data
   –   Many more coming all the time…
• Share your own datasets with the AWS community
Amazon Elastic MapReduce

• Hadoop implementation built on Amazon EC2
• Crunch any amount of data held in Amazon S3
• Use cases: web indexing, data mining, log file analysis,
  machine learning, financial analysis, scientific
  simulation, and bioinformatics research
Amazon Simple Storage Service

•   Scalable data storage in-the-cloud
•   Over 196 billion objects, 198,000 requests/second
•   Highly available and durable
•   Pay-as-you-go pricing:
    –   Storage: tiered $0.18/GB to $0.15/GB
    –   Data Transfer Out: tiered $0.17/GB to $0.10/GB
    –   Data Transfer In: $0.10/GB
    –   Requests: nominal charges
• Service Level Agreement: 99.9%
AWS Import/Export

• Get your data into AWS faster - load it onto a
  portable storage device and ship it to an Amazon
  data center

• Faster than Internet transfer and more cost effective
  than upgrading your connectivity

• Use cases: data migration, offsite backup, direct data
  interchange, disaster recovery
Amazon RDS – Relational Database Service

• Easy to provision a new relational database with only
  a simple API call
• Offload common administrative tasks to AWS
   – Leverage the proven AWS infrastructure
   – Take advantage of automated backups
• Use your existing code and tools
• Scale up easily with only a simple API call
• Integrates well with AWS
   – Low latency from Amazon EC2
• Pay only for what you use, no up-front commitments
Amazon SimpleDB

• Simple, scalable storage solution for structured data
      – Provides core database functionality for data storage and
      – No schema, no data modeling, no DBA

item              description       color             material
123               Sweater           Blue, Red
789               Shoes             Black             Leather
PUT (item, 123), (description, Sweater), (color, Blue), (color, Red)

Domain = MyStore
[‘description’ = ‘Sweater’]
Amazon Simple Queue Service

• Reliable, highly scalable, hosted queue for messaging
• Build automated workflows for all applications
• Coordinate multiple Amazon EC2 instances


  Producer                              Consumer

AWS Multi-Factor Authentication

A recommended opt-in security feature of your
Amazon Web Services (AWS) account
AWS MFA Benefits

• Helps prevent anyone with unauthorized knowledge
  of your e-mail address and password from
  impersonating you

• Requires a device in your physical possession to gain
  access to secure pages on the AWS Portal or to gain
  access to the AWS Management Console

• Adds an extra layer of protection to sensitive
  information, such as your AWS access identifiers
Management and Operations

• AWS affords numerous management options
• Utilize existing IT management systems
   – Amazon VPC enables existing management and operations
     systems, security policies, etc. to extend to cloud resources
   – AWS partners with numerous management platform
• Utilize cloud-oriented third-party providers
   – RightScale, Elastra and many others
• Leverage AWS APIs to build custom solutions
   – API-based control enables existing workflow applications to
     manage AWS resources
AWS Features for Management and

• AWS provides management and operations
  primitives to enable plugging AWS in anywhere
   – CloudWatch provides real-time API-based monitoring data
   – AutoScaling provides API-based touch-less scaling out and
     in for EC2 resources
   – Elastic Load Balancing provides APIs for touch-less
     balancing of load across a dynamic fleet of EC2 instances
     across AZs
• AWS Management Console offers GUI for
  management and visibility
• All AWS resources can be allocated, managed and de-
  allocated via a flexible API set
AWS Management Console

Mechanical Turk

    A flexible, scalable workforce with a programmatic interface

                       Use only the Workers you need, when you need them
    Scalable           Maintain the flexibility your business demands
                       Hire workers for 15 minutes – or 15 years

                       Significantly reduce headcount expenses
 Lower Risk and        Turn staffing from fixed cost to variable cost
     Costs             Pay for only what you use

                       Get more done, faster with more concurrent Workers
Faster Turnaround      Programmatic Interface – part of your systems flow
       Time            Get work done around the clock, around the world
Mechanical Turk Use Cases

•   Data Cleansing / Normalization
•   Tagging, Categorizing, De-duping
•   Transcription, Image Recognition
•   Information Collection & Research
•   Data Entry
•   Site Moderation and Quality Testing

What Amazon Web Services is to infrastructure,
Mechanical Turk is to staffing:
AWS in Education

Enable the worldwide academic community to easily leverage the benefits of
Amazon Web Services for teaching and research.

• Teaching Grants for educators using AWS in courses (plus access to
  selected course content resources).

• Research Grants for academic researchers using AWS in their work.

• Project Grants for student organizations pursuing entrepreneurial
  endeavors; Tutorials for students that want to use AWS for self-directed

• Solutions for university administrators looking to use cloud computing to
  be more efficient and cost-effective in the university’s IT Infrastructure.

AWS in Education Success Stories
Q & A / Discussion

Shared By: