Introduction to Windows Azure (PDF download) by dandanhuanghuang


									  Introduction to Windows Azure
Cloud Computing Futures Group, Microsoft Research
        Roger Barga, Jared Jackson, Nelson Araujo,
       Dennis Gannon, Wei Lu, and Jaliya Ekanayake
Range in size from “edge”
facilities to megascale.
Economies of scale
      Approximate costs for a small size
      center (1000 servers) and a larger,
      100K server center.

Technology       Cost in small-   Cost in Large    Ratio
                 sized Data       Data Center
Network          $95 per Mbps/    $13 per Mbps/     7.1      Each data center is
                 month            month
                                                                  11.5 times
Storage          $2.20 per GB/    $0.40 per GB/     5.1
                 month            month
                                                           the size of a football field
Administration   ~140 servers/    >1000 Servers/    7.1
                 Administrator    Administrator
A bunch of machines in data centers
Fabric Controller
   Owns all data center hardware
   Uses inventory to host services
   Deploys applications to free resources
   Maintains the health of those applications
   Maintains health of hardware
     If the node goes offline, FC will try to recover it
     If a failed node can’t be recovered, FC migrates
     role instances to a new node, A suitable
     replacement location is found, Existing role
     instances are notified of change
   Manages the service life cycle starting from bare
                                                           Fabric Controller (FC)
                          At Minimum
                             CPU: 1.5-1.7 GHz
                             Memory: 1.7GB
                             Network: 100+
Up to 7 Guest VMs            Mbps
                             Local Storage:

                          Up to (Extra
A Host Virtual Machine    Large)
An Optimized Hypervisor      CPU: 8 Cores
                             Memory: 14.2 GB
                             Local Storage: 2+
At Minimum
  CPU: 1.5-1.7 GHz x64
  Memory: 1.7GB
  Network: 100+ Mbps
  Local Storage: 500GB
Up to
  CPU: 8 Cores
  Memory: 14.2 GB
  Local Storage: 2+ TB
Azure Platform
                           Worker Role

                            Web Role



                 Storage     Tables

A closer look

                                       Blobs            Drives    Tables      Queues

      Application                                    Data is exposed via .NET and RESTful
       Compute               Storage                 interfaces
                    Fabric                           Data can be accessed by:
                                                        Windows Azure apps
                                                        Other on-premise applications or cloud
Account           Container                     Blob
Number of Blob Containers
 Can have has many Blob Containers as will fit within the
 storage account limit
Blob Container
 A container holds a set of blobs
 Set access policies at the container level
   Private or Public accessible
 Associate Metadata with Container
   Metadata are <name, value> pairs
   Up to 8KB per container
Block Blob
 Targeted at streaming workloads
 Each blob consists of a sequence of blocks
   Each block is identified by a Block ID
 Size limit 200GB per blob

Page Blob
 Targeted at random read/write workloads
 Each blob consists of an array of pages
   Each page is identified by its offset from the start of the blob
 Size limit 1TB per blob
                                  Block or
Account   Container   Blob         Page
           images                  Block or
                                   Page 1
                                   Block or
                                   Page 2
                      MOV1.AVI     Block or
                                   Page 3
                     Producers                   Consumers
Scalable message
paths                  P2                           C1
Provides loose
Any number of                    4   3   2   1
One week of            P1
Maximum size 8KB
Visibility timeout
Provides Structured Storage
   Massively Scalable Tables
       Billions of entities (rows) and TBs of data
       Can use thousands of servers as traffic grows
       Data is replicated several times
   A storage account can create many tables
   Table name is scoped by account
   Set of entities (i.e. rows)
   Set of properties (columns)
   Required properties
       PartitionKey, RowKey and Timestamp
                                                           Partition 1

                                                           Partition 2

Source : Windows Azure Table – Programming Table Storage
A Windows Azure Drive is a Page Blob formatted as a NTFS
single volume Virtual Hard Drive (VHD)
  Drives can be up to 1TB

A VM can dynamically mount up to 8 drives
A Page Blob can only be mounted by one VM at a time for

Remote Access via Page Blob
  Can upload the VHD to its Page Blob using the blob interface, and then
  mount it as a Drive
  Can download the Drive through the Page Blob interface
A closer look

                                 Web Role                Worker Role

     HTTP                          ASP.NET, WCF,            main()
                                        etc.                { … }
                 Load      IIS

                                            Agent                Agent


Using queues for reliable messaging

                                      To scale, add more of either

  1) Receive work     Web Role                                        Worker Role

                                                                            main()   4) Do
                    ASP.NET, WCF,
                                                                            { … }    work

                              2) Put work in                  3) Get work
                                  queue                       from queue

Queues are the application glue
• Decouple parts of application, easier to scale independently;
• Resource allocation, different priority queues and backend servers
• Mask faults in worker roles (reliable messaging).

Use Inter-role communication for performance
• TCP communication between role instances
• Define your ports in the service models
Points of interest

     Data is exposed via .NET and RESTful interfaces
     Data can be accessed by:
         Windows Azure apps
         Other on-premise applications or cloud applications
                                                        Development Fabric

                    Your   Run
 Home               App
                                                        Development Storage

Source    Version

                                 Application Works Locally
 What the ‘Value Add’ ?
Provide a platform that is scalable and available
     Services are always running, rolling upgrades/downgrades
     Failure of any node is expected, state has to be replicated
     Failure of a role (app code) is expected, automatic recovery

     Services can grow to be large, provide state management
     that scales automatically
     Handle dynamic configuration changes due to load or failure
     Manage data center hardware: from CPU cores, nodes, rack,
     to network infrastructure and load balancers.
Key takeaways
Cloud services have specific design considerations
  Always on, distributed state, large scale, fault tolerance
  Scalable infrastructure demands a scalable architecture
     Stateless roles and durable queues

Windows Azure frees service developers from
 many platform issues
Windows Azure manages both services and servers
                       Web Role               Job Management Role
          Web                                      Scaling
          Portal        Job                        Engine
                        registration              Job
          Web                                                        Global        …
          Service                                 Scheduler
                                                                     queue         Worker

NCBI                                   Job
databas                                Registry
es                                                                  databases,
                                        Azure Table
                    Database                                        temporary
                    updating                                        data, etc.)
                                                                      Azure Blob
•   Always design with failure in mind
    - On large jobs it will happen, and it can happen anywhere
•   Factoring work into optimal sizes has large performance impacts
    - The optimal size may change depending on the scope of the job
•   Test runs are your friend
    - Blowing $20,000 of computation is not a good idea
•   Make ample use of logging features
    - When failure does happen, it’s good to know where
•   Cutting 10 years of computation down to 1 week is great!!
    - Little Cloud development headaches are probably worth it
Thank you!

To top