A Survey on Cloud Storage Systems

					A Survey on Cloud Storage Systems
                                         Team : Xiaoming

 No Taxonomy
 Detailed Survey for users
 Starting point for researchers

Category                      Definition                                                        Example

Instance Storage              Storage coming with virtual machine images                        Amazon EC2 instance storage

Object storage                Storage of binary objects provided in the form of Web services.   Amazon Simple Storage Service (S3)
                              An object can be any type of file.
Block storage                 Virtual block devices that can be attached to VM instances and    Amazon Elastic Block Store (EBS)
                              used like local disks.
Semi-structured data          Database service for storing semi-structured data with high       Amazon Simple DB
storage                       availability, high scalability, and high performance.
Relational Database storage   Relational database servers on VM instances in clouds.            Amazon Relational Database service
Distributed file system       Distributed storage provided through file system interfaces       Google File System
                              with high availability and high scalability.
Online Drive/                 Storage space provided in the form of a virtual drive or folder   Microsoft SkyDrive
Folder service                on Internet.
          Commercial Cloud Providers
Vendor      Instance    Object         Block          Semi-       Relational   Distributed      Online
             storage    storage       storage      structured     Database     File System   Folder/Drive
                                                       data        storage
Amazon        EC2          S3            EBS        SimpleDB         RDS          N/A

Microsoft   Azure VM   Azure Blob       Azure       Azure table   SQL Azure       N/A        SkyDrive/Mesh
 Google       N/A        Google         N/A          BigTable       N/A        Google File
                       Storage for                                              System

Commercial Cloud Providers
 Windows Azure Blob
    - Distributed storage for large items. Each item can be of maximum size 50 GB.
    - One can view Azure Blob as a container. Each container consists of blobs and each blob is made of blocks.
    - All access to Azure Blob is through HTTP REST interface.

 Windows SQL Azure
     - SQL Azure provides web-facing database functionality as utility service.
    - TDS is the protocol which is used to connect to a Cloud-based database.
    - Queries are formulated in Transact-SQL language.
    - Applications and tools already in use with existing other relational databases work seamlessly with SQL Azure.

 Windows Azure Table
    - Provides structured storage for maintaining service state.
    - Structured storage is provided in the form of tables which contain a set entities and each entity is made up of a set of named properties.
    - Provides support for LINQ, ADO.NET data services and REST.
    - Azure Table can be thought of as a fancy spreadsheet. One can store the state of an entity in the columns of the spreadsheet.

Commercial Cloud Providers
   Amazon Elastic Block Store (EBS)
      - Off-instance storage that persists independently from the life of an instance.
      - Storage volumes behave like raw, unformatted, block devices.
      - Can store from 1 GB to 1 TB in storage volumes, can be mounted on EC2 instances.

   Amazon S3
      - Object storage that is designed to make web-scale computing easier for developers.
      - Users can store persistent data organized in buckets and objects.
      - Uses standards-based REST and SOAP interfaces designed to work with any Internet-         development toolkit.
      - Unlimited objects containing 1 byte to 5 GB of data each can be stored.

   Amazon Relational Database Storage (RDS)
      - Provides cost-effective and resizable storage capacity.
      - Applications and tools in use with existing MySQL databases work seamlessly with Amazon RDS.

 Amazon SimpleDB
      - Non-relational database that offloads the work of database administration.
      - User can Focus on application development without worrying about infrastructure provisioning, high availability, software maintenance.

Commercial Cloud Providers

Commercial Cloud Providers - Use Cases
     Creating a Web Application With Relational Data
               SQL Azure or Amazon RDS can be used
     Creating parallel processing Application, Storage for data analysis, Backup and Recovery
          (examples: financial modeling at a bank,
         New drug development in a pharmaceutical company.)
         Azure Blob or Amazon S3 can be used to store intermediate data.
     Creating Scalable Web Application, gaming application, metadata indexing
         (example : On line Tickiet system, news video site etc ,)
         Azure table or Amazon Simple DB can be used
     Applications that require a database, file system, or access to raw block level storage.
          Amazon EBS or Azure drive can be used.
       Academic Cloud Systems
 System      Instance       Object             Block           Semi-structured   Distributed file system
              storage       storage           storage            data storage
Eucalyptus     VM               S3              EBS                 N/A                    N/A

 Nimbus        VM           Cumulus             N/A                 N/A                    N/A

OpenNebula     VM             N/A               N/A                 N/A                    N/A

OpenStack      VM         OpenStack             N/A                 N/A                    N/A
                         object storage
 Hadoop        N/A            N/A               N/A                 HBase         Hadoop distributed file
                                                                                     system (HDFS)

Academic Cloud Systems
 Eucalyptus
                            SOAP/REST based tools

Cluster A                                                           Cluster B

       Storage Controller                              Storage Controller
                  …                                               …

 S3 mainly used for VM image storage
 Typical configuration contains one storage server per cluster

Academic Cloud Systems
 Nimbus
  - Cumulus service used for VM image storage
  - Cumulus can be configured to use various storage backend
 OpenNebula
  - Two ways to manage VM images: shared NFS and non-
  shared SSH

Academic Cloud Systems
 OpenStack
  - OpenStack object storage used for VM image management
  - Uses disk blocks directly instead of file systems
 Hadoop
  - HDFS interface is not totally compatible with POSIX
  standard, nor is the system optimized for file I/Os
  - Hbase is built on top of HDFS

Conclusions and Future work
 Virtualized I/O performance of cloud storage services not
    comparable to local disk yet
   Academic cloud systems are not providing a rich set of
    storage services so far
   Performance tests for commercial storage services in future
   More investigation on design and implementation details
   Include emerging services from other providers.
