Cloud Computing with the Windows Azure Platform

Document Sample
Cloud Computing with the Windows Azure Platform Powered By Docstoc
					Chapter 13: Exploiting SQL
Azure Database's Relational
              Cloud-based relational database management systems (RDBMSs)
      are relatively uncommon because they are difficult to implement with the
      enterprise-grade scale-out and availability capabilities IT departments have
      come to expect from traditional schemaless Entity-Attribute-Value (EAV)
      data stores, such as the Google App Engine’s datastore and Amazon Web
      Services’ SimpleDB. Google offers an SQL-like query dialect called GQL
      and Amazon provides a set of SimpleDB API calls to emulate simple SQL-
      like SELECT statements. Neither of these query ―languages‖ approaches
      the capability of ANSI SQL to deliver complex result sets tailored to a
      particular structure and having a precise, predictable composition. These
      tables also lack the notion of relational INNER or OUTER JOINs, but the
      App Engine’s datastore can emulate many:one associations with db.Key
      and one:many associations with list(db.Key) data types.
 The “Comparing Google App Engine, Amazon SimpleDB and Microsoft
SQL Server Data Services” post of May 6, 2008 to the OakLeaf Systems
blog (,
google-app-engine-amazon.html) describes differences in the
capabilities of these three EAV datastores at the time. SQL Server Data
Services (renamed SQL Data Services when this book was written) will be
retired when Microsoft releases SQL Azure as a commercial product in late
             Before you can query relational or EAV cloud-based tables, you
      must specify a relational schema or define entities and their properties, as
      well as populate tables or entity collections with data. Most early cloud
      databases or datastores will be based on schemas of on-premises databases
      and contain at least a nominal amount of historical data. Undoubtedly,
      getting data-intensive applications up and running in the cloud will be
     faster and simpler with an RDBMS rather than EAV tables as the datastore.
     This chapter covers migrating schemas and data from on-premises SQL
     Server [Express] instances to SQL Azure Database (SADB) running in
     Microsoft data centers, as well as executing DML and DQL queries against
  Migrating schemas and data from on-premises databases to SADB in the
cloud was essentially a manual process when this chapter was written.
Windows Azure evangelist Steve Marx said in an August 26, 2008, Tweet,
“APIs for automatic deployment to Windows Azure are coming before
launch” (, The
forthcoming Data Hub aggregation services will enable synchronization
between on-premises SQL Server instances and SQL Azure in the cloud.
This chapter will be updated online when a CTP of Data Hub services
becomes available, probably shortly after the Professional Developers
Conference (PDC) 2009.

Creating and Populating SADB
            A major issue with cloud-based databases, regardless of whether
     they use the relational or EAV data model, is populating them with existing
     historical or reference data. This issue especially applies to large data sets,
     such as clinical research data, genetic information, personal health records
     (PHR), electronic medical records (EMR), and transactional data for
     business intelligence (BI) applications, such as data mining, and
     governmental statistics.
             The speed at which you can add rows to remote databases over a
     wide-area network (WAN) is limited by the bandwidth of the connection
     from the source file or table to the destination table. Most large
     organizations have multiple 1.544Mbps Digital Signal 1 (DS1 or T1) or
     faster connections between data centers and the Internet; individuals usually
     make do with DSL or cable connections having upload speeds less than
      T1’s but considerably higher download speeds. To avoid extremely long
      upload (or download) times for very large tables, Amazon Web Services
      (AWS) offers a beta version of the AWS Import/Export service, which
      relies on shipping high-capacity disk drives between the source data center
      and an Amazon data center.
              The detail page for the AWS Import/Export service,,, includes the following
      table that shows the time required to transfer a 1TB table at common
      connection speeds:
  Available Internet         Theoretical Minimum        When to Consider AWS
  Connection                 Number of Days to          Import/Export?
                             Transfer 1TB at 80%
                             Network Utilization
  T1 (1.544 Mbps)            82 days                    100GB or more
  10 Mbps                    13 days                    600GB or more
  T3 (44.736Mbps)            3 days                     2TB or more
  100 Mbps                   1 to 2 days                5TB or more
  1,000 Mbps                 Less than 1 day            60TB or more
  The AWS Import/Export service is a sophisticated extension to the
traditional “sneakernet” file transfer approach. Microsoft didn’t offer a
similar service when this chapter was written in mid-September 2009.
             Although the maximum size of a SQL Azure database was 10GB
      when this book was written, sharding large source files or tables into
      multiple databases is a common practice. The later ―Sharding Tables to
      Overcome Database Size Limits‖ section describes several approaches to
      dividing large tables into manageable units.
  You can expect bulk uploading data to SADB, which uses SQL Server’s
traditional Tabular Data Stream (TDS) protocol over TCP port 1433, to be
faster than REST or SOAP protocols for the same amount of data due to
the latter’s much larger amount of overhead characters. Microsoft also
promises to implement the bulk copy protocol (bcp) in the PDC timeframe.

Generating the Northwind Sample Database
in SADB from the instnwind.sql Script
            The most common method for automating the creation of schemas
    for SQL Server databases is to execute a script containing a series of T-
    SQL DDL commands to add tables, columns, primary-key and foreign-key
    constraints, default values, indexes, triggers, views, stored procedures, and
    other database objects. If complex tables contain less than about 1,000
    rows, incorporating INSERT statements to populate the empty tables that
    the script creates and testing them for compatibility with the source data is
           The instnwind.sql script included in the Northwind and pubs Sample
    Databases for SQL Server 2000 download (,
    eebc53a68034&displaylang=en) generates tables, relationships,
    views, and stored procedures for Microsoft Access’s original Northwind
    database as well as five additional tables: CustomerCustomerDemo,
    CustomerDemographics, EmployeeTerritories, Regions, and Territories.

Conforming instnwind.sql to SADB T-SQL DDL
           The \WROX\Azure\Chapter13 folder contains
    InstnwindFromNorthwind.sql, which is a version of instnwind.sql without
    the additional tables and with the following modifications to accommodate
    SADB’s limitations on T-SQL DDL statements:

            The USE master and CREATE DATABASE commands ending
             with USE Northwind are removed because SADB doesn’t
             support T-SQL’s USE command, filename instructions, or other
             related database-wide commands. You can use the SQL Azure
             portal to CREATE and DROP databases quickly.
            The if exists() functions are removed because SADB
             substitutes sys.tables, sys.views, sys.procedures, and related
             system tables for sysobjects. Dropping and creating the database,
             and then running the entire procedure is easier than modifying
             each function. See the later ―Updating IF EXISTS() Tests That
               Use sysobjects Arguments‖ section for more details on syntax
              varchar(max), nvarchar(max), and varbinary(max)
               datatypes replace text, ntext, and image datatypes, as
               mentioned in Chapter 12.
              A clustered primary-key index must be added to tables without
               such an index because SADB doesn’t support heap tables. All
               original Northwind tables have clustered primary-key indexes.
               The additional tables have explicitly PRIMARY KEY
               NONCLUSTERED indexes.
             The SQL Azure ―Transact-SQL Reference (SQL Azure Database)‖
      topic (,
      us/library/ee336281.aspx) includes sections that define
      supported, partially supported, and unsupported T-SQL statements.
 InstnwindFromNorthwind.sql has 9,000 lines and is 935KB in size; the
original instnwind.sql script has 9,885 lines and is 2.01MB in size.

Updating IF EXISTS() Tests That Use sysobjects
             The original style of instnwind.sql’s object-existence tests for SQL
      Server 2000 is as follows:
if exists (select * from sysobjects where id = object_id('dbo.Order Details')
    and sysstat & 0xf = 3)
    drop table "dbo"."Order Details"

             SQL Server 2005 and 2008 also execute such commands as
      expected. SADB doesn’t recognize a sysobjects table, so the syntax
      required is as follows:
IF EXISTS (SELECT * FROM sys.tables WHERE name = 'Order Details'
    AND type = 'U')
    DROP TABLE [Order Details]

             U is the code for user objects to distinguish them from system
      objects. Substitute views, triggers, functions, or procedures
      for tables and change the DROP command’s argument accordingly.
Running InstnwindFromNorthwind.sql in SSMS
           To run InstnwindFromNorthwind.sql in from SSMS 2008 [Express],
    do the following:

          1. Open the SQL Azure portal, select the project you created in
             Chapter 12, and click the Manage link to open the Server
             Administration page.
          2. If you created the Northwind database in Chapter 12 and added
             any objects to it, select Northwind in the Database Name list,
             click the Drop Database button, and confirm your choice.
          3. Click the Create Database button, type Northwind in the Name
             Your Database text box, and click Create.
          4. Open SSMS 2008 [Express] and click Cancel if the Connect to
             Database dialog appears.
          5. Choose FileOpen (Ctrl+O) or File Open with New
             Connection (Ctrl+Shift+O) to open the File Open dialog.
          6. Navigate to \WROX\Azure\Chapter13 and double-click the
             InstnwindFromNorthwind.sql node to display the file in the
             Query Editor window and open the Connect to Database dialog.
          7. Complete any missing login information in the login dialog, click
             the Details button and verify that Northwind is specified in the
             Connect to Database text box and the Encrypt Connection check
             box is marked (see Chapter 12’s ―Connecting to SADB Master
             and User Databases with SSMS 2008 [Express]‖ section for login
          8. Click Connect, acknowledge the spurious ANSI NULLS error
             message, and verify that CloudNode.dbo.UserDb appears in the
             Available Services toolbar text box or Status Bar (see Figure 13-

          Figure 13-1. The InstnwindFromNorthwind.sql T-SQL script
          loaded in the SSMS [Express] Query Editor window.

          9. Click Execute to run the script.
          After about a minute or so, assuming a DSL connection with an
   upload speed of about 512Kbps, a few messages appear in the lower pane
   of the Query Editor window (see Figure 13-2).

         Figure 13-2. Messages returned to the lower pane of the Query
         Editor window after execution of the
         InstnwindFromNorthwind.sql T-SQL script. [f1302.bmp]

          Click the New Query button, confirm the connection parameters,
   verify existence of all tables by executing SELECT * FROM
   sys.tables to display the table properties, and execute SELECT *
   FROM TableName and compare the number of rows with the on-
   premises versions to verify that all INSERT operations succeeded.

Using SSIS to Populate a Local
AdventureWorksLT2008 Database from
          SQL Server 2008 Books Online describes SQL Server Integration
   Services (SSIS, formerly known as Data Transformation Services, DTS) as

                Microsoft Integration Services is a platform for building
                enterprise-level data integration and data transformations
                solutions. You use Integration Services to solve complex
                business problems by copying or downloading files,
                sending e-mail messages in response to events, updating
                data warehouses, cleaning and mining data, and managing
                SQL Server objects and data. The packages can work alone
                or in concert with other packages to address complex
                business needs. Integration Services can extract and
                transform data from a wide variety of sources such as XML
                data files, flat files, and relational data sources, and then
                load the data into one or more destinations.
                Integration Services includes a rich set of built-in tasks and
                transformations; tools for constructing packages; and the
                Integration Services service for running and managing
                   packages. You can use the graphical Integration Services
                   tools to create solutions without writing a single line of
                   code; or you can program the extensive Integration
                   Services object model to create packages programmatically
                   and code custom tasks and other package objects.
            SSIS was one of two methods of moving data into or out of SQL
      Azure August 2009 CTP and was the only practical method at the time for
      moving data from SQL Azure to a local database. You can’t start the Script
      Wizard from SSMS [Express]’s Object Explorer because the CTP doesn’t
      support SQL Azure connections to Object Explorer.
  SSIS is a component of SQL Server 2008 Developer Edition and higher
and isn’t included with SQL Server 2008 Express. You can install the SQL
Server 2008 Trial Edition (180 days life) or the SQL Server 2008 R2 CTP
  Alternatively, you can purchase SQL Server 2008 Developer Edition from and other software retailers for about $42.00. The Developer
edition includes all SQL Server Standard Edition components and features;
it is licensed for development only and not for production use.
             As mentioned in Chapter 12’s ―Reviewing Current SQL Azure
      Database Documentation and Resources‖ section, the Windows Azure
      Platform Training Kit – August 2009 Update (or later) contains several
      demos, hands-on labs, and presentations. Among these is a demo called
      ―Moving Data Into and Out of SQL Azure‖ which is located in your
      C:\WindowsAzurePlatformKit\Demos\MovingDataUsingSSIS folder,
      assuming you installed it to the default location.
 Despite the demo’s name, it doesn’t cover moving data into SADB.

Installing Prerequisites for Moving Data Using SSIS
             This demo requires installing the Training Kit from the location
      described in Chapter 12 and SSIS in conjunction with a non-Express
      version of SQL Server 2008. This chapter’s examples assume that you log
      on with the sa credentials you established when creating your SQL Azure
     account, so creating the HolTestUser account isn’t required. You create the
     equivalent of the HolTestDB database in later sections.
             You must also download and attach the AdventureWorksLT2008
     (lite) database to the SQL Server 2008 instance from CodePlex by
     following these steps:

           1. Navigate to the Microsoft SQL Server 2008 Database Product
              Samples Service Release (SR) 2 project on CodePlex
           2. Click the appropriate SQL2008.AdventureWorks_All_Databases.
              ... .msi file to suit your hardware architecture (x86, x64, or ia64)
              and accept the license agreement to download the installer
           3. Run the installer, accept the MS-PL license, and click Next to
              open the Custom Setup dialog.
           4. If you don’t want to create all the databases included in the
              download, select the Entire Feature Will Be Unavailable option
              for all databases except AdventureWorks2008LT, as shown in
              Figure 13-3.

           Figure 13-3. Creating the AdventureWorks2008LT database
           only. [f1303.bmp]

           5. Click Next and Install to begin the file installation and database
              creation process.
 If you encounter a problem with creating AdventureWorksLT2008 with the
preceding procedure, download from the Microsoft
SQL Server 2008 Database Product Samples Service Release (SR) 1
project (,
leases.aspx?ReleaseId=18407), extract the
AdventureWorksLT2008_Data.mdf and AdventureWorksLT2008_Log.ldf
files to \Program Files\Microsoft SQL
Server\MSSQL10.SQLServer\MSSQL\Data, and attach the Data file to
your SQL Server Instance with SSMS.
            Figure 13-4 shows SSMS with AdventureWorksLT2008 installed
      and connected to Object Explorer.

             Figure 13-4. Exploring the contents of the
             AdventureWorksLT2008 Database.        [f1304.bmp]

Verifying Demo Prerequisites with the Configuration
            The Configuration Wizard for the Hello Windows Azure Demo tests
      your computer’s configuration for the components necessary to run the
      SSIS demo. Full Text Search must be installed to enable the Configuration
             To run the Configuration Wizard, do the following.

             1. Navigate to the
                folder and double-click the StartHere.cmd icon to open the
                Configuration Wizard.
             2. Click Next to open the Steps Summary window and click Next to
                test the configuration. Figure 13-5 shows the Configuration
                Wizard with all prerequisites installed.

             Figure 13-5. Completing the Configuration Wizard’s
             prerequisite tests. [f1305.bmp]

             3. Click Next and Close to dismiss the wizard.
  If any prerequisites are missing, the wizard usually provides a link or
instructions for installation.
Creating a T-SQL Script with SSMS to Populate an SADB
            SSMS has the capability to generate T-SQL scripts for creating
    database schema and populating tables automatically. You can execute the
    script from SSMS [Express], as you did for the Northwind database, or
    execute it by a SQL Server Integration Services (SSIS) package in
    conjunction with other T-SQL commands against the master database, such
    DatabaseName to eliminate the need to involve the SQL Azure Portal in
    database generation operations.
           To generate a script that creates a schema for the
    AdventureWorksLT2008 database in SADB and then uploads data for
    tables and other objects, do the following:

          1. Choose Start All Programs SQL Server 2008 SQL Server
             Management Studio to launch SSMS, and then connect to the
             local AdventureWorksLT2008 database.
          2. If you chose the Open New Query Window At Startup option, as
             recommended earlier in the chapter, open Object Explorer, click
             the toolbar’s Connect to Object Explorer button, and connect to
             the local AdventureWorksLT2008 database.
          3. Right-click any database node and select Tasks Generate
             Scripts to open the Script Wizard, and click Next to bypass the
             Welcome dialog.
          4. Select the AdventureWorksLT2008 database in the Select
             Database list and mark the Script All Objects in the Select
             Database check box (see Figure 13-6).

          Figure 13-6. Selecting the local AdventureWorksLT2008
          database and setting the “Script all objects in the selected
          database” option. [f1306.bmp]

          5. Click Next to open the Choose Script Options dialog and accept
             the default script options except those in the following table (see
             Figure 13-7):
  Script Option           Value     Reason for Non-Default Value
  Convert UDDTs to        True      SQL Azure doesn’t support user-defined
  base types                        data types.
  Script extended         False     SQL Azure doesn’t support extended
  properties                        properties.
  Script USE              False     SQL Azure doesn’t support the USE
  DATABASE                          DATABASE command.
  Script Data             True      Generating INSERT statement for adding
                                    data is a very important feature.

            Figure 13-7. Setting script options to conform a few T-SQL
            statements to SADB requirements, where possible.

            6. Click Next to open the Output Option dialog and select the Script
               to New Query Window option.
            7. Click Next to open the Generate Script Progress dialog and then
               click Finish to generate the script. After a minute or two, the
               dialog appears as shown in Figure 13-8 later in this chapter.
              Clicking SSMS’s Execute button to run the script generates more
      than 100 errors due to additional limitations in SADB’s T-SQL repertoire
      that the Script Wizard’s available options don’t handle. To learn more
      about the corrections required to make the script compatible with SADB,
      see the Windows Azure Platform Training Kit’s Migrating Databases to
      SQL Azure Hands-on Lab’s Exercise 1, Task 2—Modifying the Exported
      Script from SQL Server 2008 (OPTIONAL), steps 1 to 30.
  Future SQL Azure versions might eliminate the need to manually correct
unsupported T-SQL syntax generated by SSMS’s Query Wizard. Even if
not, a future version of the SQL Azure Migration Wizard, which the later
“Migrating Schemas and Data from the On-Premises
AdventureWorksLT2008 Database to SQL Azure” section describes, is
likely to be able to accomplish this task.
Using the Pre-Built AdventureWorksLT2008_Azure.sql
Script to Populate an SQL Azure Database
              Fortunately, the Toolkit team included a copy of the SADB-
      compatible script, AdventureWorksLT2008_Azure.sql with the assets for
      the demos. You can find a tested copy of the script in the
      \WROX\Azure\Chapter13 folder. Do the following to use this corrected
      script to create and populate tables in a newly created SQL Azure
      AdventureWorksLT2008 database:

             1. If you’ve created the Azure AdventureWorksLT2008 database in
                an SQL Azure instance, use the SQL Azure portal to delete it.
             2. In the SQL Azure portal, create an empty
                AdventureWorksLT2008 database using the procedure described
                in the earlier ―Generating the Northwind Sample Database in
                SADB from the instnwind.sql Script‖ section.
             3. Launch SSMS [Express] and open
                AdventureWorksLT2008_Azure.sql in a new query window with
                a connection to the new AdventureWorksLT2008 SQL Azure
             4. Click Execute to add tables, data, and other AdventureWorks
                database objects to the database.
 Disregard the deliberately introduced and non-fatal error near the end of
the script, which is caused by attempting to execute an unsupported SORT
IN TEMP DB option when indexing a view.

Designing the Package to Import Data from SQL Azure
to On-Premises SQL Server Tables
             SADB v1 doesn’t support BACKUP or RESTORE operations, so
      backing up SADB data requires importing it to an on-premises or network-
      accessible SQL Server database with the same or similar schema that can
      back up the data generate in the cloud. To start the package design process,
      do the following:

             1. Open SQL Server Business Intelligence Development Studio
           (BIDS) from the StartAll ProgramsSQL Server 2008SQL
           Server Business Intelligence Development Studio menu
      2.   Open the New Project dialog by choosing FileNewProject.
      3.   Select Business Intelligence Projects in the Project types list.
      4.   Select Integration Services Project in the Templates pane.
      5.   Type CloudSalesImport in the Solution Name text box (see
           Figure 13-8).

      Figure 13-8. Naming the new business Intelligence (BI) project.

      6. Click OK to create the CloudSalesImport Solution. Double-click
         the package.dtsx node in Solution Explorer to open its designer
        If you receive the error message shown in Figure 13-9 after
completing step 6 and search on the error message, you receive a hit on the
―Cannot create a new ssis project in bids 2008‖ thread
4aca-8165-cf5ed5e2d8d4) of the SQL Server Integration Services
forum. The response in this thread refers you to Knowledge Base article
818419, ―BUG: You receive a "CustomMarshalers not found" error
message when you move an element in the XML Schema designer‖
(, This KB article
applies to Visual Studio 2002 and 2003, and when this book was written
was last reviewed on May 16, 2007. However, the article does point to the
source of the problem, which is a version conflict in msadds.dll and related
designer libraries when you install a Visual Studio 2010 beta version before
installing SQL Server 2008.

      Figure 13-9. The error message resulting from problems with
      VS 2008 and 2010 designer file versions.    [f1309.bmp]
  A search on “msdds.dll” results in a hit on Microsoft Connect’s “mscorsvw
- msiexec.exe - never ending installation” bug report 481903
Feedback.aspx?FeedbackID=481903), which accurately describes the
locations of and versioning issues with msdds*.dll libraries. The only
reliable workaround for this problem with VS 2010 Beta 1 is to remove the
VS 2010 beta installation, remove SQL Server 2008, install VS 2008 and
VS2008 SP1, and then reinstall SQL Server 2008 and SQL Server 2008
            Following is a summary of the basic operations in the
      CloudSalesImport package design process:

            1. Create an OLE DB Connection Manager for the local (on-
               premises) AdventureWorksLT2008 database.
            2. Create an ADO.NET Connection Manager for the SADB (cloud)
               AdventureWorksLT2008 database.
            3. Delete all records from the local SalesLT.SalesOrderDetail table
               to conform with referential integrity constraints with a DELETE
               FROM SalesLT.SalesOrderDetail T-SQL statement.
            4. Delete all records from the local SalesLT.SalesOrderHeader table
               with a DELETE FROM SalesLT.SalesOrderHeader T-
               SQL statement.
            5. Import all SADB (cloud) SalesLT.SalesOrderHeader rows to the
               local SalesLT.SalesOrderHeader table changing the datatype of
               the CreditCardApprovalCode field from nvarchar to varchar in
               the process.
            6. Import all SADB (cloud) SalesLT.SalesOrderDetail rows to the
               local SalesLT.SalesOrderDetail table.

Create Connection Managers for the Local (Source) and SADB (Cloud)
             SSIS requires Connection Manager objects to connect to supported
      databases. SQL Azure requires an ADO.NET Connection manager. To
      create new OLE DB and ADO.NET Connection Managers, do the
            1. With the empty CloudSalesImport package open and the Control
               Flow tab selected, right-click inside the Connection Managers
               pane and choose New OLE DB Connection to open the
               Configure OLE DB Connection Manager dialog.
            2. Click New to open the Connection Manager dialog, type
               localhost in the Server Name combo box, accept the default Use
               Windows Security option, select AdventureWorksLT2008 from
               the Select or Enter a Database Name list, click Test Connection
               to verify connectivity, and click OK to dismiss the message (see
               Figure 13-10.)

            Figure 13-10. Defining an OLE DB Connection Manager for the
            local (destination) database.  [f1310.bmp]

 Click the All button to display all Connection Manager properties.

            3. In the Connection Managers pane, right-click
               localhost.AdventureWorksLT2008 and rename it
            4. Right-click inside the Connection Managers pane and choose
               New ADO.NET Connection to open the Configure ADO.NET
               Connection Manager dialog.
            5. Click New to open the Connection Manager dialog, type or paste
               the SADB Server Name from the connection string, including the
               tcp: prefix, in the Server Name combo box. Select the Use
               SQL Server Authentication option, type your sa User Name and
               Password text boxes, and mark the Save My Password check
               box. Type AdventureWorksLT2008 in the Select or Enter a
               Database Name text box, click Test Connection to verify
               connectivity, and click OK to dismiss the message (see Figure
 You must type the database name because the SADB CTP doesn’t
provide metadata to the ADO.NET provider.
            Figure 13-11. Defining an ADO.NET Connection Manager for
            the SADB (cloud) database.    [f1311.bmp]

            6. In the Connection Managers pane, right-click
               tcp:servername.AdventureWorksLT2008 and rename
               it CloudDB.AdventureWorksLT2008.

Add and Test Execute SQL Tasks to Clear the Local SalesOrder and
SalesHeader Tables
             Control Flow tasks determine package execution. For this example,
      you must clear the SalesLT.SalesOrderDetail and
      SalesLT.SalesOrderHeader tables with Execute SQL Statement Tasks
      before importing the data from the cloud database by doing the following:

            1. Drag the Execute SQL Statement Task from the Toolbox’s
               Control Flow Items and drop it on the Designer’s Control Flow
               page surface.
            2. Right-click the Task and select Edit to open the Execute SQL
               Task Editor dialog.
            3. Change the Name of the task to Clear Sales Order Details, accept
               or select OLE DB as the Connection Type, and select the
               LocalDB.AdventureWorksLT2008 Connection Manager in the
               Connection list. Type DELETE FROM
               SalesLT.SalesOrderDetail as the SQL Statement, set
               BypassPrepare to False, and click Parse Query to verify the
               connection and your T-SQL syntax (see Figure 13-12).

            Figure 13-12. Defining an Execute SQL Statement Task to clear
            the local SalesLT.SalesOrderDetail table.  [f1312.bmp]

  The Execute SQL Task item has small red circle containing a white “X,”
indicating a missing Connection Manager until you click the OK button to
update the item.
 If you change the SQLSourceType value from Direct Input to File
Connection, you can substitute a T-SQL script for the SQL Statement by
clicking the Browse button and navigating to the appropriate *.sql file.

            4. Click OK to close the dialog and update the Execute SQL Task
            5. Repeat steps 1 through 4, but drop the new task below the
               previous task and connect the arrow to the new task. Change the
               Name to Clear Sales Order Headers and the SQL Statement to
               DELETE FROM SalesLT.SalesOrderHeader (see
               Figure 13-13).

            Figure 13-13. Defining an Execute SQL Statement Task to clear
            the local SalesLT.SalesOrderHeader table. [f1313.bmp]

            6. Click OK to save the changes to the second task and press F5 to
               build and run the package. After a second or two, the two SQL
               Statement Task items will turn green, you receive a ―Package
               execution completed.…‖ message, and the Output pane opens
               with the messages shown in Figure 13-14.

            Figure 13-14. Testing execution of the two Execute SQL
            Statement Tasks. [f1314.bmp]

            7. Click the ―Package execution completed.…‖ message to return to
               design mode.
            8. Launch SSMS and execute SELECT * FROM
               SalesLT.SalesOrderDetail and SELECT * FROM
               SalesLT.SalesOrderHeader queries to verify that the two
               tables are empty.

Add and Test Data Flow Tasks to Import SalesOrder Data from an SADB to
a Local Database Table
             The final procedure is to add two Data Flow Tasks to copy the data
      from the cloud to the local database and test data transfer by doing the
1. Drag a Data Flow Task from the Toolbox’s Control Flow Items,
   drop it on the Designer’s Control Flow page surface below the
   Clear Sales Order Headers task, and connect them.
2. Right-click and rename the Data Flow Task to Import Sales
   Order Headers, right-click it again and choose Edit to select the
   Data Flow tab and display its empty pane.
3. Drag an ADO NET Source item from the Tool Box’s Data Flow
   Sources group to the Data Flow pane.
4. Right-click the ADO NET Source item, rename it CloudDB
   Sales Headers and right-click it again and choose Edit to open the
   ADO.NET Source Editor dialog.
5. Select the CloudDB.AdventureWorksLT2009 ADO.NET
   Connection Manager in the ADO.NET Connection Manager list,
   accept the default Table or View as the Data Access Mode and
   select “SalesLT”.”SalesOrderHeader” in the Name of
   the Table or the View list (see Figure 13-15).

Figure 13-15. Configuring an ADO NET Source for an SADB
(cloud) table in the ADO.NET Source Editor. [f1315.bmp]

6. Click the Preview button to verify that the connection to the
   SADB behaves as expected by displaying the first few rows of
   the source table in a grid (see Figure 13-16).

Figure 13-16. Verifying the first few rows of the SADB table
from which to import data.        [f1316.bmp]

7. Click the Columns link in the ADO.NET Source Editor’s left
   column to open the Columns page in which you can edit column
   names to conform to those of the local table, if necessary (see
   Figure 13-17).

Figure 13-17. Mapping External (source) Column to Output
(destination) Column names, if necessary. [f1317.bmp]

8. Click OK to close the dialog and save the ADO NET Source
                configuration, including the default mappings.
            9. Click and drag an OLE DB Destination Data Flow Task from the
                Tool Box’s Data Flow Destinations group to the Data Flow pane
                below the CloudDB Sales Headers item. Connect the two items,
                and rename the new Data Flow Task to LocalDB Sales Headers.
            10. Right-click the LocalDB Sales headers Data Flow Task and click
                Edit to open the OLE DB Destination Editor. Select
                CloudDB.AdventureWorksLT2008 as the ADO.NET Connection
                Manager, accept the default Table or View – Fast Load as the
                Data Access Mode, and select [SalesLT].[SalesOrderHeader] as
                the Name of the Table or the View. Select the Keep Identity
                check box to overwrite the destination table’s identity column
                values (see Figure 13-18.)

            Figure 13-18. Configuring an OLE DB Destination for an SQL
            Server (local) table in the OLE DB Destination Editor.

 Marking the Keep Identity check box has the same effect as specifying
SET IDENTITY INSERT ON for the table, which is available for OLE DB
destinations only. Insertion options appear only if you specify Table or
View – Fast Load as the Data Access Mode. (That’s why OLE DB, rather
than ADO.NET (which doesn’t support this feature), is specified as the
driver type for the local database.)

            11. Click OK to close the dialog and apply the new settings, which
                don’t remove the red error indicator. Choose ViewErrors from
                the main menu to open the Errors pane, which displays two
                ―Validation error. Import Sales Order Headers: Import Sales
                Order Headers: Column "CreditCardApprovalCode" cannot
                convert between unicode and non-unicode string data types‖
                messages (look ahead to Figure 13-19).
 SADB requires all character data types to be encoded as Unicode
(nchar or nvarchar); the local database’s CreditCardApprovalCode field.
12. Delete the connection between CloudDB Sales Headers and
    LocalDB Sales Headers, drag a Data Conversion item from the
    Toolbox’s Data Transformations group, connect the CloudDB
    Sales Headers item to it, and connect it to the LocalDB Sales
    Headers item.
13. Rename the Data Conversion item to CreditCard Approval Code.
    Right-click it and choose Edit to open the Data Conversion
    Transformation Editor dialog.
14. Scroll to and mark the Available Input Columns’
    CreditCardApprovalCode check box to add it to the lower Input
    Columns list, add a varchar prefix to the Output Alias, and
    select String (DT_STR) to replace Unicode string
    (DT_WSTR) in the Data Type column. The white ―X‖ in the red
    circle remains visible (see Figure 13-19).

Figure 13-19. Creating a new Alias Field to transform the
datatype of an Input Column’s character field from nvarchar
(Unicode) to varchar.     [f1319.bmp]

15. Click OK to close the Data Conversion Transformation Editor
    dialog, click the Control Flow tab to expose the LocalDB Sales
    Headers Task, right-click the task, and click Edit to open the
    OLE DB Destination Editor.
16. Click the Mappings item in the OLE DB Destination Editor’s left
    pane to display the Available Input Columns and Available
    Destination Columns lists, and scroll the lists in the upper pane to
    expose the varcharCreditCardApprovalCode in the Available
    Input Columns list and CreditCardApprovalCode in the
    Available Destination Columns list.
17. Drag the varcharCreditCardApprovalCode’s icon to the
    CreditCardApprovalCode’s icon to replace the original mapping
    (see Figure 13-20).

Figure 13-20. Remapping the CreditCardApprovalCode’s
Destination Column to the new varchar alias column.
            18. Click OK to save the mapping as well as remove the error icon
                and Errors list items, and press F5 to test the import process to
                the SalesLT.SalesOrderHeader table. After a few seconds, the
                Output pane displays the operations executed and the number of
                rows imported (see Figure 13-21).

            Figure 13-21. Testing execution of the
            SalesLT.SalesOrderHeader tasks.        [f1321.bmp]

            19. Select the Package.dtsx item in Solution Explorer, choose File
                Save As, change the filename to CloudSalesImportPackage.dtsx,
                and click Save to save the package changes.
            20. Execute a SELECT * FROM [SalesLT].[SalesOrderHeader]
                query in SSMS [Express] to verify that the data (especially the
                CredCardApprovalCode column) has imported as expected.
            21. Repeat steps 1 through 11, substituting Sales Order Details for
                Sale Order Headers and SalesOrderDetail for SalesOrderHeader.
 Add the Sales Order Details Data Flow Task below the Sales Order
Headers Data Flow Task and connect the two. The SalesOrderDetail
tables have no datatype conflicts, so you can accept the default mapping
by opening the Mapping dialog and clicking OK.

            22. Press F5 to compile and execute your package (see Figure 13-
                22). Press Shift+F5 to return to design mode and press Ctrl+S to
                save your package design changes.

            Figure 13-22. Testing execution of the
            SalesLT.SalesOrderDetail tasks.        [f1322.bmp]

            23. Click the Execution Results tab to display a detailed list of task
                execution operations and elapsed times (see Figure 13-23).

            Figure 13-23. Viewing the Execution Results page for
            importing SalesOrderHeader and SalesOrderDetail records.
  The \WROX\Azure\Chapter13\CloudSalesImport folder contains the
CloudSalesImport.sln solution file and a CloudSalesImport subdirectory
with the other supporting files for the preceding project. To run the sample
you must open the CloudDB Connection Manager and specify your SADB
instance name, type your User ID and Password, and save the changes to
compile and run the package. You don’t need to modify the LocalDB
Connection Manager because it uses Windows rather than SQL Server

Migrating Schemas and Data from the On-
Premises AdventureWorksLT2008 Database
to SQL Azure
             George Huey’s SQL Azure Migration Wizard (MigWiz) offers a
      streamlined alternative to the SQL Server Management Studio (SSMS)
      Script Wizard for generating SQL Azure schemas that conform to the
      service’s current Data Definition Language (DDL) limitations. You can
      download the binaries, source code, or both from CodePlex
      (,; a Microsoft Public License
      (Ms-PL) governs use of the software. You can learn more about MigWiz
      and watch a screencast from Wade Wegner’s SQL Azure Migration Wizard
      post (,
      l-azure-migration-wizard/) of September 1, 2009.
             MigWiz generates a T-SQL script, similar to that created by SSMS
      [Express]’s Script Wizard, and applies a set of regular expressions (regex)
      in a NotSupportedByAzureFile.Config file (,
      reFile.xml) to remove or modify statements that SADB doesn’t
      support. For example, the following element from the file’s
      <TableStatement> group replaces NOT FOR REPLICATION
      modifiers with a WarningMessage (see Figure 13-24):
<NotSupported Text="\sNOT\sFOR\sREPLICATION" ReplaceWith="" SeverityLevel="0"
ReplaceString="true" DisplayWarning="true" WarningMessage="Removed 'NOT FOR
REPLICATION' because it is not supported in this version of SQL Server." />

             Figure 13-24. Warning messages issued by the SQL Server
             Migration Wizard for unsupported T-SQL syntax in black for
             corrections (NOT FOR REPLICATION) or red for not corrected
             items (ROW GUID COLUMN). [f1324.bmp]

             The ―Using the SQL Azure Migration Wizard with the
      AdventureWorksLT2008 Sample Database‖ post
      azure-migration-wizard-with.html) of September 8, 2009, to
      the OakLeaf Systems blog demonstrates the most common workflow for
      duplicating an on-premises SQL Server 2008 schema for the
      AdventureWorksLT2008 in SQL Azure running in a Microsoft data center.
      The first section of this post illustrates a few problems with the early
      MigWiz v.0.2.3 of 9/2/2009 and workarounds for them. For example, many
      AdventureWorksLT2008 tables include uniqueidentifier (GUID)
      columns for which the Script Wizard inserts ROWGUIDCOLUMN modifiers,
      which generated red warnings in the script’s Result Summary page (refer to
      Figure 13-24). Eliminating these fatal errors required adding the following
      element to the <TableStatement> group:
<NotSupported Text="\]\sROWGUIDCOL" ReplaceWith="]" SeverityLevel="0"
ReplaceString="true" DisplayWarning="true" WarningMessage="Removed 'ROW GUID
COLUMN' because it is not supported in this version of SQL Server." />

              George Huey incorporated a workaround to enable processing T-
      SQL scripts generated by SSMS [Express]’s Script Wizard by adding a
      Parse TSQL for SQL Incompatibles check box to modify the workflow of
      the later MigWiz v.0.2.7 of September 21, 2009. This version also corrects
      the ROW GUID COLUMN and a few other errors. The post’s ―Running an
      SSMS-Created Script That Populates Table Data‖ section describes how to
      run this or later MigWiz versions with T-SQL scripts generated by SSMS
Using the Bulk Copy Process to Populate
SADB Tables
                SADB’s August 2009 CTP didn’t support SQL Server’s command-
         line bcp.exe application for high-speed bulk INSERT operations.
         According to a reply on August 25, 2009, by Stan Kitsis, an SQL Azure
         program manager, to a ―Transferring data from SQL Azure‖ thread
         74d5c6125f72) in the SQL Azure—Getting Started forum, bcp
         ―…should be in the next update of the service.‖
                You can learn more about using bcp from SQL Server Books
         Online’s ―Importing and Exporting Bulk Data by Using the bcp Utility‖
         topic (,

Populating Tables with the INSERT … BULK
                SQL Server 2008 R2’s INSERT statement gains a BULK modifier
         with a syntax similar to bcp’s. Following is the syntax of the INSERT …
         BULK command from SQL Server 2008 R2 Books Online
              [ TOP ( expression ) [ PERCENT ] ]
              [ INTO ]
              { <object> | rowset_function_limited
                [ WITH ( <Table_Hint_Limited> [ ...n ] ) ]
              [ ( column_list ) ]
              [ <OUTPUT Clause> ]
              { VALUES ( { DEFAULT | NULL | expression } [ ,...n ] ) [ ,...n
              | derived_table
            | execute_statement
            | <dml_table_source>
            | DEFAULT VALUES
    [ database_name . [ schema_name ] . | schema_name . ]
    [ table_name | view_name ]
    ( <column_definition> )
    [ WITH (
        [ [ , ] CHECK_CONSTRAINTS ]
        [ [ , ] FIRE_TRIGGERS ]
        [ [ , ] KEEP_NULLS ]
        [ [ , ] KILOBYTES_PER_BATCH = kilobytes_per_batch ]
        [ [ , ] ROWS_PER_BATCH = rows_per_batch ]
        [ [ , ] ORDER ( { column [ ASC | DESC ] } [ ,...n ] ) ]
        [ [ , ] TABLOCK ]
      ) ]
[; ]

<object> ::=
    [ server_name . database_name . schema_name .
      | database_name .[ schema_name ] .
      | schema_name .

<dml_table_source> ::=
        SELECT <select_list>
        FROM ( <dml_statement_with_output_clause> )
            [AS] table_alias [ ( column_alias [ ,...n ] ) ]
    [ WHERE <search_condition> ]
        [ OPTION ( <query_hint> [ ,...n ] ) ]

<column_definition> ::=
 column_name <data_type>
    [ COLLATE collation_name ]
    [ NULL | NOT NULL ]

<data type> ::=
[ type_schema_name . ] type_name
    [ ( precision [ , scale ] | max ]

               External tools use the BULK modifier to upload a binary data stream.
       According to Books Online, ―This option is not intended for use with tools
       such as SQL Server Management Studio, SQLCMD, OSQL, or data access
       application programming interfaces such as SQL Server Native Client.‖
       [Emphasis added.] However, Stan Kitsis wrote on August 24 and 29, 2009,
       in a response to the thread mentioned in the preceding section:
                As mentioned in a few other threads, we are also working
                on enabling support for bcp and INSERT BULK statement
                (not to be confused with BULK INSERT). So you might
                want to consider them as options too.…
                BULK INSERT is different [from INSERT … BULK] in
                that the parsing of the data file is done by the server and not
                the client. The reason we don't support BULK INSERT is
                that it requires you to have access to the physical server—
                the filepath is relative to the server.

Sharding Tables to Overcome
Database Size Limits
          Wikipedia briefly describes sharding as ―a method of horizontal
  partitioning in a database or search engine‖ and continues with this more
  detailed description:

                Horizontal partitioning is a design principle whereby rows
                of a database table are held separately, rather than splitting
                by columns (as for normalization). Each partition forms
                part of a shard, which may in turn be located on a separate
                database server or physical location. The advantage is the
                number of rows in each table is reduced (this reduces index
                size, thus improves search performance). If the sharding is
                based on some real-world aspect of the data (e.g. European
                customers vs. American customers) then it may be possible
                to infer the appropriate shard membership easily and
                automatically, and query only the relevant shard.
         Sharding is in practice far more difficult than what’s described in the
  preceding definition. Although it has been done for a long time by hand-
  coding (especially where rows have an obvious grouping, as per the earlier
  example), this is often inflexible. A desire exists to support sharding
  automatically, both in terms of adding code support for it, and for
  identifying candidates to be sharded separately.
       Where distributed computing is used to separate load between
multiple servers (either for performance or reliability reasons) a shard
approach may also be useful.
       Sharding large database tables by geographical location, date,
product type, or similar columnar data is a common practice. For example,
you might shard customer and related contact, order, and invoice tables by
country or region. Large organizations might need more granular sharding
by state, county, canton, or the like. These partitioning techniques are
called value ranges. Alternatively, partitions can be created on hash values
or table lookup to equalize the size of shards on multiple servers. Dare
Obasanjo’s ―Building Scalable Databases: Pros and Cons of Various
Database Sharding Schemes‖ post of January 16, 2009
chemes.aspx) discusses the benefits and drawbacks of vertical,
horizontal, range-based, key or hash, and dictionary-based partitioning.
        Cross-shard data access is expensive so slowly changing (reference)
data often is replicated on each server holding related data. For example, if
order header and detail data for customers is sharded by location, each
database holding a shard would include a product table with a record for
every product sold in the location. This obviously leads to a substantial
amount of data redundancy and a requirement for real-time synchronization
of product table updates to enable enforcing referential integrity between
order line items and products tables. The effect of redundant data is
multiplied by three because of SADB’s data replication for high
availability. Denormalizing the data by adding required product data to the
line items is an alternative that might result in more efficient queries and
disk space utilization. Obasanjo discusses this issue in his later ―Building
Scalable Databases: Denormalization, the NoSQL Movement and Digg‖
post of September 10, 2009 (,
       Fan-out queries, which run against multiple databases in parallel
with result sets combined by a UNION ALL operation on the client, won’t
be supported until a version of SADB later than v1, which is scheduled to
     release at PDC 2009. As workload increases, you can scale out data access
     by adding more servers and placing fewer partitions on each server. PDC
     2009’s ―Scaling out Web Applications with Microsoft SQL Azure
     Databases‖ session by David Robinson, this book’s Technical Editor,
     promises to shed more light on partitioning and fan-out, according to this
                     Microsoft SQL Azure and the Windows Azure Platform
                     provide all the necessary building blocks to develop and
                     host rich Internet applications that can service an Internet-
                     scale user population while storing nearly limitless amounts
                     of data. Come learn how to build such applications and see
                     firsthand the power and elasticity of Microsoft SQL Azure
                     and the performance gains seen while using scale-out
                     development patterns.
Simon Munro and Ayende Rahien Tackle Sharding in SQL Azure
and NHibernate]
 Simon Munro’s “The Trouble With Sharding” post of September 10, 2009
sharding/) discusses the issues associated with sharding to achieve
scalable relational databases. Ayende Rahien’s “SQL Azure, Sharding and
NHibernate: A call for volunteers” post of 9/6/2009 observes:
  I was quite surprised to hear that SQL Azure has a 10GB limit for each
  database. That drastically reduce the amount of effort that I guess SQL
  Azure takes. At a guess, I would say it is simply replicated instances of
  databases instead of real SQL on the cloud.
  One of the nice premises of working on the cloud is that you get
  transparent scaling. 10GB limit is not transparent. The answer from
  Microsoft seems to be that you need to implement Sharding. That is,
  you spread your logical database over several physical databases.
  Usually it is done on physical database instances for the purpose of
  speeding up application because you get can parallelize the queries. In
  this case, you would need this because each database is pretty small.
  Sharding is a term that was invented by Google, and a few years ago
  several Google engineers decided that they want[ed] to use Sharding
  with Hibernate. Thus, the Hibernate Shards project was born, bringing
  transparent sharding support to Hibernate.
 The equivalent project for NHibernate was started, but porting was never
complete. This is a call for volunteers to help continue the port of Hibernate
Shards to NHibernate. You now have a very clear goal for why you would
want that.
             Ayende Rahien’s surprise about the maximum size of SADB
      databases isn’t warranted. SADB’s predecessor, SQL Data Services (SDS)
      CTPs limited a Container, which corresponds approximately to an
      SADB database, to 1GB of Blob entities and 100MB of Flexible entities
      (, http:// to provide
      transactional consistency for entity groups. Microsoft says Azure EAV
      Tables can expand to ―billions of rows.‖ Sharding these tables by
      PartitionKey values supports load balancing and allows transactional
      updates to a single partition with snapshot isolation.
      / folder contains sample files and code for the ―Scaling Out SQL Azure
      with DB Sharding‖ demo project. This project lets you select the
      destination country or order date as the sharding criterion.

Creating SADB Logins and Users,
and Assigning Roles
      y/ folder contains a ―Managing Logins and Security in SQL Azure‖ demo,
      which explains how to create a new SADB HolTestUser login by executing
      with SSMS [Express] a CREATE LOGIN HoLTestUser WITH
      statement against the SADB instance’s master database.
             Passwords that don’t meet SQL Server 2008’s complexity
      requirements return an error. Following are password complexity
      requirements from SQL Server Books Online’s ―Password Policy‖ section:
           The password does not contain all or part of the account name of
            the user. Part of an account name is defined as three or more
            consecutive alphanumeric characters delimited on both ends by
            whitespace such as a space, tab, and return, or any of the
            following characters: comma (,), period (.), hyphen (-),
            underscore (_), or pound sign (#).
           The password is at least eight characters long.
           The password contains characters from three of the following
            four categories:

                   Latin uppercase letters (A through Z)
                   Latin lowercase letters (a through z)
                   Base 10 digits (0 through 9)
                   Non-alphanumeric characters such as: exclamation point
                    (!), dollar sign ($), number sign (#), or percent (%).
          The demo also demonstrates executing CREATE USER
   HoLTestUser FOR LOGIN HoLTestUser from the HoLTestDB
   user database followed by EXEC sp_addrolemember
   'db_owner', 'HoLTestUser' to assign the HoLTestUser to the
   db_owner built-in fixed database role. According to SQL Server 2008
   Books Online ( and
   us/library/ms189121.aspx), SQL Server 2008 supports the fixed
   database roles shown in the following table:
Fixed Database Role                    Security Permissions
db_accessadmin                         Add or remove access for Windows
                                       logins, Windows groups, and SQL
                                       Server logins.
db_backupoperator                      Back up the database.
db_datareader                          Run a SELECT statement against any
                                       table or view in the database.
db_datawriter                          Add, delete, or change data in all user
db_ddladmin                            Run any Data Definition Language
                                       (DDL) command in a database.
db_denydatareader                      Cannot read any data in the user tables
                                       within a database.
db_denydatawriter                        Cannot add, modify, or delete any
                                         data in the user tables within a
db_owner                                 Perform all allowable configuration
                                         and maintenance activities on the
                                         database, and drop the database.
db_securityadmin                         Modify role membership and manage
          No significant difference exists between the SQL Server 2008 and
   SADB T-SQL syntax for creating a LOGIN or USER, or adding a USER to
   a fixed database role.

          Cloning schemas of on-premises databases for new SADB databases
   and importing historical data probably will be the first task faced by new
   SQL Azure developers. Although members of the SQL Azure Team
   promise ―APIs for automatic deployment to Windows Azure‖ by the time
   the commercial SQL Azure version releases, it’s likely that developers will
   need to process T-SQL scripts for creating tables, views, stored procedures,
   triggers, and other SQL Server 2008 objects for SADB databases.
          The early part of this chapter describes the problems you’ll face
   when you try to run raw T-SQL scripts generated by SSMS [Express]’s
   Query Wizard. Substantial differences in SQL Server 2008’s T-SQL DDL
   grammar and datatypes supported by SADB require extensive editing of
   auto-generated scripts to eliminate execution errors. George Huey’s open-
   source SQL Azure Migration Wizard has the potential to greatly reduce or
   eliminate manual editing of auto-generated scripts for creating and
   populating SADB tables.
          SADB’s August 2009 CTP, which was the foundation for the
   original version of this chapter, doesn’t support SQL Server’s Bulk Copy
   Procedure utility or SQL Server 2008 R-2’s new INSERT … BULK
   command. Microsoft promises support for both operations by the projected
   November 2009 release date. In the meantime, INSERT operations
   generated by the Query Wizard are adequate for populating a few thousand
table rows, but you’ll probably want to create SSIS packages for importing
data to and exporting it from SADB tables. When this chapter was written,
SSIS was the most practical means to back up SADB tables for disaster
recovery purposes.
        SADB Business Edition’s 10GB database size limit means that
you’ll need to implement database sharding to gain the scalability of Azure
Tables or other datastores that use the EAV model, such as Google’s
BigTable (the datastore for App Engine), Amazon’s SimpleDB, or open-
source Hadoop. As Simon Munro’s blog post explains, sharding today is
more an art than a science, but Microsoft promises that the patterns &
practices group will address the issue with a future ―best practices‖ paper.
In the meantime, you’re on your own without sharding frameworks and
tools to support you.

Shared By: