Exploring the Benefits of SQL Server 2008

Document Sample
Exploring the Benefits of SQL Server 2008 Powered By Docstoc
					      What’s in It for You:
  Exploring the Benefits of the
SQL Server 2008 Database Engine

                Sharon Dooley


          ©2009 Learning Tree International. All Rights Reserved.
                                                                                        LEARNING TREE INTERNATIONAL                White Paper

        T A B L E              O F         C O N T E N T S

      Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1        Introduction
      1. Enhancements for the Developer . . . . . . 2
                                                                                  Flashback to late October 2005: “It” finally arrived! “It” was
          1.1. The MERGE Statement . . . . . . . . . . . . . . 2                  the long-awaited release of SQL Server 2005. The magnitude
          1.2. Change Data Capture . . . . . . . . . . . . . . . 4                of the changes made to the software was almost frightening,
          1.3. Row Constructors . . . . . . . . . . . . . . . . . . 5             and the SQL Server community was quite vocal in demand-
          1.4. Beyond Relational Data . . . . . . . . . . . . . 5                 ing that Microsoft never repeat that scenario again. Microsoft
                                                                                  listened and promised that they would aim for a two- to
                1.4.1. Spatial Data Types . . . . . . . . . . . . 5
                                                                                  three-year interval between releases in the future.
                1.4.2. FILESTREAM . . . . . . . . . . . . . . . . . . 6
                1.4.3. DATE and TIME Data Types . . . . 7                         Now it’s early 2009, and SQL Server 2008 has been available
                                                                                  for four months. Many members of that same SQL Server
          1.5. Additional Smaller (but Useful)
               Features . . . . . . . . . . . . . . . . . . . . . . . . . . . 8   community who demanded more frequent releases are now
                1.5.1. Intellisense . . . . . . . . . . . . . . . . . . 8         complaining that it’s too soon!

                1.5.2. Stored Procedure Debugger . . . 8                          Is it too soon for you? Is SQL Server 2008 a “must-have” up-
      2. Enhancements for the DBA . . . . . . . . . . . . 8                       grade? In this paper, I’ll describe the features I personally find
                                                                                  most compelling, focusing on the core database engine and
          2.1. Policy-Based Management . . . . . . . . . . . . . . 8
                                                                                  covering three basic topics: Enhancements for the developer,
          2.2. Auditing . . . . . . . . . . . . . . . . . . . . . . . . . . 10
                                                                                  Enhancements for the Database Administrator (DBA) and
          2.3. Transparent Date Encryption . . . . . . . 10                       performance Enhancements. Of course, these topics overlap,
          2.4. External Key Management . . . . . . . . . 11                       so if you don’t find what you are looking for in a particular
          2.5. Availability . . . . . . . . . . . . . . . . . . . . . . . 11      section, try one of the others.
          2.6. Additional Smaller (but Useful)                                    Many of the new features that I’ll describe are available only
               Features . . . . . . . . . . . . . . . . . . . . . . . . . . 11
                                                                                  in the Enterprise Edition of SQL Server 2008. If some of these
                2.6.1. Hot-Add Hardware . . . . . . . . . . 11
                                                                                  interest you, you’ll want to compare the features of the various
                2.6.2. Backup Compression . . . . . . . . 11                      editions at http://www.microsoft.com/sqlserver/2008/en/us/
                2.6.3. PowerShell Integration . . . . . . . 11                    editions-compare.aspx
      3. Performance Enhancements . . . . . . . . . . . 12
                                                                                  I hope that this paper will help you decide whether or not
          3.1. Management Data Warehouse                                          SQL Server 2008 is a “must-have” upgrade for you.
               and the Data Collector . . . . . . . . . . . . . 12
          3.2. Sparse Columns . . . . . . . . . . . . . . . . . . . . 15
          3.3. Filtered Indexes . . . . . . . . . . . . . . . . . . . 15
          3.4. Resource Governor . . . . . . . . . . . . . . . . 16
          3.5. Data Compression . . . . . . . . . . . . . . . . . 17

      Summary and References . . . . . . . . . . . . . . . . 18
      About Learning Tree International . . . . . . . 19
      About the Author . . . . . . . . . . . . . . . . . . . . . . . . . 19

1-800-843-8733 • www.learningtree.ca
©2009 Learning Tree International. All Rights Reserved.
                                                                                                      SQL Server 2008 Database Engine             1
  1-800-843-8733 • www.learningtree.ca                                     LEARNING TREE INTERNATIONAL         White Paper

1. Enhancements for the Developer
1.1. The MERGE Statement
Prior to SQL Server 2008, Transact-SQL did not support an operation often called an “Upsert,” but it was possible to work
around this lack by writing code that looked like this:
      -- If the row is there, update it. Otherwise, insert it.
      UPDATE mytable SET colx = value
      WHERE akey = somevalue
      IF @@ROWCOUNT = 0
          INSERT …
This worked reasonably well for a single-row operation but did not handle the multirow case at all. In order to solve
problems like refreshing a data warehouse or tracking available inventory based on a day’s sales, the ANSI/ISO standard
specified a MERGE statement. In SQL Server 2008, however, Microsoft has implemented this specification and has
provided some enhancements to it as well.

The MERGE statement is extremely powerful, but also quite complex. To use the MERGE statement, you need to understand
two tables:
      1. The Source table. The table or table-expression from which the new and changed data comes.
      2. The Target table. This is the table that will be modified by the data in the source table.
When you are comparing two tables, there are three possible results:
      1. There is a match between a row in the Source table and a row in the Target table based on some key.
      2. There is a row in the Source table with no matching row in the Target table.
      3. There is a row in the Target table with no matching row in the Source table.
When you use the MERGE statement, you may need to deal with all three possibilities.

Example 1
There are two tables involved in this example. Purchases contains details of purchases by product. Purchase History contains
historical information about product purchases. Periodically, the contents of the Purchases table need to be placed in the
Purchase History table. Now that we understand the data and the business rules, we are ready to look at the MERGE statement
that will accomplish the task.

Source table (Purchases)

      ProductID                  PurchaseDate             Quantity    Amount

             1                       20080507                25         125.00

             2                       20080409               101        3100.45

             3                       20080509                50         750.99

             4                       20080509                43         400.00

             5                       20080608               200         400.00

             6                       20080615               300         705.00

©2009 Learning Tree International. All Rights Reserved.
                                                                                         SQL Server 2008 Database Engine    2
   1-800-843-8733 • www.learningtree.ca                                  LEARNING TREE INTERNATIONAL           White Paper

Target table (Purchase History)

      ProductID                 PurchaseDate              Quantity   Amount

             1                      20080506                 5        400.95

             2                      20080409                100      3000.45

             3                      20080407                 30       401.32

             4                      20080506                200       605.34

             5                      20080507                150       300.00

Observe the italicized rows in the two tables. You should see that the Purchases table contains an entry for ProductID 2 on
20080409 and that the PurchaseHistory table contains an entry for the same day and product. You should also notice that there
is an entry for Product 6 in the Purchases table and that there are no entries for that product in the purchase history table.
We have two of the three cases listed above:

    1. There is a match between a row in the Source table and a row in the Target table.

    2. There is a row in the Source table with no matching row in the Target table.

The business requirements are that if there is a match, the values for Quantity and Amount from the Purchases table are to
replace the amounts in the PurchaseHistory table, and that a row that exists in the Purchases table with no matching row in
the Target table should be inserted.

Now that we understand the data and the business rules, we are ready to look at the MERGE statement that will accomplish
the task.
      MERGE dbo.PurchaseHistory AS Target
      USING dbo.Purchases AS Source
      ON Target.ProductID = Source.ProductID
      AND Target.PurchaseDate = Source.PurchaseDate
                 UPDATE SET Target.Quantity = Source.Quantity,
                                           Target.Amount = Source.Amount
                 INSERT (ProductID, PurchaseDate, Quantity, Amount)
                 VALUES(Source.ProductID, Source.PurchaseDate,

Notice that “;” at the end? It’s not optional; each MERGE statement must be terminated with a semi-colon.

©2009 Learning Tree International. All Rights Reserved.
                                                                                      SQL Server 2008 Database Engine         3
  1-800-843-8733 • www.learningtree.ca                                               LEARNING TREE INTERNATIONAL             White Paper

Results of the above MERGE operation:

      ProductID                 PurchaseDate               Quantity             Amount

             1                      20080506                   5                 400.95

             1                      20080507                   25                125.00

             2                      20080409                  101               3100.45

             3                      20080506                   30                401.32

             3                      20080409                   50                750.99

             4                      20080509                   43                400.00

             4                      20080506                  200                605.34

             5                      20080507                  150                300.00

             5                      20080608                  200                400.00

             6                      20080615                  300                705.00

This simple example only begins to demonstrate the power of the new MERGE statement. Furthermore, when combined
with SQL Server 2008 Integration services, applying incremental changes to a data warehouse becomes much easier. If this is
a capability you need, I suggest you refer to Books Online for additional information. (You can download the documentation
and install it without actually installing SQL Server 2008.)

1.2. Change Data Capture
One of the problems people face when trying to maintain a data warehouse or even a read-only query database is
determining what has changed in the source data over a given time. We have used techniques such as triggers and marking
rows with update dates, but these approaches have required a lot of programming and often created a maintenance burden
as table structures change. In SQL Server 2008, Microsoft has introduced Change Data Capture to solve these problems. This
capability records insert, update and delete operations on a table in a separate Change table. The change table has exactly the
same structure as the original table (same columns, same data types, etc.) but contains four additional columns:

             Column                           Contents

                                              The log sequence number assigned to the transaction when it was committed. This serves to
           __$start_lsn                       tie all the rows affected by a transaction together and can be used to put the transactions in
                                              the order that they happened.

            __$seqval                         Can be used to further order the records within a transaction.

                                              The change that was done: INSERT – 1, DELETE – 2, UPDATE before image – 3,
                                              UPDATE after image – 4.

                                              Bit map that specifies the columns that were changed. Every bit will be set for INSERT and
       __$update_mask                         DELETE operations. For UPDATE operations, only the bits for the columns mentioned in
                                              the SET clause of the update statement will be set.

©2009 Learning Tree International. All Rights Reserved.
                                                                                                  SQL Server 2008 Database Engine          4
   1-800-843-8733 • www.learningtree.ca                                 LEARNING TREE INTERNATIONAL              White Paper

The Change table includes both the before and after images for an update operation. To use this function, you must first
enable Change Data Capture for the database. Once you have done that, you can enable Change Data Capture for tables
where you need it. The capture of the changed data is a “behind the scenes” operation, meaning the capture process reads the
transaction log and writes the data to the Change tables. Several different functions are available for retrieving data from the
Change tables as well as functions for converting time-of-day to Log Sequence Numbers (LSNs) so that it’s easier to define
the interval for which you want to retrieve changes. Note that there is no function that will purge the change tables; you will
have to develop this code yourself.
If you plan to use Change Data Capture, you need to keep in mind that it will have a negative effect on performance. There will
be at an additional write for INSERT and DELETE operations and two additional writes for UPDATE operations. In addition,
you’ll need extra disk space for storing the change data.

1.3. Row Constructors
When I teach SQL to beginners, they always want to write:
   INSERT INTO MyTable(ColumnA, ColumnB, ColumnC)
   VALUES (‘A’, ‘B’,’C’)
      VALUES (‘D’, ‘E’, ‘F’)
They are disappointed when this doesn’t work and want to know why they have to keep repeating the beginning of the
INSERT statement. SQL Server 2008 solves this problem with Row Constructors. Now you can write a single INSERT and
insert many rows:
     INSERT INTO MyTable(ColumnA, ColumnB, ColumnC)
     VALUES (‘A’, ‘B’,’C’),
              (‘D’, ‘E’, ‘F’)
You can also use Row Constructors to build a derived table as long as you provide a table name and column names:
    SELECT *
    VALUES(100, 10, $35.00, 5),(101, 12, $2.95, 100),(103, 17, $3.75, 20,))
    OrderLine(OrderID, ProductID, UnitPrice, Quantity)

1.4. Beyond Relational Data
Earlier versions of SQL Server provided some capabilities that went beyond traditional relational database data types. The
most notable of these was SQL Server 2005’s XML data type. This allowed the storage of semi-structured and un-structured
data within a relational database. Powerful query functions allowed you to retrieve both structured and semi- or un-structured
data in the same query. SQL Server 2008 goes further into the Beyond Relational area with its new spatial data types and its
FILESTREAM column property.

1.4.1. Spatial Data Types
The term Spatial Data refers to the types of data required for database management systems used to handle mapping data, data
collected from GPS devices, etc., along with displaying that data in a graphical format. Many different types of applications can
use this data: cities that maintain statistical information about crimes, mobile applications that direct and provide information
about restaurants in the area, or a real estate firm that wants to display a map of available houses on its Web site. This type of
application is called a Geographical Information System or GIS.
SQL Server 2008 provides two different types of Spatial Data: Geometry and Geography.

©2009 Learning Tree International. All Rights Reserved.
                                                                                     SQL Server 2008 Database Engine           5
   1-800-843-8733 • www.learningtree.ca                                 LEARNING TREE INTERNATIONAL              White Paper

Geometry data is two dimensional, usually referred to as Flat Earth data. With Flat Earth data, space is defined by its coor-
dinate points, similar to the X and Y coordinates of a graph. This type of data is used to represent data as it is presented on
a two-dimensional map. The coordinates can represent inches, meters, feet, miles or any other measure as long as the same
measure is used for both coordinates. So you might use geometry data to store information about the distances between two
cities or to define the polygon created by multiple points. For example, connecting three points would form a triangle and
connecting four points would form a rectangle or a square. You could also use this data to show the restaurants located within
a certain number of miles from a given location. SQL Server 2008’s geometry data conforms to the Open Geospatial Consor-
tium (OGC) Simple Features for SQL Specification version 1.1.0.
Geography data handles three-dimensional data or ellipsoidal data, usually referred to as “round earth” data. Points in
this system are defined by latitude and longitude. This is the type of data collected by GPS systems, and coordinates in this
system can form a solid, such as a cube or a pyramid.
Both the geometry and geography data types are implemented as .NET data types, which means that unlike a scalar data type
such as an integer, they are manipulated with methods. For example, if you have two integers, you can compute the distance
between them by subtracting one from another. To compute the area defined by two integers, you simply multiply them e.g.,
2” by 4” gives an area of 8”. To compute the area of a polygon defined by geometry, you use the STArea() method. Consider
the following table that contains the points that define the boundaries of cities:

      CREATE TABLE Cities
      (City NVARCHAR(50) NOT NULL,
        Region NVARCHAR(10),
        Boundaries GEOMETRY)

To find the area of Philadelphia, Pennsylvania, you’d use a query that looks like this:
      SELECT City, Region, Boundaries.STArea() AS Area
      FROM Cities
      WHERE City = ‘Philadelphia’

      AND Region = ‘PA’

If your applications require spatial data, you’ll want to delve further into these complex data types.

Since the inception of SQL Server, there has been an ongoing debate between whether large objects (images and text) should
be stored in the database or remain in the file system with the database containing the location of the file. Proponents of
storing these objects in the database pointed to the fact that when the objects are in the database, they’re backed up with the
database and that pointers can’t become stale since there are no pointers. The other side pointed to the convenience of hav-
ing the file in the file system since you can easily open it with the application that created it. If you ever had to load a Visual
Basic image control with data from a SQL Server image column, you know that it was not a pretty process. Even though SQL
Server 2005’s varbinary(max) and varchar(max) data types made the process easier, it still wasn’t ideal. You were also limited
to a maximum of 2GB for each large object.

With SQL Server 2008, the debate is finally over, thanks to the FILESTREAM column property. This gives you the best of
both worlds—data lives in the file system but is managed through the database. It gets backed up with the database, but you
can also retrieve the data using the standard Windows API calls that underlie ADO.NET. These manipulations can still retain
transactional consistency like any other database operations, and you can also manipulate the data with Transact-SQL as
you would the varbinary(max) and varchar(max) data types. In addition, the only size limit becomes the maximum size of an
operating system file.

©2009 Learning Tree International. All Rights Reserved.
                                                                                     SQL Server 2008 Database Engine            6
   1-800-843-8733 • www.learningtree.ca                                       LEARNING TREE INTERNATIONAL            White Paper

You can enable the FILESTREAM property when you install SQL Server or you can enable it later. In either case, you will
need to consider what types of retrieval you want to permit. You can allow access to FILESTREAM data from Transact-SQL.
If you want to have access from Windows for clients running on the SQL Server machine (for example, CLR functions
and procedures), you need to enable FILESTREAM for file I/O streaming access. If remote clients are going to access
FILESTREAM data from Windows (as a Windows Forms or Active Server Page application), you also need to allow remote
clients to have streaming access to FILESTREAM data. If you are enabling FILESTREAM after installation, you’ll need to
use SQL Server Configuration Manager to set these options. You’ll also need to use sp_configure to set the file stream
access level.

1.4.3. DATE and TIME Data Types
Since the first version of SQL Server, application developers have had to deal with SQL Server’s datetime data type. Because
every date field included a time, searching for an exact calendar date was impossible. datetime columns used eight bytes of
storage. (There is also a smalldatetime that only used four bytes.) In addition, because the earliest date that could be stored
was 1753-01-01, the data type was not usable for some applications. Also, the time portion of a datetime data type pretended
to hold milliseconds, but in fact recorded ticks of the machine’s clock, so the only values you saw were 000, 300 and 900. Every
developer learned the techniques for extracting the date portion or the time portion of a datetime column early in his or her
career. In SQL Server 2008, Microsoft has introduced“pure”date and time data types. date columns can hold a date in the
range January 1, 0001 to December 31, 9999. This may help librarians, but the archaeologists and early historians are still left out.
wwA time column has an accuracy of 100 milliseconds and provides a user-configurable scale. A time column uses three to five
bytes of storage and has a range of 00:00:00.0000000 through 23:59:59.9999999.

All the familiar date manipulation functions (GETDATE, DATENAME, DATEPART, DATEADD, etc.) work with the new data
types as well as the old data types (which are still supported). The old data types can easily be converted to the new ones. The
new types also have the same flexible input formats as the old ones. In addition, there are some new functions which make it
much easier to get consistent dates regardless of the actual time zone.

Here’s an example of using the new data types:
      CREATE TABLE ShowNewDataTypes
      ( OldDateTimeType datetime,
      DateType date,
      TimeType time,
      TimeType3Nanoseconds time(3),
      TimeType7Nanoseconds time(7))
      INSERT ShowNewDataTypes (OldDateTimeType,DateType,TimeType,TimeType3Nanoseconds,
      VALUES(GETDATE(), GETDATE(), GETDATE(), ‘09:55:41.1234567 AM’,
      ‘09:55:41.1234567 AM’)

      SELECT * FROM ShowNewDataTypes


     OldDateTimeType                           DateType       TimeType        TimeType3Nanoseconds       TimeType7Nanoseconds

    2009-01-09:57:25.223                      2009-01-31   09:57:25.2230000        09:55:41.123              09:55:41.1234567

I think these new data types are going to eliminate the problems we’ve had working with the datetime data type in the past,
and I am looking forward to using them.

©2009 Learning Tree International. All Rights Reserved.
                                                                                         SQL Server 2008 Database Engine           7
   1-800-843-8733 • www.learningtree.ca                                 LEARNING TREE INTERNATIONAL               White Paper

1.5. Additional Smaller (but Useful) Features
1.5.1. Intellisense
Visual Studio users have long had something called “Intellisense”—a feature that literally writes code for you as you type.
When you type the name of an object followed by a period, you get a drop-down list of possible properties and/or methods.
But people who are using either Query Analyzer or a Query Window in Management Studio haven’t had this kind of help.
In SQL Server 2008 Management Studio, the Query Window provides Intellisense at last! However, because of the non-
procedural nature of SQL, it’s not always available. For example, when you type SELECT, you won’t get a list of possible col-
umn names to include because the name of the table isn’t available yet. But after you provide the FROM clause, you’ll
get potential columns to include in your WHERE, GROUP BY and ORDER BY clauses.
1.5.2. Stored Procedure Debugger
For many years, Microsoft has provided an add-in that allowed you to debug stored procedures and functions. In its original
incarnation, the stored procedure was an add-in to Visual Basic. While it was useful, its use was limited since you couldn’t
modify the stored procedure while debugging it. But it was better than the other methods we had—mostly PRINT statements
embedded in the stored procedure. In SQL Server 2000, Microsoft finally added the stored procedure debugger to Query
Analyzer’s object browser. And then, in SQL Server 2005 Management Studio, they took it away and moved it back into
Visual Studio. But in SQL Server 2008’s Management Studio, the stored procedure debugger is back where I think it belongs.
It’s somewhat more powerful than its predecessors, but it still won’t let you modify the code. In addition, the debugging
capabilities are not just limited to permanent objects such as stored procedures and user-defined functions. You can also
debug Transact-SQL scripts.

2. Enhancements for the DBA
2.1. Policy-Based Management
This is one of the things I like best. We’ve been able to define and enforce rules (constraints) for our data for a long time. But
until now, we haven’t had a way of specifying, except in the standards document, rules for our databases and servers. SQL
Server 2008’s Policy-Based Management provides this ability. You can define policies at the database and the server level. For
example, whether or not you permit the use of the old SQL Mail, you can only create a server-level policy because this is a
server-wide feature. If you want to enforce specific naming conventions for tables and such in a specific database, you would
create a database-level policy. Then, on the other hand, if you want to enforce the naming conventions for all databases, you
would create a server-level policy. Once you have the policies you want defined, you can enforce these policies on groups of
servers, making sure all the servers in your organization obey the same rules.
A policy contains a condition that is applied to a target and has an automation schedule. Most policies allow four different
choices for automation:
• On change prevent—roll back the attempt to violate the policy and issue an error message
• On change log—allow the policy violation but record it in the error log
• On schedule—evaluate compliance with policies on a regular schedule
• On demand—evaluate compliance when requested
You can set up alerts so that you will be notified if there are policy violations. This saves you from having to continually review
the logs for violations.
A condition consists of one or more logical expressions. Expressions can be combined with AND and OR. Unfortunately, the
expression language is not Transact-SQL but rather a Visual Basic-like language similar to that used in Reporting Services and
Integration Services, and DBAs who are not familiar with those tools will have a bit of a learning curve to overcome if they
want to go beyond the common expressions that can be created with the Graphical User Interface (GUI).

©2009 Learning Tree International. All Rights Reserved.
                                                                                      SQL Server 2008 Database Engine           8
   1-800-843-8733 • www.learningtree.ca                                  LEARNING TREE INTERNATIONAL               White Paper

When you build a condition, you start with a “facet.” A facet is some aspect of a database or a server and is a collection of
properties of that aspect. For example, if you want to prevent people from starting the names of stored procedures with sp_,
you’d use the Stored Procedure facet and select the @name property. Your condition would look like this:

Once you have created the condition(s) for your policy, you can create the policy itself. We’ll create a policy that applies to all
databases with On Change Prevent enforcement.

©2009 Learning Tree International. All Rights Reserved.
                                                                                      SQL Server 2008 Database Engine            9
   1-800-843-8733 • www.learningtree.ca                                   LEARNING TREE INTERNATIONAL               White Paper

Now, if someone tries to create a stored procedure whose name starts with sp_, it will not work. Policy-based management
implements this with the Data Definition Language (DDL) triggers introduced in SQL Server 2005. This is an extremely
powerful capability. I’ve only shown you the simplest possible policy, but I hope that this gives you an understanding of
what this feature can provide.

2.2. Auditing
SQL Server 2008 provides an auditing capability. It is based on the SQL Profiler technology and allows you to audit exactly
the same events provided by that tool. However, the setup is more automatic than trying to do the same thing with Profiler.
You can audit all security-related changes: adding or deleting logins, granting database access, granting object permissions,
etc. In addition, you can audit SELECT, INSERT, UPDATE, DELETE and EXECUTE operations. The audit trail can be written
to a file, the Windows Application Event Log or the Windows Security Log. An audit is defined with a Server Audit. This
gives the audit a name and specifies the destination for the audit trail. You can set it up so that the SQL Server will stop if
there is an audit failure.
For each Server Audit, you can create one or more Server Audit Specifications. Here, you define the server level events (such
as creating logins or creating databases) that you want to audit. You can also create Database Audit Specifications. Here, you
can specify the database security activities (authorizing users, granting permissions, etc.) as well as the Data Manipulation
Language (DML) actions that you want to audit. You can limit both specifications to a specific login or user. While useful,
this capability does not provide everything that I think it should provide. For example, when an update statement is recorded,
the audit trail does not show what the data was changed to nor does it show the value (if any) used in the WHERE clause.
Reading the audit trail from the logs is difficult—it’s simply a continuous stream of text, but if you save the audit trail in a file,
you can use a supplied function, fn_get_audit_file, which will present the audit trail in a more readable fashion.
The GUI is very difficult to use, but it does make sure you get all the pieces in the right order. The Transact-SQL commands
are straightforward and are well documented. I think they are easier to use than the GUI, particularly since they allow you to
specify more than one principal. You’ll still have to have a separate line for each securable that you want to audit.
The audit does capture most of the details—who did it, what they did, when they did it. And it’s available to you with only
some relatively tedious setup work. It may save you from having to buy a third party product, and if you start using it now,
I’m sure that we’ll see many improvements in the next release of SQL Server.

2.3. Transparent Date Encryption
SQL Server 2005 introduced column-level encryption (often referred to as “cell level” encryption). This capability provided
several different ways of encrypting the data in a column. You can use pass phrases, keys (both symmetric and asymmetric) or
certificates. The purpose of this type of encryption is to make sure people who manage to get to a column even though they
don’t have permission to do so cannot see sensitive data such as credit card numbers or salaries. The data is stored in the data-
base files in its encrypted form and there is no plain text available. The data is encrypted whether SQL Server is running or
SQL Server is stopped.SQL Server 2008 extends the encryption capabilities with Transparent Data Encryption (TDE). It’s
important to understand the different goal of TDE. When SQL Server is stopped, users with proper operating system
permissions can view the contents of database files with an editor such as WordPad. The new TDE is designed to protect “data
at rest.” This encryption is done as a part of the input/output (I/O) process that SQL Server uses when reading and writing
the database pages and is invisible to the user of the data. The entire database will be encrypted. Note that any backups of an
encrypted database will also be encrypted and you will need the database encryption key to restore the database.
To use TDE, you must create the required keys and set an option that causes the database to be encrypted.

©2009 Learning Tree International. All Rights Reserved.
                                                                                       SQL Server 2008 Database Engine            10
   1-800-843-8733 • www.learningtree.ca                                 LEARNING TREE INTERNATIONAL              White Paper

2.4. External Key Management
Security professionals identify three ways an individual can be authenticated:
1. Something you know (e.g., a password),
2. Something you are (e.g., sophisticated retinal scanners, fingerprints)
3. Something you have (e.g., a “smart card” or other device)
SQL Server 2005’s encryption required a password to encrypt and decrypt data. This meant that database administrators
found that key management—keeping the keys secure—became an important part of their jobs. In SQL Server 2008,
Microsoft has provided the ability to use Hardware Security Modules (HSMs) to secure the encryption keys. These devices
store encryption keys on hardware or software modules. This is a more secure solution because the encryption keys do
not reside with encryption data. Microsoft’s implementation uses the Microsoft Cryptographic Application Programming
Interface (API), which provides only a subset of the capabilities available with many HSMs.

2.5. Availability
Database mirroring and peer-to-peer replication were introduced in SQL Server 2005, and both have been enhanced in
SQL Server 2008. Database mirroring can now recover a damaged page in the principal from the undamaged copy in
the mirror. This happens automatically when SQL Server detects the damaged page. The log records used for database
mirroring are now compressed also, which reduces the load on the network and speeds up the process, and peer-to-peer
replication now has built-in conflict detection. When SQL Server sees that there is a conflicting update on one of the
peer servers, it stops sending new data to that server. When the conflict is resolved (a manual, rather than an automated,
process), replication to that server will resume.

2.6. Additional Smaller (but Useful) Features
2.6.1. Hot-Add Hardware
When SQL Server 2008 is running on Windows 2003 or later, it is possible to add CPUs and have SQL Server take
advantage of them. A server restart is not required. Note, however, that the hardware must provide this ability before
you can take advantage of it. This capability is only available on a 64-bit operating system. In the 32-bit edition, you must
enable AWE and use a switch on the SQL Server startup command to take advantage of this capability.

2.6.2. Backup Compression
SQL Server 2008 allows database, differential, log, file and file group backups to be compressed. By default, backup
compression is disabled, but you can enable it for the server as a whole, in which case all backups will be compressed or
you can also enable it or disable it as part of the backup command only. The advantages of backup compression are that the
files are smaller and the backup is faster. The disadvantage is that the compression process makes heavy demands on the
CPU. However, the time it takes to decompress the data during a restore is outweighed by the substantial reduction in I/O.

2.6.3. PowerShell Integration
PowerShell is a powerful scripting language used by Windows Administrators to manage the Windows environment.
In SQL Server 2008, PowerShell is fully integrated with SQL Server. This provides an ideal environment for mixing operating
system and SQL Server commands. For example, you can create a backup with SQL Server and use PowerShell to copy it
to a network share:
         Invoke-Sqlcmd –Query “BACKUP DATABASE AdventureWorks
                   TO DISK = ‘C:\backups\AW.bak’
                   -ServerInstance “INSTRUCTOR\TEST2008”
            Copy-Item c:\backups\AW.bak \\BackupStore\AdvWorks\AW.bak
To use PowerShell with Windows 2003, you will need to download it from www.microsoft.com/downloads

©2009 Learning Tree International. All Rights Reserved.
                                                                                     SQL Server 2008 Database Engine            11
   1-800-843-8733 • www.learningtree.ca                                  LEARNING TREE INTERNATIONAL              White Paper

 3. Performance Enhancements
 3.1. Management Data Warehouse and the Data Collector
 The Management Data Warehouse (MDW) is a SQL Server database that serves as a repository for historical performance
 data. A data collector process runs on a regular basis to collect the data and store it in the database. The repository can be
 created on the server where the Data Collector runs or it can be on a different server. You may want to put it on a server of
 its own if the process of writing to the repository is affecting the performance of your production server. The same repository
 can hold data from multiple servers.
 The MDW and Data Collector are very easy to set up with the Wizard in Management Studio, and once the Data Collector
 has been running for a short time, you’ll be able to view the results in the three built-in reports provided. All three reports let
 you drill down to deeper levels of information. Below are illustrations of the initial page of each report.
 Server Activity Report

©2009 Learning Tree International. All Rights Reserved.
                                                                                      SQL Server 2008 Database Engine           12
   1-800-843-8733 • www.learningtree.ca                   LEARNING TREE INTERNATIONAL   White Paper

Disk Usage Report


©2009 Learning Tree International. All Rights Reserved.
                                                                  SQL Server 2008 Database Engine   13
   1-800-843-8733 • www.learningtree.ca                                   LEARNING TREE INTERNATIONAL               White Paper

Query Statistics Report

The Data Collector can only collect data from SQL 2008 servers. However, one of the MVPs (Microsoft Most Valuable
Professionals) has posted a collector for SQL 2005 servers on Codeplex (www.codeplex.com), a free, open source
download named the “SQL Monitor Tool.”
The Management Data Warehouse will give you more insight into what’s going on with your SQL Server than you’ve ever
been able to gain with SQL Server tools, and it’s easily a rival to some of the third party monitoring tools available.

3.2. Sparse Columns
When you are designing a database, you may find that you have wide tables with many columns that are likely to have a lot of null val-
ues. For example, consider a product catalog table for an office supply store. There are likely to be many product attributes that have no
relationship to each other: a laptop computer may include a wireless mouse, while a coffee maker may include a free sample of filters.
Among the problems this type of data brings are the amount of storage required to store the null values and the fact that the high per-
centage of null values often mean that indexes aren’t selective enough to be used. SQL Server 2008 has solved this problem by creating
one-to-one relationships and only populating the subordinate tables when there are actual values for a row.

©2009 Learning Tree International. All Rights Reserved.
                                                                                       SQL Server 2008 Database Engine            14
   1-800-843-8733 • www.learningtree.ca                                   LEARNING TREE INTERNATIONAL               White Paper

Sparse columns have also been introduced in SQL Server 2008 to provide optimized storage for null values.You create a table con-
taining sparse columns in exactly the same way that you create any other table. The only difference is the keyword SPARSE in the
column definition:
         CREATE TABLE …
                      MySparseColumn nvarchar(50) SPARSE null, …
You can refer to the columns in SELECT, INSERT, UPDATE and DELETE statements just as you refer to any other column.You can
also define a column set, which is simply a name given to all of the sparse columns in the table. This allows you to retrieve all the
sparse columns, or to insert or update data in the sparse columns.You need to keep in mind, however, that the column set will be
returned as an XML string, not as discrete values. If you insert or update the column set, you must provide the data in XML format.
The main benefit of sparse columns is the storage savings they provide. The Books Online site has some useful charts that will help
you estimate how much space you would save.
3.3. Filtered Indexes
The new filtered indexes capability is often grouped with sparse columns, but while there is a relationship between sparse columns
and filtered indexes, you may want to use filtered indexes for non-sparse columns as well. A filtered index includes a condition
in its definition. Only values that conform to that definition will be included in the index. This will reduce the index size and the
amount of index maintenance that is required. Because of the smaller index size, filtered indexes also make queries more efficient.
Only non-clustered indexes can be filtered. Consider the Production.Product table in AdventureWorks. It looks like this:
            CREATE TABLE [Production].[Product](
                           [ProductID] [int] IDENTITY(1,1) NOT NULL,
                           [Name] [dbo].[Name] NOT NULL,
                           [ProductNumber] [nvarchar](25) NOT NULL,
                           [MakeFlag] [dbo].[Flag] NOT NULL,
                           [FinishedGoodsFlag] [dbo].[Flag] NOT NULL,
                           [Color] [nvarchar](15) NULL,
                           [ListPrice] [money] NOT NULL,
                           [ProductSubcategoryID] [int] NULL,
                           [ProductModelID] [int] NULL,
                           [SellStartDate] [datetime] NOT NULL,
                           [SellEndDate] [datetime] NULL,
                           [DiscontinuedDate] [datetime] NULL,
                           [rowguid] [uniqueidentifier] ROWGUIDCOL                NOT NULL,
                           [ModifiedDate] [datetime] NOT NULL,
Suppose that we have a lot of queries for accessories, which are ProductSubCategoryIDs 27 – 36. To optimize those queries, we cre-
ate a filtered index that only includes those subcategory IDs:
            CREATE NONCLUSTERED INDEX ProductAccessories
                    ON Production.Product (Name, ListPrice)
            WHERE ProductSubcategoryID >= 27
            AND ProductSubcategoryID <= 36;

©2009 Learning Tree International. All Rights Reserved.
                                                                                       SQL Server 2008 Database Engine            15
   1-800-843-8733 • www.learningtree.ca                                      LEARNING TREE INTERNATIONAL                 White Paper

When you issue a query that has a where clause referencing ProductSubCategoryIDs in that range, the filtered index will be used.

3.4. Resource Governor
Many organizations have diverse workloads using the same SQL Server.You may have OLTP (OnLine Transaction Processing)
applications (such as OrderEntry) sharing a SQL Server with DSS (Decision Support Systems) applications (such as reporting). The
two types of applications require different types of resources and are competing for SQL Server’s memory and CPU. The solution
has always been to separate the different types of applications onto different SQL Servers. That’s still the best solution, but if it’s not
possible in your organization, SQL Server 2008 gives you a new tool for handling the situation. It’s called Resource Governor. The
goal is to allocate resources to classes of applications.
To use Resource Governor, you need to understand three things:

• Resource Pools—A container for the CPU and memory resources. When you create a resource group, you specify the minimum
  and maximum CPU and memory that pool is allowed to have.

• Workload Groups—A workload group is assigned to a resource pool.

• Classifier Function—A classifier function determines which workload group a connection should be placed in.

Let’s see how the Resource Governor works to handle the mixed OrderEntry/Reporting scenario described above. We will have two
Resource Pools as shown below:

         Name                     Minimum % CPU           Maximum % CPU           Minimum % Memory              Maximum % Memory

   OrderEntryPool                             30                 50                           30                           50

    ReportingPool                             10                 50                           10                           50

We’ll also have two workload groups: OrderEntryGroup and ReportingGroup. The OrderEntryGroup will be assigned to the Order-
EntryPool and the ReportingGroup will be assigned to the ReportingPool. Now we have to create a classifier function. Our function
will be based on Login Name and will look like this:
            CREATE FUNCTION DemoClassifier() RETURNS SYSNAME
                           DECLARE @groupname AS sysname
                           IF SUSER_NAME() = ‘ReportingUser’
                                       SET @groupname = ‘ReportingGroup’
                           IF SUSER_NAME() = ‘OrderEntryUser’
                                       SET @groupname = ‘OrderEntryGroup’
                           RETURN @groupname

©2009 Learning Tree International. All Rights Reserved.
                                                                                           SQL Server 2008 Database Engine             16
   1-800-843-8733 • www.learningtree.ca                                    LEARNING TREE INTERNATIONAL                White Paper

Any connections that aren’t for either OrderEntryUser or ReportingUser will be placed in the Default pool, which is predefined
by SQL Server itself. With these three things in place, and a small amount of setup to enable the Resource Governor, it will
begin doing its work. Microsoft supplies several dynamic management views that will allow you to determine how well the
Resource Governor is working for you. Here’s a sample query:
             SELECT s.group_id, CAST(g.name as nvarchar(20)) AS GroupName,
             s.session_id, s.login_time,
             CAST(s.host_name as nvarchar(20)) AS HostName,
             CAST(s.program_name AS nvarchar(20)) AS ProgramName
                                  FROM sys.dm_exec_sessions AS s
             INNER JOIN sys.dm_resource_governor_workload_groups AS g
                                  ON g.group_id = s.group_id
             WHERE g.name NOT IN (‘internal’, ‘default’)
             ORDER BY g.name

The results shown were based on a slightly different set of resource pools and workload groups than those I described above, but
as you can see, this feature may really help you if you’re struggling with the needs of diverse workloads on the same server.

3.5. Data Compression
One of the most costly elements of any database management system is input/output (I/O). The more you can reduce I/O, the
faster queries will be. In SQL Server 2008, Microsoft has introduced the ability to compress tables and indexes. This will result in
a reduction in storage and in I/O at the cost of extra CPU cycles. There are two types of compression: Row and Page.

Row compression is not compression in the sense of WinZip or Windows compressed files. Rather, it stores data types in smaller
than normal formats. The vardecimal data type property Microsoft introduced in SQL Server 2005 Service Pack 2 is an example of
this. With row compression, data types may take less than their required number of bytes. For example, take the integer data type.
By default, an integer data type requires four bytes of storage. However, with row compression, if the actual value will fit in one
byte, only one byte will be used. Another example is the way fixed length character strings are handled. By default, a CHAR(5)
column will occupy five bytes of storage with blanks as needed to make five bytes. But if there are blanks, row compression strips
them away, using only the number of bytes needed for the non-blank data.

Page compression is done in three steps:
1. Row compression
2. Prefix compression
3. Dictionary compression

The Row compression step uses the techniques previously described above.

©2009 Learning Tree International. All Rights Reserved.
                                                                                        SQL Server 2008 Database Engine                17
   1-800-843-8733 • www.learningtree.ca                                     LEARNING TREE INTERNATIONAL                 White Paper

In Prefix compression, prefix strings are extracted from the data and placed in the page header. Consider the following data:
         aaabb            aaaab
The first value has a prefix of three a’s while the second has a prefix of four a’s. Once the prefixes are in the page header, the data
values will be modified so that they look like this:
         3bb              4b
which means that the first n characters are to be taken from the prefix. In the above example, 10 characters are replaced by 9 (the 4
a’s stored in the page header, 3 for the first string and 2 for the second string). That doesn’t seem like much, but in actual practice—
with more data—the savings are much greater.
The final step is the Dictionary compression. In this step, SQL Server looks for repeated values anywhere in the page, without
regard to which column they are in. The repeated values will be placed in the page header and replaced with a pointer to that
value in the actual data. This could mean substantial space savings—think about how many times the word “street” is going
to appear in a table of addresses. From the outside, retrieval and update of the data is unchanged, but there is now the cost
of reassembling the compressed data when it is being presented.
Row compression will be less CPU-intensive than Page compression, and you probably want to try that first if you’re think-
ing about using compression. Both types of compression are specified in the CREATE or ALTER table statements. Microsoft
provides a stored procedure named sp_estimate_data_compression_savings that you may want to use before deciding that
compression is the right thing for you.

Summary and References
I hope you’ve learned enough to start thinking about installing SQL Server 2008. If you’re eager to try some of the things I’ve
talked about, you can download the evaluation edition of SQL Server 2008 from http://www.microsoft.com/sqlserver/2008/

If you’ve heard about a new feature that I didn’t talk about and want to learn more or if you want to dig more deeply into the
features I have talked about, you can download the SQL Server 2008 Books Online from

You don’t have to have SQL Server 2008 to download the Books Online.

I’m sold on SQL Server 2008, but I know a database upgrade is a considerable task that many people aren’t eager to do,
particularly when a release is so new. However, if you’re already using SQL 2005, then you’ve really done the hard part—
you’ve gotten over the learning curve of that release. For those of you who are still using SQL 2000, you’ll have a lot of things
to catch up on, but, at this point, there’s no longer any good reason to upgrade to 2005. Just “leapfrog” to 2008. I hope you’ll
start exploring it. Build yourself a test environment (my friends all use virtual machines for this purpose), download the
evaluation edition and see whether or not you find some of the features as appealing as I do.

©2009 Learning Tree International. All Rights Reserved.
                                                                                          SQL Server 2008 Database Engine             18
  1-800-843-8733 • www.learningtree.ca                                            LEARNING TREE INTERNATIONAL             White Paper

About Learning Tree International                                                About the Author
Learning Tree International is a leading worldwide                               Sharon Dooley
provider of vendor-independent training to managers
and IT professionals in business and government                                  sharond@voicenet.com
organizations. Since 1974, over 1,900,000 course
participants from over 65,000 organizations worldwide                            Sharon Dooley has worked with SQL Server since
have enhanced their skills and extended their knowledge                          its first release, nearly 20 years ago. Her major areas
under the guidance of expert instructors with real-                              of specialization are database design and performance
world experience. Learning Tree develops, markets and                            tuning, though she also does a lot of administration.
delivers a broad, proprietary library of instructor-led                          She has a consulting practice and teaches and writes
courses focused on the latest information technologies,                          courses for Learning Tree International, where she is
management practices and key business skills.                                    SQL Server Curriculum Manager. Ms. Dooley is the
                                                                                 author of SQL Server 7.0 Essential Reference
Learning Tree International annually trains over                                 (New Riders) and the lead author of Professional
94,000 professionals in its Education Centres around                             SQL Server 6.5 Administration (Wrox). Sharon is an
the world. Learning Tree also provides training in a                             active member of PASS (Professional Organization
number of additional cities and on site at customer                              for SQL Server) and has presented at numerous
locations in 43 countries. For more information                                  conferences including the Pass Community Sumits,
about Learning Tree products and services, please                                SQL Connections, and Microsoft’s TechEd. She was
call 1-800-843-8733, or visit our Web site at                                    named a Microsoft MVP in 2001 and was an MVP
www.learningtree.ca                                                              for SQL Server until 2004.

    Alexandria, VA            Atlanta                Boston         Chicago

       Dallas               Los Angeles           New York City   Philadelphia

     Reston, VA            Rockville, MD             London         Ottawa

        Paris                 Tokyo                  Toronto      Stockholm

©2009 Learning Tree International. All Rights Reserved.
WP SQL08 CA0905
                                                                                               SQL Server 2008 Database Engine             19

Shared By: